fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 15:50:11 +01:00

Author	SHA1	Message	Date
Kenneth Graunke	bb5d09da6c	intel/compiler: Use named NIR intrinsic const index accessors In the early days of NIR, you had to prod at inst->const_index[] directly, but a long while back, we added handy accessor functions that let you use the actual name of the thing you want instead of memorizing the exact order of parameters. Also rewrite a comment I had a hard time parsing. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18067>	2022-08-16 05:44:30 +00:00
Marcin Ślusarz	30c0f2bfbb	intel/compiler: there are 4 types of fences on gfx >= 12.5 Found by code inspection. There's an assert later checking that we haven't overflown this array, so this change probably doesn't matter for any workload. Cc: 22.1 <mesa-stable> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16665>	2022-08-02 09:31:24 +00:00
Marcin Ślusarz	2bd148c990	intel/compiler: emit URB fences for TASK/MESH Cc: 22.1 <mesa-stable> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16665>	2022-08-02 09:31:24 +00:00
Ian Romanick	377246318a	intel/fs: Eliminate "masked" and "per slot offset" URB messages All of this information can be inferred from the sources. v2: Fix "error: unused variable 'opcode'" detected by marge-bot. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17605>	2022-07-26 17:25:19 +00:00
Ian Romanick	1b17f8fc5a	intel/fs: Make logical URB read instructions more like other logical instructions No shader-db changes on any Intel platform Fossil-db results: Tiger Lake Instructions in all programs: 156926440 -> 156926470 (+0.0%) Instructions hurt: 15 Cycles in all programs: 7513099349 -> 7513099402 (+0.0%) Cycles hurt: 15 Ice Lake and Skylake had similar results. (Ice Lake shown) Cycles in all programs: 9099036492 -> 9099036489 (-0.0%) Cycles helped: 1 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17605>	2022-07-26 17:25:19 +00:00
Ian Romanick	349a040f68	intel/fs: Make logical URB write instructions more like other logical instructions The changes to fs_visitor::validate() helped track down a place where I initially forgot to convert a message to the new sources layout. This had caused a different validation failure in dEQP-GLES31.functional.tessellation.tesscoord.triangles_equal_spacing, but this were not detected until after SENDs were lowered. Tiger Lake, Ice Lake, and Skylake had similar results. (Ice Lake shown) total instructions in shared programs: 19951145 -> 19951133 (<.01%) instructions in affected programs: 2429 -> 2417 (-0.49%) helped: 8 / HURT: 0 total cycles in shared programs: 858904152 -> 858862331 (<.01%) cycles in affected programs: 5702652 -> 5660831 (-0.73%) helped: 2138 / HURT: 1255 Broadwell total cycles in shared programs: 904869459 -> 904835501 (<.01%) cycles in affected programs: 7686744 -> `7652786` (-0.44%) helped: 2861 / HURT: 2050 Tiger Lake, Ice Lake, and Skylake had similar results. (Ice Lake shown) Instructions in all programs: 141442369 -> 141442032 (-0.0%) Instructions helped: 337 Cycles in all programs: 9099270231 -> 9099036492 (-0.0%) Cycles helped: 40661 Cycles hurt: 28606 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17605>	2022-07-26 17:25:18 +00:00
Emma Anholt	94bd06256a	intel/fs: Simplify brw_barycentric_mode() args. Reduce a bit of mode lookup noise I was tracing through trying to resolve the previous bug. Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17381>	2022-07-19 01:25:47 +00:00
Lionel Landwerlin	2d1f021e16	intel/fs: Set NonPerspectiveBarycentricEnable when the interpolator needs it. [anholt: changed to make all drivers do the right thing by moving the payload barycentric check into the compiler] Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17381>	2022-07-19 01:25:47 +00:00
Ian Romanick	a477587b4a	intel/fs: Add _LOGICAL versions of URB messages The lowering is currently fake. It just changes the opcode from the _LOGICAL version to the non-_LOGICAL version. v2: Remove some rebase cruft. 's/gfx8_//;s/simd8_/' in brw_instruction_name. Both suggested by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17379>	2022-07-08 19:45:34 +00:00
Lionel Landwerlin	9680e0e4a2	intel/fs: ray query fix for global address With stages dispatching with a mask, we can run into situations where we don't have the global address in all lanes. The existing code always assumed we had the addres in at least lane0. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `bb40e999d1` ("intel/nir: use a single intel intrinsic to deal with ray traversal") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17330>	2022-07-08 00:36:04 +00:00
Lionel Landwerlin	1b6c74c48d	intel/fs: make sure memory writes have landed for thread dispatch The thread dispatch SEND instructions will dispatch new threads immediately even before the caller of the SEND instruction has reached EOT. So we really need to make sure all the memory writes are visible to other threads within the DSS before the SEND. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15755>	2022-07-07 09:48:20 +03:00
Marcin Ślusarz	f4386b81e6	intel: fix typos found by codespell Acked-by: David Heidelberg <david.heidelberg@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17191>	2022-06-27 10:20:55 +00:00
Marcin Ślusarz	f871aa10a1	intel/compiler: assert that base is 0 for [load\|store]_shared intrins Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17143>	2022-06-22 10:32:13 +00:00
Francisco Jerez	96e7e92f0d	intel/fs/xehp+: Emit scheduling fence for all NIR barriers on platforms with LSC. Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15743>	2022-06-12 12:56:47 +03:00
Tapani Pälli	47773a5d7c	intel/fs: setup SEND message descriptor from nir scope This fixes many tests in following groups on DG2: dEQP-VK.memory_model.* dEQP-VK.fragment_shader_interlock.* v2: use memory scope and setup descriptor also for barriers without defined scope (Curro), use local scope and flush type none with NIR_SCOPE_NONE scope, cleanups (Lionel) v3: use LSC_FENCE_THREADGROUP for NIR_SCOPE_WORKGROUP, remove default case (Curro), use eviction if scope was not defined, use LSC_FENCE_GPU scope for vertex stage v4: use LSC_FENCE_TILE independent of stage for device scope (Curro) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15743>	2022-06-12 12:29:47 +03:00
Kenneth Graunke	9886615958	intel/compiler: Move spill/fill tracking to the register allocator Originally, we had virtual opcodes for scratch access, and let the generator count spills/fills separately from other sends. Later, we started using the generic SHADER_OPCODE_SEND for spills/fills on some generations of hardware, and simply detected stateless messages there. But then we started using stateless messages for other things: - anv uses stateless messages for the buffer device address feature. - nir_opt_large_constants generates stateless messages. - XeHP curbe setup can generate stateless messages. So counting stateless messages is not accurate. Instead, we move the spill/fill accounting to the register allocator, as it generates such things, as well as the load/store_scratch intrinsic handling, as those are basically spill/fills, just at a higher level. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16691>	2022-05-25 06:56:01 +00:00
Marcin Ślusarz	29a778fa6b	intel/compiler: print name of the unhandled intrinsic Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16493>	2022-05-13 09:43:02 +00:00
Lionel Landwerlin	04bd007757	intel/fs: require memory fence commit bit on Gfx9 Fixes a hang on Gfx9 GT1 : dEQP-VK.compute.zero_initialize_workgroup_memory.max_workgroup_memory.128 Tested-by: Mark Janes <markjanes@swizzler.org> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15596>	2022-04-17 21:24:17 +00:00
Jason Ekstrand	a482877c70	intel/fs: Implement 16-bit [ui]mul_high Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15829>	2022-04-12 23:19:38 +00:00
Mykhailo Skorokhodov	9c7e750ffe	intel/fs: Enable b2f(inot(a)) and b2i(inot(a)) optimization for Gfx12+ The commit enables the optimization for Intel Gfx12+ graphics. Tigerlake ``` total instructions in shared programs: 1289326 -> 1289015 (-0.02%) instructions in affected programs: 37841 -> 37530 (-0.82%) helped: 78 HURT: 9 helped stats (abs) min: 1 max: 26 x̄: 4.69 x̃: 3 helped stats (rel) min: 0.10% max: 12.50% x̄: 2.07% x̃: 1.21% HURT stats (abs) min: 1 max: 18 x̄: 6.11 x̃: 4 HURT stats (rel) min: 0.16% max: 1.95% x̄: 0.94% x̃: 0.61% 95% mean confidence interval for instructions value: -4.95 -2.20 95% mean confidence interval for instructions %-change: -2.34% -1.18% Instructions are helped. total cycles in shared programs: 105606388 -> 105606442 (<.01%) cycles in affected programs: 620119 -> 620173 (<.01%) helped: 49 HURT: 28 helped stats (abs) min: 2 max: 3618 x̄: 228.63 x̃: 12 helped stats (rel) min: 0.02% max: 23.31% x̄: 4.60% x̃: 1.11% HURT stats (abs) min: 1 max: 2142 x̄: 402.04 x̃: 29 HURT stats (rel) min: 0.01% max: 36.42% x̄: 5.01% x̃: 0.46% 95% mean confidence interval for cycles value: -151.80 153.20 95% mean confidence interval for cycles %-change: -3.00% 0.79% Inconclusive result (value mean confidence interval includes 0). ``` Related-to: `7725d60938` Signed-off-by: Mykhailo Skorokhodov <mykhailo.skorokhodov@globallogic.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14017>	2022-04-12 10:55:05 +00:00
Kenneth Graunke	6fa66ac228	intel/compiler: Implement nir_intrinsic_last_invocation We haven't exposed this intrinsic as it doesn't directly correspond to anything in SPIR-V. However, it's used internally by some NIR passes, namely nir_opt_uniform_atomics(). We reuse most of the infrastructure in brw_find_live_channel, but with LZD/ADD instead of FBL. A new SHADER_OPCODE_FIND_LAST_LIVE_CHANNEL is like SHADER_OPCODE_FIND_LIVE_CHANNEL but from the other side. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15484>	2022-03-26 00:28:19 +00:00
Caio Oliveira	f82731d0d7	intel/fs: Fix IsHelperInvocation for the case no discard/demote are used Use emit_predicate_on_sample_mask() helper that does check where to get the correct mask depending on whether discard/demote was used or not. Fixes: `45f5db5a84` ("intel/fs: Implement "demote to helper invocation"") Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15400>	2022-03-25 08:20:27 +00:00
Daniel Schürmann	832d67e99d	nir: rename nir_src_is_dynamically_uniform to nir_src_is_always_uniform As this function doesn't check for any control-flow dependence, it only returns true for statically (or globally) uniform values. The same holds true for is_binding_dynamically_uniform() in nir_opt_gcm(). Rename to better reflect that property. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14994>	2022-03-23 14:02:08 +00:00
Lionel Landwerlin	4ec5da7270	intel/nir/fs: replace COMPUTE \|\| KERNEL by gl_shader_stage_is_compute() Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>	2022-03-21 11:26:44 +00:00
Ian Romanick	19330eeb1d	intel/fs: Force destination types on DP4A instructions Most of the time, this doesn't matter. On the versions with _sat, if the destination type is incorrect, the clamping will not happen correctly. Fixes the following CTS tests: dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_packed_ss_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_packed_su_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_packed_us_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_packed_uu_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_ss_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_su_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_us_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_uu_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_packed_ss_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_packed_su_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_packed_us_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_packed_uu_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_ss_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_su_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_us_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_uu_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_packed_ss_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_packed_su_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_packed_us_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_packed_uu_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_ss_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_su_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_us_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_uu_v4i8_out32 v2: Update anv-tgl-fails.txt. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Fixes: `0f809dbf40` ("intel/compiler: Basic support for DP4A instruction") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15417>	2022-03-17 22:39:04 +00:00
Lionel Landwerlin	6d9ae6ec1e	intel: add a new intrinsic to get the shader stage from bindless shaders We'll use this to apply ray tracing operations in our trivial return shader based on the stage we're in. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>	2022-02-08 12:55:25 +00:00
Lionel Landwerlin	bb40e999d1	intel/nir: use a single intel intrinsic to deal with ray traversal In the future we'll want to reuse this intrinsic to deal with ray queries. Ray queries will use a different global pointer and programmatically change the control/level arguments of the trace send instruction. v2: Comment on barrier after sync trace instruction (Caio) Generalize lsc helper (Caio) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>	2022-02-08 12:55:25 +00:00
Lionel Landwerlin	c89024e446	intel/fs: don't set allow_sample_mask for CS intrinsics Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `77486db867` ("intel/fs: Disable sample mask predication for scratch stores") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>	2022-02-08 12:55:24 +00:00
Lionel Landwerlin	9d22f8ed23	intel/fs: add support for ACCESS_ENABLE_HELPER v2: Factor out fragment shader masking on send messages (Caio) Update comments (Caio) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>	2022-02-08 12:55:24 +00:00
Lionel Landwerlin	c199f44d17	intel/fs: name sources for A64 opcodes Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>	2022-02-08 12:55:24 +00:00
Lionel Landwerlin	61c9b7a82e	intel/fs: add support for Eu/Thread/Lane id This index will be used for accessing ray query data in memory. v2: Drop a MOV (Caio) v3: Rework back code emission (Caio) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>	2022-02-08 12:55:24 +00:00
Lionel Landwerlin	3dabe93257	intel/fs: rework dss_id opcode into generic opcode We'll want different types of IDs based on topology. Let's make this more flexible and also move the bit shifting code a layer above where it's easier to do bitshifting operations, especially if you need to stash things into temporary registers. v2: Keep previous comment. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>	2022-02-08 12:55:24 +00:00
Lionel Landwerlin	4deb8e86df	nir: change intel dss_id intrinsic to topology_id This will allow to reuse the same intrinsic for various topology based ID. v2: fix intrinsic comment (Caio) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>	2022-02-08 12:55:24 +00:00
Caio Oliveira	8bab8f6422	compiler, intel: Add gl_shader_stage_is_mesh() And replace the previous Intel-specific function. Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14823>	2022-02-01 17:41:25 +00:00
Ian Romanick	945fb51fb5	intel/fs: Fix gl_FrontFacing optimization on Gfx12+ It's not obvious why the (gl_FrontFacing ? -1.0 : 1.0) case was handled different for Gfx12+ than for previous generations, and it's not correct. It tries to negate the result as an integer, and it does this before the mask operation that clears the other bits in the value. When we eventually support dual-SIMD8 dispatch, the other front-facing bit is in g1.6 at bit 15, so similar code should be possible there. Reviewed-by: Matt Turner <mattst88@gmail.com> Fixes: `c92fb60007` ("intel/fs/gen12: Implement gl_FrontFacing on gen12+.") Closes: #5876 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14625>	2022-01-20 22:37:18 +00:00
Rohan Garg	af13119993	intel/fs: OpImageQueryLod does not support arrayed images as an operand When we lower SPIR-V to NIR for textures in vtn_handle_texture, we only bump the number of coordinate components when the op is not a lod query. Update the assert to take this into account. This fixes: - dEQP-VK.robustness.robustness2.bind.template.r32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.r32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.r32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.r32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.r32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.r32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rg32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rg32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rg32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rg32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rg32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rg32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rgba32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rgba32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rgba32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rgba32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rgba32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rgba32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.r32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.r32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.r32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.r32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.r32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.r32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rg32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rg32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rg32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rg32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rg32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rg32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rgba32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rgba32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rgba32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rgba32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rgba32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rgba32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag Fixes: `231337a1` ("intel/fs/xehp: Assert that the compiler is sending all 3 coords for cubemaps.") Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13925>	2022-01-07 10:53:35 +00:00
Dave Airlie	e12b0d0d60	intel/compiler: remove gfx6 gather wa from backend. Crocus lowers this in the frontend, they key member is still used but reset prior to backend. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14202>	2021-12-22 21:37:55 +00:00
Jason Ekstrand	3c89dbdbfe	intel/fs: Implement the sample_pos_or_center system value Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14198>	2021-12-17 16:02:16 +00:00
Jason Ekstrand	ac7255ed1e	intel/fs: Return fs_reg directly from builtin setup helpers There's no good reason why we're allocating them on the heap and returning a pointer. Return the fs_reg directly. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14198>	2021-12-17 16:02:16 +00:00
Ian Romanick	2ca13abcce	intel/fs: Use HF as destination type for F32TOF16 in fquantize2f16 Having an integer destination type instead of a float destination type confuses the SWSB code. This causes problems on some Intel GPUs. Fix this by using the correct type in the destination of the F32TOF16 opcode. Gfx7 doesn't have the HF type, so continue to emit W on that platform. The assertions in brw_F32TO16 (brw_eu_emit.c) are updated to reflect this. In scalar mode, UD is never emitted as a destination type for this opcode, so remove it from the allowed types in the assertion. I also condidered doing something like `de55fd358f` ("intel/fs/xehp: Teach SWSB pass about the exec pipeline of FS_OPCODE_PACK_HALF_2x16_SPLIT."), but Curro recommended that just using the correct types is a better fix. I agree. v2: Add missing changes to fs_generator::generate_pack_half_2x16_split. I'm not sure how I (and the Intel CI) missed that the first time. :( v3: Fix copy-and-paste issue in the v2 fix. Noticed by Tapani. Reviewed-by: Francisco Jerez <currojerez@riseup.net> [v1] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14181>	2021-12-15 20:03:51 +00:00
Rafael Antognolli	a026d2d11c	intel/compiler: Assert that unsupported tg4 offsets were lowered for XeHP Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14142>	2021-12-13 16:59:44 -08:00
Jason Ekstrand	b8d04863e2	intel/fs: Drop high_quality_derivatives We've never bothered to hook it up in crocus or iris. If we do in the future, it should probably be a NIR pasa anyway. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14056>	2021-12-10 21:20:47 +00:00
Jason Ekstrand	278d12f991	intel/fs,vec4: Drop prog_data binding tables Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14056>	2021-12-10 21:20:47 +00:00
Jason Ekstrand	8f3c100d61	intel/fs,vec4: Drop uniform compaction and pull constant support The only driver using these was i965 and it's gone now. This is all dead code. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14056>	2021-12-10 21:20:47 +00:00
Marcin Ślusarz	bd2c11dfa8	intel/compiler: Load draw_id from XP0 in Task/Mesh shaders Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13661>	2021-12-04 00:41:46 +00:00
Caio Oliveira	db23c41537	intel/compiler: Add backend compiler basics for Task/Mesh Task/Mesh stages are CS-like stages, and include many builtins (e.g. workgroup ID/index) and intrinsics (e.g. workgroup memory primitives) originally present only in CS. This commit add two new stages (task and mesh) that 'inherit' from CS by embedding a brw_cs_prog_data in their own prog_data structure, so that CS functionality can be easily reused. They also currently use the same helpers to select the SIMD variant to use -- that was recently added for CS. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13661>	2021-12-04 00:41:46 +00:00
Caio Oliveira	18e1c9c542	intel/compiler: Don't stage Task/Mesh outputs in registers Since the outputs are shared among the whole workgroup, these can't be staged in registers as they will not be always visible for all the invocations (to read/flush). If they ever need to be staged, we should use SLM for that. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13661>	2021-12-04 00:41:46 +00:00
Caio Oliveira	be89ea3231	intel/compiler: Handle per-primitive inputs in FS In Fragment Shader, regular inputs are laid out in the thread payload in a one dword per each half-GRF, that gives room for having the two delta dwords needed for interpolation. Per-primitive inputs are laid out before the regular inputs, and since there's no need to have delta information, they are packed. So half-GRF will be fully filled with 4 dwords of input. When num_per_primitive_inputs is zero (the default case), behavior should be the same as before. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13661>	2021-12-04 00:41:46 +00:00
Caio Oliveira	7938c38778	intel/compiler: Properly lower WorkgroupId for Task/Mesh Task/Mesh currently only support a single dimension (both in NV API and HW), so make Y and Z be zero. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13661>	2021-12-04 00:41:46 +00:00
Topi Pohjolainen	31e3e32625	intel/compiler: Deprecate ld2dms and use ld2dms_w instead Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11766>	2021-11-22 21:27:30 -08:00

1 2 3 4 5 ...

496 commits