fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-22 04:50:11 +01:00

Author	SHA1	Message	Date
Marcin Ślusarz	4fddef33d5	intel/compiler: invalidate all metadata in brw_nir_lower_intersection_shader New "if" blocks were inserted. Fixes: `303378e1dd` ("intel/rt: Add lowering for combined intersection/any-hit shaders") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15924>	2022-04-19 11:43:55 +00:00
Alexey Bozhenko	2d7d907ad1	intel/compiler: fix singleton pointer coverity warning fix brw_kernel::stats member that was declared as a variable but used as a pointer to array of 3 elements CID: 1503279 Signed-off-by: Bozhenko Alexey <oleksii.bozhenko@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15975>	2022-04-19 12:36:10 +03:00
Lionel Landwerlin	04bd007757	intel/fs: require memory fence commit bit on Gfx9 Fixes a hang on Gfx9 GT1 : dEQP-VK.compute.zero_initialize_workgroup_memory.max_workgroup_memory.128 Tested-by: Mark Janes <markjanes@swizzler.org> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15596>	2022-04-17 21:24:17 +00:00
Jason Ekstrand	ad0dc8e4ab	intel/compiler: Set lower_fisnormal Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15985>	2022-04-16 00:26:43 +00:00
Lionel Landwerlin	e11bedb9f5	intel/fs: add a note on possible optimization of root node address Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15910>	2022-04-13 11:24:49 +00:00
Lionel Landwerlin	9c0805ef91	intel/fs: fix metadata preserve on trace_ray intrinsic `c78be5da30` ("intel/fs: lower ray query intrinsics") introduced a helper function using nir_(push\|pop)_if which invalidated dominance & block_index for the replacement of nir_intrinsic_rt_trace_ray. We can still keep dominance/block_index metadata for the lowering of nir_intrinsic_rt_execute_callable though. This change uses 2 different lowering function with correct metadata preservation. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `c78be5da30` ("intel/fs: lower ray query intrinsics") Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15910>	2022-04-13 11:24:49 +00:00
Jason Ekstrand	69b5424ea4	intel/nir: Lower 8 and 16-bit bitwise unops Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15829>	2022-04-12 23:19:38 +00:00
Jason Ekstrand	a482877c70	intel/fs: Implement 16-bit [ui]mul_high Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15829>	2022-04-12 23:19:38 +00:00
Mykhailo Skorokhodov	9c7e750ffe	intel/fs: Enable b2f(inot(a)) and b2i(inot(a)) optimization for Gfx12+ The commit enables the optimization for Intel Gfx12+ graphics. Tigerlake ``` total instructions in shared programs: 1289326 -> 1289015 (-0.02%) instructions in affected programs: 37841 -> 37530 (-0.82%) helped: 78 HURT: 9 helped stats (abs) min: 1 max: 26 x̄: 4.69 x̃: 3 helped stats (rel) min: 0.10% max: 12.50% x̄: 2.07% x̃: 1.21% HURT stats (abs) min: 1 max: 18 x̄: 6.11 x̃: 4 HURT stats (rel) min: 0.16% max: 1.95% x̄: 0.94% x̃: 0.61% 95% mean confidence interval for instructions value: -4.95 -2.20 95% mean confidence interval for instructions %-change: -2.34% -1.18% Instructions are helped. total cycles in shared programs: 105606388 -> 105606442 (<.01%) cycles in affected programs: 620119 -> 620173 (<.01%) helped: 49 HURT: 28 helped stats (abs) min: 2 max: 3618 x̄: 228.63 x̃: 12 helped stats (rel) min: 0.02% max: 23.31% x̄: 4.60% x̃: 1.11% HURT stats (abs) min: 1 max: 2142 x̄: 402.04 x̃: 29 HURT stats (rel) min: 0.01% max: 36.42% x̄: 5.01% x̃: 0.46% 95% mean confidence interval for cycles value: -151.80 153.20 95% mean confidence interval for cycles %-change: -3.00% 0.79% Inconclusive result (value mean confidence interval includes 0). ``` Related-to: `7725d60938` Signed-off-by: Mykhailo Skorokhodov <mykhailo.skorokhodov@globallogic.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14017>	2022-04-12 10:55:05 +00:00
Ian Romanick	b5fa43952a	intel/fs: Better handle constant sources of FS_OPCODE_PACK_HALF_2x16_SPLIT I noticed that a LOT of fragment shaders in Shadow of the Tomb Raider, for instance, end up with a sequence of NIR like: vec1 32 ssa_2 = load_const (0x00000000 = 0.000000) ... vec1 32 ssa_191 = pack_half_2x16_split ssa_188, ssa_2 vec1 32 ssa_192 = pack_half_2x16_split ssa_189, ssa_2 vec1 32 ssa_193 = pack_half_2x16_split ssa_190, ssa_2 This results in an assembly sequence like: mov(8) g28<1>UD 0x00000000UD mov(8) g21<2>HF g28<8,8,1>F shl(8) g21<1>UD g21<8,8,1>UD 0x00000010UD mov(8) g21<2>HF g25<8,8,1>F mov(8) g19<2>HF g28<8,8,1>F shl(8) g19<1>UD g19<8,8,1>UD 0x00000010UD mov(8) g19<2>HF g23<8,8,1>F mov(8) g20<2>HF g28<8,8,1>F shl(8) g20<1>UD g20<8,8,1>UD 0x00000010UD mov(8) g20<2>HF g24<8,8,1>F After this commit, the generated assembly is: mov(8) g21<1>UD 0x00000000UD mov(8) g21<2>HF g23<8,8,1>F mov(8) g19<1>UD 0x00000000UD mov(8) g19<2>HF g17<8,8,1>F mov(8) g20<1>UD 0x00000000UD mov(8) g20<2>HF g18<8,8,1>F Tiger Lake, Ice Lake, Skylake, and Haswell had similar results. (Ice Lake shown) total instructions in shared programs: 20119086 -> 20119034 (<.01%) instructions in affected programs: 9056 -> 9004 (-0.57%) helped: 8 HURT: 0 helped stats (abs) min: 2 max: 16 x̄: 6.50 x̃: 4 helped stats (rel) min: 0.29% max: 1.75% x̄: 1.00% x̃: 0.98% 95% mean confidence interval for instructions value: -11.01 -1.99 95% mean confidence interval for instructions %-change: -1.56% -0.44% Instructions are helped. total cycles in shared programs: 861019414 -> 861021044 (<.01%) cycles in affected programs: 279862 -> 281492 (0.58%) helped: 4 HURT: 2 helped stats (abs) min: 6 max: 936 x̄: 239.00 x̃: 7 helped stats (rel) min: 0.03% max: 8.13% x̄: 2.09% x̃: 0.09% HURT stats (abs) min: 18 max: 2568 x̄: 1293.00 x̃: 1293 HURT stats (rel) min: 0.36% max: 1.14% x̄: 0.75% x̃: 0.75% 95% mean confidence interval for cycles value: -972.56 1515.89 95% mean confidence interval for cycles %-change: -4.77% 2.49% Inconclusive result (value mean confidence interval includes 0). Broadwell total instructions in shared programs: 17812327 -> 17812263 (<.01%) instructions in affected programs: 9867 -> 9803 (-0.65%) helped: 8 HURT: 0 helped stats (abs) min: 2 max: 28 x̄: 8.00 x̃: 4 helped stats (rel) min: 0.32% max: 1.80% x̄: 1.00% x̃: 0.95% 95% mean confidence interval for instructions value: -15.46 -0.54 95% mean confidence interval for instructions %-change: -1.54% -0.47% Instructions are helped. total cycles in shared programs: 904768620 -> 904773291 (<.01%) cycles in affected programs: 454799 -> 459470 (1.03%) helped: 4 HURT: 4 helped stats (abs) min: 36 max: 586 x̄: 344.50 x̃: 378 helped stats (rel) min: 0.47% max: 4.04% x̄: 2.01% x̃: 1.77% HURT stats (abs) min: 1 max: 5572 x̄: 1512.25 x̃: 238 HURT stats (rel) min: <.01% max: 2.77% x̄: 1.46% x̃: 1.53% 95% mean confidence interval for cycles value: -1122.40 2290.15 95% mean confidence interval for cycles %-change: -2.26% 1.71% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 18581 -> 18579 (-0.01%) spills in affected programs: 323 -> 321 (-0.62%) helped: 1 HURT: 0 total fills in shared programs: 24985 -> 24981 (-0.02%) fills in affected programs: 1348 -> 1344 (-0.30%) helped: 1 HURT: 0 Tiger Lake, Ice Lake, and Skylake had similar results. (Ice Lake shown) Instructions in all programs: 143585431 -> 143513657 (-0.0%) Instructions helped: 14403 Cycles in all programs: 8439312778 -> 8439371578 (+0.0%) Cycles helped: 10570 Cycles hurt: 3290 Gained: 146 Lost: 74 All of the lost and gained fossil-db shaders are SIMD32 fragment shaders. 14,247 of the affected shaders are from Shadow of the Tomb Raider. 154 are from Batman Arkham Origins, and the remaining two are from Octopath Traveler. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15089>	2022-04-07 18:26:23 +00:00
Ian Romanick	c08302670b	intel/compiler: Fix sample_d messages on DG2 DG2 can only do sample_d and sample_d_c on 1D and 2D surfaces. The maximum number of gradient components and coordinate components should be 2. In spite of this limitation, the Bspec lists a mysterious R component before the min_lod, so the maximum coordinate components is 3. Fixes the following Vulkan CTS failures on DG2: dEQP-VK.glsl.texture_functions.texturegradclamp.isampler1d_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.isampler2d_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.sampler1d_fixed_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.sampler1d_float_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.sampler2d_fixed_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.sampler2d_float_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.usampler1d_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.usampler2d_fragment The Fixes: tag below is a bit misleading. This commit fixes some test cases similar to ones fixed by the Fixes: commit. I just want to make sure this commit gets applied everywhere that commit was also applied. Fixes: `635ed58e52` ("intel/compiler: Lower txd for 3D samplers on XeHP.") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15781>	2022-04-07 17:09:28 +00:00
Lionel Landwerlin	b5031bd6f7	intel/nir: don't report progress on rayqueries if no queries Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `c78be5da30` ("intel/fs: lower ray query intrinsics") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15769>	2022-04-07 08:24:19 +00:00
Ian Romanick	7fd1955412	nir: intel/compiler: Lower TXD on array surfaces on DG2+ DG2 can only do sample_d and sample_d_c on 1D and 2D surfaces. Cube maps and 3D surfaces were already handled, but 1D array and 2D array surfaces were not. Fixes the following Vulkan CTS failures on DG2: dEQP-VK.glsl.texture_functions.texturegradclamp.isampler1darray_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.isampler2darray_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.sampler1darray_fixed_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.sampler1darray_float_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.sampler2darray_fixed_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.sampler2darray_float_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.usampler1darray_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.usampler2darray_fragment The Fixes: tag below is a bit misleading. This commit adds another lowering, similar to the one in the Fixes: commit, that probably should have been added at the same time. I just want to make sure this commit gets applied everywhere that commit was also applied. Fixes: `635ed58e52` ("intel/compiler: Lower txd for 3D samplers on XeHP.") Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15681>	2022-03-31 12:59:18 -07:00
Lionel Landwerlin	684a4ea30c	intel/clc: fix missing pointer write Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `346a7f14fb` ("intel/compiler: Add code for compiling CL-style SPIR-V kernels") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15611>	2022-03-30 07:56:25 +00:00
Kenneth Graunke	1967fd3b10	intel/compiler: Call inst->resize_sources before setting the sources You should probably resize the sources array before accessing entries that might be out of bounds. inst->resize_sources() always allocates enough space for at least 3 sources, so this is really only an issue when there are 4+ sources. Fixes: `a920979d4f` ("intel/fs: Use split sends for surface writes on gen9+") Fixes: `4f86a70599` ("intel/fs: Lower DW untyped r/w messages to LSC when available") Fixes: `d372abe397` ("intel/fs: Add surface OWORD BLOCK opcodes") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15632>	2022-03-29 13:06:17 -07:00
Georg Lehmann	922916bf64	nir: Move lower_usub_sat64 to nir_lower_int64_options. Signed-off-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15421>	2022-03-28 20:02:52 +00:00
Kenneth Graunke	823745dc27	intel/compiler: Use nir_opt_uniform_atomics() In general, an atomic intrinsic may perform separate atomics for every enabled SIMD channel, as each channel may operate on different memory. However, an extremely common case is for all channels to access the same memory location. In this case, we can simply perform a reduction/scan across the subgroup, and perform one atomic for the whole subgroup, rather than one per channel. For example, if an intrinsic says to take the minimum value of the existing memory and the value in each channel, we can do a thread-local minimum of all enabled channels, then do a single atomic to take the minimum of that and the existing memory. Our hardware doesn't optimize the case where multiple channels ask for atomics on the same memory location; it assumes the compiler will do so. nir_opt_uniform_atomics() uses divergence analysis to detect this case, adds the necessary subgroup operations, and moves the atomic inside a conditional that disables all but a single invocation. It even detects cases where the shader code already performs this kind of optimization, and avoids doing it a second time. This may not be the optimal solution for us. In the backend, we could detect this case and emit send(1) instructions with NoMask, rather than generating if...send(16)...endif, and a lot of unnecessary ALU ops. But it's simple to do, reuses the same path as ACO, and still provides most of the benefit by cutting up to 16x atomics down to a single atomic, which is more merciful to the memory bus. Improves performance of Shadow of the Tomb Raider by 5.5% on XeHP. Improves performance of a customer-internal benchmark on XeHP at 3840x2160 and low settings by approximately 30%. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15484>	2022-03-26 00:28:19 +00:00
Kenneth Graunke	49ef23f4a6	intel/compiler: Convert to LCSSA and use divergence analysis. We'll use this more shortly. For now, enable it to separately in case anything bisects to this. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15484>	2022-03-26 00:28:19 +00:00
Kenneth Graunke	b3942beecf	intel/compiler: Set divergence analysis options Although we don't use divergence analysis yet, we've had several work-in-progress series that make use of it. We may as well set our options so that those series can assume they're in place. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15484>	2022-03-26 00:28:19 +00:00
Kenneth Graunke	6fa66ac228	intel/compiler: Implement nir_intrinsic_last_invocation We haven't exposed this intrinsic as it doesn't directly correspond to anything in SPIR-V. However, it's used internally by some NIR passes, namely nir_opt_uniform_atomics(). We reuse most of the infrastructure in brw_find_live_channel, but with LZD/ADD instead of FBL. A new SHADER_OPCODE_FIND_LAST_LIVE_CHANNEL is like SHADER_OPCODE_FIND_LIVE_CHANNEL but from the other side. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15484>	2022-03-26 00:28:19 +00:00
Caio Oliveira	c32d386ce2	intel/compiler: Inline TUE map computation into TUE Input lowering Refactor since the TUE compute function is simpler now and the comments make sense being near the lowering. Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15022>	2022-03-25 23:29:19 +00:00
Caio Oliveira	c36ae42e4c	intel/compiler: Use nir_var_mem_task_payload Instead of reusing the in/out slot mechanism, use a separated NIR variable mode. This will make easier later to implement staging the output in shared memory (and storing all at the end to the URB). Note to get 64-bit type support we currently rely on the brw_nir_lower_mem_access_bit_sizes() pass. Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15022>	2022-03-25 23:29:19 +00:00
Caio Oliveira	f82731d0d7	intel/fs: Fix IsHelperInvocation for the case no discard/demote are used Use emit_predicate_on_sample_mask() helper that does check where to get the correct mask depending on whether discard/demote was used or not. Fixes: `45f5db5a84` ("intel/fs: Implement "demote to helper invocation"") Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15400>	2022-03-25 08:20:27 +00:00
Caio Oliveira	bb311c22df	intel/fs: Initialize the sample mask in flags register when using demote Without this change, a check for "is helper invocation" could read uninitialized values. Fixes: `45f5db5a84` ("intel/fs: Implement "demote to helper invocation"") Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15400>	2022-03-25 08:20:27 +00:00
Mark Janes	85e314db5d	Revert "intel/fs: handle interpolation modes for at_sample and at_offset too" This reverts commit `5afbb0e730`. Closes: #6198 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15534>	2022-03-23 11:03:47 -07:00
Daniel Schürmann	832d67e99d	nir: rename nir_src_is_dynamically_uniform to nir_src_is_always_uniform As this function doesn't check for any control-flow dependence, it only returns true for statically (or globally) uniform values. The same holds true for is_binding_dynamically_uniform() in nir_opt_gcm(). Rename to better reflect that property. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14994>	2022-03-23 14:02:08 +00:00
Lionel Landwerlin	df059c6781	intel/clc: deal with SPIRV-Tools linker new behavior We're now required to provide all modules to link at the same SPIRV version. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15486>	2022-03-23 10:24:31 +00:00
Lionel Landwerlin	21aa1d3de1	intel/clc: fixup shared memory offsets We're running the io lowering twice so need to reset some fields so the offset don't go over what is really needed. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15486>	2022-03-23 10:24:31 +00:00
Lionel Landwerlin	de9c2312ea	intel/clc: compile fix Fixes: `c15bf88f01` ("intel: Add a little OpenCL C compiler binary") Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15486>	2022-03-23 10:24:31 +00:00
Lionel Landwerlin	a7f264f33a	intel/clc: add option to printout kernel prog_data Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15486>	2022-03-23 10:24:31 +00:00
Lionel Landwerlin	451f907d16	intel/kernel: enable linkage cap Linkage should have happened before this in intel_clc. This just silence a parser warning. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15486>	2022-03-23 10:24:31 +00:00
Lionel Landwerlin	bb4ff3e6e2	intel/kernel: enable groups caps This is roughly the same as SpvCapabilityGroupNonUniform (subgroup_basic). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15486>	2022-03-23 10:24:31 +00:00
Iván Briano	5afbb0e730	intel/fs: handle interpolation modes for at_sample and at_offset too Cc: mesa-stable Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15424>	2022-03-22 19:05:05 +00:00
Lionel Landwerlin	9ca29c687b	intel/clc: disable tool prior to Gfx12.5 platforms This tool is currently only aimed at Gfx version 12.5+ with COMPUTE_WALKER. We could make it work on earlier platforms but they require pushing gl_SubgroupInvocation and the CLC code is missing the back-end compiler set-up bits for that. v2: Commit description by Jason Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>	2022-03-21 11:26:44 +00:00
Lionel Landwerlin	c735c4ca85	intel/clc: specify supported extensions Having everything ever known to man is confusing our SPIRV parser. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>	2022-03-21 11:26:44 +00:00
Lionel Landwerlin	a29b1d5716	intel/clc: allow producing SPIRV files Useful to debug the parser. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>	2022-03-21 11:26:44 +00:00
Lionel Landwerlin	77e929a527	intel/clc: allow multiple CL files to be compiled together v2: use util_dynarray_append() (Jason) identation fixes (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>	2022-03-21 11:26:44 +00:00
Jason Ekstrand	c15bf88f01	intel: Add a little OpenCL C compiler binary v2: Fix up indentation (Marcin) s/gen/gfx/ (Marcin) Deal with fd closing in success/fail cases (Marin) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>	2022-03-21 11:26:44 +00:00
Lionel Landwerlin	ec6e247a40	intel/fs: handle inline data on OpenCL style kernels This is for Gfx12.5 with the COMPUTE_WALKER::Inline Data payload. We do this in a similar way to the compute kernels. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>	2022-03-21 11:26:44 +00:00
Jason Ekstrand	4d8e788663	intel/kernel: Implement some Intel built-in functions v2: Document mangled function names (Marcin) Fixup progress & metadata (Marcin) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>	2022-03-21 11:26:44 +00:00
Jason Ekstrand	346a7f14fb	intel/compiler: Add code for compiling CL-style SPIR-V kernels v2: simplify INTEL_DEBUG expressions (Marcin) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>	2022-03-21 11:26:44 +00:00
Jason Ekstrand	d1bddfba6b	intel/nir: Add optimizations to help OpenCL-style kernels Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>	2022-03-21 11:26:44 +00:00
Lionel Landwerlin	4ec5da7270	intel/nir/fs: replace COMPUTE \|\| KERNEL by gl_shader_stage_is_compute() Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13171>	2022-03-21 11:26:44 +00:00
Ian Romanick	19330eeb1d	intel/fs: Force destination types on DP4A instructions Most of the time, this doesn't matter. On the versions with _sat, if the destination type is incorrect, the clamping will not happen correctly. Fixes the following CTS tests: dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_packed_ss_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_packed_su_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_packed_us_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_packed_uu_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_ss_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_su_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_us_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.all_uu_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_packed_ss_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_packed_su_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_packed_us_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_packed_uu_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_ss_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_su_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_us_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.limits_uu_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_packed_ss_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_packed_su_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_packed_us_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_packed_uu_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_ss_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_su_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_us_v4i8_out32 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.small_uu_v4i8_out32 v2: Update anv-tgl-fails.txt. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Fixes: `0f809dbf40` ("intel/compiler: Basic support for DP4A instruction") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15417>	2022-03-17 22:39:04 +00:00
Sagar Ghuge	2e336c602d	intel/fs: Add Wa_14014435656 For any fence greater than local scope, always set flush type to at least invalidate so that fence goes on properly. v2: Fixup condition to trigger workaround (Lionel) v3: Simplify workaround (Curro) v4: Don't drop the existing WA (Curro) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: 22.0 <mesa-stable> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14947>	2022-03-17 14:18:02 +00:00
Sagar Ghuge	6031ad4bf6	intel/fs: Add Wa_22013689345 v2: Use a simpler framework (Lionel) v3: Rebase, add task/mesh (Lionel) v4: Fixup fence exec size (SIMDX -> SIMD1) v5: Fix invalidate_analysis, add finishme comment (Curro) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: 22.0 <mesa-stable> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14947>	2022-03-17 14:18:02 +00:00
Ernst Sjöstrand	e5f3689cff	intel/compiler: Fix non-trivial designated initializer Not supported by GCC 7. src/compiler/nir/nir_builder_opcodes.h:14156:118: sorry, unimplemented: non-trivial designated initializers not supported src/intel/compiler/brw_mesh.cpp:515:7: note: in expansion of macro ‘nir_store_per_primitive_output’ Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Fixes: `bc4f8c073a` ("intel/compiler: inject MUE initialization") Signed-off-by: Ernst Sjöstrand <ernstp@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15360>	2022-03-14 09:56:04 +00:00
Dave Airlie	7edda218fd	intel: add some missing debug recompile info. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15315>	2022-03-12 08:12:41 +00:00
Marcin Ślusarz	81df66bfff	intel/compiler: mark some variables as per-primitive in FS if they come from MS Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15303>	2022-03-09 16:52:59 +00:00
Marcin Ślusarz	8c16ce53a9	intel/compiler: handle ViewportIndex, PrimitiveID and Layer in MUE setup Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15303>	2022-03-09 16:52:59 +00:00

... 3 4 5 6 7 ...

2263 commits