fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 07:08:05 +02:00

Author	SHA1	Message	Date
Daniel Schürmann	3dab7b0a45	nir/tests: add tests for nir_move_terminate_out_of_loops Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33479>	2025-05-09 17:20:29 +00:00
Daniel Schürmann	c59356e6a5	nir: add option to move terminate{_if} out of loops Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33479>	2025-05-09 17:20:29 +00:00
Sil Vilerino	150fa795fe	nir: Only build nir headers for mediafoundation/d3d12-no-graphics paired build Reviewed-by: <pohhsu@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34845>	2025-05-09 16:34:00 +00:00
Georg Lehmann	ba63263f32	nir: add bfdot2_bfadd and use it for lowering bfdot if supported Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>	2025-05-09 11:20:26 +00:00
Georg Lehmann	02e743c99e	nir: add an option to lower bf2f and f2bf Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>	2025-05-09 11:20:25 +00:00
Georg Lehmann	e8f5c335ff	radv,aco,nir: keep the A and B base type for cmat_muladd_amd With bfloat16, and the two fp8 formats in the future, using just the bit size to identify the types is no longer possible. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>	2025-05-09 11:20:25 +00:00
Rhys Perry	ddef4bddf8	ac/nir: round components when lowering 8/16-bit loads to 32-bit Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162>	2025-05-08 13:30:50 +00:00
Rhys Perry	f538cae743	nir/algebraic: optimize ior(unpack_4x8, unpack_4x8<<8) to unpack_32_2x16 No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162>	2025-05-08 13:30:50 +00:00
Rhys Perry	10f4264936	nir/search: extend swizzle_y Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162>	2025-05-08 13:30:50 +00:00
Job Noorman	6a57bfb004	nir/lower_io_to_vector: remove can_read_output assert Since we're not creating new output reads, just vectorizing existing ones, this isn't the place to assert whether we can actually read outputs. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <anholt@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34784>	2025-05-08 08:18:24 +00:00
Lionel Landwerlin	9d342081e7	brw/nir: add intrinsics to read attribute payload register indirectly Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>	2025-05-08 06:48:35 +00:00
Lionel Landwerlin	c467444670	brw/nir: use a new intrinsic for fs_msaa_flag Avoid NIR code doing offset computations. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>	2025-05-08 06:48:34 +00:00
Marek Olšák	f58c0cbb6a	nir: split _accessed_indirectly bitmasks into _read/written_indirectly for AMD Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34863>	2025-05-08 02:54:12 +00:00
Marek Olšák	afd8fefb79	nir: add shader_info::tess::tcs_cross_invocation_outputs_written for AMD Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34863>	2025-05-08 02:54:12 +00:00
Alyssa Rosenzweig	5788770d91	nir: add nir_lower_default_point_size pass this is useful across drivers for maint5 semantics on mobile hw. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34762>	2025-05-06 17:07:00 +00:00
Rhys Perry	75880655f8	nir/lower_gs_intrinsics: silence warning Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details ../../../../../../../mesa/src/compiler/nir/nir_lower_gs_intrinsics.c: In function ‘nir_lower_gs_intrinsics’: ../../../../../../../mesa/src/compiler/nir/nir_lower_gs_intrinsics.c:523:93: warning: ‘state’ may be used uninitialized [-Wmaybe-uninitialized] 523 \| state.decomposed_primitive_count_vars[i] = state.decomposed_primitive_count_vars[0]; \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~ ../../../../../../../mesa/src/compiler/nir/nir_lower_gs_intrinsics.c:464:17: note: ‘state’ declared here 464 \| struct state state; \| ^~~~~ It's always initialized by the first iteration of the loop, but GCC doesn't seem to know that. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34785>	2025-05-05 11:45:42 +00:00
Rhys Perry	bc49045294	nir/opt_shrink_vectors: add assume to silence warning ../../../../../../../mesa/src/compiler/nir/nir_opt_shrink_vectors.c: In function ‘shrink_dest_to_read_mask’: ../../../../../../../mesa/src/compiler/nir/nir_opt_shrink_vectors.c:140:36: warning: writing 16 bytes into a region of size 15 [-Wstringop-overflow=] 140 \| swizzle[first_bit + i] = i; \| ~~~~~~~~~~~~~~~~~~~~~~~^~~ ../../../../../../../mesa/src/compiler/nir/nir_opt_shrink_vectors.c:138:18: note: at offset [1, 15] into destination object ‘swizzle’ of size 16 138 \| uint8_t swizzle[NIR_MAX_VEC_COMPONENTS] = { 0 }; \| ^~~~~~~ Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34785>	2025-05-05 11:45:42 +00:00
Ella Stanforth	32d9afdf73	nir/printf: add new helper to printf at a specific pixel. Debugging with nir_printf_fmt can result in overwhelming information. This allows us to filter for a pixel we care about. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34737>	2025-05-05 06:20:18 +00:00
Ella Stanforth	43f22110e7	nir/printf: break out va_list handling Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34737>	2025-05-05 06:20:17 +00:00
Christian Gmeiner	f17d350001	lima: Move fdot lowering from NIR to lima This change relocates the fdot lowering from the generic NIR to the lima, since lima is the only consumer of this particular lowering. This avoids potential conflicts with the similar fdot lowering already present in nir_lower_alu_width. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34757>	2025-04-30 17:33:38 +00:00
Caio Oliveira	a38960e8f3	brw, nir: Use glsl_base_type instead of nir_alu_type for @dpas_intel This will allow including types that don't have a nir_alu_type equivalent, like bfloat16. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>	2025-04-29 16:29:37 +00:00
Caio Oliveira	cf4021f93c	nir: Add opcodes for BFloat16 SPV_KHR_bfloat16 requires a small set of operations, since it doesn't support all the arithmetic ops. This patch adds conversions to/from Float32 and also the necessary ops (bfdot, bffma, bfmul) to implement SpvOpDot using the same lowering approach than the Float32 counterpart. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>	2025-04-29 16:29:36 +00:00
Rohan Garg	9e5d7eb88d	compiler/types: add a bfloat16 type Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>	2025-04-29 16:29:36 +00:00
Marek Olšák	55db7fc18c	nir/opt_varyings: group TES inputs based on whether they are used by POS or VAR If the optional flag is set, compaction groups TES inputs based on which outputs they are used for: - inputs generating only POS/CLIP outputs are first - inputs generating both POS/CLIP and VAR outputs are next - inputs generating only VAR outputs are last shader-db with ACO: 143 shaders have -1.44% average decrease in code size. There are fewer input loads and more of them are vec4 instead of vec1-3. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32262>	2025-04-23 17:47:37 +00:00
Marek Olšák	f15399af0f	nir: add gathering passes that gather which inputs affect specific outputs The first pass computes which shader instructions contribute to each output. It can be used to query how data flows within shaders towards outputs. The second pass computes which shader input components and which types of memory loads are used to compute shader outputs. The third pass uses the second pass to gather which input components are used to compute pos and clip dist outputs, which input components are used to compute all other outputs, and which input components are used to compute both. This will be used by compaction in nir_opt_varyings for drivers that split TES into a separate position cull shader and varying shader to make it less likely that the same vec4 inputs are needed in both. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32262>	2025-04-23 17:47:37 +00:00
Karol Herbst	33965bb21b	nir_lower_mem_access_bit_sizes: fix negative chunk offsets With a 64 bit pointer model, instead of doing -1 the pass ended up doing +4294967295. The reason here was some implicit integer conversion going horribly wrong, so just do the offset math in 64 bit to get a nice result. Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13023 Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34669>	2025-04-23 16:59:56 +00:00
Ella Stanforth	b38c4e8982	nir/alpha_to_coverage: Add an intrinsic for better dithering Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33942>	2025-04-23 09:03:41 +00:00
Ella Stanforth	d3aedbfe9d	asahi/lib: Move alpha_to_one and alpha_to_coverage lowering to common code. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33942>	2025-04-23 09:03:41 +00:00
Georg Lehmann	6d7e67d986	nir,amd: add neg_lo/hi modifiers to cmat_matmul_amd Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34396>	2025-04-22 16:08:55 +00:00
Georg Lehmann	3e26fc4498	nir/opt_algebraic: disable fsat(a + 1.0) opt if a can be NaN Foz-DB Navi21: Totals from 9 (0.01% of 79789) affected shaders: Instrs: 6782 -> 6796 (+0.21%); split: -0.03%, +0.24% CodeSize: 40020 -> 40108 (+0.22%); split: -0.04%, +0.26% Latency: 23764 -> 23758 (-0.03%) InvThroughput: 6424 -> 6431 (+0.11%); split: -0.08%, +0.19% SClause: 273 -> 275 (+0.73%) Copies: 338 -> 339 (+0.30%) VALU: 5138 -> 5147 (+0.18%); split: -0.06%, +0.23% SALU: 349 -> 350 (+0.29%) SMEM: 498 -> 500 (+0.40%) Fixes: `a4a3487aae` ("nir/opt_algebraic: optimize patterns from Skia") Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34125>	2025-04-22 14:23:05 +00:00
Georg Lehmann	a60d61cce8	nir: improve fadd is_a_number analysis by using the range Foz-DB Navi21: Totals from 145 (0.18% of 79789) affected shaders: Instrs: 168553 -> 168391 (-0.10%); split: -0.10%, +0.00% CodeSize: 926708 -> 926684 (-0.00%) Latency: 2210456 -> 2210329 (-0.01%); split: -0.01%, +0.00% InvThroughput: 545992 -> 545768 (-0.04%) SClause: 3084 -> 3085 (+0.03%) VALU: 129521 -> 129360 (-0.12%) SALU: 13085 -> 13084 (-0.01%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34125>	2025-04-22 14:23:05 +00:00
Georg Lehmann	a6fd9f488a	nir: add is_a_number analysis for ffma Foz-DB Navi21: Totals from 508 (0.64% of 79789) affected shaders: Instrs: 796183 -> 795838 (-0.04%) CodeSize: 4303420 -> 4303384 (-0.00%); split: -0.00%, +0.00% Latency: 7806095 -> 7805458 (-0.01%); split: -0.01%, +0.00% InvThroughput: 1377028 -> 1376824 (-0.01%); split: -0.01%, +0.00% Copies: 63297 -> 63299 (+0.00%); split: -0.00%, +0.00% PreVGPRs: 29818 -> 29819 (+0.00%) VALU: 562067 -> 561885 (-0.03%); split: -0.03%, +0.00% SALU: 89896 -> 89733 (-0.18%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34125>	2025-04-22 14:23:05 +00:00
Georg Lehmann	cb6d035925	nir: add range analysis for ffmaz Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34125>	2025-04-22 14:23:05 +00:00
Georg Lehmann	8ad695195e	nir/opt_algebraic: turn exact fmin(1.0, a) into fsat if a is not NaN and not negative Foz-DB Navi21: Totals from 2456 (3.08% of 79789) affected shaders: Instrs: 3415398 -> 3413352 (-0.06%); split: -0.06%, +0.00% CodeSize: 18781096 -> 18776092 (-0.03%); split: -0.03%, +0.00% VGPRs: 158512 -> 158528 (+0.01%) Latency: 39528900 -> 39526687 (-0.01%); split: -0.01%, +0.00% InvThroughput: 10612237 -> 10609296 (-0.03%); split: -0.03%, +0.00% VClause: 71028 -> 71034 (+0.01%) SClause: 93971 -> 93975 (+0.00%); split: -0.00%, +0.01% Copies: 257525 -> 257521 (-0.00%); split: -0.01%, +0.01% VALU: 2483374 -> 2481325 (-0.08%); split: -0.09%, +0.00% SALU: 348207 -> 348211 (+0.00%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34125>	2025-04-22 14:23:04 +00:00
Georg Lehmann	18a0de1834	nir/opt_algebraic: optimize fmax(ffma(a, b, c), 0.0) to fsat Foz-DB Navi21: Totals from 2621 (3.28% of 79789) affected shaders: MaxWaves: 55744 -> 55736 (-0.01%) Instrs: 2840180 -> 2832647 (-0.27%); split: -0.27%, +0.00% CodeSize: 15497364 -> 15464692 (-0.21%); split: -0.21%, +0.00% VGPRs: 138448 -> 138456 (+0.01%) Latency: 22319512 -> 22307018 (-0.06%); split: -0.06%, +0.01% InvThroughput: 5745108 -> 5729197 (-0.28%); split: -0.28%, +0.00% Copies: 110279 -> 110268 (-0.01%); split: -0.04%, +0.03% VALU: 2210578 -> 2203211 (-0.33%); split: -0.33%, +0.00% SALU: 169014 -> 168841 (-0.10%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34125>	2025-04-22 14:23:04 +00:00
Georg Lehmann	f71fc26393	nir/opt_algebraic: generalize fmax(fadd(a, b), 0.0) to fsat by not requiring fneg Not a large effect, but it's positive and makes the pattern simpler. Foz-DB Navi21: Totals from 1 (0.00% of 79789) affected shaders: Instrs: 145 -> 138 (-4.83%) CodeSize: 784 -> 756 (-3.57%) Latency: 1495 -> 1487 (-0.54%) InvThroughput: 210 -> 196 (-6.67%) VALU: 103 -> 96 (-6.80%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34125>	2025-04-22 14:23:04 +00:00
Alyssa Rosenzweig	f1aeb46a34	nir: factor out nir_verts_in_output_prim helper very useful for geometry shader lowering code. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34638>	2025-04-22 12:47:54 +00:00
Job Noorman	f269c7b3b5	nir/opt_shrink_vectors: enable for load_ubo_vec4 Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34600>	2025-04-18 15:56:02 +00:00
Konstantin Seurer	978e9b670e	aco,nir: Add support for new GFX12 ray tracing instructions Adds image_bvh_dual_intersect_ray and image_bvh8_intersect_ray which can handle the new BVH format. Both instructions write up to 10 VGPRs so they need to use a vec16 definition in nir. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34273>	2025-04-17 20:20:40 +00:00
Caio Oliveira	33295b2249	spirv, nir: Allow non-Aliased workgroup memory blocks Allocate space for the aliased region first, then allocate the non-Aliased blocks in sequence after that. SPV_KHR_workgroup_memory_explicit_layout extension added support for having Blocks of workgroup (shared) memory, which include layout decoration. For that extension all such blocks must be decorated with Aliased. SPV_KHR_untyped_pointers extension lifts that requirement, allowing blocks that don't alias in workgroup memory. They are still explicitly laid out. The motivation is that untyped pointers provide a different mechanism to obtain the same effect as the Aliased blocks. Instead of having two Aliased variables with different types, have a single variable and use an untyped pointer with a different type to access it. This patch is a preparation for supporting untyped pointers. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34139>	2025-04-17 19:13:18 +00:00
Caio Oliveira	fd0a7efb5a	spirv, nir: Delay calculation of shared_size when using explicit layout Move the calculation to nir_lower_vars_to_explicit_types(). This consolidates the check of shader_info::shared_memory_explicit_layout in a single place instead of in all drivers. This is motivated by SPV_KHR_untyped_pointers. Before that extension we had essentially two modes for shared memory variables - No layout decorations in the SPIR-V, and both internal layout and driver location was _given by the driver_. - Explicitly laid out, i.e. they are blocks, and decorated with Aliased. Because they all alias, we could assign them driver location directly to the start of the shared memory. With the untyped pointers extension, there's a third option, to be added by a later commit - Explicitly laid out, i.e. they are blocks, and NOT decorated with Aliased. Driver location is _given by the driver_. Blocks with and without Aliased can be mixed. The driver location of multiple blocks that don't alias depend on alignment that is driver-specific, which we can more easily do from the nir_lower_vars_to_explicit_types() that already has access to a function to obtain such value. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> (hk) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v3dv) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (anv/hasvk) Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (panvk) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (radv) Reviewed-by: Rob Clark <robdclark@gmail.com> (tu) Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34139>	2025-04-17 19:13:17 +00:00
Caio Oliveira	d5ad798140	spirv, radv, intel: Add NIR intrinsic for cmat conversion A cooperative matrix conversion operation was represented in NIR by the cmat_unary_op intrinsic with an nir_alu_op as extra parameter, that was already lowered to a specific conversion operation based on the matrix types. Instead of that, add a new intrinsic `cmat_convert` that is specific for that conversion. In addition to the src/dst matrix descriptions already available, also include the signedness information in the intrinsic (reuse nir_cmat_signed for that). This is needed because different Convert operations define different interpretations for integers, regardless their original type. In this patch, both radv and intel were changed to use the same logic that was previously used to pick the lowered ALU op. This change will help represent cmat conversions involving BFloat16, because it avoids having to create new NIR ALU ops for all the combinations involving BFloat16. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34511>	2025-04-16 23:13:36 +00:00
Ian Romanick	1d2ebeca17	nir/algebraic: Allow fmin(a,a) optimization when flush denorm to zero is not set I was surprised this had any affect on Intel GPUs because we have been unconditionally performing this optimization in the backend since June 2014. Once that error is fixed (later in this MR), this change prevents a couple dozen regressions in shader-db and around 90 regressions in fossil-db. Many of the regressions in fossil-db were loss of SIMD32, and that can be a big deal. v2: Add 64-bit too. Suggested by Alyssa. shader-db: All Intel platforms had similar results. (Lunar Lake shown) total instructions in shared programs: 16970141 -> 16970139 (<.01%) instructions in affected programs: 40 -> 38 (-5.00%) helped: 2 / HURT: 0 total cycles in shared programs: 914617580 -> 914617548 (<.01%) cycles in affected programs: 3428 -> 3396 (-0.93%) helped: 2 / HURT: 0 fossil-db: All Intel platforms had similar results. (Lunar Lake shown) Totals: Cycle count: 30546028462 -> 30546025224 (-0.00%); split: -0.00%, +0.00% Non SSA regs after NIR: 237017827 -> 237017731 (-0.00%) Totals from 83 (0.01% of 706657) affected shaders: Cycle count: 3042978 -> 3039740 (-0.11%); split: -0.13%, +0.02% Non SSA regs after NIR: 78997 -> 78901 (-0.12%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34192>	2025-04-15 23:59:31 +00:00
Alyssa Rosenzweig	63eb27d166	nir: add sampler LOD bias lowering this is a cleaned up version of the lowering originally written for asahi, moved to common code so it can be shared with an upcoming Vulkan implementation (not honeykrisp). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34507>	2025-04-15 14:10:50 +00:00
Alyssa Rosenzweig	9de7ea875d	nir: handle mismatched bias/lod bitsizes the sampler lod bias lowering uses fp16 for perf on AGX. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34507>	2025-04-15 14:10:49 +00:00
Alyssa Rosenzweig	2e15b42eec	nir: unvendor lod_bias(_agx) this will be useful for other backends. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34507>	2025-04-15 14:10:49 +00:00
Connor Abbott	2f93137308	nir/opt_preamble: Handle load_global_ir3 fossil-db results with turnip: Totals from 994 (0.60% of 165023) affected shaders: MaxWaves: 10720 -> 11528 (+7.54%); split: +7.57%, -0.04% Instrs: 1032004 -> 972314 (-5.78%); split: -5.99%, +0.21% CodeSize: 1847536 -> 1942472 (+5.14%); split: -0.11%, +5.25% NOPs: 261089 -> 233279 (-10.65%); split: -10.89%, +0.23% MOVs: 57217 -> 51434 (-10.11%); split: -14.11%, +4.00% Full: 16412 -> 14647 (-10.75%); split: -10.96%, +0.21% (ss): 23330 -> 25594 (+9.70%); split: -5.51%, +15.21% (sy): 17803 -> 15711 (-11.75%); split: -11.93%, +0.18% (ss)-stall: 96387 -> 107976 (+12.02%); split: -5.14%, +17.17% (sy)-stall: 952952 -> 765754 (-19.64%); split: -19.84%, +0.19% STPs: 494 -> 327 (-33.81%) LDPs: 1447 -> 1163 (-19.63%) Early-preamble: 668 -> 22 (-96.71%) Cat0: 280935 -> 251779 (-10.38%); split: -10.60%, +0.22% Cat1: 93400 -> 84766 (-9.24%); split: -11.79%, +2.55% Cat2: 343880 -> 337270 (-1.92%); split: -3.20%, +1.28% Cat3: 189311 -> 180918 (-4.43%) Cat4: 21008 -> 19920 (-5.18%) Cat5: 17788 -> 17783 (-0.03%) Cat6: 45786 -> 39531 (-13.66%) Cat7: 39896 -> 40347 (+1.13%); split: -0.43%, +1.56% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34483>	2025-04-14 16:53:34 +00:00
Erik Faye-Lund	1d5da22dfd	nir/lower_tex: avoid undefined-behavior Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details When texture_index and sampler_index are over 32, we can't really check for them in a single 32-bit word. This happens among other things when Panfrost uses preload shaders on v9 and later. Otherwise, we trigger undefined behavior. We're already doing this for textures in one case, let's be consistent. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34365>	2025-04-14 11:22:43 +00:00
Erik Faye-Lund	41b136f674	nir/lower_tex: use texture_mask instead of shifting on use In commit `292ac71a4a` ("nir/lower_tex: handle deref casts"), we avoided using texture_index when a texture instruction contained a variable deref. There's no good reason why this should be done to some of the lowering, but not all. So let's fix up code-paths that were added after this change to do the same. The first two patches here crossed paths with the commit that introduced texture_mask, so it's not strange that the change was missed. The last one seems to have just copied what was done around it, propagating the issue. Fixes: `880b00dc59` ("nir/lower_tex: Add support for lowering YUYV formats") Fixes: `1358d93650` ("nir/lower_tex: Add support for lowering Y41x formats") Fixes: `65d6f5aed2` ("nir: add options to lower y_vu, yv_yu, yx_xvxu and xy_vxux") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34365>	2025-04-14 11:22:43 +00:00
Konstantin Seurer	cb31b5a958	clc,libcl: Clean up CL includes This patch does a couple of things to make CL integration with drivers as seamless as possible: - We pull in opencl-c.h and opencl-c-base.h to stop relying on system headers. - Parts of libcl.h are moved to new headers that are incomplete CL-safe variants of libc headers. - A couple of util headers are changed to remove now unnecessary __OPENCL_VERSION__ guards and make more headers CL safe. - Drivers now include src/compiler/libcl and use headers like macros.h,u_math.h instead of libcl.h. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33576>	2025-04-11 21:27:37 +00:00

1 2 3 4 5 ...

6168 commits