fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 18:08:15 +02:00

Author	SHA1	Message	Date
Rhys Perry	bc49045294	nir/opt_shrink_vectors: add assume to silence warning ../../../../../../../mesa/src/compiler/nir/nir_opt_shrink_vectors.c: In function ‘shrink_dest_to_read_mask’: ../../../../../../../mesa/src/compiler/nir/nir_opt_shrink_vectors.c:140:36: warning: writing 16 bytes into a region of size 15 [-Wstringop-overflow=] 140 \| swizzle[first_bit + i] = i; \| ~~~~~~~~~~~~~~~~~~~~~~~^~~ ../../../../../../../mesa/src/compiler/nir/nir_opt_shrink_vectors.c:138:18: note: at offset [1, 15] into destination object ‘swizzle’ of size 16 138 \| uint8_t swizzle[NIR_MAX_VEC_COMPONENTS] = { 0 }; \| ^~~~~~~ Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34785>	2025-05-05 11:45:42 +00:00
Ella Stanforth	32d9afdf73	nir/printf: add new helper to printf at a specific pixel. Debugging with nir_printf_fmt can result in overwhelming information. This allows us to filter for a pixel we care about. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34737>	2025-05-05 06:20:18 +00:00
Ella Stanforth	43f22110e7	nir/printf: break out va_list handling Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34737>	2025-05-05 06:20:17 +00:00
Rhys Perry	1d7a988ec2	vtn: use nir_const_value_for_raw_uint for bfloat SpecConstantOp/FConvert I'm not sure how this was supposed to ensure padding was zero, and it doesn't seem to work for me (GCC 15.0.1). Fixes a NIR validation failure with dEQP-VK.glsl.bfloat16.constant.compute and RADV. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `90e1b12890` ("spirv: Add bfloat16 support to SpecConstantOp") Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34769>	2025-05-01 10:52:30 +00:00
Christian Gmeiner	f17d350001	lima: Move fdot lowering from NIR to lima This change relocates the fdot lowering from the generic NIR to the lima, since lima is the only consumer of this particular lowering. This avoids potential conflicts with the similar fdot lowering already present in nir_lower_alu_width. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34757>	2025-04-30 17:33:38 +00:00
Rohan Garg	2bbe042e87	spirv: Enable bfloat16 capabilities Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>	2025-04-29 16:29:37 +00:00
Caio Oliveira	e0b195cadb	spirv: Use bfdot for SpvOpDot with BFloat16 Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>	2025-04-29 16:29:37 +00:00
Caio Oliveira	2807097690	spirv: Implement Conversions to/from bfloat16 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>	2025-04-29 16:29:37 +00:00
Caio Oliveira	90e1b12890	spirv: Add bfloat16 support to SpecConstantOp Handle bfloat16 by converting sources to float, performing the operation, and converting result back to bfloat16 if needed. This is done because not all ALU ops have a `bf` version in NIR. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>	2025-04-29 16:29:37 +00:00
Rohan Garg	dc8074683d	spirv: construct a bfloat16 from the given SPIR-V bitsize and encoding Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>	2025-04-29 16:29:37 +00:00
Caio Oliveira	fb6ae2eac1	spirv: Refactor to use glsl_type to pick ALU ops Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>	2025-04-29 16:29:37 +00:00
Caio Oliveira	bba607ac2b	spirv: Move Convert opcodes handling to its own function Take the opportunity to add a comment about why the bit_size comes from the NIR def and not the original type. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>	2025-04-29 16:29:37 +00:00
Caio Oliveira	a38960e8f3	brw, nir: Use glsl_base_type instead of nir_alu_type for @dpas_intel This will allow including types that don't have a nir_alu_type equivalent, like bfloat16. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>	2025-04-29 16:29:37 +00:00
Caio Oliveira	cf4021f93c	nir: Add opcodes for BFloat16 SPV_KHR_bfloat16 requires a small set of operations, since it doesn't support all the arithmetic ops. This patch adds conversions to/from Float32 and also the necessary ops (bfdot, bffma, bfmul) to implement SpvOpDot using the same lowering approach than the Float32 counterpart. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>	2025-04-29 16:29:36 +00:00
Rohan Garg	9e5d7eb88d	compiler/types: add a bfloat16 type Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>	2025-04-29 16:29:36 +00:00
Dmitry Baryshkov	419a9e9d42	mesa-clc: add an option to force inclusion of OpenCL headers Currently mesa-clc bundles OpenCL headers from Clang only if the static LLVM is used (which means Clang / LLVM are not present on the target system). In some cases (e.g. when building in OpenEmbedded environemnt) it is desirable to have shared LLVM library, but skip installing the whole Clang runtime just to compile shaders. Add an option that forces OpenCL headers to be bundled with the mesa-clc binary. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34551>	2025-04-24 11:40:15 +00:00
Marek Olšák	55db7fc18c	nir/opt_varyings: group TES inputs based on whether they are used by POS or VAR If the optional flag is set, compaction groups TES inputs based on which outputs they are used for: - inputs generating only POS/CLIP outputs are first - inputs generating both POS/CLIP and VAR outputs are next - inputs generating only VAR outputs are last shader-db with ACO: 143 shaders have -1.44% average decrease in code size. There are fewer input loads and more of them are vec4 instead of vec1-3. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32262>	2025-04-23 17:47:37 +00:00
Marek Olšák	f15399af0f	nir: add gathering passes that gather which inputs affect specific outputs The first pass computes which shader instructions contribute to each output. It can be used to query how data flows within shaders towards outputs. The second pass computes which shader input components and which types of memory loads are used to compute shader outputs. The third pass uses the second pass to gather which input components are used to compute pos and clip dist outputs, which input components are used to compute all other outputs, and which input components are used to compute both. This will be used by compaction in nir_opt_varyings for drivers that split TES into a separate position cull shader and varying shader to make it less likely that the same vec4 inputs are needed in both. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32262>	2025-04-23 17:47:37 +00:00
Karol Herbst	33965bb21b	nir_lower_mem_access_bit_sizes: fix negative chunk offsets With a 64 bit pointer model, instead of doing -1 the pass ended up doing +4294967295. The reason here was some implicit integer conversion going horribly wrong, so just do the offset math in 64 bit to get a nice result. Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13023 Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34669>	2025-04-23 16:59:56 +00:00
Ella Stanforth	b38c4e8982	nir/alpha_to_coverage: Add an intrinsic for better dithering Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33942>	2025-04-23 09:03:41 +00:00
Ella Stanforth	d3aedbfe9d	asahi/lib: Move alpha_to_one and alpha_to_coverage lowering to common code. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33942>	2025-04-23 09:03:41 +00:00
Georg Lehmann	6d7e67d986	nir,amd: add neg_lo/hi modifiers to cmat_matmul_amd Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34396>	2025-04-22 16:08:55 +00:00
Georg Lehmann	3e26fc4498	nir/opt_algebraic: disable fsat(a + 1.0) opt if a can be NaN Foz-DB Navi21: Totals from 9 (0.01% of 79789) affected shaders: Instrs: 6782 -> 6796 (+0.21%); split: -0.03%, +0.24% CodeSize: 40020 -> 40108 (+0.22%); split: -0.04%, +0.26% Latency: 23764 -> 23758 (-0.03%) InvThroughput: 6424 -> 6431 (+0.11%); split: -0.08%, +0.19% SClause: 273 -> 275 (+0.73%) Copies: 338 -> 339 (+0.30%) VALU: 5138 -> 5147 (+0.18%); split: -0.06%, +0.23% SALU: 349 -> 350 (+0.29%) SMEM: 498 -> 500 (+0.40%) Fixes: `a4a3487aae` ("nir/opt_algebraic: optimize patterns from Skia") Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34125>	2025-04-22 14:23:05 +00:00
Georg Lehmann	a60d61cce8	nir: improve fadd is_a_number analysis by using the range Foz-DB Navi21: Totals from 145 (0.18% of 79789) affected shaders: Instrs: 168553 -> 168391 (-0.10%); split: -0.10%, +0.00% CodeSize: 926708 -> 926684 (-0.00%) Latency: 2210456 -> 2210329 (-0.01%); split: -0.01%, +0.00% InvThroughput: 545992 -> 545768 (-0.04%) SClause: 3084 -> 3085 (+0.03%) VALU: 129521 -> 129360 (-0.12%) SALU: 13085 -> 13084 (-0.01%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34125>	2025-04-22 14:23:05 +00:00
Georg Lehmann	a6fd9f488a	nir: add is_a_number analysis for ffma Foz-DB Navi21: Totals from 508 (0.64% of 79789) affected shaders: Instrs: 796183 -> 795838 (-0.04%) CodeSize: 4303420 -> 4303384 (-0.00%); split: -0.00%, +0.00% Latency: 7806095 -> 7805458 (-0.01%); split: -0.01%, +0.00% InvThroughput: 1377028 -> 1376824 (-0.01%); split: -0.01%, +0.00% Copies: 63297 -> 63299 (+0.00%); split: -0.00%, +0.00% PreVGPRs: 29818 -> 29819 (+0.00%) VALU: 562067 -> 561885 (-0.03%); split: -0.03%, +0.00% SALU: 89896 -> 89733 (-0.18%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34125>	2025-04-22 14:23:05 +00:00
Georg Lehmann	cb6d035925	nir: add range analysis for ffmaz Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34125>	2025-04-22 14:23:05 +00:00
Georg Lehmann	8ad695195e	nir/opt_algebraic: turn exact fmin(1.0, a) into fsat if a is not NaN and not negative Foz-DB Navi21: Totals from 2456 (3.08% of 79789) affected shaders: Instrs: 3415398 -> 3413352 (-0.06%); split: -0.06%, +0.00% CodeSize: 18781096 -> 18776092 (-0.03%); split: -0.03%, +0.00% VGPRs: 158512 -> 158528 (+0.01%) Latency: 39528900 -> 39526687 (-0.01%); split: -0.01%, +0.00% InvThroughput: 10612237 -> 10609296 (-0.03%); split: -0.03%, +0.00% VClause: 71028 -> 71034 (+0.01%) SClause: 93971 -> 93975 (+0.00%); split: -0.00%, +0.01% Copies: 257525 -> 257521 (-0.00%); split: -0.01%, +0.01% VALU: 2483374 -> 2481325 (-0.08%); split: -0.09%, +0.00% SALU: 348207 -> 348211 (+0.00%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34125>	2025-04-22 14:23:04 +00:00
Georg Lehmann	18a0de1834	nir/opt_algebraic: optimize fmax(ffma(a, b, c), 0.0) to fsat Foz-DB Navi21: Totals from 2621 (3.28% of 79789) affected shaders: MaxWaves: 55744 -> 55736 (-0.01%) Instrs: 2840180 -> 2832647 (-0.27%); split: -0.27%, +0.00% CodeSize: 15497364 -> 15464692 (-0.21%); split: -0.21%, +0.00% VGPRs: 138448 -> 138456 (+0.01%) Latency: 22319512 -> 22307018 (-0.06%); split: -0.06%, +0.01% InvThroughput: 5745108 -> 5729197 (-0.28%); split: -0.28%, +0.00% Copies: 110279 -> 110268 (-0.01%); split: -0.04%, +0.03% VALU: 2210578 -> 2203211 (-0.33%); split: -0.33%, +0.00% SALU: 169014 -> 168841 (-0.10%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34125>	2025-04-22 14:23:04 +00:00
Georg Lehmann	f71fc26393	nir/opt_algebraic: generalize fmax(fadd(a, b), 0.0) to fsat by not requiring fneg Not a large effect, but it's positive and makes the pattern simpler. Foz-DB Navi21: Totals from 1 (0.00% of 79789) affected shaders: Instrs: 145 -> 138 (-4.83%) CodeSize: 784 -> 756 (-3.57%) Latency: 1495 -> 1487 (-0.54%) InvThroughput: 210 -> 196 (-6.67%) VALU: 103 -> 96 (-6.80%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34125>	2025-04-22 14:23:04 +00:00
Alyssa Rosenzweig	f1aeb46a34	nir: factor out nir_verts_in_output_prim helper very useful for geometry shader lowering code. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34638>	2025-04-22 12:47:54 +00:00
Job Noorman	f269c7b3b5	nir/opt_shrink_vectors: enable for load_ubo_vec4 Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34600>	2025-04-18 15:56:02 +00:00
Konstantin Seurer	978e9b670e	aco,nir: Add support for new GFX12 ray tracing instructions Adds image_bvh_dual_intersect_ray and image_bvh8_intersect_ray which can handle the new BVH format. Both instructions write up to 10 VGPRs so they need to use a vec16 definition in nir. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34273>	2025-04-17 20:20:40 +00:00
Caio Oliveira	33295b2249	spirv, nir: Allow non-Aliased workgroup memory blocks Allocate space for the aliased region first, then allocate the non-Aliased blocks in sequence after that. SPV_KHR_workgroup_memory_explicit_layout extension added support for having Blocks of workgroup (shared) memory, which include layout decoration. For that extension all such blocks must be decorated with Aliased. SPV_KHR_untyped_pointers extension lifts that requirement, allowing blocks that don't alias in workgroup memory. They are still explicitly laid out. The motivation is that untyped pointers provide a different mechanism to obtain the same effect as the Aliased blocks. Instead of having two Aliased variables with different types, have a single variable and use an untyped pointer with a different type to access it. This patch is a preparation for supporting untyped pointers. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34139>	2025-04-17 19:13:18 +00:00
Caio Oliveira	fd0a7efb5a	spirv, nir: Delay calculation of shared_size when using explicit layout Move the calculation to nir_lower_vars_to_explicit_types(). This consolidates the check of shader_info::shared_memory_explicit_layout in a single place instead of in all drivers. This is motivated by SPV_KHR_untyped_pointers. Before that extension we had essentially two modes for shared memory variables - No layout decorations in the SPIR-V, and both internal layout and driver location was _given by the driver_. - Explicitly laid out, i.e. they are blocks, and decorated with Aliased. Because they all alias, we could assign them driver location directly to the start of the shared memory. With the untyped pointers extension, there's a third option, to be added by a later commit - Explicitly laid out, i.e. they are blocks, and NOT decorated with Aliased. Driver location is _given by the driver_. Blocks with and without Aliased can be mixed. The driver location of multiple blocks that don't alias depend on alignment that is driver-specific, which we can more easily do from the nir_lower_vars_to_explicit_types() that already has access to a function to obtain such value. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> (hk) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v3dv) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (anv/hasvk) Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (panvk) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (radv) Reviewed-by: Rob Clark <robdclark@gmail.com> (tu) Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34139>	2025-04-17 19:13:17 +00:00
Caio Oliveira	d5ad798140	spirv, radv, intel: Add NIR intrinsic for cmat conversion A cooperative matrix conversion operation was represented in NIR by the cmat_unary_op intrinsic with an nir_alu_op as extra parameter, that was already lowered to a specific conversion operation based on the matrix types. Instead of that, add a new intrinsic `cmat_convert` that is specific for that conversion. In addition to the src/dst matrix descriptions already available, also include the signedness information in the intrinsic (reuse nir_cmat_signed for that). This is needed because different Convert operations define different interpretations for integers, regardless their original type. In this patch, both radv and intel were changed to use the same logic that was previously used to pick the lowered ALU op. This change will help represent cmat conversions involving BFloat16, because it avoids having to create new NIR ALU ops for all the combinations involving BFloat16. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34511>	2025-04-16 23:13:36 +00:00
Ian Romanick	1d2ebeca17	nir/algebraic: Allow fmin(a,a) optimization when flush denorm to zero is not set I was surprised this had any affect on Intel GPUs because we have been unconditionally performing this optimization in the backend since June 2014. Once that error is fixed (later in this MR), this change prevents a couple dozen regressions in shader-db and around 90 regressions in fossil-db. Many of the regressions in fossil-db were loss of SIMD32, and that can be a big deal. v2: Add 64-bit too. Suggested by Alyssa. shader-db: All Intel platforms had similar results. (Lunar Lake shown) total instructions in shared programs: 16970141 -> 16970139 (<.01%) instructions in affected programs: 40 -> 38 (-5.00%) helped: 2 / HURT: 0 total cycles in shared programs: 914617580 -> 914617548 (<.01%) cycles in affected programs: 3428 -> 3396 (-0.93%) helped: 2 / HURT: 0 fossil-db: All Intel platforms had similar results. (Lunar Lake shown) Totals: Cycle count: 30546028462 -> 30546025224 (-0.00%); split: -0.00%, +0.00% Non SSA regs after NIR: 237017827 -> 237017731 (-0.00%) Totals from 83 (0.01% of 706657) affected shaders: Cycle count: 3042978 -> 3039740 (-0.11%); split: -0.13%, +0.02% Non SSA regs after NIR: 78997 -> 78901 (-0.12%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34192>	2025-04-15 23:59:31 +00:00
Alyssa Rosenzweig	63eb27d166	nir: add sampler LOD bias lowering this is a cleaned up version of the lowering originally written for asahi, moved to common code so it can be shared with an upcoming Vulkan implementation (not honeykrisp). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34507>	2025-04-15 14:10:50 +00:00
Alyssa Rosenzweig	9de7ea875d	nir: handle mismatched bias/lod bitsizes the sampler lod bias lowering uses fp16 for perf on AGX. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34507>	2025-04-15 14:10:49 +00:00
Alyssa Rosenzweig	2e15b42eec	nir: unvendor lod_bias(_agx) this will be useful for other backends. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34507>	2025-04-15 14:10:49 +00:00
Connor Abbott	2f93137308	nir/opt_preamble: Handle load_global_ir3 fossil-db results with turnip: Totals from 994 (0.60% of 165023) affected shaders: MaxWaves: 10720 -> 11528 (+7.54%); split: +7.57%, -0.04% Instrs: 1032004 -> 972314 (-5.78%); split: -5.99%, +0.21% CodeSize: 1847536 -> 1942472 (+5.14%); split: -0.11%, +5.25% NOPs: 261089 -> 233279 (-10.65%); split: -10.89%, +0.23% MOVs: 57217 -> 51434 (-10.11%); split: -14.11%, +4.00% Full: 16412 -> 14647 (-10.75%); split: -10.96%, +0.21% (ss): 23330 -> 25594 (+9.70%); split: -5.51%, +15.21% (sy): 17803 -> 15711 (-11.75%); split: -11.93%, +0.18% (ss)-stall: 96387 -> 107976 (+12.02%); split: -5.14%, +17.17% (sy)-stall: 952952 -> 765754 (-19.64%); split: -19.84%, +0.19% STPs: 494 -> 327 (-33.81%) LDPs: 1447 -> 1163 (-19.63%) Early-preamble: 668 -> 22 (-96.71%) Cat0: 280935 -> 251779 (-10.38%); split: -10.60%, +0.22% Cat1: 93400 -> 84766 (-9.24%); split: -11.79%, +2.55% Cat2: 343880 -> 337270 (-1.92%); split: -3.20%, +1.28% Cat3: 189311 -> 180918 (-4.43%) Cat4: 21008 -> 19920 (-5.18%) Cat5: 17788 -> 17783 (-0.03%) Cat6: 45786 -> 39531 (-13.66%) Cat7: 39896 -> 40347 (+1.13%); split: -0.43%, +1.56% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34483>	2025-04-14 16:53:34 +00:00
Erik Faye-Lund	1d5da22dfd	nir/lower_tex: avoid undefined-behavior Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details When texture_index and sampler_index are over 32, we can't really check for them in a single 32-bit word. This happens among other things when Panfrost uses preload shaders on v9 and later. Otherwise, we trigger undefined behavior. We're already doing this for textures in one case, let's be consistent. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34365>	2025-04-14 11:22:43 +00:00
Erik Faye-Lund	41b136f674	nir/lower_tex: use texture_mask instead of shifting on use In commit `292ac71a4a` ("nir/lower_tex: handle deref casts"), we avoided using texture_index when a texture instruction contained a variable deref. There's no good reason why this should be done to some of the lowering, but not all. So let's fix up code-paths that were added after this change to do the same. The first two patches here crossed paths with the commit that introduced texture_mask, so it's not strange that the change was missed. The last one seems to have just copied what was done around it, propagating the issue. Fixes: `880b00dc59` ("nir/lower_tex: Add support for lowering YUYV formats") Fixes: `1358d93650` ("nir/lower_tex: Add support for lowering Y41x formats") Fixes: `65d6f5aed2` ("nir: add options to lower y_vu, yv_yu, yx_xvxu and xy_vxux") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34365>	2025-04-14 11:22:43 +00:00
Konstantin Seurer	cb31b5a958	clc,libcl: Clean up CL includes This patch does a couple of things to make CL integration with drivers as seamless as possible: - We pull in opencl-c.h and opencl-c-base.h to stop relying on system headers. - Parts of libcl.h are moved to new headers that are incomplete CL-safe variants of libc headers. - A couple of util headers are changed to remove now unnecessary __OPENCL_VERSION__ guards and make more headers CL safe. - Drivers now include src/compiler/libcl and use headers like macros.h,u_math.h instead of libcl.h. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33576>	2025-04-11 21:27:37 +00:00
Konstantin Seurer	a80fab3e87	clc: Allow bitfields bitfields are not officially supported by Open CL but there is a clang extension that adds support. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33576>	2025-04-11 21:27:37 +00:00
Konstantin Seurer	ed07aab147	clc: Print errors when initializing clang fails It's nice to know what actually went wrong. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33576>	2025-04-11 21:27:37 +00:00
Caio Oliveira	2ed79f80ba	nir/load_store_vectorize: Skip new bit-sizes that are unaligned with high_offset Otherwise this would require combining two values to produce a single (new bit-size) channel, which vectorize_stores() don't handle. The pass can still keep trying smaller bit-sizes. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12946 Fixes: `ce9205c03b` ("nir: add a load/store vectorization pass") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34414>	2025-04-11 19:17:17 +00:00
Georg Lehmann	d046ecf95a	nir/opt_algebraic: optimize open coded ffract Foz-DB Navi21: Totals from 274 (0.34% of 79789) affected shaders: Instrs: 522630 -> 522181 (-0.09%); split: -0.09%, +0.01% CodeSize: 2880668 -> 2878940 (-0.06%); split: -0.07%, +0.01% VGPRs: 14488 -> 14464 (-0.17%) Latency: 4092358 -> 4091243 (-0.03%); split: -0.04%, +0.01% InvThroughput: 1014148 -> 1013471 (-0.07%); split: -0.07%, +0.00% VClause: 11646 -> 11639 (-0.06%) SClause: 18614 -> 18611 (-0.02%) Copies: 56248 -> 56309 (+0.11%); split: -0.05%, +0.16% PreVGPRs: 13649 -> 13647 (-0.01%) VALU: 359733 -> 359285 (-0.12%); split: -0.13%, +0.01% SALU: 59719 -> 59720 (+0.00%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33369>	2025-04-11 12:36:02 +00:00
Konstantin Seurer	ba001626ac	nir: Turn the format string index into a const index It is already expected to be constant. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34208>	2025-04-10 19:31:37 +00:00
Konstantin Seurer	d21926bc04	spirv: Emit code for NonSemantic.DebugPrintf if supported This can be useful for debugging code in situations where VVL cannot be used. (DGC, meta shaders) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34208>	2025-04-10 19:31:37 +00:00
Boris Brezillon	4f4ac56145	pan/va: Support relaxed waits on read-only render targets On Valhall we can optimize lower waits, which waits for both readers and writers, into resource_waits which only wait for writers, allowing threads accessing read-only resources to execute concurrently. Let's use that on LD_TILE instructions so we can optmize the read-only case. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00

1 2 3 4 5 ...

10471 commits