fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 20:08:06 +02:00

Author	SHA1	Message	Date
Georg Lehmann	3e26fc4498	nir/opt_algebraic: disable fsat(a + 1.0) opt if a can be NaN Foz-DB Navi21: Totals from 9 (0.01% of 79789) affected shaders: Instrs: 6782 -> 6796 (+0.21%); split: -0.03%, +0.24% CodeSize: 40020 -> 40108 (+0.22%); split: -0.04%, +0.26% Latency: 23764 -> 23758 (-0.03%) InvThroughput: 6424 -> 6431 (+0.11%); split: -0.08%, +0.19% SClause: 273 -> 275 (+0.73%) Copies: 338 -> 339 (+0.30%) VALU: 5138 -> 5147 (+0.18%); split: -0.06%, +0.23% SALU: 349 -> 350 (+0.29%) SMEM: 498 -> 500 (+0.40%) Fixes: `a4a3487aae` ("nir/opt_algebraic: optimize patterns from Skia") Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34125>	2025-04-22 14:23:05 +00:00
Georg Lehmann	a60d61cce8	nir: improve fadd is_a_number analysis by using the range Foz-DB Navi21: Totals from 145 (0.18% of 79789) affected shaders: Instrs: 168553 -> 168391 (-0.10%); split: -0.10%, +0.00% CodeSize: 926708 -> 926684 (-0.00%) Latency: 2210456 -> 2210329 (-0.01%); split: -0.01%, +0.00% InvThroughput: 545992 -> 545768 (-0.04%) SClause: 3084 -> 3085 (+0.03%) VALU: 129521 -> 129360 (-0.12%) SALU: 13085 -> 13084 (-0.01%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34125>	2025-04-22 14:23:05 +00:00
Georg Lehmann	a6fd9f488a	nir: add is_a_number analysis for ffma Foz-DB Navi21: Totals from 508 (0.64% of 79789) affected shaders: Instrs: 796183 -> 795838 (-0.04%) CodeSize: 4303420 -> 4303384 (-0.00%); split: -0.00%, +0.00% Latency: 7806095 -> 7805458 (-0.01%); split: -0.01%, +0.00% InvThroughput: 1377028 -> 1376824 (-0.01%); split: -0.01%, +0.00% Copies: 63297 -> 63299 (+0.00%); split: -0.00%, +0.00% PreVGPRs: 29818 -> 29819 (+0.00%) VALU: 562067 -> 561885 (-0.03%); split: -0.03%, +0.00% SALU: 89896 -> 89733 (-0.18%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34125>	2025-04-22 14:23:05 +00:00
Georg Lehmann	cb6d035925	nir: add range analysis for ffmaz Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34125>	2025-04-22 14:23:05 +00:00
Georg Lehmann	8ad695195e	nir/opt_algebraic: turn exact fmin(1.0, a) into fsat if a is not NaN and not negative Foz-DB Navi21: Totals from 2456 (3.08% of 79789) affected shaders: Instrs: 3415398 -> 3413352 (-0.06%); split: -0.06%, +0.00% CodeSize: 18781096 -> 18776092 (-0.03%); split: -0.03%, +0.00% VGPRs: 158512 -> 158528 (+0.01%) Latency: 39528900 -> 39526687 (-0.01%); split: -0.01%, +0.00% InvThroughput: 10612237 -> 10609296 (-0.03%); split: -0.03%, +0.00% VClause: 71028 -> 71034 (+0.01%) SClause: 93971 -> 93975 (+0.00%); split: -0.00%, +0.01% Copies: 257525 -> 257521 (-0.00%); split: -0.01%, +0.01% VALU: 2483374 -> 2481325 (-0.08%); split: -0.09%, +0.00% SALU: 348207 -> 348211 (+0.00%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34125>	2025-04-22 14:23:04 +00:00
Georg Lehmann	18a0de1834	nir/opt_algebraic: optimize fmax(ffma(a, b, c), 0.0) to fsat Foz-DB Navi21: Totals from 2621 (3.28% of 79789) affected shaders: MaxWaves: 55744 -> 55736 (-0.01%) Instrs: 2840180 -> 2832647 (-0.27%); split: -0.27%, +0.00% CodeSize: 15497364 -> 15464692 (-0.21%); split: -0.21%, +0.00% VGPRs: 138448 -> 138456 (+0.01%) Latency: 22319512 -> 22307018 (-0.06%); split: -0.06%, +0.01% InvThroughput: 5745108 -> 5729197 (-0.28%); split: -0.28%, +0.00% Copies: 110279 -> 110268 (-0.01%); split: -0.04%, +0.03% VALU: 2210578 -> 2203211 (-0.33%); split: -0.33%, +0.00% SALU: 169014 -> 168841 (-0.10%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34125>	2025-04-22 14:23:04 +00:00
Georg Lehmann	f71fc26393	nir/opt_algebraic: generalize fmax(fadd(a, b), 0.0) to fsat by not requiring fneg Not a large effect, but it's positive and makes the pattern simpler. Foz-DB Navi21: Totals from 1 (0.00% of 79789) affected shaders: Instrs: 145 -> 138 (-4.83%) CodeSize: 784 -> 756 (-3.57%) Latency: 1495 -> 1487 (-0.54%) InvThroughput: 210 -> 196 (-6.67%) VALU: 103 -> 96 (-6.80%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34125>	2025-04-22 14:23:04 +00:00
Alyssa Rosenzweig	f1aeb46a34	nir: factor out nir_verts_in_output_prim helper very useful for geometry shader lowering code. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34638>	2025-04-22 12:47:54 +00:00
Job Noorman	f269c7b3b5	nir/opt_shrink_vectors: enable for load_ubo_vec4 Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34600>	2025-04-18 15:56:02 +00:00
Konstantin Seurer	978e9b670e	aco,nir: Add support for new GFX12 ray tracing instructions Adds image_bvh_dual_intersect_ray and image_bvh8_intersect_ray which can handle the new BVH format. Both instructions write up to 10 VGPRs so they need to use a vec16 definition in nir. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34273>	2025-04-17 20:20:40 +00:00
Caio Oliveira	33295b2249	spirv, nir: Allow non-Aliased workgroup memory blocks Allocate space for the aliased region first, then allocate the non-Aliased blocks in sequence after that. SPV_KHR_workgroup_memory_explicit_layout extension added support for having Blocks of workgroup (shared) memory, which include layout decoration. For that extension all such blocks must be decorated with Aliased. SPV_KHR_untyped_pointers extension lifts that requirement, allowing blocks that don't alias in workgroup memory. They are still explicitly laid out. The motivation is that untyped pointers provide a different mechanism to obtain the same effect as the Aliased blocks. Instead of having two Aliased variables with different types, have a single variable and use an untyped pointer with a different type to access it. This patch is a preparation for supporting untyped pointers. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34139>	2025-04-17 19:13:18 +00:00
Caio Oliveira	fd0a7efb5a	spirv, nir: Delay calculation of shared_size when using explicit layout Move the calculation to nir_lower_vars_to_explicit_types(). This consolidates the check of shader_info::shared_memory_explicit_layout in a single place instead of in all drivers. This is motivated by SPV_KHR_untyped_pointers. Before that extension we had essentially two modes for shared memory variables - No layout decorations in the SPIR-V, and both internal layout and driver location was _given by the driver_. - Explicitly laid out, i.e. they are blocks, and decorated with Aliased. Because they all alias, we could assign them driver location directly to the start of the shared memory. With the untyped pointers extension, there's a third option, to be added by a later commit - Explicitly laid out, i.e. they are blocks, and NOT decorated with Aliased. Driver location is _given by the driver_. Blocks with and without Aliased can be mixed. The driver location of multiple blocks that don't alias depend on alignment that is driver-specific, which we can more easily do from the nir_lower_vars_to_explicit_types() that already has access to a function to obtain such value. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> (hk) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v3dv) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (anv/hasvk) Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (panvk) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (radv) Reviewed-by: Rob Clark <robdclark@gmail.com> (tu) Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34139>	2025-04-17 19:13:17 +00:00
Caio Oliveira	d5ad798140	spirv, radv, intel: Add NIR intrinsic for cmat conversion A cooperative matrix conversion operation was represented in NIR by the cmat_unary_op intrinsic with an nir_alu_op as extra parameter, that was already lowered to a specific conversion operation based on the matrix types. Instead of that, add a new intrinsic `cmat_convert` that is specific for that conversion. In addition to the src/dst matrix descriptions already available, also include the signedness information in the intrinsic (reuse nir_cmat_signed for that). This is needed because different Convert operations define different interpretations for integers, regardless their original type. In this patch, both radv and intel were changed to use the same logic that was previously used to pick the lowered ALU op. This change will help represent cmat conversions involving BFloat16, because it avoids having to create new NIR ALU ops for all the combinations involving BFloat16. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34511>	2025-04-16 23:13:36 +00:00
Ian Romanick	1d2ebeca17	nir/algebraic: Allow fmin(a,a) optimization when flush denorm to zero is not set I was surprised this had any affect on Intel GPUs because we have been unconditionally performing this optimization in the backend since June 2014. Once that error is fixed (later in this MR), this change prevents a couple dozen regressions in shader-db and around 90 regressions in fossil-db. Many of the regressions in fossil-db were loss of SIMD32, and that can be a big deal. v2: Add 64-bit too. Suggested by Alyssa. shader-db: All Intel platforms had similar results. (Lunar Lake shown) total instructions in shared programs: 16970141 -> 16970139 (<.01%) instructions in affected programs: 40 -> 38 (-5.00%) helped: 2 / HURT: 0 total cycles in shared programs: 914617580 -> 914617548 (<.01%) cycles in affected programs: 3428 -> 3396 (-0.93%) helped: 2 / HURT: 0 fossil-db: All Intel platforms had similar results. (Lunar Lake shown) Totals: Cycle count: 30546028462 -> 30546025224 (-0.00%); split: -0.00%, +0.00% Non SSA regs after NIR: 237017827 -> 237017731 (-0.00%) Totals from 83 (0.01% of 706657) affected shaders: Cycle count: 3042978 -> 3039740 (-0.11%); split: -0.13%, +0.02% Non SSA regs after NIR: 78997 -> 78901 (-0.12%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34192>	2025-04-15 23:59:31 +00:00
Alyssa Rosenzweig	63eb27d166	nir: add sampler LOD bias lowering this is a cleaned up version of the lowering originally written for asahi, moved to common code so it can be shared with an upcoming Vulkan implementation (not honeykrisp). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34507>	2025-04-15 14:10:50 +00:00
Alyssa Rosenzweig	9de7ea875d	nir: handle mismatched bias/lod bitsizes the sampler lod bias lowering uses fp16 for perf on AGX. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34507>	2025-04-15 14:10:49 +00:00
Alyssa Rosenzweig	2e15b42eec	nir: unvendor lod_bias(_agx) this will be useful for other backends. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34507>	2025-04-15 14:10:49 +00:00
Connor Abbott	2f93137308	nir/opt_preamble: Handle load_global_ir3 fossil-db results with turnip: Totals from 994 (0.60% of 165023) affected shaders: MaxWaves: 10720 -> 11528 (+7.54%); split: +7.57%, -0.04% Instrs: 1032004 -> 972314 (-5.78%); split: -5.99%, +0.21% CodeSize: 1847536 -> 1942472 (+5.14%); split: -0.11%, +5.25% NOPs: 261089 -> 233279 (-10.65%); split: -10.89%, +0.23% MOVs: 57217 -> 51434 (-10.11%); split: -14.11%, +4.00% Full: 16412 -> 14647 (-10.75%); split: -10.96%, +0.21% (ss): 23330 -> 25594 (+9.70%); split: -5.51%, +15.21% (sy): 17803 -> 15711 (-11.75%); split: -11.93%, +0.18% (ss)-stall: 96387 -> 107976 (+12.02%); split: -5.14%, +17.17% (sy)-stall: 952952 -> 765754 (-19.64%); split: -19.84%, +0.19% STPs: 494 -> 327 (-33.81%) LDPs: 1447 -> 1163 (-19.63%) Early-preamble: 668 -> 22 (-96.71%) Cat0: 280935 -> 251779 (-10.38%); split: -10.60%, +0.22% Cat1: 93400 -> 84766 (-9.24%); split: -11.79%, +2.55% Cat2: 343880 -> 337270 (-1.92%); split: -3.20%, +1.28% Cat3: 189311 -> 180918 (-4.43%) Cat4: 21008 -> 19920 (-5.18%) Cat5: 17788 -> 17783 (-0.03%) Cat6: 45786 -> 39531 (-13.66%) Cat7: 39896 -> 40347 (+1.13%); split: -0.43%, +1.56% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34483>	2025-04-14 16:53:34 +00:00
Erik Faye-Lund	1d5da22dfd	nir/lower_tex: avoid undefined-behavior Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details When texture_index and sampler_index are over 32, we can't really check for them in a single 32-bit word. This happens among other things when Panfrost uses preload shaders on v9 and later. Otherwise, we trigger undefined behavior. We're already doing this for textures in one case, let's be consistent. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34365>	2025-04-14 11:22:43 +00:00
Erik Faye-Lund	41b136f674	nir/lower_tex: use texture_mask instead of shifting on use In commit `292ac71a4a` ("nir/lower_tex: handle deref casts"), we avoided using texture_index when a texture instruction contained a variable deref. There's no good reason why this should be done to some of the lowering, but not all. So let's fix up code-paths that were added after this change to do the same. The first two patches here crossed paths with the commit that introduced texture_mask, so it's not strange that the change was missed. The last one seems to have just copied what was done around it, propagating the issue. Fixes: `880b00dc59` ("nir/lower_tex: Add support for lowering YUYV formats") Fixes: `1358d93650` ("nir/lower_tex: Add support for lowering Y41x formats") Fixes: `65d6f5aed2` ("nir: add options to lower y_vu, yv_yu, yx_xvxu and xy_vxux") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34365>	2025-04-14 11:22:43 +00:00
Konstantin Seurer	cb31b5a958	clc,libcl: Clean up CL includes This patch does a couple of things to make CL integration with drivers as seamless as possible: - We pull in opencl-c.h and opencl-c-base.h to stop relying on system headers. - Parts of libcl.h are moved to new headers that are incomplete CL-safe variants of libc headers. - A couple of util headers are changed to remove now unnecessary __OPENCL_VERSION__ guards and make more headers CL safe. - Drivers now include src/compiler/libcl and use headers like macros.h,u_math.h instead of libcl.h. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33576>	2025-04-11 21:27:37 +00:00
Caio Oliveira	2ed79f80ba	nir/load_store_vectorize: Skip new bit-sizes that are unaligned with high_offset Otherwise this would require combining two values to produce a single (new bit-size) channel, which vectorize_stores() don't handle. The pass can still keep trying smaller bit-sizes. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12946 Fixes: `ce9205c03b` ("nir: add a load/store vectorization pass") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34414>	2025-04-11 19:17:17 +00:00
Georg Lehmann	d046ecf95a	nir/opt_algebraic: optimize open coded ffract Foz-DB Navi21: Totals from 274 (0.34% of 79789) affected shaders: Instrs: 522630 -> 522181 (-0.09%); split: -0.09%, +0.01% CodeSize: 2880668 -> 2878940 (-0.06%); split: -0.07%, +0.01% VGPRs: 14488 -> 14464 (-0.17%) Latency: 4092358 -> 4091243 (-0.03%); split: -0.04%, +0.01% InvThroughput: 1014148 -> 1013471 (-0.07%); split: -0.07%, +0.00% VClause: 11646 -> 11639 (-0.06%) SClause: 18614 -> 18611 (-0.02%) Copies: 56248 -> 56309 (+0.11%); split: -0.05%, +0.16% PreVGPRs: 13649 -> 13647 (-0.01%) VALU: 359733 -> 359285 (-0.12%); split: -0.13%, +0.01% SALU: 59719 -> 59720 (+0.00%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33369>	2025-04-11 12:36:02 +00:00
Konstantin Seurer	ba001626ac	nir: Turn the format string index into a const index It is already expected to be constant. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34208>	2025-04-10 19:31:37 +00:00
Boris Brezillon	4f4ac56145	pan/va: Support relaxed waits on read-only render targets On Valhall we can optimize lower waits, which waits for both readers and writers, into resource_waits which only wait for writers, allowing threads accessing read-only resources to execute concurrently. Let's use that on LD_TILE instructions so we can optmize the read-only case. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	20275d6521	pan/bi: Introduce two intrinsics to support input attachment remapping In order to dynamically load the content of the tile buffer, we need to know the target (color, depth or stencil) and the conversion to apply. Let's define the load_input_attachment_{target,conv}_pan intrinsics so we can dissociate the logic lowering input attachment loads into load_converted_output_pan, and the part optimizing the shader when input attachment map is passed at compile time. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	f3be0836b7	pan/bi: Pass an explicit sampleid to load_converted_output_pan Needed if we want to lower multisample input attachment loads to tile buffer loads. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Boris Brezillon	cdeda45282	pan/bi: Pass load_converted_output_pan target through a source This allows us to pass a dynamic render target which will be needed to support VK_KHR_dynamic_rendering_local_read. While at it, we also enable support for depth/stencil tile loads. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>	2025-04-10 13:17:53 +00:00
Alyssa Rosenzweig	c2a3c70086	nir/lower_tex: use vector_insert_imm was in the area. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34426>	2025-04-08 19:04:47 +00:00
Alyssa Rosenzweig	c23201ad8a	nir/lower_blend: disable logic ops for unsupported formats Fixes new Vulkan CTS cases on Honeykrisp (and probably panvk and whatever) dEQP-VK.pipeline.shader_object_unlinked_binary.logic_op_na_formats.* Cc: mesa-stable Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34426>	2025-04-08 19:04:47 +00:00
Alyssa Rosenzweig	54ccc8ed0b	nir/lower_blend: refactor logicop variables This pulls out the logicop_func variable from the options struct, so we can modify it in the next commit in a central place. It then refactors out the format variable from the options struct since we end up duplicating options->format[rt] a zillion times and passing in both an options struct and a logicop func override is confusing so this will just make everything neater and self-contained next commit. no functional change. Cc'd to make the next commit cherrypickable. Cc: mesa-stable Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34426>	2025-04-08 19:04:46 +00:00
Faith Ekstrand	6aa2c152b8	nak,nir: Add an image_load_raw_nv intrinsic Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34336>	2025-04-08 04:06:45 +00:00
Marek Olšák	1d5c42528b	nir/opt_algebraic: lower 16-bit imul_high & umul_high Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34016>	2025-04-07 19:44:22 +00:00
Timothy Arceri	d8782db3a4	glsl: fix regression in ubo cloning Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Fixes KHR-GL46.layout_binding.block_layout_binding_block_VertexShader with radeonsi. Fixes: `2b2132d2ac` ("nir: fix uniform cloning helper") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34337>	2025-04-06 19:43:47 +10:00
Konstantin	e7a44de184	nir/tests: Do not rely on __LINE__ __LINE__ can be inconsistent when using different compilers. This patch changes the test runner to do a simple string find/replace of the test source file instead of looking for the line where the reference string starts. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33980>	2025-04-04 19:01:01 +00:00
Timur Kristóf	a530890e75	nir/print: Fix variable mode for arrayed output load intrinsics. Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This helps print the names of varyings correctly. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>	2025-04-03 19:54:51 +00:00
Timur Kristóf	96d11d0f56	nir/opt_varyings: Fix assertion when deduplicating TCS outputs. When deduplicating TCS outputs, we may find outputs that aren't loaded by the shader itself. This previously hit a bad assertion. Fixes: `c66967b5cb` Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12410 Cc: mesa-stable Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>	2025-04-03 19:54:51 +00:00
Timur Kristóf	a29b5857f7	nir/xfb: Preserve some xfb information when gathering from intrinsics. We need to remember which streamout buffers and streams were enabled, even if the shader doesn't actually write any outputs to them, because the API requires that we count vertices created by this shader towards queries against those streams. That information can be gathered by nir_gather_xfb_info_with_varyings from the original NIR I/O variables that we get from the frontend, but it isn't included in any intrinsics so would be otherwise lost here. Cc: mesa-stable Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34317>	2025-04-03 19:54:51 +00:00
Faith Ekstrand	a3935c7aa2	nak,nir: Generalize nak_nir_split_64bit_conversions and move it to NIR This pass was originally based on a similar pass from Intel but it's grown support for some fancy stuff like fp64 -> fp16 conversion splitting with proper rounding. Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34126>	2025-03-29 03:02:17 +00:00
Lionel Landwerlin	772beb0ebf	nir: add support for lowering non uniform texture offsets Intel HW only has support for non-uniform offsets for TG4 operations. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33138>	2025-03-29 02:15:18 +00:00
Georg Lehmann	2b1fc1a7fe	nir: add option to keep mul24_relaxed Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33871>	2025-03-27 06:24:15 +00:00
Timothy Arceri	2b2132d2ac	nir: fix uniform cloning helper glsl allows for ubos to have the same name but different bindings. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Fixes: `b47b8d16d9` ("nir: expose reusable linking helpers for cloning uniform loads") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12852 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34138>	2025-03-25 06:54:53 +00:00
Connor Abbott	1621080df7	compiler,nir: Gather needs_full_quad_helper_invocations info This is needed on Qualcomm, where there are separate fields to enable just 3 fragments and all 4 fragments. Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Fixes: `264d8a6766` ("ir3: Set need_full_quad depending on info.fs.require_full_quads") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33862>	2025-03-14 21:55:58 +00:00
Connor Abbott	7a55e13939	nir, compiler: Rename needs_quad_helper_invocations This currently treats coarse and fine derivatives the same, but Qualcomm needs to know whether just coarse derivatives are used or fine derivatives/quad ops are also used. Rename this to needs_coarse_quad_helper_invocations make clear the difference from the new field, needs_full_quad_helper_invocations. Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com> Fixes: `264d8a6766` ("ir3: Set need_full_quad depending on info.fs.require_full_quads") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33862>	2025-03-14 21:55:57 +00:00
Karol Herbst	3a9954c117	nir/serialize: fix decoding of is_return and is_uniform Fixes: `3321a56d1d` ("nir: Serialize all parameter attributes") Fixes: `26cbb6b933` ("nir: Add parameter divergence info") Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34052>	2025-03-14 15:01:32 +00:00
Georg Lehmann	b386659588	nir/opt_algebraic: create ubfe from (a & mask) >> c Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Foz-DB Navi21: Totals from 917 (1.16% of 79188) affected shaders: Instrs: 2549482 -> 2544997 (-0.18%); split: -0.18%, +0.00% CodeSize: 13781648 -> 13763616 (-0.13%); split: -0.13%, +0.00% Latency: 24832087 -> 24825199 (-0.03%); split: -0.04%, +0.01% InvThroughput: 5921339 -> 5914799 (-0.11%); split: -0.12%, +0.01% VClause: 59910 -> 59898 (-0.02%); split: -0.02%, +0.00% SClause: 62294 -> 62293 (-0.00%) Copies: 221015 -> 220988 (-0.01%); split: -0.02%, +0.01% VALU: 1717280 -> 1713332 (-0.23%); split: -0.23%, +0.00% SALU: 359390 -> 358910 (-0.13%) VMEM: 101966 -> 101924 (-0.04%) Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33455>	2025-03-14 11:15:04 +00:00
Matt Turner	7534559f2f	nir: Return NULL, not false, from functions returning pointers Reported by clang's `-Wbool-conversion`. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34014>	2025-03-13 20:11:09 +00:00
Mary Guillemard	e0be93d881	nir: Add Panfrost specific shader_output intrinsic On Avalon, this is a bitfield that holds information on what values a vertex shader should output. Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33910>	2025-03-10 07:38:16 +01:00
Alyssa Rosenzweig	bc6b527b52	nir/lower_helper_writes: fix stores after discard We need to use nir_is_helper_invocation instead of nir_load_helper_invocation, to correctly predicate stores after demote. Identified in a Piglit on AGX a year ago but I forgot to upstream this. Fixes: `586da7b329` ("nir: Add nir_lower_helper_writes pass") Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33939>	2025-03-08 07:47:40 +00:00
Daniel Schürmann	dbd41e3ddd	nir: set SYSTEM_VALUE_HELPER_INVOCATION read for nir_intrinsic_is_helper_invocation is_helper_invocation is the volatile access of load_helper_invocation. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33492>	2025-03-07 15:44:49 +00:00

1 2 3 4 5 ...

6139 commits