fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 22:18:18 +02:00

Author	SHA1	Message	Date
Qiang Yu	799806d85e	all: rename PIPE_SHADER_MESH_TYPES to MESA_SHADER_MESH_STAGES Use command: find . -type f -not -path '/.git/' -exec sed -i 's/\bPIPE_SHADER_MESH_TYPES\b/MESA_SHADER_MESH_STAGES/g' {} + Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Yonggang Luo <luoyonggang@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>	2025-08-06 10:28:40 +08:00
Qiang Yu	7729920d92	all: rename PIPE_SHADER_MESH to MESA_SHADER_MESH Use command: find . -type f -not -path '/.git/' -exec sed -i 's/\bPIPE_SHADER_MESH\b/MESA_SHADER_MESH/g' {} + Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Yonggang Luo <luoyonggang@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>	2025-08-06 10:28:39 +08:00
Qiang Yu	f60ea0a3cd	all: rename PIPE_SHADER_COMPUTE to MESA_SHADER_COMPUTE Use command: find . -type f -not -path '/.git/' -exec sed -i 's/PIPE_SHADER_COMPUTE/MESA_SHADER_COMPUTE/g' {} + Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Yonggang Luo <luoyonggang@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>	2025-08-06 10:28:39 +08:00
Marek Olšák	fee8e92855	nir: use gc_ctx for nir_variable to reduce ralloc/malloc overhead gc_ctx uses a slab allocator. This reduces GLSL compile times by 1-3% with the gallium noop driver. This reduces the number of ralloc_size calls for Heaven shaders by 14.3%. Note that gc_ctx also uses ralloc_size, so the reduction is a net change. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538>	2025-08-05 22:55:14 +00:00
Marek Olšák	44350bce1f	nir: add nir_variable_create_zeroed helper This will allow us to switch nir_variable from ralloc to gc_ctx, which uses a slab allocator. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538>	2025-08-05 22:55:14 +00:00
Marek Olšák	b769d5dcde	nir: don't use variables as ralloc parents, use the shader instead so that we can switch variables to gc_ctx Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538>	2025-08-05 22:55:13 +00:00
Marek Olšák	dadd4e4555	nir/clone: don't call ralloc_strdup with a NULL pointer for intrinsic names No impact, but it was affecting my ralloc_strdup stats for nir_intrinsic_instr names. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538>	2025-08-05 22:55:13 +00:00
Marek Olšák	3c4a64e807	nir: eliminate most ralloc/malloc for nir_variable names Store small names in a fixed-sized string in nir_variable. GLSL IR does the same thing. When compiling my shader-db with the gallium noop driver, it improves GLSL compile times by 0.7% (much lower than anticipated). For Unigine Heaven shaders: - it eliminates 95.6% ralloc calls for nir_variable names - the total number of ralloc calls is reduced by 11% It also adds only 16B to nir_variable, while just the ralloc header for the name would occupy 40B. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538>	2025-08-05 22:55:12 +00:00
Marek Olšák	96ffc24e4e	nir: add nir_variable_{set,append,steal}_name{f}() to modify nir_variable names Setting variable names currently always uses ralloc, but the new nir_variable_* helpers will mostly eliminate ralloc/malloc in a later commit. This just updates all places that touch nir_variable names to use the new helpers. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538>	2025-08-05 22:55:12 +00:00
Marek Olšák	05749922b0	nir: don't allocate nir_constant::elements if there are none Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538>	2025-08-05 22:55:11 +00:00
Dave Airlie	b1242e6b30	spirv: move cmat store barrier after the store. Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Fixes: `b98f87612b` ("spirv: Implement SPV_KHR_cooperative_matrix") Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36583>	2025-08-05 22:28:03 +00:00
Job Noorman	ae66bd1c00	nir/opt_uniform_subgroup: use ballot_bit_count Using bit_count on the result of ballot doesn't work for targets where ballot's num_components > 1. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Fixes: `d2e1e4442a` ("ir3: enable nir_opt_uniform_subgroup") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35669>	2025-08-05 17:09:27 +00:00
Antonio Ospite	5649a0aa06	libcl: avoid calling UNREACHABLE(str) macro without arguments Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details In commit `9ced3148ca` ("util: avoid calling UNREACHABLE(str) macro without arguments", 2025-07-30) the argument type check in the UNREACHABLE(str) macro in src/util/macros.h was improved to also avoid calling it without arguments, but the definition in src/compiler/libcl/libcl.h was not updated. Apply a similar change to src/compiler/libcl/libcl.h to keep the C and CL macros in sync. Fixes: `9ced3148ca` ("util: avoid calling UNREACHABLE(str) macro without arguments", 2025-07-30) Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> on gfx8 (Polaris 20) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36508>	2025-08-04 23:15:18 +02:00
Georg Lehmann	1d885fab9c	nir/opt_algebraic: optimize pack_half_rtz of b2f Foz-DB Navi21: Totals from 13 (0.02% of 80255) affected shaders: Instrs: 2313 -> 2306 (-0.30%); split: -0.35%, +0.04% CodeSize: 13452 -> 13480 (+0.21%) Latency: 12066 -> 12013 (-0.44%); split: -0.45%, +0.01% InvThroughput: 2172 -> 2163 (-0.41%) Copies: 112 -> 114 (+1.79%) VALU: 1480 -> 1472 (-0.54%) SALU: 154 -> 155 (+0.65%) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36535>	2025-08-04 19:42:22 +00:00
Georg Lehmann	bc3b09c5dd	nir/opt_algebraic: optimize pack_half_rtz of bcsel with constant Foz-DB Navi21: Totals from 448 (0.56% of 80255) affected shaders: Instrs: 345474 -> 344791 (-0.20%); split: -0.20%, +0.00% CodeSize: 1917784 -> 1913324 (-0.23%); split: -0.25%, +0.02% VGPRs: 22344 -> 22416 (+0.32%) Latency: 2320847 -> 2318161 (-0.12%); split: -0.13%, +0.01% InvThroughput: 543008 -> 541722 (-0.24%) SClause: 11450 -> 11459 (+0.08%) Copies: 19991 -> 19949 (-0.21%); split: -0.23%, +0.02% PreSGPRs: 19129 -> 19114 (-0.08%) PreVGPRs: 19695 -> 19696 (+0.01%); split: -0.01%, +0.01% VALU: 257627 -> 256948 (-0.26%) SALU: 30432 -> 30422 (-0.03%) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36535>	2025-08-04 19:42:22 +00:00
Georg Lehmann	8512479097	nir/opt_algebraic: create 16bit fmin/fmax if only used by pack_half_2x16_rtz_split Foz-DB Navi21: Totals from 1842 (2.30% of 80066) affected shaders: Instrs: 869152 -> 866751 (-0.28%) CodeSize: 4687316 -> 4682496 (-0.10%); split: -0.14%, +0.03% VGPRs: 75216 -> 75312 (+0.13%) Latency: 7297749 -> 7297929 (+0.00%); split: -0.01%, +0.02% InvThroughput: 1864933 -> 1860706 (-0.23%); split: -0.23%, +0.00% Copies: 52679 -> 52463 (-0.41%) VALU: 665076 -> 662890 (-0.33%) SALU: 56226 -> 56010 (-0.38%) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36535>	2025-08-04 19:42:22 +00:00
Georg Lehmann	22afe83473	nir/opt_algebraic: remove fneg around fmin/fmax Foz-DB Navi21: Totals from 282 (0.35% of 80255) affected shaders: Instrs: 310515 -> 309755 (-0.24%) CodeSize: 1721236 -> 1714540 (-0.39%) Latency: 1366446 -> 1365141 (-0.10%); split: -0.10%, +0.00% InvThroughput: 352528 -> 351097 (-0.41%); split: -0.41%, +0.00% Copies: 24623 -> 24630 (+0.03%) VALU: 231716 -> 230951 (-0.33%) SALU: 28774 -> 28779 (+0.02%) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36535>	2025-08-04 19:42:22 +00:00
Rhys Perry	d4b329219e	nir/lower_memory_model: remove empty lowered barriers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36080>	2025-08-04 15:36:51 +00:00
Rhys Perry	0512ba8743	vtn: remove acquire/release around make visible/available barriers These are not necessary and can be expensive. I think they were added because of a misunderstanding of the informative descriptions in the Vulkan memory model, or because the memory model requires make visible/available barriers to have these semantics. Because we use these to implement MakePointerVisible/MakePointerAvailable, we can skip that requirement in NIR. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36080>	2025-08-04 15:36:51 +00:00
Rhys Perry	ae6e39a8f5	nir: don't move accesses across make visible/available barriers Otherwise, the barrier would no longer affect the access. nir_opt_dead_write_vars should be fine, since it's removing stores, not moving them. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36080>	2025-08-04 15:36:50 +00:00
Rhys Perry	d54f2ca84f	vtn: fix placement of barriers for MakeAvailable/MakeVisible From Vulkan 1.4.321 spec: The implicit availability operation is program-ordered between the barrier or atomic and all other operations program-ordered before the barrier or atomic. ... The implicit visibility operation is program-ordered between the barrier or atomic and all other operations program-ordered after the barrier or atomic. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36080>	2025-08-04 15:36:49 +00:00
Mary Guillemard	440e0c283c	libcl: Add stdatomic.h Useful when using C11 atomics with CL C. Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Olivia Lee <olivia.lee@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35724>	2025-08-04 12:12:51 +00:00
Rhys Perry	4c36e08854	glsl_to_nir,vtn: insert barriers around begin/end invocation interlock Backends probably already deal with this, but these would be needed to prevent NIR passes from moving accesses outside the critical section. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36513>	2025-08-04 09:30:06 +00:00
Marek Olšák	8462b1dc71	glsl: switch ir_variable_refcount to linear_ctx Compiling my shader-db with the gallium noop driver is 6.8% faster now. Theoretical stat-based results are below, which don't always reflect real results. When compiling Heaven shaders with the gallium noop driver, 134610 calloc calls are removed. 134610 / ralloc count = 6%, so it's roughly the equivalent of 6% of the cost of all ralloc calls that's removed. The shift from calloc to linear_alloc increases ralloc calls by 0.4%, so the approximate reduction is 6% -> 0.4% overhead change. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36539>	2025-08-04 02:07:00 +00:00
Marek Olšák	dfe45d1b67	glsl: switch ir_instruction to linear_ctx to eliminate malloc overhead Compiling my shader-db with the gallium noop driver is 3.6% faster now. malloc calls from ralloc+linear_alloc are reduced by 34% when compiling Heaven shaders with the gallium noop driver. That's due to a shift of malloc calls from ralloc to linear_alloc. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36539>	2025-08-04 02:07:00 +00:00
Marek Olšák	6b2cb71560	glsl: add support for linear_ctx into ir_instruction The type of the "new operator" parameter determines whether ir_instruction is allocated with linear_ctx or ralloc. The ralloc operators will be removed in the next commit. GCC expects classes with virtual functions to have a virtual destructor, but linear_ctx has static assertions that expects that no destructor is present. Remove the assertions, as that's our only option. The destructor is empty including in all derived classes, so it doesn't have to execute. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36539>	2025-08-04 02:07:00 +00:00
Marek Olšák	ae5b168051	ralloc/linalloc: allow adding custom code to LINEAR_ALLOC new operator for GLSL IR Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36539>	2025-08-04 02:07:00 +00:00
Marek Olšák	4f2b8e7713	glsl/tests: fix memory leaks Fixes: `09cc5f0c37` - glsl: use pipe_screen::nir_options instead of NirOptions Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36539>	2025-08-04 02:06:59 +00:00
Alyssa Rosenzweig	e8ff9eb9cb	nir/opt_varyings: link interpolation qualifiers Some hardware (AGX, Imagination, Arm) really want to know the interpolation qualifiers when compiling the vertex shader. Even though we need to handle this dynamic for separate shaders, we can improve performance by linking. nir_opt_varyings already has all the information to do this, so just do so. Note this has to be done in common code for Gallium, which links varyings within the GLSL linker but then presents the linked programs as separate shader objects. This models that nicely, allowing Gallium drivers to optimize without weird sidebands. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36501>	2025-08-03 21:57:25 +00:00
Alyssa Rosenzweig	66740d9c91	nir: gather interpolation qualifiers we'll want this to be able to link interpolation qualifiers in a simple way with nir_opt_varyings. add the metadata for it and the FS gathering pass. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36501>	2025-08-03 21:57:25 +00:00
Alyssa Rosenzweig	b8f50b6317	nir: gather info in opt_varyings_bulk the info is all messed up so we need to do this right after. merge this code. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36501>	2025-08-03 21:57:25 +00:00
Alyssa Rosenzweig	3e8575c037	nir,agx: pull lower_printf_buffer into backend no other users now. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36516>	2025-08-03 21:27:50 +00:00
Alyssa Rosenzweig	1c28fc0a86	nir: add nir_inline_sysval pass a bunch of drivers have versions of this, might as well make a common one. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: John Anthony <john.anthony@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36516>	2025-08-03 21:27:47 +00:00
Emma Anholt	d5826506ce	nir,agx: Move AGX's loop (generalized) to shared NIR code. When I went to use opt_reassociate for tu, I was advised that you want to do this loop to get the best results. If everyone needs it, let's make it common code and explain what's going on. In the process, also make it skip work appropriately when there's no progress. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36342>	2025-08-03 20:58:28 +00:00
Emma Anholt	062a35b554	nir/lower_sample_shading: Set the sample qualifier on in vars. This is another step in setting things up, that zink would like to have. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36496>	2025-08-03 20:27:39 +00:00
Emma Anholt	d3ada77a6a	nir: Move ST's force-persample-shading NIR pass to shared code. This is about to grow a little. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36496>	2025-08-03 20:27:39 +00:00
Alyssa Rosenzweig	aca4948997	clc: force exact! across libclc libclc seems to have piles of bugs where it relies on precise floating point behaviours to meet CL precision requirements but doesn't actually disable fast math in its own spir-v. I am tired of playing this whack-a-mole game. Let's just assume that the math in CLC is right and should not be optimized in unsafe ways, and force the exact bit across libclc. This works around a large class of libclc bugs that keep cropping up from innocuous NIR changes. This does not force the exact bit for application shaders using libclc, just for the calculations inside of libclc itself. This seems like the right tradeoff all considered, anything "fast" bypasses libclc anyway. Fixes generated_tests/cl/builtin/math/builtin-float-pow-1.0.generated.cl on drivers using nir_opt_reassociate, and probably other stuff. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36527>	2025-08-01 21:00:47 +00:00
Georg Lehmann	cfd5fbfde1	nir/opt_algebraic: make fmin/fmax(a, #b) 16bit if only used by f2f16 Foz-DB Navi31: Totals from 11 out of 14 FSR4 shaders: Instrs: 58298 -> 58374 (+0.13%); split: -0.08%, +0.21% CodeSize: 397836 -> 398108 (+0.07%); split: -0.08%, +0.15% Latency: 209634 -> 211438 (+0.86%); split: -0.14%, +1.00% InvThroughput: 229152 -> 229314 (+0.07%); split: -0.03%, +0.10% VClause: 826 -> 847 (+2.54%); split: -0.36%, +2.91% Copies: 2954 -> 3040 (+2.91%); split: -1.56%, +4.47% VALU: 49637 -> 49711 (+0.15%); split: -0.06%, +0.21% VOPD: 1916 -> 1400 (-26.93%) These stats looks bad, but it's actually just unlucky RA. Replacing 1 VOPD (two v_dual_max_f32) with 1 VOP3P (v_pk_max_f16) should still be a win from a register bandwidth perspective. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36468>	2025-08-01 20:29:30 +00:00
Georg Lehmann	3168ebe2c5	nir/range_analysis: look through vec2 Foz-DB Navi31: Totals from 11 out of 14 FSR4 shaders: Instrs: 58987 -> 58298 (-1.17%) CodeSize: 402844 -> 397836 (-1.24%) Latency: 209630 -> 209634 (+0.00%); split: -0.66%, +0.66% InvThroughput: 230240 -> 229152 (-0.47%); split: -0.48%, +0.00% VClause: 838 -> 826 (-1.43%); split: -1.55%, +0.12% Copies: 3019 -> 2954 (-2.15%); split: -2.82%, +0.66% VALU: 50196 -> 49637 (-1.11%) VOPD: 1950 -> 1916 (-1.74%); split: +0.72%, -2.46% Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36468>	2025-08-01 20:29:29 +00:00
Georg Lehmann	caf89c97de	nir/range_analysis: look through f2f Foz-DB Navi31: Totals from 93 (0.12% of 80273) affected shaders: Instrs: 123927 -> 121073 (-2.30%); split: -2.30%, +0.00% CodeSize: 670832 -> 653332 (-2.61%); split: -2.61%, +0.00% Latency: 337678 -> 322803 (-4.41%); split: -4.41%, +0.00% InvThroughput: 63277 -> 61083 (-3.47%) VClause: 460 -> 373 (-18.91%) SClause: 2178 -> 2100 (-3.58%) Copies: 7637 -> 7744 (+1.40%) PreSGPRs: 4414 -> 4287 (-2.88%) PreVGPRs: 4229 -> 4230 (+0.02%) VALU: 77375 -> 75693 (-2.17%) SALU: 16497 -> 16383 (-0.69%); split: -0.73%, +0.04% VMEM: 561 -> 477 (-14.97%) SMEM: 3197 -> 3113 (-2.63%) Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36468>	2025-08-01 20:29:28 +00:00
Georg Lehmann	261239a492	nir/opt_algebraic: use range analysis to detect no-op fmin/fmax Foz-DB Navi31: Totals from 418 (0.52% of 80273) affected shaders: Instrs: 564550 -> 564387 (-0.03%); split: -0.04%, +0.01% CodeSize: 2983860 -> 2982684 (-0.04%); split: -0.05%, +0.01% Latency: 4387264 -> 4386397 (-0.02%); split: -0.02%, +0.00% InvThroughput: 717464 -> 716874 (-0.08%); split: -0.08%, +0.00% Copies: 40126 -> 40125 (-0.00%) VALU: 352128 -> 352003 (-0.04%); split: -0.04%, +0.01% SALU: 50290 -> 50283 (-0.01%) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36468>	2025-08-01 20:29:28 +00:00
Georg Lehmann	a0665e79e9	nir/opt_algebraic: push fsat into bcsel with constant bcsel doesn't have a free clamp modifier on AMD hardware, but what's inside might have free clamp. Foz-DB Navi31: Totals from 873 (1.09% of 80273) affected shaders: MaxWaves: 22008 -> 21968 (-0.18%) Instrs: 4624956 -> 4623950 (-0.02%); split: -0.04%, +0.02% CodeSize: 24152780 -> 24142884 (-0.04%); split: -0.05%, +0.01% VGPRs: 57900 -> 57960 (+0.10%) Latency: 28762622 -> 28749889 (-0.04%); split: -0.06%, +0.02% InvThroughput: 5320810 -> 5320145 (-0.01%); split: -0.02%, +0.00% VClause: 115879 -> 115929 (+0.04%); split: -0.10%, +0.14% SClause: 93058 -> 93059 (+0.00%); split: -0.01%, +0.02% Copies: 335674 -> 335845 (+0.05%); split: -0.05%, +0.10% PreSGPRs: 53819 -> 53843 (+0.04%); split: -0.01%, +0.05% PreVGPRs: 50908 -> 50939 (+0.06%); split: -0.02%, +0.08% VALU: 2816395 -> 2815514 (-0.03%); split: -0.04%, +0.01% SALU: 509988 -> 509987 (-0.00%); split: -0.02%, +0.02% Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36468>	2025-08-01 20:29:27 +00:00
Georg Lehmann	e9e5146848	nir/opt_algebraic: optimize fsat(fmax(a, b)) where b is not positive Foz-DB Navi31: Totals from 946 (1.18% of 80273) affected shaders: Instrs: 4986082 -> 4983988 (-0.04%); split: -0.04%, +0.00% CodeSize: 25998700 -> 25989796 (-0.03%); split: -0.04%, +0.00% Latency: 45514742 -> 45510330 (-0.01%); split: -0.01%, +0.00% InvThroughput: 8163529 -> 8162325 (-0.01%); split: -0.02%, +0.00% VClause: 112105 -> 112104 (-0.00%); split: -0.00%, +0.00% SClause: 109694 -> 109688 (-0.01%) Copies: 372356 -> 372284 (-0.02%); split: -0.03%, +0.01% Branches: 132636 -> 132633 (-0.00%) PreVGPRs: 58997 -> 58979 (-0.03%); split: -0.03%, +0.00% VALU: 3025662 -> 3024191 (-0.05%); split: -0.05%, +0.00% SALU: 551712 -> 551714 (+0.00%); split: -0.00%, +0.00% Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36468>	2025-08-01 20:29:27 +00:00
Alyssa Rosenzweig	bcf1a1c20b	treewide: use nir_def_block Via Coccinelle patch: @@ expression definition; @@ -definition->parent_instr->block +nir_def_block(definition) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Marek Olšák <maraeo@gmail.com> Acked-by: Karol Herbst <kherbst@redhat.com> Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489>	2025-08-01 15:34:24 +00:00
Alyssa Rosenzweig	82ae8b1d33	treewide: simplify nir_def_rewrite_uses_after Most of the time with nir_def_rewrite_uses_after, you want to rewrite after the replacement. Make that the default thing to be more ergonomic and to drop parent_instr uses. We leave nir_def_rewrite_uses_after_instr defined if you really want the old signature with an arbitrary after point. Via Coccinelle patch: @@ expression a, b; @@ -nir_def_rewrite_uses_after(a, b, b->parent_instr) +nir_def_rewrite_uses_after_def(a, b) Followed by a bunch of sed. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Marek Olšák <maraeo@gmail.com> Acked-by: Karol Herbst <kherbst@redhat.com> Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489>	2025-08-01 15:34:24 +00:00
Alyssa Rosenzweig	cc6e3b84cb	treewide: use nir_def_as_* Via Coccinelle patch: @@ expression definition; @@ -nir_instr_as_alu(definition->parent_instr) +nir_def_as_alu(definition) @@ expression definition; @@ -nir_instr_as_intrinsic(definition->parent_instr) +nir_def_as_intrinsic(definition) @@ expression definition; @@ -nir_instr_as_phi(definition->parent_instr) +nir_def_as_phi(definition) @@ expression definition; @@ -nir_instr_as_load_const(definition->parent_instr) +nir_def_as_load_const(definition) @@ expression definition; @@ -nir_instr_as_deref(definition->parent_instr) +nir_def_as_deref(definition) @@ expression definition; @@ -nir_instr_as_tex(definition->parent_instr) +nir_def_as_tex(definition) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Marek Olšák <maraeo@gmail.com> Acked-by: Karol Herbst <kherbst@redhat.com> Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489>	2025-08-01 15:34:24 +00:00
Alyssa Rosenzweig	114bf69956	nir: add nir_def_block helper Another common composition. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Marek Olšák <maraeo@gmail.com> Acked-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489>	2025-08-01 15:34:24 +00:00
Alyssa Rosenzweig	3624f054f2	nir: add nir_def_as_* helpers We want to get rid of nir_def::parent_instr eventually, requiring an accessor function instead nir_def_parent_instr(def), so to mitigate the hit to NIR ergonomics, let's add helpers for common patterns using parent_instr. This gets us an immediate win for NIR ergonomics and then reduces the surface area for the later flag day hiding parent_instr. This commit starts us off by adding compositions for nir_instr_as_* with parent_instr's, which are common. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Marek Olšák <maraeo@gmail.com> Acked-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489>	2025-08-01 15:34:24 +00:00
Lionel Landwerlin	83cb02206c	compiler: add gl_shader_stage_is_graphics Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36512>	2025-08-01 11:35:00 +00:00
Marek Olšák	c64c6a0c31	nir/opt_group_loads: support tex instructions without resource srcs for i915 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Fixes: `aa732f6f` - nir/group_loads: handle more loads (or a later commit) Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13624 Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36498>	2025-07-31 23:30:20 -04:00

1 2 3 4 5 ...

10886 commits