fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 22:08:10 +02:00

Author	SHA1	Message	Date
Georg Lehmann	de3d04dd72	nir/uub: guard against division by 0 Fixes: `8ee5440073` ("nir/uub: improve ishl/imul with constant sources") Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36805>	2025-08-19 15:49:57 +00:00
Daniel Schürmann	8c8fc7d058	nir/opt_load_store_vectorize: don't vectorize large shared2_amd loads for performance reasons. Totals from 180 (0.23% of 79839) affected shaders: (Navi48) Instrs: 288089 -> 289937 (+0.64%); split: -0.00%, +0.64% CodeSize: 1515884 -> 1527936 (+0.80%); split: -0.00%, +0.80% VGPRs: 10740 -> 10704 (-0.34%) Latency: 1477965 -> 1478591 (+0.04%); split: -0.09%, +0.14% InvThroughput: 467449 -> 467885 (+0.09%); split: -0.02%, +0.11% VClause: 5012 -> 5010 (-0.04%); split: -0.08%, +0.04% SClause: 6509 -> 6512 (+0.05%); split: -0.02%, +0.06% Copies: 20815 -> 20923 (+0.52%); split: -0.28%, +0.80% Branches: 6019 -> 6018 (-0.02%) PreSGPRs: 7670 -> 7669 (-0.01%) PreVGPRs: 7239 -> 7192 (-0.65%) VALU: 151763 -> 152011 (+0.16%); split: -0.04%, +0.20% SALU: 39199 -> 39202 (+0.01%) VOPD: 877 -> 861 (-1.82%); split: +0.57%, -2.39% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36133>	2025-08-19 14:28:14 +00:00
Daniel Schürmann	957b271a9f	nir/opt_load_store_vectorize: only attempt to vectorize shared2 after exhausting other possibilities Totals from 249 (0.31% of 79839) affected shaders: (Navi48) Instrs: 276401 -> 275918 (-0.17%); split: -0.29%, +0.11% CodeSize: 1477072 -> 1474440 (-0.18%); split: -0.26%, +0.08% VGPRs: 12748 -> 12760 (+0.09%); split: -0.28%, +0.38% Latency: 1397959 -> 1398846 (+0.06%); split: -0.10%, +0.16% InvThroughput: 424767 -> 424496 (-0.06%); split: -0.09%, +0.02% VClause: 5183 -> 5186 (+0.06%); split: -0.10%, +0.15% SClause: 6537 -> 6538 (+0.02%); split: -0.05%, +0.06% Copies: 21295 -> 21098 (-0.93%); split: -1.21%, +0.29% Branches: 4324 -> 4325 (+0.02%) PreSGPRs: 9719 -> 9717 (-0.02%) PreVGPRs: 8857 -> 8847 (-0.11%); split: -0.24%, +0.12% VALU: 144514 -> 144334 (-0.12%); split: -0.20%, +0.07% SALU: 38970 -> 38944 (-0.07%); split: -0.08%, +0.01% VOPD: 884 -> 898 (+1.58%); split: +1.92%, -0.34% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36133>	2025-08-19 14:28:14 +00:00
Gert Wollny	8c65da0c9d	r600/sfn: cleanup GS shader emission Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Now that we lower all load_per_vertex_input to r600_load_per_vertex_input we can remove some dead code and also change the intrinsic to use only one source value. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36488>	2025-08-12 14:30:17 +00:00
Georg Lehmann	8818d7367d	nir/opt_load_skip_helpers: optionally handle intrinsics Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36610>	2025-08-12 08:56:37 +00:00
Georg Lehmann	cd687e277f	nir: add access for scratch loads To be able to use ACCESS_SKIP_HELPERS. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36610>	2025-08-12 08:56:37 +00:00
Georg Lehmann	2d16f457c5	nir: add ACCESS_SKIP_HELPERS Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36610>	2025-08-12 08:56:37 +00:00
Georg Lehmann	91572a99bb	nir: rename to nir_opt_load_skip_helpers and add options struct Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36610>	2025-08-12 08:56:37 +00:00
Georg Lehmann	fbae0893a6	nir: print skip_helpers for tex instrs Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36610>	2025-08-12 08:56:37 +00:00
Georg Lehmann	6577f68ad4	nir/opt_tex_skip_helpers: never require helpers for stores/atomics Helpers never execute stores/atomics. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36610>	2025-08-12 08:56:37 +00:00
Georg Lehmann	26e6c4c092	nir/opt_tex_skip_helpers: don't skip helpers for terminate_if source Helpers must be terminated correctly. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36610>	2025-08-12 08:56:37 +00:00
Qiang Yu	bfd7f498a5	nir/opt_varying: remove assert for mesh shader crash Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This assert is not true when mesh shader. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Yonggang Luo <luoyonggang@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36596>	2025-08-11 01:44:45 +00:00
Alyssa Rosenzweig	8566a566e6	nir: plumb ballot options glsl needs to plumb this from the backend. we should clean up nir_lower_subgroups to use this later but I don't have time to churn everything right now. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36649>	2025-08-08 20:51:03 +00:00
Alyssa Rosenzweig	1af0897452	nir/lower_subgroups: add lower_fp64 option This is needed for doubles lowering to do the right thing. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36649>	2025-08-08 20:51:03 +00:00
John Anthony	000bd3046d	nir,spirv: Add support for SPV_ARM_core_builtins Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36019>	2025-08-07 11:46:33 +02:00
John Anthony	a68a825aad	nir,agx: unvendor core_id_agx core_id will be used by SPV_ARM_core_builtins Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36019>	2025-08-07 11:46:33 +02:00
Qiang Yu	c135ed1eb9	all: rename gl_shader_stage_name to mesa_shader_stage_name Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Yonggang Luo <luoyonggang@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>	2025-08-06 10:28:41 +08:00
Qiang Yu	807d693421	compiler: rename gl_shader_stage_is_callable to mesa_shader_stage_is_callable Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Yonggang Luo <luoyonggang@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>	2025-08-06 10:28:41 +08:00
Qiang Yu	4847e0b380	all: rename gl_shader_stage_uses_workgroup to mesa_shader_stage_uses_workgroup Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Yonggang Luo <luoyonggang@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>	2025-08-06 10:28:41 +08:00
Qiang Yu	7a91473192	all: rename gl_shader_stage_is_compute to mesa_shader_stage_is_compute Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Yonggang Luo <luoyonggang@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>	2025-08-06 10:28:41 +08:00
Qiang Yu	196569b1a4	all: rename gl_shader_stage to mesa_shader_stage It's not only for GL, change to a generic name. Use command: find . -type f -not -path '/.git/' -exec sed -i 's/\bgl_shader_stage\b/mesa_shader_stage/g' {} + Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Yonggang Luo <luoyonggang@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>	2025-08-06 10:28:40 +08:00
Marek Olšák	fee8e92855	nir: use gc_ctx for nir_variable to reduce ralloc/malloc overhead gc_ctx uses a slab allocator. This reduces GLSL compile times by 1-3% with the gallium noop driver. This reduces the number of ralloc_size calls for Heaven shaders by 14.3%. Note that gc_ctx also uses ralloc_size, so the reduction is a net change. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538>	2025-08-05 22:55:14 +00:00
Marek Olšák	44350bce1f	nir: add nir_variable_create_zeroed helper This will allow us to switch nir_variable from ralloc to gc_ctx, which uses a slab allocator. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538>	2025-08-05 22:55:14 +00:00
Marek Olšák	b769d5dcde	nir: don't use variables as ralloc parents, use the shader instead so that we can switch variables to gc_ctx Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538>	2025-08-05 22:55:13 +00:00
Marek Olšák	dadd4e4555	nir/clone: don't call ralloc_strdup with a NULL pointer for intrinsic names No impact, but it was affecting my ralloc_strdup stats for nir_intrinsic_instr names. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538>	2025-08-05 22:55:13 +00:00
Marek Olšák	3c4a64e807	nir: eliminate most ralloc/malloc for nir_variable names Store small names in a fixed-sized string in nir_variable. GLSL IR does the same thing. When compiling my shader-db with the gallium noop driver, it improves GLSL compile times by 0.7% (much lower than anticipated). For Unigine Heaven shaders: - it eliminates 95.6% ralloc calls for nir_variable names - the total number of ralloc calls is reduced by 11% It also adds only 16B to nir_variable, while just the ralloc header for the name would occupy 40B. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538>	2025-08-05 22:55:12 +00:00
Marek Olšák	96ffc24e4e	nir: add nir_variable_{set,append,steal}_name{f}() to modify nir_variable names Setting variable names currently always uses ralloc, but the new nir_variable_* helpers will mostly eliminate ralloc/malloc in a later commit. This just updates all places that touch nir_variable names to use the new helpers. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538>	2025-08-05 22:55:12 +00:00
Marek Olšák	05749922b0	nir: don't allocate nir_constant::elements if there are none Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538>	2025-08-05 22:55:11 +00:00
Job Noorman	ae66bd1c00	nir/opt_uniform_subgroup: use ballot_bit_count Using bit_count on the result of ballot doesn't work for targets where ballot's num_components > 1. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Fixes: `d2e1e4442a` ("ir3: enable nir_opt_uniform_subgroup") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35669>	2025-08-05 17:09:27 +00:00
Georg Lehmann	1d885fab9c	nir/opt_algebraic: optimize pack_half_rtz of b2f Foz-DB Navi21: Totals from 13 (0.02% of 80255) affected shaders: Instrs: 2313 -> 2306 (-0.30%); split: -0.35%, +0.04% CodeSize: 13452 -> 13480 (+0.21%) Latency: 12066 -> 12013 (-0.44%); split: -0.45%, +0.01% InvThroughput: 2172 -> 2163 (-0.41%) Copies: 112 -> 114 (+1.79%) VALU: 1480 -> 1472 (-0.54%) SALU: 154 -> 155 (+0.65%) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36535>	2025-08-04 19:42:22 +00:00
Georg Lehmann	bc3b09c5dd	nir/opt_algebraic: optimize pack_half_rtz of bcsel with constant Foz-DB Navi21: Totals from 448 (0.56% of 80255) affected shaders: Instrs: 345474 -> 344791 (-0.20%); split: -0.20%, +0.00% CodeSize: 1917784 -> 1913324 (-0.23%); split: -0.25%, +0.02% VGPRs: 22344 -> 22416 (+0.32%) Latency: 2320847 -> 2318161 (-0.12%); split: -0.13%, +0.01% InvThroughput: 543008 -> 541722 (-0.24%) SClause: 11450 -> 11459 (+0.08%) Copies: 19991 -> 19949 (-0.21%); split: -0.23%, +0.02% PreSGPRs: 19129 -> 19114 (-0.08%) PreVGPRs: 19695 -> 19696 (+0.01%); split: -0.01%, +0.01% VALU: 257627 -> 256948 (-0.26%) SALU: 30432 -> 30422 (-0.03%) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36535>	2025-08-04 19:42:22 +00:00
Georg Lehmann	8512479097	nir/opt_algebraic: create 16bit fmin/fmax if only used by pack_half_2x16_rtz_split Foz-DB Navi21: Totals from 1842 (2.30% of 80066) affected shaders: Instrs: 869152 -> 866751 (-0.28%) CodeSize: 4687316 -> 4682496 (-0.10%); split: -0.14%, +0.03% VGPRs: 75216 -> 75312 (+0.13%) Latency: 7297749 -> 7297929 (+0.00%); split: -0.01%, +0.02% InvThroughput: 1864933 -> 1860706 (-0.23%); split: -0.23%, +0.00% Copies: 52679 -> 52463 (-0.41%) VALU: 665076 -> 662890 (-0.33%) SALU: 56226 -> 56010 (-0.38%) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36535>	2025-08-04 19:42:22 +00:00
Georg Lehmann	22afe83473	nir/opt_algebraic: remove fneg around fmin/fmax Foz-DB Navi21: Totals from 282 (0.35% of 80255) affected shaders: Instrs: 310515 -> 309755 (-0.24%) CodeSize: 1721236 -> 1714540 (-0.39%) Latency: 1366446 -> 1365141 (-0.10%); split: -0.10%, +0.00% InvThroughput: 352528 -> 351097 (-0.41%); split: -0.41%, +0.00% Copies: 24623 -> 24630 (+0.03%) VALU: 231716 -> 230951 (-0.33%) SALU: 28774 -> 28779 (+0.02%) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36535>	2025-08-04 19:42:22 +00:00
Rhys Perry	d4b329219e	nir/lower_memory_model: remove empty lowered barriers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36080>	2025-08-04 15:36:51 +00:00
Rhys Perry	ae6e39a8f5	nir: don't move accesses across make visible/available barriers Otherwise, the barrier would no longer affect the access. nir_opt_dead_write_vars should be fine, since it's removing stores, not moving them. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36080>	2025-08-04 15:36:50 +00:00
Alyssa Rosenzweig	e8ff9eb9cb	nir/opt_varyings: link interpolation qualifiers Some hardware (AGX, Imagination, Arm) really want to know the interpolation qualifiers when compiling the vertex shader. Even though we need to handle this dynamic for separate shaders, we can improve performance by linking. nir_opt_varyings already has all the information to do this, so just do so. Note this has to be done in common code for Gallium, which links varyings within the GLSL linker but then presents the linked programs as separate shader objects. This models that nicely, allowing Gallium drivers to optimize without weird sidebands. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36501>	2025-08-03 21:57:25 +00:00
Alyssa Rosenzweig	66740d9c91	nir: gather interpolation qualifiers we'll want this to be able to link interpolation qualifiers in a simple way with nir_opt_varyings. add the metadata for it and the FS gathering pass. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36501>	2025-08-03 21:57:25 +00:00
Alyssa Rosenzweig	b8f50b6317	nir: gather info in opt_varyings_bulk the info is all messed up so we need to do this right after. merge this code. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36501>	2025-08-03 21:57:25 +00:00
Alyssa Rosenzweig	3e8575c037	nir,agx: pull lower_printf_buffer into backend no other users now. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36516>	2025-08-03 21:27:50 +00:00
Alyssa Rosenzweig	1c28fc0a86	nir: add nir_inline_sysval pass a bunch of drivers have versions of this, might as well make a common one. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: John Anthony <john.anthony@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36516>	2025-08-03 21:27:47 +00:00
Emma Anholt	d5826506ce	nir,agx: Move AGX's loop (generalized) to shared NIR code. When I went to use opt_reassociate for tu, I was advised that you want to do this loop to get the best results. If everyone needs it, let's make it common code and explain what's going on. In the process, also make it skip work appropriately when there's no progress. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36342>	2025-08-03 20:58:28 +00:00
Emma Anholt	062a35b554	nir/lower_sample_shading: Set the sample qualifier on in vars. This is another step in setting things up, that zink would like to have. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36496>	2025-08-03 20:27:39 +00:00
Emma Anholt	d3ada77a6a	nir: Move ST's force-persample-shading NIR pass to shared code. This is about to grow a little. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36496>	2025-08-03 20:27:39 +00:00
Georg Lehmann	cfd5fbfde1	nir/opt_algebraic: make fmin/fmax(a, #b) 16bit if only used by f2f16 Foz-DB Navi31: Totals from 11 out of 14 FSR4 shaders: Instrs: 58298 -> 58374 (+0.13%); split: -0.08%, +0.21% CodeSize: 397836 -> 398108 (+0.07%); split: -0.08%, +0.15% Latency: 209634 -> 211438 (+0.86%); split: -0.14%, +1.00% InvThroughput: 229152 -> 229314 (+0.07%); split: -0.03%, +0.10% VClause: 826 -> 847 (+2.54%); split: -0.36%, +2.91% Copies: 2954 -> 3040 (+2.91%); split: -1.56%, +4.47% VALU: 49637 -> 49711 (+0.15%); split: -0.06%, +0.21% VOPD: 1916 -> 1400 (-26.93%) These stats looks bad, but it's actually just unlucky RA. Replacing 1 VOPD (two v_dual_max_f32) with 1 VOP3P (v_pk_max_f16) should still be a win from a register bandwidth perspective. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36468>	2025-08-01 20:29:30 +00:00
Georg Lehmann	3168ebe2c5	nir/range_analysis: look through vec2 Foz-DB Navi31: Totals from 11 out of 14 FSR4 shaders: Instrs: 58987 -> 58298 (-1.17%) CodeSize: 402844 -> 397836 (-1.24%) Latency: 209630 -> 209634 (+0.00%); split: -0.66%, +0.66% InvThroughput: 230240 -> 229152 (-0.47%); split: -0.48%, +0.00% VClause: 838 -> 826 (-1.43%); split: -1.55%, +0.12% Copies: 3019 -> 2954 (-2.15%); split: -2.82%, +0.66% VALU: 50196 -> 49637 (-1.11%) VOPD: 1950 -> 1916 (-1.74%); split: +0.72%, -2.46% Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36468>	2025-08-01 20:29:29 +00:00
Georg Lehmann	caf89c97de	nir/range_analysis: look through f2f Foz-DB Navi31: Totals from 93 (0.12% of 80273) affected shaders: Instrs: 123927 -> 121073 (-2.30%); split: -2.30%, +0.00% CodeSize: 670832 -> 653332 (-2.61%); split: -2.61%, +0.00% Latency: 337678 -> 322803 (-4.41%); split: -4.41%, +0.00% InvThroughput: 63277 -> 61083 (-3.47%) VClause: 460 -> 373 (-18.91%) SClause: 2178 -> 2100 (-3.58%) Copies: 7637 -> 7744 (+1.40%) PreSGPRs: 4414 -> 4287 (-2.88%) PreVGPRs: 4229 -> 4230 (+0.02%) VALU: 77375 -> 75693 (-2.17%) SALU: 16497 -> 16383 (-0.69%); split: -0.73%, +0.04% VMEM: 561 -> 477 (-14.97%) SMEM: 3197 -> 3113 (-2.63%) Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36468>	2025-08-01 20:29:28 +00:00
Georg Lehmann	261239a492	nir/opt_algebraic: use range analysis to detect no-op fmin/fmax Foz-DB Navi31: Totals from 418 (0.52% of 80273) affected shaders: Instrs: 564550 -> 564387 (-0.03%); split: -0.04%, +0.01% CodeSize: 2983860 -> 2982684 (-0.04%); split: -0.05%, +0.01% Latency: 4387264 -> 4386397 (-0.02%); split: -0.02%, +0.00% InvThroughput: 717464 -> 716874 (-0.08%); split: -0.08%, +0.00% Copies: 40126 -> 40125 (-0.00%) VALU: 352128 -> 352003 (-0.04%); split: -0.04%, +0.01% SALU: 50290 -> 50283 (-0.01%) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36468>	2025-08-01 20:29:28 +00:00
Georg Lehmann	a0665e79e9	nir/opt_algebraic: push fsat into bcsel with constant bcsel doesn't have a free clamp modifier on AMD hardware, but what's inside might have free clamp. Foz-DB Navi31: Totals from 873 (1.09% of 80273) affected shaders: MaxWaves: 22008 -> 21968 (-0.18%) Instrs: 4624956 -> 4623950 (-0.02%); split: -0.04%, +0.02% CodeSize: 24152780 -> 24142884 (-0.04%); split: -0.05%, +0.01% VGPRs: 57900 -> 57960 (+0.10%) Latency: 28762622 -> 28749889 (-0.04%); split: -0.06%, +0.02% InvThroughput: 5320810 -> 5320145 (-0.01%); split: -0.02%, +0.00% VClause: 115879 -> 115929 (+0.04%); split: -0.10%, +0.14% SClause: 93058 -> 93059 (+0.00%); split: -0.01%, +0.02% Copies: 335674 -> 335845 (+0.05%); split: -0.05%, +0.10% PreSGPRs: 53819 -> 53843 (+0.04%); split: -0.01%, +0.05% PreVGPRs: 50908 -> 50939 (+0.06%); split: -0.02%, +0.08% VALU: 2816395 -> 2815514 (-0.03%); split: -0.04%, +0.01% SALU: 509988 -> 509987 (-0.00%); split: -0.02%, +0.02% Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36468>	2025-08-01 20:29:27 +00:00
Georg Lehmann	e9e5146848	nir/opt_algebraic: optimize fsat(fmax(a, b)) where b is not positive Foz-DB Navi31: Totals from 946 (1.18% of 80273) affected shaders: Instrs: 4986082 -> 4983988 (-0.04%); split: -0.04%, +0.00% CodeSize: 25998700 -> 25989796 (-0.03%); split: -0.04%, +0.00% Latency: 45514742 -> 45510330 (-0.01%); split: -0.01%, +0.00% InvThroughput: 8163529 -> 8162325 (-0.01%); split: -0.02%, +0.00% VClause: 112105 -> 112104 (-0.00%); split: -0.00%, +0.00% SClause: 109694 -> 109688 (-0.01%) Copies: 372356 -> 372284 (-0.02%); split: -0.03%, +0.01% Branches: 132636 -> 132633 (-0.00%) PreVGPRs: 58997 -> 58979 (-0.03%); split: -0.03%, +0.00% VALU: 3025662 -> 3024191 (-0.05%); split: -0.05%, +0.00% SALU: 551712 -> 551714 (+0.00%); split: -0.00%, +0.00% Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36468>	2025-08-01 20:29:27 +00:00
Alyssa Rosenzweig	bcf1a1c20b	treewide: use nir_def_block Via Coccinelle patch: @@ expression definition; @@ -definition->parent_instr->block +nir_def_block(definition) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Marek Olšák <maraeo@gmail.com> Acked-by: Karol Herbst <kherbst@redhat.com> Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489>	2025-08-01 15:34:24 +00:00

... 6 7 8 9 10 ...

6842 commits