Georg Lehmann
de3d04dd72
nir/uub: guard against division by 0
...
Fixes: 8ee5440073 ("nir/uub: improve ishl/imul with constant sources")
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36805 >
2025-08-19 15:49:57 +00:00
Daniel Schürmann
8c8fc7d058
nir/opt_load_store_vectorize: don't vectorize large shared2_amd loads
...
for performance reasons.
Totals from 180 (0.23% of 79839) affected shaders: (Navi48)
Instrs: 288089 -> 289937 (+0.64%); split: -0.00%, +0.64%
CodeSize: 1515884 -> 1527936 (+0.80%); split: -0.00%, +0.80%
VGPRs: 10740 -> 10704 (-0.34%)
Latency: 1477965 -> 1478591 (+0.04%); split: -0.09%, +0.14%
InvThroughput: 467449 -> 467885 (+0.09%); split: -0.02%, +0.11%
VClause: 5012 -> 5010 (-0.04%); split: -0.08%, +0.04%
SClause: 6509 -> 6512 (+0.05%); split: -0.02%, +0.06%
Copies: 20815 -> 20923 (+0.52%); split: -0.28%, +0.80%
Branches: 6019 -> 6018 (-0.02%)
PreSGPRs: 7670 -> 7669 (-0.01%)
PreVGPRs: 7239 -> 7192 (-0.65%)
VALU: 151763 -> 152011 (+0.16%); split: -0.04%, +0.20%
SALU: 39199 -> 39202 (+0.01%)
VOPD: 877 -> 861 (-1.82%); split: +0.57%, -2.39%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36133 >
2025-08-19 14:28:14 +00:00
Daniel Schürmann
957b271a9f
nir/opt_load_store_vectorize: only attempt to vectorize shared2 after exhausting other possibilities
...
Totals from 249 (0.31% of 79839) affected shaders: (Navi48)
Instrs: 276401 -> 275918 (-0.17%); split: -0.29%, +0.11%
CodeSize: 1477072 -> 1474440 (-0.18%); split: -0.26%, +0.08%
VGPRs: 12748 -> 12760 (+0.09%); split: -0.28%, +0.38%
Latency: 1397959 -> 1398846 (+0.06%); split: -0.10%, +0.16%
InvThroughput: 424767 -> 424496 (-0.06%); split: -0.09%, +0.02%
VClause: 5183 -> 5186 (+0.06%); split: -0.10%, +0.15%
SClause: 6537 -> 6538 (+0.02%); split: -0.05%, +0.06%
Copies: 21295 -> 21098 (-0.93%); split: -1.21%, +0.29%
Branches: 4324 -> 4325 (+0.02%)
PreSGPRs: 9719 -> 9717 (-0.02%)
PreVGPRs: 8857 -> 8847 (-0.11%); split: -0.24%, +0.12%
VALU: 144514 -> 144334 (-0.12%); split: -0.20%, +0.07%
SALU: 38970 -> 38944 (-0.07%); split: -0.08%, +0.01%
VOPD: 884 -> 898 (+1.58%); split: +1.92%, -0.34%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36133 >
2025-08-19 14:28:14 +00:00
Gert Wollny
8c65da0c9d
r600/sfn: cleanup GS shader emission
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Now that we lower all load_per_vertex_input to
r600_load_per_vertex_input we can remove some dead code
and also change the intrinsic to use only one source value.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36488 >
2025-08-12 14:30:17 +00:00
Georg Lehmann
8818d7367d
nir/opt_load_skip_helpers: optionally handle intrinsics
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36610 >
2025-08-12 08:56:37 +00:00
Georg Lehmann
cd687e277f
nir: add access for scratch loads
...
To be able to use ACCESS_SKIP_HELPERS.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36610 >
2025-08-12 08:56:37 +00:00
Georg Lehmann
2d16f457c5
nir: add ACCESS_SKIP_HELPERS
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36610 >
2025-08-12 08:56:37 +00:00
Georg Lehmann
91572a99bb
nir: rename to nir_opt_load_skip_helpers and add options struct
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36610 >
2025-08-12 08:56:37 +00:00
Georg Lehmann
fbae0893a6
nir: print skip_helpers for tex instrs
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36610 >
2025-08-12 08:56:37 +00:00
Georg Lehmann
6577f68ad4
nir/opt_tex_skip_helpers: never require helpers for stores/atomics
...
Helpers never execute stores/atomics.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36610 >
2025-08-12 08:56:37 +00:00
Georg Lehmann
26e6c4c092
nir/opt_tex_skip_helpers: don't skip helpers for terminate_if source
...
Helpers must be terminated correctly.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36610 >
2025-08-12 08:56:37 +00:00
Qiang Yu
bfd7f498a5
nir/opt_varying: remove assert for mesh shader crash
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This assert is not true when mesh shader.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36596 >
2025-08-11 01:44:45 +00:00
Alyssa Rosenzweig
8566a566e6
nir: plumb ballot options
...
glsl needs to plumb this from the backend. we should clean up
nir_lower_subgroups to use this later but I don't have time to churn everything
right now.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36649 >
2025-08-08 20:51:03 +00:00
Alyssa Rosenzweig
1af0897452
nir/lower_subgroups: add lower_fp64 option
...
This is needed for doubles lowering to do the right thing.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36649 >
2025-08-08 20:51:03 +00:00
John Anthony
000bd3046d
nir,spirv: Add support for SPV_ARM_core_builtins
...
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36019 >
2025-08-07 11:46:33 +02:00
John Anthony
a68a825aad
nir,agx: unvendor core_id_agx
...
core_id will be used by SPV_ARM_core_builtins
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36019 >
2025-08-07 11:46:33 +02:00
Qiang Yu
c135ed1eb9
all: rename gl_shader_stage_name to mesa_shader_stage_name
...
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569 >
2025-08-06 10:28:41 +08:00
Qiang Yu
807d693421
compiler: rename gl_shader_stage_is_callable to mesa_shader_stage_is_callable
...
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569 >
2025-08-06 10:28:41 +08:00
Qiang Yu
4847e0b380
all: rename gl_shader_stage_uses_workgroup to mesa_shader_stage_uses_workgroup
...
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569 >
2025-08-06 10:28:41 +08:00
Qiang Yu
7a91473192
all: rename gl_shader_stage_is_compute to mesa_shader_stage_is_compute
...
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569 >
2025-08-06 10:28:41 +08:00
Qiang Yu
196569b1a4
all: rename gl_shader_stage to mesa_shader_stage
...
It's not only for GL, change to a generic name.
Use command:
find . -type f -not -path '*/.git/*' -exec sed -i 's/\bgl_shader_stage\b/mesa_shader_stage/g' {} +
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569 >
2025-08-06 10:28:40 +08:00
Marek Olšák
fee8e92855
nir: use gc_ctx for nir_variable to reduce ralloc/malloc overhead
...
gc_ctx uses a slab allocator. This reduces GLSL compile times by 1-3%
with the gallium noop driver.
This reduces the number of ralloc_size calls for Heaven shaders by 14.3%.
Note that gc_ctx also uses ralloc_size, so the reduction is a net change.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538 >
2025-08-05 22:55:14 +00:00
Marek Olšák
44350bce1f
nir: add nir_variable_create_zeroed helper
...
This will allow us to switch nir_variable from ralloc to gc_ctx,
which uses a slab allocator.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538 >
2025-08-05 22:55:14 +00:00
Marek Olšák
b769d5dcde
nir: don't use variables as ralloc parents, use the shader instead
...
so that we can switch variables to gc_ctx
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538 >
2025-08-05 22:55:13 +00:00
Marek Olšák
dadd4e4555
nir/clone: don't call ralloc_strdup with a NULL pointer for intrinsic names
...
No impact, but it was affecting my ralloc_strdup stats for
nir_intrinsic_instr names.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538 >
2025-08-05 22:55:13 +00:00
Marek Olšák
3c4a64e807
nir: eliminate most ralloc/malloc for nir_variable names
...
Store small names in a fixed-sized string in nir_variable.
GLSL IR does the same thing.
When compiling my shader-db with the gallium noop driver, it improves GLSL
compile times by 0.7% (much lower than anticipated).
For Unigine Heaven shaders:
- it eliminates 95.6% ralloc calls for nir_variable names
- the total number of ralloc calls is reduced by 11%
It also adds only 16B to nir_variable, while just the ralloc header
for the name would occupy 40B.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538 >
2025-08-05 22:55:12 +00:00
Marek Olšák
96ffc24e4e
nir: add nir_variable_{set,append,steal}_name{f}() to modify nir_variable names
...
Setting variable names currently always uses ralloc, but the new
nir_variable_* helpers will mostly eliminate ralloc/malloc in a later
commit.
This just updates all places that touch nir_variable names to use the new
helpers.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538 >
2025-08-05 22:55:12 +00:00
Marek Olšák
05749922b0
nir: don't allocate nir_constant::elements if there are none
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538 >
2025-08-05 22:55:11 +00:00
Job Noorman
ae66bd1c00
nir/opt_uniform_subgroup: use ballot_bit_count
...
Using bit_count on the result of ballot doesn't work for targets where
ballot's num_components > 1.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Fixes: d2e1e4442a ("ir3: enable nir_opt_uniform_subgroup")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35669 >
2025-08-05 17:09:27 +00:00
Georg Lehmann
1d885fab9c
nir/opt_algebraic: optimize pack_half_rtz of b2f
...
Foz-DB Navi21:
Totals from 13 (0.02% of 80255) affected shaders:
Instrs: 2313 -> 2306 (-0.30%); split: -0.35%, +0.04%
CodeSize: 13452 -> 13480 (+0.21%)
Latency: 12066 -> 12013 (-0.44%); split: -0.45%, +0.01%
InvThroughput: 2172 -> 2163 (-0.41%)
Copies: 112 -> 114 (+1.79%)
VALU: 1480 -> 1472 (-0.54%)
SALU: 154 -> 155 (+0.65%)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36535 >
2025-08-04 19:42:22 +00:00
Georg Lehmann
bc3b09c5dd
nir/opt_algebraic: optimize pack_half_rtz of bcsel with constant
...
Foz-DB Navi21:
Totals from 448 (0.56% of 80255) affected shaders:
Instrs: 345474 -> 344791 (-0.20%); split: -0.20%, +0.00%
CodeSize: 1917784 -> 1913324 (-0.23%); split: -0.25%, +0.02%
VGPRs: 22344 -> 22416 (+0.32%)
Latency: 2320847 -> 2318161 (-0.12%); split: -0.13%, +0.01%
InvThroughput: 543008 -> 541722 (-0.24%)
SClause: 11450 -> 11459 (+0.08%)
Copies: 19991 -> 19949 (-0.21%); split: -0.23%, +0.02%
PreSGPRs: 19129 -> 19114 (-0.08%)
PreVGPRs: 19695 -> 19696 (+0.01%); split: -0.01%, +0.01%
VALU: 257627 -> 256948 (-0.26%)
SALU: 30432 -> 30422 (-0.03%)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36535 >
2025-08-04 19:42:22 +00:00
Georg Lehmann
8512479097
nir/opt_algebraic: create 16bit fmin/fmax if only used by pack_half_2x16_rtz_split
...
Foz-DB Navi21:
Totals from 1842 (2.30% of 80066) affected shaders:
Instrs: 869152 -> 866751 (-0.28%)
CodeSize: 4687316 -> 4682496 (-0.10%); split: -0.14%, +0.03%
VGPRs: 75216 -> 75312 (+0.13%)
Latency: 7297749 -> 7297929 (+0.00%); split: -0.01%, +0.02%
InvThroughput: 1864933 -> 1860706 (-0.23%); split: -0.23%, +0.00%
Copies: 52679 -> 52463 (-0.41%)
VALU: 665076 -> 662890 (-0.33%)
SALU: 56226 -> 56010 (-0.38%)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36535 >
2025-08-04 19:42:22 +00:00
Georg Lehmann
22afe83473
nir/opt_algebraic: remove fneg around fmin/fmax
...
Foz-DB Navi21:
Totals from 282 (0.35% of 80255) affected shaders:
Instrs: 310515 -> 309755 (-0.24%)
CodeSize: 1721236 -> 1714540 (-0.39%)
Latency: 1366446 -> 1365141 (-0.10%); split: -0.10%, +0.00%
InvThroughput: 352528 -> 351097 (-0.41%); split: -0.41%, +0.00%
Copies: 24623 -> 24630 (+0.03%)
VALU: 231716 -> 230951 (-0.33%)
SALU: 28774 -> 28779 (+0.02%)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36535 >
2025-08-04 19:42:22 +00:00
Rhys Perry
d4b329219e
nir/lower_memory_model: remove empty lowered barriers
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36080 >
2025-08-04 15:36:51 +00:00
Rhys Perry
ae6e39a8f5
nir: don't move accesses across make visible/available barriers
...
Otherwise, the barrier would no longer affect the access.
nir_opt_dead_write_vars should be fine, since it's removing stores, not
moving them.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36080 >
2025-08-04 15:36:50 +00:00
Alyssa Rosenzweig
e8ff9eb9cb
nir/opt_varyings: link interpolation qualifiers
...
Some hardware (AGX, Imagination, Arm) really want to know the interpolation
qualifiers when compiling the vertex shader. Even though we need to handle this
dynamic for separate shaders, we can improve performance by linking.
nir_opt_varyings already has all the information to do this, so just do so.
Note this has to be done in common code for Gallium, which links varyings within
the GLSL linker but then presents the linked programs as separate shader
objects. This models that nicely, allowing Gallium drivers to optimize without
weird sidebands.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36501 >
2025-08-03 21:57:25 +00:00
Alyssa Rosenzweig
66740d9c91
nir: gather interpolation qualifiers
...
we'll want this to be able to link interpolation qualifiers in a simple way with
nir_opt_varyings. add the metadata for it and the FS gathering pass.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36501 >
2025-08-03 21:57:25 +00:00
Alyssa Rosenzweig
b8f50b6317
nir: gather info in opt_varyings_bulk
...
the info is all messed up so we need to do this right after. merge this
code.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36501 >
2025-08-03 21:57:25 +00:00
Alyssa Rosenzweig
3e8575c037
nir,agx: pull lower_printf_buffer into backend
...
no other users now.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36516 >
2025-08-03 21:27:50 +00:00
Alyssa Rosenzweig
1c28fc0a86
nir: add nir_inline_sysval pass
...
a bunch of drivers have versions of this, might as well make a common one.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: John Anthony <john.anthony@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36516 >
2025-08-03 21:27:47 +00:00
Emma Anholt
d5826506ce
nir,agx: Move AGX's loop (generalized) to shared NIR code.
...
When I went to use opt_reassociate for tu, I was advised that you want to
do this loop to get the best results. If everyone needs it, let's make it
common code and explain what's going on.
In the process, also make it skip work appropriately when there's no
progress.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36342 >
2025-08-03 20:58:28 +00:00
Emma Anholt
062a35b554
nir/lower_sample_shading: Set the sample qualifier on in vars.
...
This is another step in setting things up, that zink would like to have.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36496 >
2025-08-03 20:27:39 +00:00
Emma Anholt
d3ada77a6a
nir: Move ST's force-persample-shading NIR pass to shared code.
...
This is about to grow a little.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36496 >
2025-08-03 20:27:39 +00:00
Georg Lehmann
cfd5fbfde1
nir/opt_algebraic: make fmin/fmax(a, #b) 16bit if only used by f2f16
...
Foz-DB Navi31:
Totals from 11 out of 14 FSR4 shaders:
Instrs: 58298 -> 58374 (+0.13%); split: -0.08%, +0.21%
CodeSize: 397836 -> 398108 (+0.07%); split: -0.08%, +0.15%
Latency: 209634 -> 211438 (+0.86%); split: -0.14%, +1.00%
InvThroughput: 229152 -> 229314 (+0.07%); split: -0.03%, +0.10%
VClause: 826 -> 847 (+2.54%); split: -0.36%, +2.91%
Copies: 2954 -> 3040 (+2.91%); split: -1.56%, +4.47%
VALU: 49637 -> 49711 (+0.15%); split: -0.06%, +0.21%
VOPD: 1916 -> 1400 (-26.93%)
These stats looks bad, but it's actually just unlucky RA.
Replacing 1 VOPD (two v_dual_max_f32) with 1 VOP3P (v_pk_max_f16)
should still be a win from a register bandwidth perspective.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36468 >
2025-08-01 20:29:30 +00:00
Georg Lehmann
3168ebe2c5
nir/range_analysis: look through vec2
...
Foz-DB Navi31:
Totals from 11 out of 14 FSR4 shaders:
Instrs: 58987 -> 58298 (-1.17%)
CodeSize: 402844 -> 397836 (-1.24%)
Latency: 209630 -> 209634 (+0.00%); split: -0.66%, +0.66%
InvThroughput: 230240 -> 229152 (-0.47%); split: -0.48%, +0.00%
VClause: 838 -> 826 (-1.43%); split: -1.55%, +0.12%
Copies: 3019 -> 2954 (-2.15%); split: -2.82%, +0.66%
VALU: 50196 -> 49637 (-1.11%)
VOPD: 1950 -> 1916 (-1.74%); split: +0.72%, -2.46%
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36468 >
2025-08-01 20:29:29 +00:00
Georg Lehmann
caf89c97de
nir/range_analysis: look through f2f
...
Foz-DB Navi31:
Totals from 93 (0.12% of 80273) affected shaders:
Instrs: 123927 -> 121073 (-2.30%); split: -2.30%, +0.00%
CodeSize: 670832 -> 653332 (-2.61%); split: -2.61%, +0.00%
Latency: 337678 -> 322803 (-4.41%); split: -4.41%, +0.00%
InvThroughput: 63277 -> 61083 (-3.47%)
VClause: 460 -> 373 (-18.91%)
SClause: 2178 -> 2100 (-3.58%)
Copies: 7637 -> 7744 (+1.40%)
PreSGPRs: 4414 -> 4287 (-2.88%)
PreVGPRs: 4229 -> 4230 (+0.02%)
VALU: 77375 -> 75693 (-2.17%)
SALU: 16497 -> 16383 (-0.69%); split: -0.73%, +0.04%
VMEM: 561 -> 477 (-14.97%)
SMEM: 3197 -> 3113 (-2.63%)
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36468 >
2025-08-01 20:29:28 +00:00
Georg Lehmann
261239a492
nir/opt_algebraic: use range analysis to detect no-op fmin/fmax
...
Foz-DB Navi31:
Totals from 418 (0.52% of 80273) affected shaders:
Instrs: 564550 -> 564387 (-0.03%); split: -0.04%, +0.01%
CodeSize: 2983860 -> 2982684 (-0.04%); split: -0.05%, +0.01%
Latency: 4387264 -> 4386397 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 717464 -> 716874 (-0.08%); split: -0.08%, +0.00%
Copies: 40126 -> 40125 (-0.00%)
VALU: 352128 -> 352003 (-0.04%); split: -0.04%, +0.01%
SALU: 50290 -> 50283 (-0.01%)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36468 >
2025-08-01 20:29:28 +00:00
Georg Lehmann
a0665e79e9
nir/opt_algebraic: push fsat into bcsel with constant
...
bcsel doesn't have a free clamp modifier on AMD hardware,
but what's inside might have free clamp.
Foz-DB Navi31:
Totals from 873 (1.09% of 80273) affected shaders:
MaxWaves: 22008 -> 21968 (-0.18%)
Instrs: 4624956 -> 4623950 (-0.02%); split: -0.04%, +0.02%
CodeSize: 24152780 -> 24142884 (-0.04%); split: -0.05%, +0.01%
VGPRs: 57900 -> 57960 (+0.10%)
Latency: 28762622 -> 28749889 (-0.04%); split: -0.06%, +0.02%
InvThroughput: 5320810 -> 5320145 (-0.01%); split: -0.02%, +0.00%
VClause: 115879 -> 115929 (+0.04%); split: -0.10%, +0.14%
SClause: 93058 -> 93059 (+0.00%); split: -0.01%, +0.02%
Copies: 335674 -> 335845 (+0.05%); split: -0.05%, +0.10%
PreSGPRs: 53819 -> 53843 (+0.04%); split: -0.01%, +0.05%
PreVGPRs: 50908 -> 50939 (+0.06%); split: -0.02%, +0.08%
VALU: 2816395 -> 2815514 (-0.03%); split: -0.04%, +0.01%
SALU: 509988 -> 509987 (-0.00%); split: -0.02%, +0.02%
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36468 >
2025-08-01 20:29:27 +00:00
Georg Lehmann
e9e5146848
nir/opt_algebraic: optimize fsat(fmax(a, b)) where b is not positive
...
Foz-DB Navi31:
Totals from 946 (1.18% of 80273) affected shaders:
Instrs: 4986082 -> 4983988 (-0.04%); split: -0.04%, +0.00%
CodeSize: 25998700 -> 25989796 (-0.03%); split: -0.04%, +0.00%
Latency: 45514742 -> 45510330 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 8163529 -> 8162325 (-0.01%); split: -0.02%, +0.00%
VClause: 112105 -> 112104 (-0.00%); split: -0.00%, +0.00%
SClause: 109694 -> 109688 (-0.01%)
Copies: 372356 -> 372284 (-0.02%); split: -0.03%, +0.01%
Branches: 132636 -> 132633 (-0.00%)
PreVGPRs: 58997 -> 58979 (-0.03%); split: -0.03%, +0.00%
VALU: 3025662 -> 3024191 (-0.05%); split: -0.05%, +0.00%
SALU: 551712 -> 551714 (+0.00%); split: -0.00%, +0.00%
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36468 >
2025-08-01 20:29:27 +00:00
Alyssa Rosenzweig
bcf1a1c20b
treewide: use nir_def_block
...
Via Coccinelle patch:
@@
expression definition;
@@
-definition->parent_instr->block
+nir_def_block(definition)
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489 >
2025-08-01 15:34:24 +00:00