Timothy Arceri
8417f4a8eb
nir: move nir_lower_drawpixels() to the state tracker
...
This is gl specific and a following fix will add more gl specific
params so here we move it to the st to avoid filling nir.h with
more junk.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37037 >
2025-09-07 23:13:22 +00:00
Daniel Schürmann
c78f1d516c
nir/algebraic: add pattern for (a << #b) * #c => a * (#c << #b)
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Totals from 2545 (3.19% of 79839) affected shaders: (Navi48)
Instrs: 6371003 -> 6364130 (-0.11%); split: -0.12%, +0.01%
CodeSize: 33827548 -> 33812244 (-0.05%); split: -0.06%, +0.01%
Latency: 47451755 -> 47430108 (-0.05%); split: -0.05%, +0.00%
InvThroughput: 10442450 -> 10437159 (-0.05%); split: -0.05%, +0.00%
SClause: 159829 -> 159874 (+0.03%); split: -0.01%, +0.04%
Copies: 500725 -> 500721 (-0.00%); split: -0.01%, +0.01%
PreSGPRs: 110482 -> 110478 (-0.00%); split: -0.00%, +0.00%
PreVGPRs: 147289 -> 147287 (-0.00%); split: -0.00%, +0.00%
VALU: 3456135 -> 3454241 (-0.05%); split: -0.06%, +0.01%
SALU: 925982 -> 923616 (-0.26%)
VOPD: 1243 -> 1212 (-2.49%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37173 >
2025-09-06 10:18:42 +00:00
Rhys Perry
efe536dbe9
vtn: use vtn_has_decoration more
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37175 >
2025-09-05 15:58:03 +00:00
Christoph Pillmayer
f81f3c85e2
nir/opt_algebraic: Convert a + b + a to b + 2a
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This allows fusing into one FMA later.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37113 >
2025-09-05 11:39:51 +00:00
Lionel Landwerlin
7cbabcad36
compiler: add stage_is_graphics() helper
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34872 >
2025-09-05 07:46:17 +00:00
Lionel Landwerlin
afea98593e
nir: add a new intrinsic for load dynamic tessellation config
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34872 >
2025-09-05 07:46:15 +00:00
Rob Clark
d5a8233598
nir/lower-amul: Comment fix
...
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37063 >
2025-09-04 15:21:38 +00:00
Rob Clark
55d77749ed
nir/lower-amul: Fix crash with unused SSBO
...
Since https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12175
we should be able to rely on driver_location for both UBOs and SSBOs.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37063 >
2025-09-04 15:21:38 +00:00
Georg Lehmann
796f0847a6
nir/lower_subgroups: recursively lower ballot scans
...
This should be better for backends that have le/lt mask intrinsics.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37178 >
2025-09-04 14:04:00 +00:00
Georg Lehmann
2725eaf9a2
nir/lower_subgroups: change filter to intrinsic callback
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37178 >
2025-09-04 14:04:00 +00:00
Georg Lehmann
d14897b2f7
nir/lower_subgroups: don't use get_max_subgroup_size for lowering boolean rotates
...
The lowering won't work with an unknown subgroup size, and we correctly
assert that at the top of the function.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37178 >
2025-09-04 14:03:59 +00:00
Georg Lehmann
516c766c71
spirv: ensure ballot find_lsb/find_msb/bit_count have 32bit result
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37178 >
2025-09-04 14:03:58 +00:00
Georg Lehmann
f8633511be
nir: make ballot find_lsb/msb/bit_count 32bit only
...
The lowering is 32bit only too.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37178 >
2025-09-04 14:03:58 +00:00
Georg Lehmann
276fce4f13
spirv: handle ballot bit_extract separately
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37178 >
2025-09-04 14:03:58 +00:00
Georg Lehmann
b8db8f877d
nir: make ballot_bitfield_extract 1bit only
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37178 >
2025-09-04 14:03:57 +00:00
Georg Lehmann
83326af899
nir/builder: add nir_inverse_ballot_imm
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37178 >
2025-09-04 14:03:56 +00:00
Georg Lehmann
ef8c364d3d
nir: make inverse_ballot 1bit only
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37178 >
2025-09-04 14:03:56 +00:00
Simon Perretta
880098158d
nir/nir_lower_calls_to_builtins: trivially handle IA64 mangled functions
...
Using __attribute__((overloadable)) when declaring nir ops with
variable-width params in clc results in their symbol names being (IA64)
mangled; this change enables the mangled names to be handled when later
lowering the calls.
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36873 >
2025-09-02 16:04:19 +00:00
Robert Mader
1772380307
nir: Fixup 10/12 bit SW decoder YCbCr formats
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The highest possible values that can be represented with
16/12/10 bits are 65535/4095/1023, not 65536/4096/1024.
In order to ensure 1023 maps to 65535 in the Sx10 case
we thus need to multiply by 65535 / 1023 ~= 64.06158
instead of 64.
Fixes: a166d7609f ("gles: Add support for 10/12/16 bit SW decoder YCbCr formats")
Suggested-by: Benjamin Otte <otte@redhat.com>
Signed-off-by: Robert Mader <robert.mader@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37077 >
2025-09-02 09:08:51 +00:00
Job Noorman
e78bd88a06
nir/opt_offsets: add callback to set need_nuw per intrinsic
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Wether need_nuw is used is currently decided in two different ways:
- globally through the allow_offset_wrap option;
- per intrinsic but hard-coded in opt_offsets.
Make this more flexible by creating a callback that is called per
intrinsic. This will allow backends to decide, on a per-intrinsic basis,
whether need_nuw is needed.
Note that the main use case for ir3 is to add support for opt_offsets
for global memory accesses. Other intrinsics don't need need_nuw but
global memory accesses do.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37114 >
2025-09-01 11:25:07 +00:00
Job Noorman
bc03086320
nir/opt_offsets: rename max_offset_data to cb_data
...
We want to add more callbacks and pass the same data.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37114 >
2025-09-01 11:25:07 +00:00
Rhys Perry
2d0f93631c
nir/divergence: make smem load_global_amd uniform
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37101 >
2025-08-30 14:55:13 -04:00
Marek Olšák
25294f3dd4
nir/opt_move_to_top: handle load_global_amd with ACCESS_SMEM_AMD
...
to match the behavior of load_smem_amd
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37101 >
2025-08-30 14:55:13 -04:00
Marek Olšák
48050dbef6
nir/opt_sink: handle load_global_amd
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37101 >
2025-08-30 14:55:13 -04:00
Marek Olšák
219fcd4b32
nir/opt_call: handle load_global(_amd) with SPECULATE as rematerializable
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37101 >
2025-08-30 14:55:13 -04:00
Ashley Smith
d9b388af27
mesa: Fix support for GL_EXT_shader_clock
...
Missing 32-bit entry point in GLSL
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 2ce20170 ("mesa: Add support for GL_EXT_shader_clock")
Signed-off-by: Ashley Smith <ashley.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36041 >
2025-08-29 11:09:04 +00:00
Faith Ekstrand
26e32417b9
nir: Add an option to make lower_phis_to_regs_block() less clever
...
Right now it tries to place reg_write instructions as far up the
predecessor chain as possible. This is useful for a bunch of the passes
that call it since it ensures they don't get placed in dead blocks or in
single successors and things like that. But it screws up NAK's control
flow lowering so we need the option to turn it off and make the pass
place the reg_write instructions in the most obvious place possible.
Fixes: b013d54e4f ("nak/lower_cf: Flag phis as convergent when possible")
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36914 >
2025-08-29 01:24:56 +00:00
Dave Airlie
c38170452d
nir: add nir_intrinsic_cmat_load_shared_nv
...
This maps to NAK's OpLdsm
Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36363 >
2025-08-28 16:09:07 +02:00
Georg Lehmann
3b06824e4c
nir/opt_algebraic: optimize some post peephole select patterns
...
Foz-DB GFX1201:
Totals from 208 (0.26% of 80287) affected shaders:
Instrs: 427684 -> 426834 (-0.20%); split: -0.22%, +0.02%
CodeSize: 2232616 -> 2228816 (-0.17%); split: -0.20%, +0.03%
Latency: 3993934 -> 3992726 (-0.03%); split: -0.04%, +0.01%
InvThroughput: 569055 -> 568622 (-0.08%); split: -0.09%, +0.01%
SClause: 12932 -> 12927 (-0.04%)
Copies: 22567 -> 22604 (+0.16%); split: -0.47%, +0.63%
Branches: 7671 -> 7658 (-0.17%)
VALU: 222047 -> 221625 (-0.19%)
SALU: 83954 -> 83815 (-0.17%); split: -0.29%, +0.13%
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36938 >
2025-08-27 09:45:19 +00:00
Georg Lehmann
395893e16b
nir/peephole_select: allows more lowered io
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36938 >
2025-08-27 09:45:19 +00:00
Georg Lehmann
e270a7480b
nir/lower_io: fix boolean output stores
...
Stores don't have a definition, we have to check the bit size of the source.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13762
Fixes: c217ee8d35 ("nir: Insert b2b1s around booleans in nir_lower_to")
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36966 >
2025-08-27 08:46:34 +00:00
Georg Lehmann
047b95a8c3
nir/shrink_vec_array_vars: detect zero init shared memory using constant initializer
...
More consistent.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36956 >
2025-08-27 06:37:41 +00:00
Georg Lehmann
edc5bea61e
nir/shrink_vec_array_vars: update constant initializer after shrinking
...
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13751
Fixes: c7df3b4f64 ("nir/shrink_vec_array_vars: allow nir_var_mem_shared")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36956 >
2025-08-27 06:37:41 +00:00
Faith Ekstrand
a1d5e8bfdb
compiler/rust: Fix the DFS loop detection algorithm
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The previous algorithm just looked at the dominator's loop header.
However, if you have multiple consecutive loops like:
function_impl {
loop {
// Stuff
}
loop {
// Other stuff
}
}
then it will look like the second loop is contained in the first loop
because the first loop's header dominates the second loop. This isn't
actually what we want. Instead, we want a node N to be considered part
of a loop with header H if H dominates N and H is reachable from N.
Fixes: 741f7067f1 ("nak: Add loop detection to the CFG")
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36524 >
2025-08-27 01:20:05 +00:00
Georg Lehmann
d0f4b535fe
nir: constant fold txd with 0 ddx/ddy to txl
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Foz-DB GFX1201:
Totals from 34 (0.04% of 80287) affected shaders:
Instrs: 3111158 -> 3111076 (-0.00%)
CodeSize: 16345020 -> 16344908 (-0.00%); split: -0.00%, +0.00%
Latency: 15378053 -> 15378063 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 2940485 -> 2940477 (-0.00%); split: -0.00%, +0.00%
VClause: 79940 -> 79941 (+0.00%)
Copies: 228205 -> 228159 (-0.02%)
VALU: 1730040 -> 1729994 (-0.00%)
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36967 >
2025-08-26 06:19:43 +00:00
Dave Airlie
7a96a928a2
nir: add coop mat flexible dimensions lowering.
...
This adds a generic lowering pass for coop mat flexible dimensions.
This should be suitable for all drivers that implement coop mat2 flexible dimensions
or even just lowering sw exposed sizes to hw sizes.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36544 >
2025-08-25 18:55:08 +00:00
Konstantin Seurer
951b187b95
nir: Use nir_def_block in more places
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36746 >
2025-08-24 14:03:10 +00:00
Konstantin Seurer
9df7b48d2f
nir: Use nir_def_as_* in more places
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36746 >
2025-08-24 14:03:09 +00:00
Pierre-Eric Pelloux-Prayer
e92638b6bf
nir/opt_varyings: fix build with PRINT_RELOCATE_SLOT
...
Fixes: e3d122ed7b ("nir/opt_varyings: completely exclude mediump from type changes")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36411 >
2025-08-23 14:44:29 +00:00
Jesse Natalie
5b3756f231
nir: Add missing #include for c99_alloca.h
...
Fixes: 3dd9a978 ("nir: add new pass nir_lower_io_indirect_loads")
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36940 >
2025-08-22 22:33:50 +00:00
Rhys Perry
2d597b6919
nir/load_store_vectorize: use nir_def_num_lsb_zero in calc_alignment
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
fossil-db (gfx1201):
Totals from 20 (0.03% of 79839) affected shaders:
Instrs: 15370 -> 15251 (-0.77%)
CodeSize: 89764 -> 88952 (-0.90%)
Latency: 150295 -> 149963 (-0.22%)
InvThroughput: 210291 -> 210105 (-0.09%)
Copies: 1337 -> 1320 (-1.27%)
PreVGPRs: 589 -> 590 (+0.17%)
VALU: 7519 -> 7466 (-0.70%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760 >
2025-08-22 15:45:55 +00:00
Rhys Perry
b03eeb12a9
nir/load_store_vectorize: use nir_def_num_lsb_zero in check_for_robustness
...
fossil-db (gfx1201):
Totals from 499 (0.63% of 79839) affected shaders:
MaxWaves: 14276 -> 14234 (-0.29%)
Instrs: 520883 -> 508159 (-2.44%); split: -2.45%, +0.01%
CodeSize: 2831220 -> 2731080 (-3.54%); split: -3.54%, +0.00%
VGPRs: 27156 -> 27348 (+0.71%)
SpillSGPRs: 360 -> 390 (+8.33%)
Latency: 4473898 -> 4414552 (-1.33%); split: -1.54%, +0.21%
InvThroughput: 494468 -> 493508 (-0.19%); split: -0.62%, +0.43%
VClause: 14211 -> 14060 (-1.06%); split: -1.16%, +0.10%
SClause: 14653 -> 14354 (-2.04%); split: -2.39%, +0.35%
Copies: 36772 -> 37056 (+0.77%); split: -0.65%, +1.42%
Branches: 11502 -> 11486 (-0.14%)
PreSGPRs: 22605 -> 22848 (+1.07%); split: -0.39%, +1.47%
PreVGPRs: 20571 -> 20833 (+1.27%)
VALU: 242982 -> 243151 (+0.07%); split: -0.08%, +0.14%
SALU: 91332 -> 88069 (-3.57%); split: -3.71%, +0.14%
VMEM: 32275 -> 29137 (-9.72%)
SMEM: 26239 -> 22400 (-14.63%)
VOPD: 345 -> 330 (-4.35%)
SClause: 14646 -> 14347 (-2.04%); split: -2.39%, +0.35%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760 >
2025-08-22 15:45:55 +00:00
Rhys Perry
46da666205
nir/algebraic: allow non-const for iand(iadd()) -> iadd(iand())
...
fossil-db (gfx1201):
Totals from 596 (0.75% of 79839) affected shaders:
Instrs: 691926 -> 691819 (-0.02%); split: -0.11%, +0.09%
CodeSize: 3675216 -> 3675180 (-0.00%); split: -0.08%, +0.08%
VGPRs: 37464 -> 37452 (-0.03%)
Latency: 8566849 -> 8563162 (-0.04%); split: -0.09%, +0.05%
InvThroughput: 1068038 -> 1063279 (-0.45%); split: -0.46%, +0.01%
VClause: 17859 -> 17897 (+0.21%); split: -0.01%, +0.22%
SClause: 16704 -> 16735 (+0.19%); split: -0.07%, +0.26%
Copies: 45422 -> 45395 (-0.06%); split: -0.15%, +0.09%
PreSGPRs: 24345 -> 24351 (+0.02%)
PreVGPRs: 29121 -> 29128 (+0.02%)
VALU: 349959 -> 348117 (-0.53%); split: -0.54%, +0.01%
SALU: 105926 -> 107576 (+1.56%); split: -0.02%, +1.58%
VOPD: 252 -> 234 (-7.14%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760 >
2025-08-22 15:45:55 +00:00
Rhys Perry
4f83059ac5
nir/algebraic: improve is_unsigned_multiple_of_4 and use it more
...
fossil-db (gfx1201):
Totals from 160 (0.20% of 79839) affected shaders:
MaxWaves: 4008 -> 3952 (-1.40%)
Instrs: 390073 -> 379834 (-2.62%); split: -2.63%, +0.00%
CodeSize: 2126020 -> 2053740 (-3.40%); split: -3.40%, +0.00%
VGPRs: 9492 -> 9612 (+1.26%)
Latency: 6746019 -> 6723893 (-0.33%); split: -0.33%, +0.00%
InvThroughput: 849571 -> 848942 (-0.07%); split: -0.42%, +0.35%
VClause: 11977 -> 11983 (+0.05%); split: -0.20%, +0.25%
SClause: 11828 -> 11824 (-0.03%); split: -0.14%, +0.11%
Copies: 30003 -> 30938 (+3.12%); split: -0.09%, +3.20%
PreSGPRs: 8914 -> 8938 (+0.27%)
PreVGPRs: 7352 -> 7514 (+2.20%); split: -0.04%, +2.24%
VALU: 171829 -> 168829 (-1.75%); split: -1.76%, +0.01%
SALU: 66503 -> 66543 (+0.06%); split: -0.01%, +0.07%
VMEM: 29365 -> 25327 (-13.75%)
VOPD: 864 -> 1013 (+17.25%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760 >
2025-08-22 15:45:55 +00:00
Rhys Perry
09ab7ff01e
nir: add nir_def_num_lsb_zero
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760 >
2025-08-22 15:45:55 +00:00
Rhys Perry
51dd513789
nir/search: reorder match_value to check constants first
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760 >
2025-08-22 15:45:55 +00:00
Rhys Perry
84fe10f939
nir/search: don't clear empty hash tables
...
_mesa_hash_table_clear() memsets the entries, even if it's already empty.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760 >
2025-08-22 15:45:55 +00:00
Rhys Perry
2a12624532
nir/search: add nir_search_state
...
A future commit will add another hash table.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760 >
2025-08-22 15:45:55 +00:00
Georg Lehmann
996c07353b
nir/shrink_vec_array_vars: use range analysis for non constant indices
...
Foz-DB Navi21:
Totals from 84 (0.10% of 80255) affected shaders:
MaxWaves: 1700 -> 1806 (+6.24%); split: +6.59%, -0.35%
Instrs: 90479 -> 91278 (+0.88%); split: -0.15%, +1.04%
CodeSize: 499644 -> 504572 (+0.99%); split: -0.10%, +1.08%
VGPRs: 5400 -> 4912 (-9.04%); split: -9.93%, +0.89%
LDS: 292864 -> 152064 (-48.08%)
Latency: 2001405 -> 2002335 (+0.05%); split: -0.01%, +0.06%
InvThroughput: 545293 -> 543073 (-0.41%); split: -0.52%, +0.11%
VClause: 1510 -> 1508 (-0.13%)
SClause: 2096 -> 2097 (+0.05%); split: -0.05%, +0.10%
Copies: 6373 -> 6431 (+0.91%); split: -0.64%, +1.55%
Branches: 1648 -> 1686 (+2.31%); split: -0.36%, +2.67%
PreVGPRs: 3918 -> 3960 (+1.07%); split: -0.03%, +1.10%
VALU: 67591 -> 68107 (+0.76%); split: -0.14%, +0.90%
SALU: 8352 -> 8490 (+1.65%); split: -0.25%, +1.90%
VMEM: 2685 -> 2683 (-0.07%)
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26388 >
2025-08-22 13:47:47 +00:00
Georg Lehmann
c7df3b4f64
nir/shrink_vec_array_vars: allow nir_var_mem_shared
...
This should just work.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26388 >
2025-08-22 13:47:47 +00:00