Daniel Schürmann
f4e3ab5266
nir/divergence: Ignore divergent_loop_{continue|break} for nir_block::divergent
...
This is already implicitly accounted for.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39934 >
2026-02-19 19:55:33 +00:00
Daniel Schürmann
eabd7cc22c
nir/divergence: Fix nir_block::divergent in presence of divergent breaks
...
If no second pass is necessary, we might miss setting nir_block::divergent
to true, if a loop has a divergent break.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39934 >
2026-02-19 19:55:31 +00:00
Daniel Schürmann
a57b900a59
nir/divergence: rename divergent_loop_cf to divergent_cf
...
in order to better reflect the actual semantics.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39934 >
2026-02-19 19:55:31 +00:00
Georg Lehmann
5d5f99bfe8
nir/opt_algebraic: create more b2f if sign of zero doesn't matter
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39966 >
2026-02-19 15:21:27 +00:00
Georg Lehmann
d87943ad3d
nir/opt_algebraic: preserve signed zero when creating new b2f
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39966 >
2026-02-19 15:21:27 +00:00
Georg Lehmann
5e544ecd08
nir/opcodes: remove valid_fp_math_ctrl bits from some opcodes
...
This is mostly about conversions.
Conversions from float to int don't care about signed zero
and in the case of plain f2u/f2i, nan and inf are always
undefined too.
Conversions for int to float can't create nan, so they don't
need preserve_nan.
b2f only cares about preserve_sz, and nothing else.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39966 >
2026-02-19 15:21:27 +00:00
Georg Lehmann
62f3be87c4
nir/serialize: omit serializing fp_math_ctrl if it has to be 0
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39966 >
2026-02-19 15:21:27 +00:00
Alyssa Rosenzweig
5d5c2a6430
nir/opt_intrinsics: use data helpers
...
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39939 >
2026-02-19 14:47:11 +00:00
Alyssa Rosenzweig
84f3849688
nir/opt_fragdepth: use data helper
...
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39939 >
2026-02-19 14:47:11 +00:00
Alyssa Rosenzweig
9da61b3ea5
nir/opt_uniform_atomics: use data helper
...
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39939 >
2026-02-19 14:47:11 +00:00
Alyssa Rosenzweig
76d5436f04
nir/lower_atomics: use data helper
...
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39939 >
2026-02-19 14:47:11 +00:00
Alyssa Rosenzweig
8fb1d65426
nir: add nir_get_io_data_src
...
This complements our existing nir_get_io_index_src helper. Most, but annoyingly
not all, stores put their data source in source 0. Having a helper for this lets
us reduce special casing in a bunch of random places.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39939 >
2026-02-19 14:47:11 +00:00
Maíra Canal
5a1e0112a9
nir: add load_texture_scale intrinsic
...
Add load_texture_scale to the list of intrinsics whose divergence
depends on their sources. This is needed to support running divergence
analysis on VC4.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39768 >
2026-02-19 09:57:05 +00:00
Alyssa Rosenzweig
e172f97fdd
nir/opt_constant_folding: optimize ballot(false)
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
always zero. noticed on dEQP-VK.subgroups.ballot.graphics.graphic
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39948 >
2026-02-18 23:40:44 +00:00
Rob Clark
8cc99edb7b
nir: Fill in missing conversion opts
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
I noticed we were missing:
(('u2f16', ('u2u64', 'a@32')), ('u2f16', a))
This was do to coupling the u2f/i2f opts with i2i/u2u in the same loop
(with different positionals). The `if B <= S\ncontinue` doesn't apply
to the second part. So just split these into two loops.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14848
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39899 >
2026-02-18 15:13:21 +00:00
Rhys Perry
fd22c48b2a
nir/algebraic: remove ignore_exact
...
This was used because the exact bit meant something different for
comparisons than it did for the replacement expression, but that isn't the
case anymore.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39809 >
2026-02-18 14:04:22 +00:00
Rhys Perry
f44de53586
nir: only set fp_math_ctrl if meaningful
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39809 >
2026-02-18 14:04:22 +00:00
Rhys Perry
12df083e0b
nir: fix fmin_agx/fmax_agx constant folding
...
This seems to have two issues:
- since d7e88c0ccd , denormals would be flushed
- it did a f32->u32 conversion
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39809 >
2026-02-18 14:04:22 +00:00
Rhys Perry
dfad15df0b
nir/load_store_vectorize: don't update last_entry after a barrier
...
fossil-db (navi31):
Totals from 2 (0.00% of 84369) affected shaders:
Instrs: 7738 -> 7740 (+0.03%)
Latency: 333207 -> 333239 (+0.01%)
InvThroughput: 33320 -> 33324 (+0.01%)
VClause: 382 -> 384 (+0.52%)
VMEM: 656 -> 658 (+0.30%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 4ca7ee7bd7 ("nir/opt_load_store_vectorize: Allow to vectorize at most one entry of each type across blocks")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14825
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39849 >
2026-02-18 10:09:14 +00:00
Rhys Perry
b203329e0d
nir/load_store_vectorize: more carefully add entries from loop preheader
...
This would fix both stores 'b' and 'c' from being vectorized:
a = load(0)
loop {
b = load(0)
if (break)
store(0)
}
c = load(0)
fossil-db (navi31):
Totals from 8 (0.01% of 84369) affected shaders:
Instrs: 12035 -> 12066 (+0.26%)
CodeSize: 63016 -> 63208 (+0.30%)
Latency: 176091 -> 177013 (+0.52%)
InvThroughput: 43894 -> 43981 (+0.20%)
SClause: 194 -> 196 (+1.03%)
Copies: 803 -> 812 (+1.12%)
VALU: 7666 -> 7675 (+0.12%)
SALU: 1102 -> 1105 (+0.27%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 4ca7ee7bd7 ("nir/opt_load_store_vectorize: Allow to vectorize at most one entry of each type across blocks")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14825
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39849 >
2026-02-18 10:09:14 +00:00
Alyssa Rosenzweig
f55e87db93
nir: add missing ssbo atomics to nir_get_io_index_src_number
...
Match other SSBO intrinsics and other atomics.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39895 >
2026-02-17 15:42:36 +00:00
Daniel Schürmann
d66de1bb49
glsl_to_nir: emit loop continue construct
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39777 >
2026-02-17 13:15:36 +00:00
Georg Lehmann
6a662a59b7
nir/opt_algebraic: optimize 1.0 - b2f(a) to b2f(inot(a))
...
Which can then be cleaned up further.
Foz-DB Navi48:
Totals from 4156 (3.62% of 114655) affected shaders:
MaxWaves: 102580 -> 102620 (+0.04%)
Instrs: 11696222 -> 11679986 (-0.14%); split: -0.16%, +0.02%
CodeSize: 64452544 -> 64379204 (-0.11%); split: -0.13%, +0.02%
VGPRs: 288256 -> 288172 (-0.03%)
SpillSGPRs: 7290 -> 7297 (+0.10%)
Latency: 160690992 -> 160643825 (-0.03%); split: -0.05%, +0.02%
InvThroughput: 26869332 -> 26849963 (-0.07%); split: -0.09%, +0.02%
VClause: 237078 -> 237003 (-0.03%); split: -0.04%, +0.01%
SClause: 270560 -> 270564 (+0.00%); split: -0.01%, +0.01%
Copies: 936165 -> 937970 (+0.19%); split: -0.07%, +0.26%
Branches: 302981 -> 302992 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 244967 -> 245303 (+0.14%)
PreVGPRs: 232930 -> 232886 (-0.02%); split: -0.02%, +0.00%
VALU: 6200283 -> 6187264 (-0.21%); split: -0.23%, +0.02%
SALU: 1759176 -> 1760275 (+0.06%); split: -0.10%, +0.16%
VOPD: 447502 -> 446194 (-0.29%); split: +0.14%, -0.43%
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39917 >
2026-02-17 10:01:21 +00:00
Rhys Perry
c0143829f9
nir/opt_intrinsics: optimize inot(inverse_ballot(const))
...
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Georg Lehmann
bca5aab2be
nir: let nir_analyze_fp_range take a nir_def
...
This is midly worse for vector constants, but so much simpler.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39756 >
2026-02-16 18:08:53 +00:00
Georg Lehmann
474af815ff
nir: rename nir_analyze_range because it's float only
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39756 >
2026-02-16 18:08:53 +00:00
Georg Lehmann
f2a59fdea6
nir: remove non float nir_analyse_range support
...
This was always unused/unfinished.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39756 >
2026-02-16 18:08:53 +00:00
Georg Lehmann
f7222d6939
nir/opt_algebraic: remove few uses of integer nir_analyze_range
...
Surprisingly, this has an effect on GFX1201:
Totals from 66 (0.08% of 82405) affected shaders:
Instrs: 200725 -> 201517 (+0.39%)
CodeSize: 978676 -> 981488 (+0.29%)
Latency: 291736 -> 291760 (+0.01%)
InvThroughput: 31556 -> 31604 (+0.15%)
Copies: 11928 -> 12588 (+5.53%)
Branches: 14850 -> 15048 (+1.33%)
SALU: 68981 -> 69509 (+0.77%)
I say surprisingly, because nir_analyze_range handles nothing but
constants and bcsel for integers. Maybe rdr2 is actually
hitting some weird bcsel(a, #b, #c) == 0 case where b and c are not 0?
No, I looked at a few of those shaders, and it's just noise from changed
instruction order.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39756 >
2026-02-16 18:08:53 +00:00
Marek Olšák
aa92b464f3
nir/opt_non_uniform_access: use new query flags
...
NFC for drivers
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743 >
2026-02-16 12:59:36 +00:00
Marek Olšák
61a96be494
nir/lower_non_uniform_access: add an option not to lower tex & image queries
...
AMD can do non-uniform queries. The RADV change will be in a separate commit.
NFC for drivers.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743 >
2026-02-16 12:59:36 +00:00
Marek Olšák
a9df891bc6
nir: allow get_ssbo_size to return a 64-bit result
...
to match get_ubo_size, and to support HW where SSBOs can have a 64-bit size.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743 >
2026-02-16 12:59:36 +00:00
Marek Olšák
c151402f35
nir: add ACCESS to get_ubo_size
...
so that we can set NON_UNIFORM
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743 >
2026-02-16 12:59:36 +00:00
Marek Olšák
1d09a975bf
nir: handle get_ubo_size as a resource query in nir_shader_gather_info
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743 >
2026-02-16 12:59:36 +00:00
Ian Romanick
9017d37e84
nir: Use STACK_ARRAY instead of NIR_VLA
...
The number of fields comes from the shader, so it could be a value large
enough that using alloca would be problematic.
Fixes: c11833ab24 ("nir,spirv: Rework function calls")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39866 >
2026-02-14 01:19:27 +00:00
Ian Romanick
3da828d2dd
spirv: Use STACK_ARRAY instead of NIR_VLA
...
The number of fields comes from the shader, so it could be a value large
enough that using alloca would be problematic.
Fixes: 2a023f30a6 ("nir/spirv: Add basic support for types")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39866 >
2026-02-14 01:19:27 +00:00
Mel Henning
bead49697d
libcl_vk: Add VkCopyMemoryIndirectCommandKHR
...
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39869 >
2026-02-13 20:53:47 +00:00
Alyssa Rosenzweig
a517184396
vtn: fix wait_group_events memory scope
...
For reasons I don't fully understand, wait_group_events needs to synchronize
global memory accesses across the device not just the local workgroup. This
matches what libclc does [1] even though we bypass the libclc implemenation
here. It also matches what Intel's OpenCL runtime does [2].
Fixes OpenCL tests on Iris:
basic async_copy_global_to_local
basic async_strided_copy_global_to_local
[1] a9ea1cfe6e/libclc/opencl/lib/generic/async/wait_group_events.cl (L11)
[2] 1f0ea8f11d/IGC/BiFModule/Languages/OpenCL/IBiF_Impl.cl (L182-L190)
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39796 >
2026-02-13 19:50:22 +00:00
Marek Olšák
0a9bdcac79
ac: lower load_workgroup_ids for ACO in NIR
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638 >
2026-02-13 15:33:19 +00:00
Daniel Schürmann
88b4221519
nir/clone: Fix cloning indirect call instructions
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes: bb40284f76 ('nir: Add indirect calls')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39844 >
2026-02-13 11:27:59 +00:00
Sagar Ghuge
1fb8435b77
nir: Add nir_resource_intel_internal entry
...
Will use the load/store_ssbo with nir_resource_intel_internal later in
this series.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35160 >
2026-02-12 16:45:22 +00:00
Rhys Perry
fa5d4174c4
nir/search: use memcmp/memcpy/memset
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39808 >
2026-02-12 14:47:06 +00:00
Rhys Perry
5d92942241
nir/search: remove creation of swizzle
...
match_expression() only accesses the first instr->def.num_components
elements, so we don't need to ensure the rest are zero.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39808 >
2026-02-12 14:47:06 +00:00
jiajia Qian
f16d17a454
nir/opt_phi_precision: Fix bit size mismatch when moving widening conversions
...
Add a check to ensure that when load_const can be narrowed, the bit size
from other widening conversion sources must be 16-bit to maintain
consistency across all phi sources.
Signed-off-by: jiajia Qian <jiajia.qian@nxp.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39773 >
2026-02-12 12:27:55 +00:00
Karol Herbst
a274b9c6a8
nak: Fold constant ishl into shared ld/st/atoms
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Totals:
CodeSize: 9459006048 -> 9458124656 (-0.01%); split: -0.01%, +0.00%
Number of GPRs: 47358402 -> 47358138 (-0.00%)
SLM Size: 5409064 -> 5409024 (-0.00%)
Static cycle count: 6129914910 -> 6129436959 (-0.01%); split: -0.01%, +0.00%
Spills to memory: 44471 -> 44453 (-0.04%)
Fills from memory: 44471 -> 44453 (-0.04%)
Spills to reg: 186364 -> 186365 (+0.00%); split: -0.00%, +0.00%
Fills from reg: 226975 -> 226976 (+0.00%); split: -0.00%, +0.00%
Max warps/SM: 50638680 -> 50638804 (+0.00%)
Totals from 9700 (0.83% of 1163204) affected shaders:
CodeSize: 234188480 -> 233307088 (-0.38%); split: -0.43%, +0.05%
Number of GPRs: 567950 -> 567686 (-0.05%)
SLM Size: 39952 -> 39912 (-0.10%)
Static cycle count: 225267269 -> 224789318 (-0.21%); split: -0.26%, +0.05%
Spills to memory: 4792 -> 4774 (-0.38%)
Fills from memory: 4792 -> 4774 (-0.38%)
Spills to reg: 33250 -> 33251 (+0.00%); split: -0.00%, +0.01%
Fills from reg: 27531 -> 27532 (+0.00%); split: -0.00%, +0.01%
Max warps/SM: 349200 -> 349324 (+0.04%)
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39709 >
2026-02-11 03:42:05 +01:00
Karol Herbst
18bf6fb96d
nir: add nvidias shared memory non unform address shift
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39709 >
2026-02-11 03:41:23 +01:00
Georg Lehmann
fbc0562203
nir/algebraic: allow inexact optimizations with sz/inf/nan preserve
...
Vulkan says these options only apply after possible contract/reassoc/transform
optimizations using real number rules.
No Foz-DB Navi48:
Totals from 3923 (4.76% of 82405) affected shaders:
MaxWaves: 113159 -> 113121 (-0.03%); split: +0.01%, -0.05%
Instrs: 6946272 -> 6933510 (-0.18%); split: -0.22%, +0.03%
CodeSize: 38894140 -> 38844432 (-0.13%); split: -0.16%, +0.03%
VGPRs: 206280 -> 206412 (+0.06%); split: -0.06%, +0.12%
Latency: 45991075 -> 45964455 (-0.06%); split: -0.09%, +0.03%
InvThroughput: 8555282 -> 8546561 (-0.10%); split: -0.15%, +0.05%
VClause: 159765 -> 159745 (-0.01%); split: -0.05%, +0.04%
SClause: 160199 -> 160263 (+0.04%); split: -0.07%, +0.11%
Copies: 550751 -> 550432 (-0.06%); split: -0.17%, +0.11%
Branches: 192949 -> 192960 (+0.01%)
PreSGPRs: 189198 -> 189314 (+0.06%); split: -0.07%, +0.13%
PreVGPRs: 142732 -> 142544 (-0.13%); split: -0.33%, +0.20%
VALU: 3579904 -> 3569665 (-0.29%); split: -0.34%, +0.05%
SALU: 1072897 -> 1072440 (-0.04%); split: -0.18%, +0.14%
VMEM: 262759 -> 262791 (+0.01%)
SMEM: 246224 -> 246230 (+0.00%)
VOPD: 369734 -> 369207 (-0.14%); split: +0.08%, -0.23%
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
4e2f1345d8
nir/opt_algebraic: make fcmp(a+b, 0.0) -> fcmp(a, -b) exact using ninf
...
And remove some cases that never happen because we remove fneg on compare with constants.
Foz-DB Navi48:
Totals from 1305 (1.58% of 82405) affected shaders:
MaxWaves: 32872 -> 32854 (-0.05%)
Instrs: 4554013 -> 4551638 (-0.05%); split: -0.06%, +0.01%
CodeSize: 25269108 -> 25255428 (-0.05%); split: -0.06%, +0.00%
VGPRs: 87660 -> 87732 (+0.08%)
Latency: 33291152 -> 33285023 (-0.02%); split: -0.03%, +0.01%
InvThroughput: 8965288 -> 8963071 (-0.02%); split: -0.03%, +0.00%
VClause: 104008 -> 103947 (-0.06%); split: -0.09%, +0.03%
SClause: 97577 -> 97574 (-0.00%); split: -0.01%, +0.00%
Copies: 372741 -> 372628 (-0.03%); split: -0.05%, +0.02%
Branches: 134076 -> 134072 (-0.00%)
PreSGPRs: 65109 -> 65110 (+0.00%); split: -0.00%, +0.00%
PreVGPRs: 68911 -> 68968 (+0.08%); split: -0.01%, +0.10%
VALU: 2247091 -> 2245815 (-0.06%); split: -0.07%, +0.01%
SALU: 810190 -> 810001 (-0.02%); split: -0.02%, +0.00%
VOPD: 205075 -> 205016 (-0.03%); split: +0.04%, -0.07%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
ef7dd040d9
nir/opt_algebraic: make a < 0.0 ? -a : a exact using search helpers
...
Foz-DB Navi21:
Totals from 104 (0.13% of 82405) affected shaders:
Instrs: 175964 -> 175514 (-0.26%); split: -0.26%, +0.00%
CodeSize: 909008 -> 908744 (-0.03%); split: -0.05%, +0.02%
Latency: 1515203 -> 1514560 (-0.04%); split: -0.05%, +0.01%
InvThroughput: 308751 -> 308573 (-0.06%); split: -0.06%, +0.00%
Copies: 10318 -> 10315 (-0.03%); split: -0.06%, +0.03%
PreVGPRs: 5767 -> 5755 (-0.21%)
VALU: 108151 -> 107745 (-0.38%)
VOPD: 738 -> 737 (-0.14%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
0474ad1504
nir/opt_algebraic: make ffract(is_integral) exact using nnan
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
b8d1763e0a
nir/opt_algebraic: make some more fcmp patterns exact using nnan
...
No Foz-DB changes.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00