Commit graph

11723 commits

Author SHA1 Message Date
Rob Clark
8cc99edb7b nir: Fill in missing conversion opts
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
I noticed we were missing:

    (('u2f16', ('u2u64', 'a@32')), ('u2f16', a))

This was do to coupling the u2f/i2f opts with i2i/u2u in the same loop
(with different positionals).  The `if B <= S\ncontinue` doesn't apply
to the second part.  So just split these into two loops.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14848
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39899>
2026-02-18 15:13:21 +00:00
Rhys Perry
fd22c48b2a nir/algebraic: remove ignore_exact
This was used because the exact bit meant something different for
comparisons than it did for the replacement expression, but that isn't the
case anymore.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39809>
2026-02-18 14:04:22 +00:00
Rhys Perry
f44de53586 nir: only set fp_math_ctrl if meaningful
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39809>
2026-02-18 14:04:22 +00:00
Rhys Perry
12df083e0b nir: fix fmin_agx/fmax_agx constant folding
This seems to have two issues:
- since d7e88c0ccd, denormals would be flushed
- it did a f32->u32 conversion

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39809>
2026-02-18 14:04:22 +00:00
Rhys Perry
dfad15df0b nir/load_store_vectorize: don't update last_entry after a barrier
fossil-db (navi31):
Totals from 2 (0.00% of 84369) affected shaders:
Instrs: 7738 -> 7740 (+0.03%)
Latency: 333207 -> 333239 (+0.01%)
InvThroughput: 33320 -> 33324 (+0.01%)
VClause: 382 -> 384 (+0.52%)
VMEM: 656 -> 658 (+0.30%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 4ca7ee7bd7 ("nir/opt_load_store_vectorize: Allow to vectorize at most one entry of each type across blocks")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14825
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39849>
2026-02-18 10:09:14 +00:00
Rhys Perry
b203329e0d nir/load_store_vectorize: more carefully add entries from loop preheader
This would fix both stores 'b' and 'c' from being vectorized:
a = load(0)
loop {
   b = load(0)
   if (break)
   store(0)
}
c = load(0)

fossil-db (navi31):
Totals from 8 (0.01% of 84369) affected shaders:
Instrs: 12035 -> 12066 (+0.26%)
CodeSize: 63016 -> 63208 (+0.30%)
Latency: 176091 -> 177013 (+0.52%)
InvThroughput: 43894 -> 43981 (+0.20%)
SClause: 194 -> 196 (+1.03%)
Copies: 803 -> 812 (+1.12%)
VALU: 7666 -> 7675 (+0.12%)
SALU: 1102 -> 1105 (+0.27%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 4ca7ee7bd7 ("nir/opt_load_store_vectorize: Allow to vectorize at most one entry of each type across blocks")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14825
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39849>
2026-02-18 10:09:14 +00:00
Alyssa Rosenzweig
f55e87db93 nir: add missing ssbo atomics to nir_get_io_index_src_number
Match other SSBO intrinsics and other atomics.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39895>
2026-02-17 15:42:36 +00:00
Daniel Schürmann
d66de1bb49 glsl_to_nir: emit loop continue construct
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39777>
2026-02-17 13:15:36 +00:00
Georg Lehmann
6a662a59b7 nir/opt_algebraic: optimize 1.0 - b2f(a) to b2f(inot(a))
Which can then be cleaned up further.

Foz-DB Navi48:
Totals from 4156 (3.62% of 114655) affected shaders:
MaxWaves: 102580 -> 102620 (+0.04%)
Instrs: 11696222 -> 11679986 (-0.14%); split: -0.16%, +0.02%
CodeSize: 64452544 -> 64379204 (-0.11%); split: -0.13%, +0.02%
VGPRs: 288256 -> 288172 (-0.03%)
SpillSGPRs: 7290 -> 7297 (+0.10%)
Latency: 160690992 -> 160643825 (-0.03%); split: -0.05%, +0.02%
InvThroughput: 26869332 -> 26849963 (-0.07%); split: -0.09%, +0.02%
VClause: 237078 -> 237003 (-0.03%); split: -0.04%, +0.01%
SClause: 270560 -> 270564 (+0.00%); split: -0.01%, +0.01%
Copies: 936165 -> 937970 (+0.19%); split: -0.07%, +0.26%
Branches: 302981 -> 302992 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 244967 -> 245303 (+0.14%)
PreVGPRs: 232930 -> 232886 (-0.02%); split: -0.02%, +0.00%
VALU: 6200283 -> 6187264 (-0.21%); split: -0.23%, +0.02%
SALU: 1759176 -> 1760275 (+0.06%); split: -0.10%, +0.16%
VOPD: 447502 -> 446194 (-0.29%); split: +0.14%, -0.43%

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39917>
2026-02-17 10:01:21 +00:00
Rhys Perry
c0143829f9 nir/opt_intrinsics: optimize inot(inverse_ballot(const))
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262>
2026-02-16 19:39:43 +00:00
Georg Lehmann
bca5aab2be nir: let nir_analyze_fp_range take a nir_def
This is midly worse for vector constants, but so much simpler.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39756>
2026-02-16 18:08:53 +00:00
Georg Lehmann
474af815ff nir: rename nir_analyze_range because it's float only
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39756>
2026-02-16 18:08:53 +00:00
Georg Lehmann
f2a59fdea6 nir: remove non float nir_analyse_range support
This was always unused/unfinished.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39756>
2026-02-16 18:08:53 +00:00
Georg Lehmann
f7222d6939 nir/opt_algebraic: remove few uses of integer nir_analyze_range
Surprisingly, this has an effect on GFX1201:

Totals from 66 (0.08% of 82405) affected shaders:
Instrs: 200725 -> 201517 (+0.39%)
CodeSize: 978676 -> 981488 (+0.29%)
Latency: 291736 -> 291760 (+0.01%)
InvThroughput: 31556 -> 31604 (+0.15%)
Copies: 11928 -> 12588 (+5.53%)
Branches: 14850 -> 15048 (+1.33%)
SALU: 68981 -> 69509 (+0.77%)

I say surprisingly, because nir_analyze_range handles nothing but
constants and bcsel for integers. Maybe rdr2 is actually
hitting some weird bcsel(a, #b, #c) == 0 case where b and c are not 0?
No, I looked at a few of those shaders, and it's just noise from changed
instruction order.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39756>
2026-02-16 18:08:53 +00:00
Marek Olšák
aa92b464f3 nir/opt_non_uniform_access: use new query flags
NFC for drivers

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743>
2026-02-16 12:59:36 +00:00
Marek Olšák
61a96be494 nir/lower_non_uniform_access: add an option not to lower tex & image queries
AMD can do non-uniform queries. The RADV change will be in a separate commit.

NFC for drivers.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743>
2026-02-16 12:59:36 +00:00
Marek Olšák
a9df891bc6 nir: allow get_ssbo_size to return a 64-bit result
to match get_ubo_size, and to support HW where SSBOs can have a 64-bit size.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743>
2026-02-16 12:59:36 +00:00
Marek Olšák
c151402f35 nir: add ACCESS to get_ubo_size
so that we can set NON_UNIFORM

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743>
2026-02-16 12:59:36 +00:00
Marek Olšák
1d09a975bf nir: handle get_ubo_size as a resource query in nir_shader_gather_info
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743>
2026-02-16 12:59:36 +00:00
Ian Romanick
9017d37e84 nir: Use STACK_ARRAY instead of NIR_VLA
The number of fields comes from the shader, so it could be a value large
enough that using alloca would be problematic.

Fixes: c11833ab24 ("nir,spirv: Rework function calls")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39866>
2026-02-14 01:19:27 +00:00
Ian Romanick
3da828d2dd spirv: Use STACK_ARRAY instead of NIR_VLA
The number of fields comes from the shader, so it could be a value large
enough that using alloca would be problematic.

Fixes: 2a023f30a6 ("nir/spirv: Add basic support for types")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39866>
2026-02-14 01:19:27 +00:00
Mel Henning
bead49697d libcl_vk: Add VkCopyMemoryIndirectCommandKHR
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39869>
2026-02-13 20:53:47 +00:00
Alyssa Rosenzweig
a517184396 vtn: fix wait_group_events memory scope
For reasons I don't fully understand, wait_group_events needs to synchronize
global memory accesses across the device not just the local workgroup. This
matches what libclc does [1] even though we bypass the libclc implemenation
here. It also matches what Intel's OpenCL runtime does [2].

Fixes OpenCL tests on Iris:

   basic async_copy_global_to_local
   basic async_strided_copy_global_to_local

[1] a9ea1cfe6e/libclc/opencl/lib/generic/async/wait_group_events.cl (L11)
[2] 1f0ea8f11d/IGC/BiFModule/Languages/OpenCL/IBiF_Impl.cl (L182-L190)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39796>
2026-02-13 19:50:22 +00:00
Marek Olšák
0a9bdcac79 ac: lower load_workgroup_ids for ACO in NIR
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638>
2026-02-13 15:33:19 +00:00
Daniel Schürmann
88b4221519 nir/clone: Fix cloning indirect call instructions
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes: bb40284f76 ('nir: Add indirect calls')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39844>
2026-02-13 11:27:59 +00:00
Sagar Ghuge
1fb8435b77 nir: Add nir_resource_intel_internal entry
Will use the load/store_ssbo with nir_resource_intel_internal later in
this series.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35160>
2026-02-12 16:45:22 +00:00
Rhys Perry
fa5d4174c4 nir/search: use memcmp/memcpy/memset
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39808>
2026-02-12 14:47:06 +00:00
Rhys Perry
5d92942241 nir/search: remove creation of swizzle
match_expression() only accesses the first instr->def.num_components
elements, so we don't need to ensure the rest are zero.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39808>
2026-02-12 14:47:06 +00:00
jiajia Qian
f16d17a454 nir/opt_phi_precision: Fix bit size mismatch when moving widening conversions
Add a check to ensure that when load_const can be narrowed, the bit size

from other widening conversion sources must be 16-bit to maintain

consistency across all phi sources.

Signed-off-by: jiajia Qian <jiajia.qian@nxp.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39773>
2026-02-12 12:27:55 +00:00
Karol Herbst
a274b9c6a8 nak: Fold constant ishl into shared ld/st/atoms
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Totals:
CodeSize: 9459006048 -> 9458124656 (-0.01%); split: -0.01%, +0.00%
Number of GPRs: 47358402 -> 47358138 (-0.00%)
SLM Size: 5409064 -> 5409024 (-0.00%)
Static cycle count: 6129914910 -> 6129436959 (-0.01%); split: -0.01%, +0.00%
Spills to memory: 44471 -> 44453 (-0.04%)
Fills from memory: 44471 -> 44453 (-0.04%)
Spills to reg: 186364 -> 186365 (+0.00%); split: -0.00%, +0.00%
Fills from reg: 226975 -> 226976 (+0.00%); split: -0.00%, +0.00%
Max warps/SM: 50638680 -> 50638804 (+0.00%)

Totals from 9700 (0.83% of 1163204) affected shaders:
CodeSize: 234188480 -> 233307088 (-0.38%); split: -0.43%, +0.05%
Number of GPRs: 567950 -> 567686 (-0.05%)
SLM Size: 39952 -> 39912 (-0.10%)
Static cycle count: 225267269 -> 224789318 (-0.21%); split: -0.26%, +0.05%
Spills to memory: 4792 -> 4774 (-0.38%)
Fills from memory: 4792 -> 4774 (-0.38%)
Spills to reg: 33250 -> 33251 (+0.00%); split: -0.00%, +0.01%
Fills from reg: 27531 -> 27532 (+0.00%); split: -0.00%, +0.01%
Max warps/SM: 349200 -> 349324 (+0.04%)

Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39709>
2026-02-11 03:42:05 +01:00
Karol Herbst
18bf6fb96d nir: add nvidias shared memory non unform address shift
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39709>
2026-02-11 03:41:23 +01:00
Georg Lehmann
fbc0562203 nir/algebraic: allow inexact optimizations with sz/inf/nan preserve
Vulkan says these options only apply after possible contract/reassoc/transform
optimizations using real number rules.

No Foz-DB Navi48:
Totals from 3923 (4.76% of 82405) affected shaders:
MaxWaves: 113159 -> 113121 (-0.03%); split: +0.01%, -0.05%
Instrs: 6946272 -> 6933510 (-0.18%); split: -0.22%, +0.03%
CodeSize: 38894140 -> 38844432 (-0.13%); split: -0.16%, +0.03%
VGPRs: 206280 -> 206412 (+0.06%); split: -0.06%, +0.12%
Latency: 45991075 -> 45964455 (-0.06%); split: -0.09%, +0.03%
InvThroughput: 8555282 -> 8546561 (-0.10%); split: -0.15%, +0.05%
VClause: 159765 -> 159745 (-0.01%); split: -0.05%, +0.04%
SClause: 160199 -> 160263 (+0.04%); split: -0.07%, +0.11%
Copies: 550751 -> 550432 (-0.06%); split: -0.17%, +0.11%
Branches: 192949 -> 192960 (+0.01%)
PreSGPRs: 189198 -> 189314 (+0.06%); split: -0.07%, +0.13%
PreVGPRs: 142732 -> 142544 (-0.13%); split: -0.33%, +0.20%
VALU: 3579904 -> 3569665 (-0.29%); split: -0.34%, +0.05%
SALU: 1072897 -> 1072440 (-0.04%); split: -0.18%, +0.14%
VMEM: 262759 -> 262791 (+0.01%)
SMEM: 246224 -> 246230 (+0.00%)
VOPD: 369734 -> 369207 (-0.14%); split: +0.08%, -0.23%

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
4e2f1345d8 nir/opt_algebraic: make fcmp(a+b, 0.0) -> fcmp(a, -b) exact using ninf
And remove some cases that never happen because we remove fneg on compare with constants.

Foz-DB Navi48:
Totals from 1305 (1.58% of 82405) affected shaders:
MaxWaves: 32872 -> 32854 (-0.05%)
Instrs: 4554013 -> 4551638 (-0.05%); split: -0.06%, +0.01%
CodeSize: 25269108 -> 25255428 (-0.05%); split: -0.06%, +0.00%
VGPRs: 87660 -> 87732 (+0.08%)
Latency: 33291152 -> 33285023 (-0.02%); split: -0.03%, +0.01%
InvThroughput: 8965288 -> 8963071 (-0.02%); split: -0.03%, +0.00%
VClause: 104008 -> 103947 (-0.06%); split: -0.09%, +0.03%
SClause: 97577 -> 97574 (-0.00%); split: -0.01%, +0.00%
Copies: 372741 -> 372628 (-0.03%); split: -0.05%, +0.02%
Branches: 134076 -> 134072 (-0.00%)
PreSGPRs: 65109 -> 65110 (+0.00%); split: -0.00%, +0.00%
PreVGPRs: 68911 -> 68968 (+0.08%); split: -0.01%, +0.10%
VALU: 2247091 -> 2245815 (-0.06%); split: -0.07%, +0.01%
SALU: 810190 -> 810001 (-0.02%); split: -0.02%, +0.00%
VOPD: 205075 -> 205016 (-0.03%); split: +0.04%, -0.07%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
ef7dd040d9 nir/opt_algebraic: make a < 0.0 ? -a : a exact using search helpers
Foz-DB Navi21:
Totals from 104 (0.13% of 82405) affected shaders:
Instrs: 175964 -> 175514 (-0.26%); split: -0.26%, +0.00%
CodeSize: 909008 -> 908744 (-0.03%); split: -0.05%, +0.02%
Latency: 1515203 -> 1514560 (-0.04%); split: -0.05%, +0.01%
InvThroughput: 308751 -> 308573 (-0.06%); split: -0.06%, +0.00%
Copies: 10318 -> 10315 (-0.03%); split: -0.06%, +0.03%
PreVGPRs: 5767 -> 5755 (-0.21%)
VALU: 108151 -> 107745 (-0.38%)
VOPD: 738 -> 737 (-0.14%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
0474ad1504 nir/opt_algebraic: make ffract(is_integral) exact using nnan
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
b8d1763e0a nir/opt_algebraic: make some more fcmp patterns exact using nnan
No Foz-DB changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
8d52c59505 nir/opt_algebraic: make some fmin/fmax/fsat patterns exact using nsz/nnan
Foz-DB Navi48:
Totals from 90 (0.11% of 82405) affected shaders:
Instrs: 52109 -> 52032 (-0.15%); split: -0.16%, +0.01%
CodeSize: 263916 -> 263900 (-0.01%); split: -0.05%, +0.05%
Latency: 504693 -> 504775 (+0.02%); split: -0.01%, +0.03%
InvThroughput: 81444 -> 81157 (-0.35%)
Copies: 2894 -> 2895 (+0.03%)
VALU: 30097 -> 29991 (-0.35%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
486ea54184 nir/opt_algebraic: make bcsel(fcmp(b, a), b, a) -> fmin/fmax patterns exact
These patterns need is_only_used_as_float because fmin/fmax might change NaN
patterns, while bcsel is bit exact. For the same reason, the replacement
must not add undefined results, so make the replacement NaN/inf preserving.

It's impossible to make them signed zero correct (-0.0 == +0.0),
so it's also important that the user alu doesn't care.

Otherwise, the only thing that matters is is whether a is NaN.

Foz-DB Navi48:
Totals from 453 (0.55% of 82405) affected shaders:
MaxWaves: 8242 -> 8270 (+0.34%)
Instrs: 2382059 -> 2380094 (-0.08%); split: -0.09%, +0.00%
CodeSize: 13197208 -> 13179488 (-0.13%); split: -0.14%, +0.00%
VGPRs: 44688 -> 44604 (-0.19%)
Latency: 22839894 -> 22838985 (-0.00%); split: -0.01%, +0.00%
InvThroughput: 4873352 -> 4872924 (-0.01%)
VClause: 50862 -> 50883 (+0.04%); split: -0.02%, +0.06%
SClause: 54000 -> 53993 (-0.01%)
Copies: 250215 -> 250233 (+0.01%); split: -0.00%, +0.01%
PreVGPRs: 39694 -> 39620 (-0.19%)
VALU: 1116881 -> 1116073 (-0.07%); split: -0.07%, +0.00%
SALU: 492799 -> 492139 (-0.13%); split: -0.14%, +0.00%
VOPD: 85457 -> 85461 (+0.00%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
aa78083477 nir: make alu fp_math_ctrl helpers const
No Foz-DB changes.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
f55668bb50 nir/opt_algebraic: update flt -> fneu patterns
And remove the ones that are redundant because we already move the fneg to
the constant source.

No Foz-DB changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
15b13d5fd4 nir/opt_algebraic: optimize flt/fge(#c, fadd(a, #b))
I guess these were missing because the author forgot flt/fge aren't commutative.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
2355b63cb5 nir/opt_algebraic: use better float control for some fcmp patterns
Foz-DB Navi48:
Totals from 1084 (1.32% of 82405) affected shaders:
Instrs: 1969973 -> 1968947 (-0.05%); split: -0.08%, +0.02%
CodeSize: 11349704 -> 11344884 (-0.04%); split: -0.06%, +0.02%
VGPRs: 59076 -> 59064 (-0.02%); split: -0.06%, +0.04%
Latency: 20766031 -> 20755032 (-0.05%); split: -0.07%, +0.01%
InvThroughput: 2849402 -> 2846733 (-0.09%); split: -0.10%, +0.01%
VClause: 40736 -> 40740 (+0.01%)
SClause: 91835 -> 91832 (-0.00%)
Copies: 217961 -> 217868 (-0.04%); split: -0.07%, +0.02%
Branches: 60045 -> 60031 (-0.02%)
PreSGPRs: 50639 -> 50618 (-0.04%); split: -0.06%, +0.02%
PreVGPRs: 39593 -> 39590 (-0.01%); split: -0.01%, +0.01%
VALU: 960270 -> 959524 (-0.08%); split: -0.10%, +0.02%
SALU: 326638 -> 326680 (+0.01%); split: -0.04%, +0.06%
VOPD: 23963 -> 23929 (-0.14%); split: +0.04%, -0.18%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
7238888d93 nir/opt_algebraic: remove redundant patterns with fcmp(fneg(...), #c)
We already have patterns to move the negation to the constant.

No Foz-DB changes.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
03c497f236 nir/opt_algebraic: make 1.0 - fsat(a) -> fsat(1.0 - a) pattern exact using nnan
Foz-DB Navi48:
Totals from 50 (0.06% of 82405) affected shaders:
CodeSize: 137072 -> 137456 (+0.28%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
79e4530a9b nir/opt_algebraic: make pattern pushing fmul into bcsel exact
The only special case here is d == -0.0.

Foz-DB Navi48:
Totals from 3 (0.00% of 82405) affected shaders:
CodeSize: 29140 -> 29188 (+0.16%)
InvThroughput: 2945 -> 2951 (+0.20%)
VALU: 3217 -> 3223 (+0.19%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
a3bc94a3d0 nir/opt_algebraic: remove inexact from floor->trunc pattern
This was marked inexact because of me in !21475, but I don't see why now,
even after checking all the special values.

No Foz-DB changes.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
da7abb1337 nir/opt_algebraic: mark fmulz(finite, finite) -> fmul pattern as nsz
No Foz-DB chagnes.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
ea87f1f9bc nir/opt_algebraic: add a - a with nnan
Foz-DB Navi48:
Totals from 576 (0.70% of 82405) affected shaders:
MaxWaves: 16706 -> 16726 (+0.12%)
Instrs: 618677 -> 580965 (-6.10%); split: -6.10%, +0.00%
CodeSize: 3022552 -> 2861612 (-5.32%); split: -5.33%, +0.00%
VGPRs: 28008 -> 28860 (+3.04%); split: -0.51%, +3.56%
Latency: 2689318 -> 2655887 (-1.24%); split: -1.25%, +0.01%
InvThroughput: 403512 -> 393404 (-2.51%); split: -2.51%, +0.00%
VClause: 7584 -> 7577 (-0.09%); split: -0.17%, +0.08%
SClause: 19974 -> 19086 (-4.45%); split: -4.48%, +0.03%
Copies: 43862 -> 40888 (-6.78%); split: -6.87%, +0.09%
Branches: 12457 -> 11407 (-8.43%)
PreSGPRs: 28315 -> 27046 (-4.48%); split: -4.53%, +0.05%
PreVGPRs: 20751 -> 19397 (-6.52%)
VALU: 317224 -> 290151 (-8.53%); split: -8.53%, +0.00%
SALU: 124297 -> 121347 (-2.37%); split: -2.39%, +0.02%
VMEM: 11918 -> 11907 (-0.09%)
SMEM: 27582 -> 26241 (-4.86%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
16db9f79d1 nir/opt_algebraic: remove inexact a * 0.0 patterns
We already have some with nnan,nsz.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
63d199a01e nir: remove special fp_math_ctrl rules
All opcodes should now respect the nan/inf/sz preserving flags.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00