Commit graph

718 commits

Author SHA1 Message Date
Georg Lehmann
b8d1763e0a nir/opt_algebraic: make some more fcmp patterns exact using nnan
No Foz-DB changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
8d52c59505 nir/opt_algebraic: make some fmin/fmax/fsat patterns exact using nsz/nnan
Foz-DB Navi48:
Totals from 90 (0.11% of 82405) affected shaders:
Instrs: 52109 -> 52032 (-0.15%); split: -0.16%, +0.01%
CodeSize: 263916 -> 263900 (-0.01%); split: -0.05%, +0.05%
Latency: 504693 -> 504775 (+0.02%); split: -0.01%, +0.03%
InvThroughput: 81444 -> 81157 (-0.35%)
Copies: 2894 -> 2895 (+0.03%)
VALU: 30097 -> 29991 (-0.35%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
486ea54184 nir/opt_algebraic: make bcsel(fcmp(b, a), b, a) -> fmin/fmax patterns exact
These patterns need is_only_used_as_float because fmin/fmax might change NaN
patterns, while bcsel is bit exact. For the same reason, the replacement
must not add undefined results, so make the replacement NaN/inf preserving.

It's impossible to make them signed zero correct (-0.0 == +0.0),
so it's also important that the user alu doesn't care.

Otherwise, the only thing that matters is is whether a is NaN.

Foz-DB Navi48:
Totals from 453 (0.55% of 82405) affected shaders:
MaxWaves: 8242 -> 8270 (+0.34%)
Instrs: 2382059 -> 2380094 (-0.08%); split: -0.09%, +0.00%
CodeSize: 13197208 -> 13179488 (-0.13%); split: -0.14%, +0.00%
VGPRs: 44688 -> 44604 (-0.19%)
Latency: 22839894 -> 22838985 (-0.00%); split: -0.01%, +0.00%
InvThroughput: 4873352 -> 4872924 (-0.01%)
VClause: 50862 -> 50883 (+0.04%); split: -0.02%, +0.06%
SClause: 54000 -> 53993 (-0.01%)
Copies: 250215 -> 250233 (+0.01%); split: -0.00%, +0.01%
PreVGPRs: 39694 -> 39620 (-0.19%)
VALU: 1116881 -> 1116073 (-0.07%); split: -0.07%, +0.00%
SALU: 492799 -> 492139 (-0.13%); split: -0.14%, +0.00%
VOPD: 85457 -> 85461 (+0.00%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
f55668bb50 nir/opt_algebraic: update flt -> fneu patterns
And remove the ones that are redundant because we already move the fneg to
the constant source.

No Foz-DB changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
15b13d5fd4 nir/opt_algebraic: optimize flt/fge(#c, fadd(a, #b))
I guess these were missing because the author forgot flt/fge aren't commutative.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
2355b63cb5 nir/opt_algebraic: use better float control for some fcmp patterns
Foz-DB Navi48:
Totals from 1084 (1.32% of 82405) affected shaders:
Instrs: 1969973 -> 1968947 (-0.05%); split: -0.08%, +0.02%
CodeSize: 11349704 -> 11344884 (-0.04%); split: -0.06%, +0.02%
VGPRs: 59076 -> 59064 (-0.02%); split: -0.06%, +0.04%
Latency: 20766031 -> 20755032 (-0.05%); split: -0.07%, +0.01%
InvThroughput: 2849402 -> 2846733 (-0.09%); split: -0.10%, +0.01%
VClause: 40736 -> 40740 (+0.01%)
SClause: 91835 -> 91832 (-0.00%)
Copies: 217961 -> 217868 (-0.04%); split: -0.07%, +0.02%
Branches: 60045 -> 60031 (-0.02%)
PreSGPRs: 50639 -> 50618 (-0.04%); split: -0.06%, +0.02%
PreVGPRs: 39593 -> 39590 (-0.01%); split: -0.01%, +0.01%
VALU: 960270 -> 959524 (-0.08%); split: -0.10%, +0.02%
SALU: 326638 -> 326680 (+0.01%); split: -0.04%, +0.06%
VOPD: 23963 -> 23929 (-0.14%); split: +0.04%, -0.18%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
7238888d93 nir/opt_algebraic: remove redundant patterns with fcmp(fneg(...), #c)
We already have patterns to move the negation to the constant.

No Foz-DB changes.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
03c497f236 nir/opt_algebraic: make 1.0 - fsat(a) -> fsat(1.0 - a) pattern exact using nnan
Foz-DB Navi48:
Totals from 50 (0.06% of 82405) affected shaders:
CodeSize: 137072 -> 137456 (+0.28%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
79e4530a9b nir/opt_algebraic: make pattern pushing fmul into bcsel exact
The only special case here is d == -0.0.

Foz-DB Navi48:
Totals from 3 (0.00% of 82405) affected shaders:
CodeSize: 29140 -> 29188 (+0.16%)
InvThroughput: 2945 -> 2951 (+0.20%)
VALU: 3217 -> 3223 (+0.19%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
a3bc94a3d0 nir/opt_algebraic: remove inexact from floor->trunc pattern
This was marked inexact because of me in !21475, but I don't see why now,
even after checking all the special values.

No Foz-DB changes.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
da7abb1337 nir/opt_algebraic: mark fmulz(finite, finite) -> fmul pattern as nsz
No Foz-DB chagnes.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
ea87f1f9bc nir/opt_algebraic: add a - a with nnan
Foz-DB Navi48:
Totals from 576 (0.70% of 82405) affected shaders:
MaxWaves: 16706 -> 16726 (+0.12%)
Instrs: 618677 -> 580965 (-6.10%); split: -6.10%, +0.00%
CodeSize: 3022552 -> 2861612 (-5.32%); split: -5.33%, +0.00%
VGPRs: 28008 -> 28860 (+3.04%); split: -0.51%, +3.56%
Latency: 2689318 -> 2655887 (-1.24%); split: -1.25%, +0.01%
InvThroughput: 403512 -> 393404 (-2.51%); split: -2.51%, +0.00%
VClause: 7584 -> 7577 (-0.09%); split: -0.17%, +0.08%
SClause: 19974 -> 19086 (-4.45%); split: -4.48%, +0.03%
Copies: 43862 -> 40888 (-6.78%); split: -6.87%, +0.09%
Branches: 12457 -> 11407 (-8.43%)
PreSGPRs: 28315 -> 27046 (-4.48%); split: -4.53%, +0.05%
PreVGPRs: 20751 -> 19397 (-6.52%)
VALU: 317224 -> 290151 (-8.53%); split: -8.53%, +0.00%
SALU: 124297 -> 121347 (-2.37%); split: -2.39%, +0.02%
VMEM: 11918 -> 11907 (-0.09%)
SMEM: 27582 -> 26241 (-4.86%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
16db9f79d1 nir/opt_algebraic: remove inexact a * 0.0 patterns
We already have some with nnan,nsz.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
e443229644 nir/opt_algebraic: mark newly created fmulz nan/inf preserving
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
b678899ef8 nir/opt_algebraic: use nan/inf/sz preserve flags instead of exact for cmp/min/max replacement
And remove some, because they should be covered by the search pattern anyway.

Foz-DB Navi48:
Totals from 560 (0.68% of 82405) affected shaders:
MaxWaves: 11279 -> 11291 (+0.11%)
Instrs: 5214229 -> 5214386 (+0.00%); split: -0.02%, +0.02%
CodeSize: 29613884 -> 29616740 (+0.01%); split: -0.01%, +0.02%
VGPRs: 50400 -> 50328 (-0.14%)
Latency: 36481700 -> 36481157 (-0.00%); split: -0.01%, +0.01%
InvThroughput: 7309905 -> 7307905 (-0.03%); split: -0.05%, +0.02%
VClause: 131423 -> 131424 (+0.00%); split: -0.00%, +0.00%
SClause: 111485 -> 111499 (+0.01%); split: -0.00%, +0.01%
Copies: 441899 -> 442029 (+0.03%); split: -0.02%, +0.05%
Branches: 165599 -> 165597 (-0.00%)
PreVGPRs: 43558 -> 43525 (-0.08%)
VALU: 2573609 -> 2573324 (-0.01%); split: -0.03%, +0.02%
SALU: 851172 -> 851271 (+0.01%); split: -0.01%, +0.02%
VOPD: 366409 -> 366934 (+0.14%); split: +0.23%, -0.08%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
45cb1d3b6f nir/opt_algebraic: remove unpack_half_2x16_split
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511>
2026-02-06 06:12:36 +00:00
Georg Lehmann
b18d9c1b33 nir/opt_algebraic: optimize unpack_32_2x16 of extract
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511>
2026-02-06 06:12:36 +00:00
Marek Olšák
a3f022d0a2 nir: reassociate a $op (b ? #c : #d) for div, mod, rem
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This eliminates expensive div, mod, rem opcodes with non-constant src1 being
constant src1 hiding behind bcsel.

gcc and LLVM are missing this.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39560>
2026-02-02 21:34:48 +00:00
Georg Lehmann
ad6f8291bf nir/opt_algebraic: rework ignore_exact to work like other internal conditions
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39616>
2026-01-31 15:30:25 +00:00
Georg Lehmann
70f0e75262 nir/opt_algebraic: optimize pack_half_2x16_rtz of float converted from 16bit
Foz-DB Navi48:
Totals from 177 (0.21% of 82405) affected shaders:
Instrs: 326628 -> 325955 (-0.21%); split: -0.21%, +0.00%
CodeSize: 1726720 -> 1722500 (-0.24%); split: -0.24%, +0.00%
Latency: 5076631 -> 5075700 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 596010 -> 595598 (-0.07%); split: -0.07%, +0.00%
VClause: 3613 -> 3616 (+0.08%)
Copies: 24427 -> 24501 (+0.30%); split: -0.06%, +0.36%
VALU: 182468 -> 182029 (-0.24%); split: -0.24%, +0.00%
SALU: 55449 -> 55452 (+0.01%); split: -0.01%, +0.01%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39531>
2026-01-29 14:44:37 +00:00
Georg Lehmann
c3e12429c5 nir/opt_algebaric: improve a < 0.0 ? 0.0 : sqrt(a) pattern
Fix the NaN correctness of the original pattern, and add more variants.

Foz-DB Navi48:
Totals from 372 (0.45% of 82405) affected shaders:
Instrs: 208946 -> 207522 (-0.68%); split: -0.71%, +0.03%
CodeSize: 1116436 -> 1109804 (-0.59%); split: -0.61%, +0.02%
VGPRs: 19452 -> 19104 (-1.79%)
Latency: 1121222 -> 1120423 (-0.07%); split: -0.13%, +0.05%
InvThroughput: 158228 -> 157567 (-0.42%); split: -0.61%, +0.19%
VClause: 3695 -> 3704 (+0.24%)
Copies: 9516 -> 9606 (+0.95%); split: -0.24%, +1.19%
VALU: 118696 -> 118031 (-0.56%); split: -0.61%, +0.05%
VOPD: 380 -> 372 (-2.11%)

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39507>
2026-01-29 11:29:48 +00:00
Georg Lehmann
f872c13707 nir/opt_algebraic: use contract instead of inexact for more patterns
These use more precise operations, so contract is enough.

Foz-DB Navi48:
Totals from 248 (0.30% of 82405) affected shaders:
Instrs: 284686 -> 284318 (-0.13%); split: -0.14%, +0.01%
CodeSize: 1528856 -> 1527520 (-0.09%); split: -0.10%, +0.01%
Latency: 2368390 -> 2367345 (-0.04%); split: -0.06%, +0.01%
InvThroughput: 346623 -> 346335 (-0.08%); split: -0.09%, +0.01%
SClause: 6752 -> 6756 (+0.06%); split: -0.12%, +0.18%
Copies: 14685 -> 14694 (+0.06%); split: -0.01%, +0.07%
VALU: 179922 -> 179727 (-0.11%); split: -0.11%, +0.01%
SALU: 28706 -> 28707 (+0.00%)
VOPD: 1196 -> 1198 (+0.17%)

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39507>
2026-01-29 11:29:48 +00:00
Georg Lehmann
d8ef28671d nir/opt_algebraic: use correct syntax to create exact fsat
Fixes: 3b06824e4c ("nir/opt_algebraic: optimize some post peephole select patterns")

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39586>
2026-01-28 18:46:22 +00:00
Georg Lehmann
b2d9615000 nir/opt_algebraic: optimize bcsel to hi 16bits with undef lo
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>
2026-01-26 10:54:20 +00:00
Georg Lehmann
d06b627d23 nir/opt_algebraic: optimize f2f16_rtz of bcsel with constants
Foz-DB Navi48:
Totals from 145 (0.18% of 82405) affected shaders:
Instrs: 1706001 -> 1705669 (-0.02%); split: -0.03%, +0.01%
CodeSize: 9621036 -> 9620784 (-0.00%); split: -0.02%, +0.02%
SpillSGPRs: 711 -> 726 (+2.11%); split: -0.56%, +2.67%
Latency: 20066360 -> 20066193 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 4326789 -> 4326763 (-0.00%); split: -0.00%, +0.00%
Copies: 192041 -> 191995 (-0.02%); split: -0.03%, +0.01%
Branches: 75673 -> 75675 (+0.00%); split: -0.00%, +0.01%
VALU: 765163 -> 764835 (-0.04%); split: -0.05%, +0.00%
SALU: 351758 -> 351715 (-0.01%); split: -0.01%, +0.00%
VOPD: 65236 -> 65282 (+0.07%); split: +0.17%, -0.10%

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>
2026-01-26 10:54:20 +00:00
Georg Lehmann
ee5492e6dd nir/opt_algebraic: remove f2f16 roundtrip conversions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>
2026-01-26 10:54:20 +00:00
Georg Lehmann
592b6579da nir/opt_algebraic: optimize f2f16_rtz(min/max)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>
2026-01-26 10:54:18 +00:00
Georg Lehmann
2b92c0f06e nir/opt_algebraic: optimize f2f16_rtz(b2f(a))
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>
2026-01-26 10:54:18 +00:00
Emma Anholt
cdec063d37 nir/opt_algebraic: Fix a bit of imad24_ir3's optimization.
The mul is 24-bit sign-extended, so in simplifying we should retain that.
If nothing else, this keeps us on the happy path of mul24s.

I didn't fix the other broken pattern, since it's not really part of this
MR.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
2026-01-26 05:39:41 +00:00
Emma Anholt
e5a9eae2b5 nir/opt_algebraic_tests: Fix fuzzing levels for multi-component inputs.
We were enumerating enough for a single component, but not all the
combinations.  This helps show that our fdots fail pretty consistently.
And triggers more skipping from the fany_equal16s thanks to varied inputs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
2026-01-26 05:39:41 +00:00
Emma Anholt
7fd0287a89 nir/opt_algebraic_tests: Test !nir_fp_preserve_signed_zero behavior.
Iterate over a set of sign-flips for 0.0s to see if we can find a set that
makes the search and replace sides match.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
2026-01-26 05:39:41 +00:00
Emma Anholt
68f5bc4f12 nir/opt_algebraic_tests: Rename and use the enum result type more.
As I introduced another layer of iteration for signed zero testing, the
former logic got unwieldy.  In fact, it was already unwieldy enough that I
forgot to clear all_skipped when the assert failed, allowing a failing
test to be marked UNSUPPORTED instead of XFAIL.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
2026-01-26 05:39:40 +00:00
Emma Anholt
a90163a15a nir/opt_algebraic_tests: Add support for expression swizzles.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
2026-01-26 05:39:40 +00:00
Emma Anholt
c30c383d4d nir/opt_algebraic_tests: Allow testing of fdot*_replicated opcodes.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
2026-01-26 05:39:40 +00:00
Emma Anholt
173295adf4 nir/opt_algebraic_tests: Allow testing udiv_aligned_4.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
2026-01-26 05:39:40 +00:00
Emma Anholt
94237c3ea3 nir/opt_algebraic_tests: Allow testing mul/mad_relaxed opcodes.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
2026-01-26 05:39:40 +00:00
Emma Anholt
fd7754fba1 nir/opt_algebraic_tests: Allow testing imad24_ir3.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
2026-01-26 05:39:39 +00:00
Georg Lehmann
442daeb54a nir/opt_algebraic: use fcanonicalize
Mostly optimizations, some minor fixes but I don't
think they are worth backporting.

Foz-DB Navi21:
Totals from 7570 (9.21% of 82151) affected shaders:
MaxWaves: 204288 -> 204476 (+0.09%); split: +0.09%, -0.00%
Instrs: 4511439 -> 4500261 (-0.25%); split: -0.25%, +0.00%
CodeSize: 23727088 -> 23644388 (-0.35%); split: -0.35%, +0.00%
VGPRs: 290944 -> 290616 (-0.11%); split: -0.12%, +0.01%
SpillSGPRs: 1256 -> 1251 (-0.40%)
Latency: 16738072 -> 16726717 (-0.07%); split: -0.10%, +0.04%
InvThroughput: 3736856 -> 3716631 (-0.54%); split: -0.55%, +0.01%
VClause: 66150 -> 66156 (+0.01%); split: -0.05%, +0.06%
SClause: 93644 -> 93631 (-0.01%); split: -0.02%, +0.01%
Copies: 448816 -> 458584 (+2.18%); split: -0.05%, +2.22%
Branches: 139817 -> 139775 (-0.03%); split: -0.03%, +0.00%
PreSGPRs: 321922 -> 321900 (-0.01%); split: -0.01%, +0.00%
PreVGPRs: 239709 -> 238856 (-0.36%); split: -0.39%, +0.03%
VALU: 2595164 -> 2584250 (-0.42%); split: -0.43%, +0.01%
SALU: 839038 -> 838965 (-0.01%); split: -0.02%, +0.01%
VMEM: 137584 -> 137583 (-0.00%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39180>
2026-01-19 16:11:29 +00:00
Rhys Perry
625afb0d29 nir: add fcanonicalize
v2(Georg Lehmann):
Always remove fcanonicalize if denorms must be neither flushed nor preserved.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39180>
2026-01-19 16:11:29 +00:00
Eric Engestrom
30c2e6dbf2 nir/meson: drop redundant --build-tests in favour of just checking if --out-tests is set
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39350>
2026-01-16 16:55:21 +00:00
Eric Engestrom
246095da49 nir/meson: only try to generate the nir_opt_algebraic tests when requested
Anything listed in a meson target's `output` is expected to exist once
the command has run. If it's missing, meson/ninja will run the command
again to try to generate it, resulting in a ton of files getting
re-generated/re-compiled for no reason.

Fixes: 4c30c44b75 ("nir: Generate unit tests for nir_opt_algebraic")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14667
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39350>
2026-01-16 16:55:21 +00:00
Konstantin Seurer
4c30c44b75 nir: Generate unit tests for nir_opt_algebraic
This catches a number of bugs in the current NIR algebraic optimizations
or opcodes implementations (as fixed in this series, or documented in the
XFAIL tests), and should prevent many future bugs from landing.

This required bumping the test timeout, because s390x is very slow to
emulate in CI.

Closes: #3338
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076>
2026-01-15 19:09:43 +00:00
Emma Anholt
df215cc3cd nir/opt_algebraic_tests: Mark patterns as unsupported or xfails.
This way as a pattern author/editor you can immediately see whether it's
getting test coverage and if there are known issues with the pattern.
This will also give us clear outcomes from testing as we fix failing
patterns.

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076>
2026-01-15 19:09:43 +00:00
Georg Lehmann
93d05cdfd8 nir/opt_algebraic: move fsat last for fsqrt(fsat(a))
This should be exact, even for all special values:

fsqrt(NaN) -> NaN
fsqrt(-0.0) -> 0.0
fsqrt(-Inf) -> NaN
fsqrt(negative finite) -> NaN

So all of these get saturated to +0.0

All numbers >= 1.0 will have a square root >= 1.0,
which will be saturate to 1.0

Moving the fsat guarantees that it can use an output modifier
for hardware that has those, and shouldn't harm other hardware either.

Foz-DB Navi21:
Totals from 255 (0.31% of 82151) affected shaders:
Instrs: 664906 -> 664194 (-0.11%)
CodeSize: 3623500 -> 3619188 (-0.12%)
Latency: 11336397 -> 11335688 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 2716430 -> 2715726 (-0.03%); split: -0.03%, +0.00%
VALU: 442603 -> 441891 (-0.16%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39202>
2026-01-09 07:34:46 +00:00
Ian Romanick
d4a87e85b3 nir/algebraic: Add missing f on F-strings
Without this, nir_algebraic.py was treating "f2i{int_sz}_sat" as the
literal opcode name when it should have been "f2i8_sat" or similar.

Fixes: c49d6e0480 ("nir/algebraic: Elide range clamping of f2u sources")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39031>
2026-01-08 13:19:35 -08:00
Konstantin Seurer
a8224e3e00 nir/opt_algebraic: Do not emit patterns for 64bit booleans
Avoids assertion failures trying to constant-evaluate the pattern with the
new nir_opt_algebraic_pattern_tests.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39184>
2026-01-06 21:27:48 +00:00
Konstantin Seurer
211c7db8e3 nir/opt_algebraic: Remove a pattern for 8bit floats
Avoids assertion failures trying to constant-evaluate the pattern with the
new nir_opt_algebraic_pattern_tests.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39184>
2026-01-06 21:27:48 +00:00
Emma Anholt
afece95101 nir/opt_algebraic: Fix return type of fdot(vec(a, 0.0, ...), b).
The replace pattern was generating a vector when it should have been
scalar.  Fixes validation failures with the new algebraic unit tests.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39184>
2026-01-06 21:27:47 +00:00
Georg Lehmann
c8ce0df2d2 nir/opt_algebraic: replace is_negative_zero with constant -0.0
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Now that nir_search respects the sign of zero, we don't need
a manual helper for this.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39123>
2026-01-03 12:42:23 +00:00
Georg Lehmann
7d2a946730 nir/opt_algebraic: canonicalize scmp with -0.0
We already do this for non fused comparisons.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39123>
2026-01-03 12:42:23 +00:00