Georg Lehmann
b8d1763e0a
nir/opt_algebraic: make some more fcmp patterns exact using nnan
...
No Foz-DB changes.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
8d52c59505
nir/opt_algebraic: make some fmin/fmax/fsat patterns exact using nsz/nnan
...
Foz-DB Navi48:
Totals from 90 (0.11% of 82405) affected shaders:
Instrs: 52109 -> 52032 (-0.15%); split: -0.16%, +0.01%
CodeSize: 263916 -> 263900 (-0.01%); split: -0.05%, +0.05%
Latency: 504693 -> 504775 (+0.02%); split: -0.01%, +0.03%
InvThroughput: 81444 -> 81157 (-0.35%)
Copies: 2894 -> 2895 (+0.03%)
VALU: 30097 -> 29991 (-0.35%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
486ea54184
nir/opt_algebraic: make bcsel(fcmp(b, a), b, a) -> fmin/fmax patterns exact
...
These patterns need is_only_used_as_float because fmin/fmax might change NaN
patterns, while bcsel is bit exact. For the same reason, the replacement
must not add undefined results, so make the replacement NaN/inf preserving.
It's impossible to make them signed zero correct (-0.0 == +0.0),
so it's also important that the user alu doesn't care.
Otherwise, the only thing that matters is is whether a is NaN.
Foz-DB Navi48:
Totals from 453 (0.55% of 82405) affected shaders:
MaxWaves: 8242 -> 8270 (+0.34%)
Instrs: 2382059 -> 2380094 (-0.08%); split: -0.09%, +0.00%
CodeSize: 13197208 -> 13179488 (-0.13%); split: -0.14%, +0.00%
VGPRs: 44688 -> 44604 (-0.19%)
Latency: 22839894 -> 22838985 (-0.00%); split: -0.01%, +0.00%
InvThroughput: 4873352 -> 4872924 (-0.01%)
VClause: 50862 -> 50883 (+0.04%); split: -0.02%, +0.06%
SClause: 54000 -> 53993 (-0.01%)
Copies: 250215 -> 250233 (+0.01%); split: -0.00%, +0.01%
PreVGPRs: 39694 -> 39620 (-0.19%)
VALU: 1116881 -> 1116073 (-0.07%); split: -0.07%, +0.00%
SALU: 492799 -> 492139 (-0.13%); split: -0.14%, +0.00%
VOPD: 85457 -> 85461 (+0.00%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
f55668bb50
nir/opt_algebraic: update flt -> fneu patterns
...
And remove the ones that are redundant because we already move the fneg to
the constant source.
No Foz-DB changes.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
15b13d5fd4
nir/opt_algebraic: optimize flt/fge(#c, fadd(a, #b))
...
I guess these were missing because the author forgot flt/fge aren't commutative.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
2355b63cb5
nir/opt_algebraic: use better float control for some fcmp patterns
...
Foz-DB Navi48:
Totals from 1084 (1.32% of 82405) affected shaders:
Instrs: 1969973 -> 1968947 (-0.05%); split: -0.08%, +0.02%
CodeSize: 11349704 -> 11344884 (-0.04%); split: -0.06%, +0.02%
VGPRs: 59076 -> 59064 (-0.02%); split: -0.06%, +0.04%
Latency: 20766031 -> 20755032 (-0.05%); split: -0.07%, +0.01%
InvThroughput: 2849402 -> 2846733 (-0.09%); split: -0.10%, +0.01%
VClause: 40736 -> 40740 (+0.01%)
SClause: 91835 -> 91832 (-0.00%)
Copies: 217961 -> 217868 (-0.04%); split: -0.07%, +0.02%
Branches: 60045 -> 60031 (-0.02%)
PreSGPRs: 50639 -> 50618 (-0.04%); split: -0.06%, +0.02%
PreVGPRs: 39593 -> 39590 (-0.01%); split: -0.01%, +0.01%
VALU: 960270 -> 959524 (-0.08%); split: -0.10%, +0.02%
SALU: 326638 -> 326680 (+0.01%); split: -0.04%, +0.06%
VOPD: 23963 -> 23929 (-0.14%); split: +0.04%, -0.18%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
7238888d93
nir/opt_algebraic: remove redundant patterns with fcmp(fneg(...), #c)
...
We already have patterns to move the negation to the constant.
No Foz-DB changes.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
03c497f236
nir/opt_algebraic: make 1.0 - fsat(a) -> fsat(1.0 - a) pattern exact using nnan
...
Foz-DB Navi48:
Totals from 50 (0.06% of 82405) affected shaders:
CodeSize: 137072 -> 137456 (+0.28%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:02 +00:00
Georg Lehmann
79e4530a9b
nir/opt_algebraic: make pattern pushing fmul into bcsel exact
...
The only special case here is d == -0.0.
Foz-DB Navi48:
Totals from 3 (0.00% of 82405) affected shaders:
CodeSize: 29140 -> 29188 (+0.16%)
InvThroughput: 2945 -> 2951 (+0.20%)
VALU: 3217 -> 3223 (+0.19%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:02 +00:00
Georg Lehmann
a3bc94a3d0
nir/opt_algebraic: remove inexact from floor->trunc pattern
...
This was marked inexact because of me in !21475 , but I don't see why now,
even after checking all the special values.
No Foz-DB changes.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:02 +00:00
Georg Lehmann
da7abb1337
nir/opt_algebraic: mark fmulz(finite, finite) -> fmul pattern as nsz
...
No Foz-DB chagnes.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:02 +00:00
Georg Lehmann
ea87f1f9bc
nir/opt_algebraic: add a - a with nnan
...
Foz-DB Navi48:
Totals from 576 (0.70% of 82405) affected shaders:
MaxWaves: 16706 -> 16726 (+0.12%)
Instrs: 618677 -> 580965 (-6.10%); split: -6.10%, +0.00%
CodeSize: 3022552 -> 2861612 (-5.32%); split: -5.33%, +0.00%
VGPRs: 28008 -> 28860 (+3.04%); split: -0.51%, +3.56%
Latency: 2689318 -> 2655887 (-1.24%); split: -1.25%, +0.01%
InvThroughput: 403512 -> 393404 (-2.51%); split: -2.51%, +0.00%
VClause: 7584 -> 7577 (-0.09%); split: -0.17%, +0.08%
SClause: 19974 -> 19086 (-4.45%); split: -4.48%, +0.03%
Copies: 43862 -> 40888 (-6.78%); split: -6.87%, +0.09%
Branches: 12457 -> 11407 (-8.43%)
PreSGPRs: 28315 -> 27046 (-4.48%); split: -4.53%, +0.05%
PreVGPRs: 20751 -> 19397 (-6.52%)
VALU: 317224 -> 290151 (-8.53%); split: -8.53%, +0.00%
SALU: 124297 -> 121347 (-2.37%); split: -2.39%, +0.02%
VMEM: 11918 -> 11907 (-0.09%)
SMEM: 27582 -> 26241 (-4.86%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:02 +00:00
Georg Lehmann
16db9f79d1
nir/opt_algebraic: remove inexact a * 0.0 patterns
...
We already have some with nnan,nsz.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:02 +00:00
Georg Lehmann
e443229644
nir/opt_algebraic: mark newly created fmulz nan/inf preserving
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:02 +00:00
Georg Lehmann
b678899ef8
nir/opt_algebraic: use nan/inf/sz preserve flags instead of exact for cmp/min/max replacement
...
And remove some, because they should be covered by the search pattern anyway.
Foz-DB Navi48:
Totals from 560 (0.68% of 82405) affected shaders:
MaxWaves: 11279 -> 11291 (+0.11%)
Instrs: 5214229 -> 5214386 (+0.00%); split: -0.02%, +0.02%
CodeSize: 29613884 -> 29616740 (+0.01%); split: -0.01%, +0.02%
VGPRs: 50400 -> 50328 (-0.14%)
Latency: 36481700 -> 36481157 (-0.00%); split: -0.01%, +0.01%
InvThroughput: 7309905 -> 7307905 (-0.03%); split: -0.05%, +0.02%
VClause: 131423 -> 131424 (+0.00%); split: -0.00%, +0.00%
SClause: 111485 -> 111499 (+0.01%); split: -0.00%, +0.01%
Copies: 441899 -> 442029 (+0.03%); split: -0.02%, +0.05%
Branches: 165599 -> 165597 (-0.00%)
PreVGPRs: 43558 -> 43525 (-0.08%)
VALU: 2573609 -> 2573324 (-0.01%); split: -0.03%, +0.02%
SALU: 851172 -> 851271 (+0.01%); split: -0.01%, +0.02%
VOPD: 366409 -> 366934 (+0.14%); split: +0.23%, -0.08%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:02 +00:00
Georg Lehmann
45cb1d3b6f
nir/opt_algebraic: remove unpack_half_2x16_split
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511 >
2026-02-06 06:12:36 +00:00
Georg Lehmann
b18d9c1b33
nir/opt_algebraic: optimize unpack_32_2x16 of extract
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511 >
2026-02-06 06:12:36 +00:00
Marek Olšák
a3f022d0a2
nir: reassociate a $op (b ? #c : #d) for div, mod, rem
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This eliminates expensive div, mod, rem opcodes with non-constant src1 being
constant src1 hiding behind bcsel.
gcc and LLVM are missing this.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39560 >
2026-02-02 21:34:48 +00:00
Georg Lehmann
ad6f8291bf
nir/opt_algebraic: rework ignore_exact to work like other internal conditions
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39616 >
2026-01-31 15:30:25 +00:00
Georg Lehmann
70f0e75262
nir/opt_algebraic: optimize pack_half_2x16_rtz of float converted from 16bit
...
Foz-DB Navi48:
Totals from 177 (0.21% of 82405) affected shaders:
Instrs: 326628 -> 325955 (-0.21%); split: -0.21%, +0.00%
CodeSize: 1726720 -> 1722500 (-0.24%); split: -0.24%, +0.00%
Latency: 5076631 -> 5075700 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 596010 -> 595598 (-0.07%); split: -0.07%, +0.00%
VClause: 3613 -> 3616 (+0.08%)
Copies: 24427 -> 24501 (+0.30%); split: -0.06%, +0.36%
VALU: 182468 -> 182029 (-0.24%); split: -0.24%, +0.00%
SALU: 55449 -> 55452 (+0.01%); split: -0.01%, +0.01%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39531 >
2026-01-29 14:44:37 +00:00
Georg Lehmann
c3e12429c5
nir/opt_algebaric: improve a < 0.0 ? 0.0 : sqrt(a) pattern
...
Fix the NaN correctness of the original pattern, and add more variants.
Foz-DB Navi48:
Totals from 372 (0.45% of 82405) affected shaders:
Instrs: 208946 -> 207522 (-0.68%); split: -0.71%, +0.03%
CodeSize: 1116436 -> 1109804 (-0.59%); split: -0.61%, +0.02%
VGPRs: 19452 -> 19104 (-1.79%)
Latency: 1121222 -> 1120423 (-0.07%); split: -0.13%, +0.05%
InvThroughput: 158228 -> 157567 (-0.42%); split: -0.61%, +0.19%
VClause: 3695 -> 3704 (+0.24%)
Copies: 9516 -> 9606 (+0.95%); split: -0.24%, +1.19%
VALU: 118696 -> 118031 (-0.56%); split: -0.61%, +0.05%
VOPD: 380 -> 372 (-2.11%)
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39507 >
2026-01-29 11:29:48 +00:00
Georg Lehmann
f872c13707
nir/opt_algebraic: use contract instead of inexact for more patterns
...
These use more precise operations, so contract is enough.
Foz-DB Navi48:
Totals from 248 (0.30% of 82405) affected shaders:
Instrs: 284686 -> 284318 (-0.13%); split: -0.14%, +0.01%
CodeSize: 1528856 -> 1527520 (-0.09%); split: -0.10%, +0.01%
Latency: 2368390 -> 2367345 (-0.04%); split: -0.06%, +0.01%
InvThroughput: 346623 -> 346335 (-0.08%); split: -0.09%, +0.01%
SClause: 6752 -> 6756 (+0.06%); split: -0.12%, +0.18%
Copies: 14685 -> 14694 (+0.06%); split: -0.01%, +0.07%
VALU: 179922 -> 179727 (-0.11%); split: -0.11%, +0.01%
SALU: 28706 -> 28707 (+0.00%)
VOPD: 1196 -> 1198 (+0.17%)
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39507 >
2026-01-29 11:29:48 +00:00
Georg Lehmann
d8ef28671d
nir/opt_algebraic: use correct syntax to create exact fsat
...
Fixes: 3b06824e4c ("nir/opt_algebraic: optimize some post peephole select patterns")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39586 >
2026-01-28 18:46:22 +00:00
Georg Lehmann
b2d9615000
nir/opt_algebraic: optimize bcsel to hi 16bits with undef lo
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412 >
2026-01-26 10:54:20 +00:00
Georg Lehmann
d06b627d23
nir/opt_algebraic: optimize f2f16_rtz of bcsel with constants
...
Foz-DB Navi48:
Totals from 145 (0.18% of 82405) affected shaders:
Instrs: 1706001 -> 1705669 (-0.02%); split: -0.03%, +0.01%
CodeSize: 9621036 -> 9620784 (-0.00%); split: -0.02%, +0.02%
SpillSGPRs: 711 -> 726 (+2.11%); split: -0.56%, +2.67%
Latency: 20066360 -> 20066193 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 4326789 -> 4326763 (-0.00%); split: -0.00%, +0.00%
Copies: 192041 -> 191995 (-0.02%); split: -0.03%, +0.01%
Branches: 75673 -> 75675 (+0.00%); split: -0.00%, +0.01%
VALU: 765163 -> 764835 (-0.04%); split: -0.05%, +0.00%
SALU: 351758 -> 351715 (-0.01%); split: -0.01%, +0.00%
VOPD: 65236 -> 65282 (+0.07%); split: +0.17%, -0.10%
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412 >
2026-01-26 10:54:20 +00:00
Georg Lehmann
ee5492e6dd
nir/opt_algebraic: remove f2f16 roundtrip conversions
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412 >
2026-01-26 10:54:20 +00:00
Georg Lehmann
592b6579da
nir/opt_algebraic: optimize f2f16_rtz(min/max)
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412 >
2026-01-26 10:54:18 +00:00
Georg Lehmann
2b92c0f06e
nir/opt_algebraic: optimize f2f16_rtz(b2f(a))
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412 >
2026-01-26 10:54:18 +00:00
Emma Anholt
cdec063d37
nir/opt_algebraic: Fix a bit of imad24_ir3's optimization.
...
The mul is 24-bit sign-extended, so in simplifying we should retain that.
If nothing else, this keeps us on the happy path of mul24s.
I didn't fix the other broken pattern, since it's not really part of this
MR.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369 >
2026-01-26 05:39:41 +00:00
Emma Anholt
e5a9eae2b5
nir/opt_algebraic_tests: Fix fuzzing levels for multi-component inputs.
...
We were enumerating enough for a single component, but not all the
combinations. This helps show that our fdots fail pretty consistently.
And triggers more skipping from the fany_equal16s thanks to varied inputs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369 >
2026-01-26 05:39:41 +00:00
Emma Anholt
7fd0287a89
nir/opt_algebraic_tests: Test !nir_fp_preserve_signed_zero behavior.
...
Iterate over a set of sign-flips for 0.0s to see if we can find a set that
makes the search and replace sides match.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369 >
2026-01-26 05:39:41 +00:00
Emma Anholt
68f5bc4f12
nir/opt_algebraic_tests: Rename and use the enum result type more.
...
As I introduced another layer of iteration for signed zero testing, the
former logic got unwieldy. In fact, it was already unwieldy enough that I
forgot to clear all_skipped when the assert failed, allowing a failing
test to be marked UNSUPPORTED instead of XFAIL.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369 >
2026-01-26 05:39:40 +00:00
Emma Anholt
a90163a15a
nir/opt_algebraic_tests: Add support for expression swizzles.
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369 >
2026-01-26 05:39:40 +00:00
Emma Anholt
c30c383d4d
nir/opt_algebraic_tests: Allow testing of fdot*_replicated opcodes.
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369 >
2026-01-26 05:39:40 +00:00
Emma Anholt
173295adf4
nir/opt_algebraic_tests: Allow testing udiv_aligned_4.
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369 >
2026-01-26 05:39:40 +00:00
Emma Anholt
94237c3ea3
nir/opt_algebraic_tests: Allow testing mul/mad_relaxed opcodes.
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369 >
2026-01-26 05:39:40 +00:00
Emma Anholt
fd7754fba1
nir/opt_algebraic_tests: Allow testing imad24_ir3.
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369 >
2026-01-26 05:39:39 +00:00
Georg Lehmann
442daeb54a
nir/opt_algebraic: use fcanonicalize
...
Mostly optimizations, some minor fixes but I don't
think they are worth backporting.
Foz-DB Navi21:
Totals from 7570 (9.21% of 82151) affected shaders:
MaxWaves: 204288 -> 204476 (+0.09%); split: +0.09%, -0.00%
Instrs: 4511439 -> 4500261 (-0.25%); split: -0.25%, +0.00%
CodeSize: 23727088 -> 23644388 (-0.35%); split: -0.35%, +0.00%
VGPRs: 290944 -> 290616 (-0.11%); split: -0.12%, +0.01%
SpillSGPRs: 1256 -> 1251 (-0.40%)
Latency: 16738072 -> 16726717 (-0.07%); split: -0.10%, +0.04%
InvThroughput: 3736856 -> 3716631 (-0.54%); split: -0.55%, +0.01%
VClause: 66150 -> 66156 (+0.01%); split: -0.05%, +0.06%
SClause: 93644 -> 93631 (-0.01%); split: -0.02%, +0.01%
Copies: 448816 -> 458584 (+2.18%); split: -0.05%, +2.22%
Branches: 139817 -> 139775 (-0.03%); split: -0.03%, +0.00%
PreSGPRs: 321922 -> 321900 (-0.01%); split: -0.01%, +0.00%
PreVGPRs: 239709 -> 238856 (-0.36%); split: -0.39%, +0.03%
VALU: 2595164 -> 2584250 (-0.42%); split: -0.43%, +0.01%
SALU: 839038 -> 838965 (-0.01%); split: -0.02%, +0.01%
VMEM: 137584 -> 137583 (-0.00%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39180 >
2026-01-19 16:11:29 +00:00
Rhys Perry
625afb0d29
nir: add fcanonicalize
...
v2(Georg Lehmann):
Always remove fcanonicalize if denorms must be neither flushed nor preserved.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39180 >
2026-01-19 16:11:29 +00:00
Eric Engestrom
30c2e6dbf2
nir/meson: drop redundant --build-tests in favour of just checking if --out-tests is set
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39350 >
2026-01-16 16:55:21 +00:00
Eric Engestrom
246095da49
nir/meson: only try to generate the nir_opt_algebraic tests when requested
...
Anything listed in a meson target's `output` is expected to exist once
the command has run. If it's missing, meson/ninja will run the command
again to try to generate it, resulting in a ton of files getting
re-generated/re-compiled for no reason.
Fixes: 4c30c44b75 ("nir: Generate unit tests for nir_opt_algebraic")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14667
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39350 >
2026-01-16 16:55:21 +00:00
Konstantin Seurer
4c30c44b75
nir: Generate unit tests for nir_opt_algebraic
...
This catches a number of bugs in the current NIR algebraic optimizations
or opcodes implementations (as fixed in this series, or documented in the
XFAIL tests), and should prevent many future bugs from landing.
This required bumping the test timeout, because s390x is very slow to
emulate in CI.
Closes : #3338
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076 >
2026-01-15 19:09:43 +00:00
Emma Anholt
df215cc3cd
nir/opt_algebraic_tests: Mark patterns as unsupported or xfails.
...
This way as a pattern author/editor you can immediately see whether it's
getting test coverage and if there are known issues with the pattern.
This will also give us clear outcomes from testing as we fix failing
patterns.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076 >
2026-01-15 19:09:43 +00:00
Georg Lehmann
93d05cdfd8
nir/opt_algebraic: move fsat last for fsqrt(fsat(a))
...
This should be exact, even for all special values:
fsqrt(NaN) -> NaN
fsqrt(-0.0) -> 0.0
fsqrt(-Inf) -> NaN
fsqrt(negative finite) -> NaN
So all of these get saturated to +0.0
All numbers >= 1.0 will have a square root >= 1.0,
which will be saturate to 1.0
Moving the fsat guarantees that it can use an output modifier
for hardware that has those, and shouldn't harm other hardware either.
Foz-DB Navi21:
Totals from 255 (0.31% of 82151) affected shaders:
Instrs: 664906 -> 664194 (-0.11%)
CodeSize: 3623500 -> 3619188 (-0.12%)
Latency: 11336397 -> 11335688 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 2716430 -> 2715726 (-0.03%); split: -0.03%, +0.00%
VALU: 442603 -> 441891 (-0.16%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39202 >
2026-01-09 07:34:46 +00:00
Ian Romanick
d4a87e85b3
nir/algebraic: Add missing f on F-strings
...
Without this, nir_algebraic.py was treating "f2i{int_sz}_sat" as the
literal opcode name when it should have been "f2i8_sat" or similar.
Fixes: c49d6e0480 ("nir/algebraic: Elide range clamping of f2u sources")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39031 >
2026-01-08 13:19:35 -08:00
Konstantin Seurer
a8224e3e00
nir/opt_algebraic: Do not emit patterns for 64bit booleans
...
Avoids assertion failures trying to constant-evaluate the pattern with the
new nir_opt_algebraic_pattern_tests.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39184 >
2026-01-06 21:27:48 +00:00
Konstantin Seurer
211c7db8e3
nir/opt_algebraic: Remove a pattern for 8bit floats
...
Avoids assertion failures trying to constant-evaluate the pattern with the
new nir_opt_algebraic_pattern_tests.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39184 >
2026-01-06 21:27:48 +00:00
Emma Anholt
afece95101
nir/opt_algebraic: Fix return type of fdot(vec(a, 0.0, ...), b).
...
The replace pattern was generating a vector when it should have been
scalar. Fixes validation failures with the new algebraic unit tests.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39184 >
2026-01-06 21:27:47 +00:00
Georg Lehmann
c8ce0df2d2
nir/opt_algebraic: replace is_negative_zero with constant -0.0
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Now that nir_search respects the sign of zero, we don't need
a manual helper for this.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39123 >
2026-01-03 12:42:23 +00:00
Georg Lehmann
7d2a946730
nir/opt_algebraic: canonicalize scmp with -0.0
...
We already do this for non fused comparisons.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39123 >
2026-01-03 12:42:23 +00:00