Commit graph

732 commits

Author SHA1 Message Date
Georg Lehmann
ef6f5377da nir/opt_algebraic: remove fcmp+fneg patterns that are cleaned up earlier
No Foz-DB changes, as expected.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>
2026-03-02 15:24:36 +00:00
Georg Lehmann
a5334ec239 nir/opt_algebraic: generalize late fcmp(fneg(a), const) patterns
No reason just to do this for 1.0.

Foz-DB Navi48:
Totals from 44 (0.04% of 114655) affected shaders:
CodeSize: 111620 -> 111476 (-0.13%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>
2026-03-02 15:24:35 +00:00
Georg Lehmann
6b464785b9 nir/opt_algebraic: optimize d3d9 iand(a, inot(b))
Foz-DB GFX1201:
Totals from 24 (0.02% of 112525) affected shaders:
Instrs: 15598 -> 15426 (-1.10%); split: -1.17%, +0.06%
CodeSize: 88716 -> 88260 (-0.51%); split: -0.98%, +0.46%
Latency: 54419 -> 53965 (-0.83%); split: -0.91%, +0.08%
InvThroughput: 10294 -> 10166 (-1.24%); split: -1.28%, +0.04%
VClause: 302 -> 300 (-0.66%)
SClause: 367 -> 363 (-1.09%); split: -1.63%, +0.54%
Copies: 712 -> 705 (-0.98%); split: -3.09%, +2.11%
PreSGPRs: 1402 -> 1424 (+1.57%); split: -0.14%, +1.71%
PreVGPRs: 850 -> 848 (-0.24%)
VALU: 9730 -> 9591 (-1.43%)
SALU: 1579 -> 1649 (+4.43%)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40104>
2026-02-26 14:44:01 +00:00
Georg Lehmann
a3f9c347bf nir/opt_algebraic: optimize b2f(a) - 1.0 to -b2f(a)
Foz-DB GFX1201:
Totals from 81 (0.07% of 112525) affected shaders:
Instrs: 95048 -> 94965 (-0.09%); split: -0.13%, +0.05%
CodeSize: 532148 -> 531864 (-0.05%); split: -0.09%, +0.04%
SpillSGPRs: 122 -> 125 (+2.46%)
Latency: 440372 -> 440402 (+0.01%); split: -0.02%, +0.03%
InvThroughput: 296078 -> 296173 (+0.03%); split: -0.03%, +0.06%
VClause: 1449 -> 1456 (+0.48%); split: -0.21%, +0.69%
SClause: 2249 -> 2256 (+0.31%); split: -0.09%, +0.40%
Copies: 3956 -> 3965 (+0.23%); split: -0.10%, +0.33%
PreVGPRs: 2900 -> 2899 (-0.03%)
VALU: 61212 -> 61098 (-0.19%); split: -0.19%, +0.01%
SALU: 6970 -> 6981 (+0.16%)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40104>
2026-02-26 14:44:01 +00:00
Alyssa Rosenzweig
42c4f7935a nir: optimize u2u32(unpack_32_2x16_split_*)
Noticed while playing with pixel coord things.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40056>
2026-02-24 19:16:56 +00:00
Georg Lehmann
5d5f99bfe8 nir/opt_algebraic: create more b2f if sign of zero doesn't matter
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39966>
2026-02-19 15:21:27 +00:00
Georg Lehmann
d87943ad3d nir/opt_algebraic: preserve signed zero when creating new b2f
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39966>
2026-02-19 15:21:27 +00:00
Rob Clark
8cc99edb7b nir: Fill in missing conversion opts
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
I noticed we were missing:

    (('u2f16', ('u2u64', 'a@32')), ('u2f16', a))

This was do to coupling the u2f/i2f opts with i2i/u2u in the same loop
(with different positionals).  The `if B <= S\ncontinue` doesn't apply
to the second part.  So just split these into two loops.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14848
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39899>
2026-02-18 15:13:21 +00:00
Rhys Perry
fd22c48b2a nir/algebraic: remove ignore_exact
This was used because the exact bit meant something different for
comparisons than it did for the replacement expression, but that isn't the
case anymore.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39809>
2026-02-18 14:04:22 +00:00
Georg Lehmann
6a662a59b7 nir/opt_algebraic: optimize 1.0 - b2f(a) to b2f(inot(a))
Which can then be cleaned up further.

Foz-DB Navi48:
Totals from 4156 (3.62% of 114655) affected shaders:
MaxWaves: 102580 -> 102620 (+0.04%)
Instrs: 11696222 -> 11679986 (-0.14%); split: -0.16%, +0.02%
CodeSize: 64452544 -> 64379204 (-0.11%); split: -0.13%, +0.02%
VGPRs: 288256 -> 288172 (-0.03%)
SpillSGPRs: 7290 -> 7297 (+0.10%)
Latency: 160690992 -> 160643825 (-0.03%); split: -0.05%, +0.02%
InvThroughput: 26869332 -> 26849963 (-0.07%); split: -0.09%, +0.02%
VClause: 237078 -> 237003 (-0.03%); split: -0.04%, +0.01%
SClause: 270560 -> 270564 (+0.00%); split: -0.01%, +0.01%
Copies: 936165 -> 937970 (+0.19%); split: -0.07%, +0.26%
Branches: 302981 -> 302992 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 244967 -> 245303 (+0.14%)
PreVGPRs: 232930 -> 232886 (-0.02%); split: -0.02%, +0.00%
VALU: 6200283 -> 6187264 (-0.21%); split: -0.23%, +0.02%
SALU: 1759176 -> 1760275 (+0.06%); split: -0.10%, +0.16%
VOPD: 447502 -> 446194 (-0.29%); split: +0.14%, -0.43%

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39917>
2026-02-17 10:01:21 +00:00
Georg Lehmann
f7222d6939 nir/opt_algebraic: remove few uses of integer nir_analyze_range
Surprisingly, this has an effect on GFX1201:

Totals from 66 (0.08% of 82405) affected shaders:
Instrs: 200725 -> 201517 (+0.39%)
CodeSize: 978676 -> 981488 (+0.29%)
Latency: 291736 -> 291760 (+0.01%)
InvThroughput: 31556 -> 31604 (+0.15%)
Copies: 11928 -> 12588 (+5.53%)
Branches: 14850 -> 15048 (+1.33%)
SALU: 68981 -> 69509 (+0.77%)

I say surprisingly, because nir_analyze_range handles nothing but
constants and bcsel for integers. Maybe rdr2 is actually
hitting some weird bcsel(a, #b, #c) == 0 case where b and c are not 0?
No, I looked at a few of those shaders, and it's just noise from changed
instruction order.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39756>
2026-02-16 18:08:53 +00:00
Georg Lehmann
4e2f1345d8 nir/opt_algebraic: make fcmp(a+b, 0.0) -> fcmp(a, -b) exact using ninf
And remove some cases that never happen because we remove fneg on compare with constants.

Foz-DB Navi48:
Totals from 1305 (1.58% of 82405) affected shaders:
MaxWaves: 32872 -> 32854 (-0.05%)
Instrs: 4554013 -> 4551638 (-0.05%); split: -0.06%, +0.01%
CodeSize: 25269108 -> 25255428 (-0.05%); split: -0.06%, +0.00%
VGPRs: 87660 -> 87732 (+0.08%)
Latency: 33291152 -> 33285023 (-0.02%); split: -0.03%, +0.01%
InvThroughput: 8965288 -> 8963071 (-0.02%); split: -0.03%, +0.00%
VClause: 104008 -> 103947 (-0.06%); split: -0.09%, +0.03%
SClause: 97577 -> 97574 (-0.00%); split: -0.01%, +0.00%
Copies: 372741 -> 372628 (-0.03%); split: -0.05%, +0.02%
Branches: 134076 -> 134072 (-0.00%)
PreSGPRs: 65109 -> 65110 (+0.00%); split: -0.00%, +0.00%
PreVGPRs: 68911 -> 68968 (+0.08%); split: -0.01%, +0.10%
VALU: 2247091 -> 2245815 (-0.06%); split: -0.07%, +0.01%
SALU: 810190 -> 810001 (-0.02%); split: -0.02%, +0.00%
VOPD: 205075 -> 205016 (-0.03%); split: +0.04%, -0.07%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
ef7dd040d9 nir/opt_algebraic: make a < 0.0 ? -a : a exact using search helpers
Foz-DB Navi21:
Totals from 104 (0.13% of 82405) affected shaders:
Instrs: 175964 -> 175514 (-0.26%); split: -0.26%, +0.00%
CodeSize: 909008 -> 908744 (-0.03%); split: -0.05%, +0.02%
Latency: 1515203 -> 1514560 (-0.04%); split: -0.05%, +0.01%
InvThroughput: 308751 -> 308573 (-0.06%); split: -0.06%, +0.00%
Copies: 10318 -> 10315 (-0.03%); split: -0.06%, +0.03%
PreVGPRs: 5767 -> 5755 (-0.21%)
VALU: 108151 -> 107745 (-0.38%)
VOPD: 738 -> 737 (-0.14%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
0474ad1504 nir/opt_algebraic: make ffract(is_integral) exact using nnan
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
b8d1763e0a nir/opt_algebraic: make some more fcmp patterns exact using nnan
No Foz-DB changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
8d52c59505 nir/opt_algebraic: make some fmin/fmax/fsat patterns exact using nsz/nnan
Foz-DB Navi48:
Totals from 90 (0.11% of 82405) affected shaders:
Instrs: 52109 -> 52032 (-0.15%); split: -0.16%, +0.01%
CodeSize: 263916 -> 263900 (-0.01%); split: -0.05%, +0.05%
Latency: 504693 -> 504775 (+0.02%); split: -0.01%, +0.03%
InvThroughput: 81444 -> 81157 (-0.35%)
Copies: 2894 -> 2895 (+0.03%)
VALU: 30097 -> 29991 (-0.35%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
486ea54184 nir/opt_algebraic: make bcsel(fcmp(b, a), b, a) -> fmin/fmax patterns exact
These patterns need is_only_used_as_float because fmin/fmax might change NaN
patterns, while bcsel is bit exact. For the same reason, the replacement
must not add undefined results, so make the replacement NaN/inf preserving.

It's impossible to make them signed zero correct (-0.0 == +0.0),
so it's also important that the user alu doesn't care.

Otherwise, the only thing that matters is is whether a is NaN.

Foz-DB Navi48:
Totals from 453 (0.55% of 82405) affected shaders:
MaxWaves: 8242 -> 8270 (+0.34%)
Instrs: 2382059 -> 2380094 (-0.08%); split: -0.09%, +0.00%
CodeSize: 13197208 -> 13179488 (-0.13%); split: -0.14%, +0.00%
VGPRs: 44688 -> 44604 (-0.19%)
Latency: 22839894 -> 22838985 (-0.00%); split: -0.01%, +0.00%
InvThroughput: 4873352 -> 4872924 (-0.01%)
VClause: 50862 -> 50883 (+0.04%); split: -0.02%, +0.06%
SClause: 54000 -> 53993 (-0.01%)
Copies: 250215 -> 250233 (+0.01%); split: -0.00%, +0.01%
PreVGPRs: 39694 -> 39620 (-0.19%)
VALU: 1116881 -> 1116073 (-0.07%); split: -0.07%, +0.00%
SALU: 492799 -> 492139 (-0.13%); split: -0.14%, +0.00%
VOPD: 85457 -> 85461 (+0.00%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
f55668bb50 nir/opt_algebraic: update flt -> fneu patterns
And remove the ones that are redundant because we already move the fneg to
the constant source.

No Foz-DB changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
15b13d5fd4 nir/opt_algebraic: optimize flt/fge(#c, fadd(a, #b))
I guess these were missing because the author forgot flt/fge aren't commutative.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
2355b63cb5 nir/opt_algebraic: use better float control for some fcmp patterns
Foz-DB Navi48:
Totals from 1084 (1.32% of 82405) affected shaders:
Instrs: 1969973 -> 1968947 (-0.05%); split: -0.08%, +0.02%
CodeSize: 11349704 -> 11344884 (-0.04%); split: -0.06%, +0.02%
VGPRs: 59076 -> 59064 (-0.02%); split: -0.06%, +0.04%
Latency: 20766031 -> 20755032 (-0.05%); split: -0.07%, +0.01%
InvThroughput: 2849402 -> 2846733 (-0.09%); split: -0.10%, +0.01%
VClause: 40736 -> 40740 (+0.01%)
SClause: 91835 -> 91832 (-0.00%)
Copies: 217961 -> 217868 (-0.04%); split: -0.07%, +0.02%
Branches: 60045 -> 60031 (-0.02%)
PreSGPRs: 50639 -> 50618 (-0.04%); split: -0.06%, +0.02%
PreVGPRs: 39593 -> 39590 (-0.01%); split: -0.01%, +0.01%
VALU: 960270 -> 959524 (-0.08%); split: -0.10%, +0.02%
SALU: 326638 -> 326680 (+0.01%); split: -0.04%, +0.06%
VOPD: 23963 -> 23929 (-0.14%); split: +0.04%, -0.18%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
7238888d93 nir/opt_algebraic: remove redundant patterns with fcmp(fneg(...), #c)
We already have patterns to move the negation to the constant.

No Foz-DB changes.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
03c497f236 nir/opt_algebraic: make 1.0 - fsat(a) -> fsat(1.0 - a) pattern exact using nnan
Foz-DB Navi48:
Totals from 50 (0.06% of 82405) affected shaders:
CodeSize: 137072 -> 137456 (+0.28%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
79e4530a9b nir/opt_algebraic: make pattern pushing fmul into bcsel exact
The only special case here is d == -0.0.

Foz-DB Navi48:
Totals from 3 (0.00% of 82405) affected shaders:
CodeSize: 29140 -> 29188 (+0.16%)
InvThroughput: 2945 -> 2951 (+0.20%)
VALU: 3217 -> 3223 (+0.19%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
a3bc94a3d0 nir/opt_algebraic: remove inexact from floor->trunc pattern
This was marked inexact because of me in !21475, but I don't see why now,
even after checking all the special values.

No Foz-DB changes.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
da7abb1337 nir/opt_algebraic: mark fmulz(finite, finite) -> fmul pattern as nsz
No Foz-DB chagnes.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
ea87f1f9bc nir/opt_algebraic: add a - a with nnan
Foz-DB Navi48:
Totals from 576 (0.70% of 82405) affected shaders:
MaxWaves: 16706 -> 16726 (+0.12%)
Instrs: 618677 -> 580965 (-6.10%); split: -6.10%, +0.00%
CodeSize: 3022552 -> 2861612 (-5.32%); split: -5.33%, +0.00%
VGPRs: 28008 -> 28860 (+3.04%); split: -0.51%, +3.56%
Latency: 2689318 -> 2655887 (-1.24%); split: -1.25%, +0.01%
InvThroughput: 403512 -> 393404 (-2.51%); split: -2.51%, +0.00%
VClause: 7584 -> 7577 (-0.09%); split: -0.17%, +0.08%
SClause: 19974 -> 19086 (-4.45%); split: -4.48%, +0.03%
Copies: 43862 -> 40888 (-6.78%); split: -6.87%, +0.09%
Branches: 12457 -> 11407 (-8.43%)
PreSGPRs: 28315 -> 27046 (-4.48%); split: -4.53%, +0.05%
PreVGPRs: 20751 -> 19397 (-6.52%)
VALU: 317224 -> 290151 (-8.53%); split: -8.53%, +0.00%
SALU: 124297 -> 121347 (-2.37%); split: -2.39%, +0.02%
VMEM: 11918 -> 11907 (-0.09%)
SMEM: 27582 -> 26241 (-4.86%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
16db9f79d1 nir/opt_algebraic: remove inexact a * 0.0 patterns
We already have some with nnan,nsz.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
e443229644 nir/opt_algebraic: mark newly created fmulz nan/inf preserving
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
b678899ef8 nir/opt_algebraic: use nan/inf/sz preserve flags instead of exact for cmp/min/max replacement
And remove some, because they should be covered by the search pattern anyway.

Foz-DB Navi48:
Totals from 560 (0.68% of 82405) affected shaders:
MaxWaves: 11279 -> 11291 (+0.11%)
Instrs: 5214229 -> 5214386 (+0.00%); split: -0.02%, +0.02%
CodeSize: 29613884 -> 29616740 (+0.01%); split: -0.01%, +0.02%
VGPRs: 50400 -> 50328 (-0.14%)
Latency: 36481700 -> 36481157 (-0.00%); split: -0.01%, +0.01%
InvThroughput: 7309905 -> 7307905 (-0.03%); split: -0.05%, +0.02%
VClause: 131423 -> 131424 (+0.00%); split: -0.00%, +0.00%
SClause: 111485 -> 111499 (+0.01%); split: -0.00%, +0.01%
Copies: 441899 -> 442029 (+0.03%); split: -0.02%, +0.05%
Branches: 165599 -> 165597 (-0.00%)
PreVGPRs: 43558 -> 43525 (-0.08%)
VALU: 2573609 -> 2573324 (-0.01%); split: -0.03%, +0.02%
SALU: 851172 -> 851271 (+0.01%); split: -0.01%, +0.02%
VOPD: 366409 -> 366934 (+0.14%); split: +0.23%, -0.08%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
45cb1d3b6f nir/opt_algebraic: remove unpack_half_2x16_split
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511>
2026-02-06 06:12:36 +00:00
Georg Lehmann
b18d9c1b33 nir/opt_algebraic: optimize unpack_32_2x16 of extract
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511>
2026-02-06 06:12:36 +00:00
Marek Olšák
a3f022d0a2 nir: reassociate a $op (b ? #c : #d) for div, mod, rem
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This eliminates expensive div, mod, rem opcodes with non-constant src1 being
constant src1 hiding behind bcsel.

gcc and LLVM are missing this.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39560>
2026-02-02 21:34:48 +00:00
Georg Lehmann
ad6f8291bf nir/opt_algebraic: rework ignore_exact to work like other internal conditions
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39616>
2026-01-31 15:30:25 +00:00
Georg Lehmann
70f0e75262 nir/opt_algebraic: optimize pack_half_2x16_rtz of float converted from 16bit
Foz-DB Navi48:
Totals from 177 (0.21% of 82405) affected shaders:
Instrs: 326628 -> 325955 (-0.21%); split: -0.21%, +0.00%
CodeSize: 1726720 -> 1722500 (-0.24%); split: -0.24%, +0.00%
Latency: 5076631 -> 5075700 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 596010 -> 595598 (-0.07%); split: -0.07%, +0.00%
VClause: 3613 -> 3616 (+0.08%)
Copies: 24427 -> 24501 (+0.30%); split: -0.06%, +0.36%
VALU: 182468 -> 182029 (-0.24%); split: -0.24%, +0.00%
SALU: 55449 -> 55452 (+0.01%); split: -0.01%, +0.01%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39531>
2026-01-29 14:44:37 +00:00
Georg Lehmann
c3e12429c5 nir/opt_algebaric: improve a < 0.0 ? 0.0 : sqrt(a) pattern
Fix the NaN correctness of the original pattern, and add more variants.

Foz-DB Navi48:
Totals from 372 (0.45% of 82405) affected shaders:
Instrs: 208946 -> 207522 (-0.68%); split: -0.71%, +0.03%
CodeSize: 1116436 -> 1109804 (-0.59%); split: -0.61%, +0.02%
VGPRs: 19452 -> 19104 (-1.79%)
Latency: 1121222 -> 1120423 (-0.07%); split: -0.13%, +0.05%
InvThroughput: 158228 -> 157567 (-0.42%); split: -0.61%, +0.19%
VClause: 3695 -> 3704 (+0.24%)
Copies: 9516 -> 9606 (+0.95%); split: -0.24%, +1.19%
VALU: 118696 -> 118031 (-0.56%); split: -0.61%, +0.05%
VOPD: 380 -> 372 (-2.11%)

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39507>
2026-01-29 11:29:48 +00:00
Georg Lehmann
f872c13707 nir/opt_algebraic: use contract instead of inexact for more patterns
These use more precise operations, so contract is enough.

Foz-DB Navi48:
Totals from 248 (0.30% of 82405) affected shaders:
Instrs: 284686 -> 284318 (-0.13%); split: -0.14%, +0.01%
CodeSize: 1528856 -> 1527520 (-0.09%); split: -0.10%, +0.01%
Latency: 2368390 -> 2367345 (-0.04%); split: -0.06%, +0.01%
InvThroughput: 346623 -> 346335 (-0.08%); split: -0.09%, +0.01%
SClause: 6752 -> 6756 (+0.06%); split: -0.12%, +0.18%
Copies: 14685 -> 14694 (+0.06%); split: -0.01%, +0.07%
VALU: 179922 -> 179727 (-0.11%); split: -0.11%, +0.01%
SALU: 28706 -> 28707 (+0.00%)
VOPD: 1196 -> 1198 (+0.17%)

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39507>
2026-01-29 11:29:48 +00:00
Georg Lehmann
d8ef28671d nir/opt_algebraic: use correct syntax to create exact fsat
Fixes: 3b06824e4c ("nir/opt_algebraic: optimize some post peephole select patterns")

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39586>
2026-01-28 18:46:22 +00:00
Georg Lehmann
b2d9615000 nir/opt_algebraic: optimize bcsel to hi 16bits with undef lo
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>
2026-01-26 10:54:20 +00:00
Georg Lehmann
d06b627d23 nir/opt_algebraic: optimize f2f16_rtz of bcsel with constants
Foz-DB Navi48:
Totals from 145 (0.18% of 82405) affected shaders:
Instrs: 1706001 -> 1705669 (-0.02%); split: -0.03%, +0.01%
CodeSize: 9621036 -> 9620784 (-0.00%); split: -0.02%, +0.02%
SpillSGPRs: 711 -> 726 (+2.11%); split: -0.56%, +2.67%
Latency: 20066360 -> 20066193 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 4326789 -> 4326763 (-0.00%); split: -0.00%, +0.00%
Copies: 192041 -> 191995 (-0.02%); split: -0.03%, +0.01%
Branches: 75673 -> 75675 (+0.00%); split: -0.00%, +0.01%
VALU: 765163 -> 764835 (-0.04%); split: -0.05%, +0.00%
SALU: 351758 -> 351715 (-0.01%); split: -0.01%, +0.00%
VOPD: 65236 -> 65282 (+0.07%); split: +0.17%, -0.10%

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>
2026-01-26 10:54:20 +00:00
Georg Lehmann
ee5492e6dd nir/opt_algebraic: remove f2f16 roundtrip conversions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>
2026-01-26 10:54:20 +00:00
Georg Lehmann
592b6579da nir/opt_algebraic: optimize f2f16_rtz(min/max)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>
2026-01-26 10:54:18 +00:00
Georg Lehmann
2b92c0f06e nir/opt_algebraic: optimize f2f16_rtz(b2f(a))
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>
2026-01-26 10:54:18 +00:00
Emma Anholt
cdec063d37 nir/opt_algebraic: Fix a bit of imad24_ir3's optimization.
The mul is 24-bit sign-extended, so in simplifying we should retain that.
If nothing else, this keeps us on the happy path of mul24s.

I didn't fix the other broken pattern, since it's not really part of this
MR.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
2026-01-26 05:39:41 +00:00
Emma Anholt
e5a9eae2b5 nir/opt_algebraic_tests: Fix fuzzing levels for multi-component inputs.
We were enumerating enough for a single component, but not all the
combinations.  This helps show that our fdots fail pretty consistently.
And triggers more skipping from the fany_equal16s thanks to varied inputs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
2026-01-26 05:39:41 +00:00
Emma Anholt
7fd0287a89 nir/opt_algebraic_tests: Test !nir_fp_preserve_signed_zero behavior.
Iterate over a set of sign-flips for 0.0s to see if we can find a set that
makes the search and replace sides match.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
2026-01-26 05:39:41 +00:00
Emma Anholt
68f5bc4f12 nir/opt_algebraic_tests: Rename and use the enum result type more.
As I introduced another layer of iteration for signed zero testing, the
former logic got unwieldy.  In fact, it was already unwieldy enough that I
forgot to clear all_skipped when the assert failed, allowing a failing
test to be marked UNSUPPORTED instead of XFAIL.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
2026-01-26 05:39:40 +00:00
Emma Anholt
a90163a15a nir/opt_algebraic_tests: Add support for expression swizzles.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
2026-01-26 05:39:40 +00:00
Emma Anholt
c30c383d4d nir/opt_algebraic_tests: Allow testing of fdot*_replicated opcodes.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
2026-01-26 05:39:40 +00:00
Emma Anholt
173295adf4 nir/opt_algebraic_tests: Allow testing udiv_aligned_4.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
2026-01-26 05:39:40 +00:00
Emma Anholt
94237c3ea3 nir/opt_algebraic_tests: Allow testing mul/mad_relaxed opcodes.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
2026-01-26 05:39:40 +00:00