nir/opt_algebaric: improve a < 0.0 ? 0.0 : sqrt(a) pattern

Fix the NaN correctness of the original pattern, and add more variants.

Foz-DB Navi48:
Totals from 372 (0.45% of 82405) affected shaders:
Instrs: 208946 -> 207522 (-0.68%); split: -0.71%, +0.03%
CodeSize: 1116436 -> 1109804 (-0.59%); split: -0.61%, +0.02%
VGPRs: 19452 -> 19104 (-1.79%)
Latency: 1121222 -> 1120423 (-0.07%); split: -0.13%, +0.05%
InvThroughput: 158228 -> 157567 (-0.42%); split: -0.61%, +0.19%
VClause: 3695 -> 3704 (+0.24%)
Copies: 9516 -> 9606 (+0.95%); split: -0.24%, +1.19%
VALU: 118696 -> 118031 (-0.56%); split: -0.61%, +0.05%
VOPD: 380 -> 372 (-2.11%)

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39507>
This commit is contained in:
Georg Lehmann 2026-01-24 10:45:35 +01:00 committed by Marge Bot
parent f872c13707
commit c3e12429c5

View file

@ -1854,7 +1854,10 @@ optimizations.extend([
(('flog2(contract)', ('frsq', a)), ('fmul', -0.5, ('flog2', a))),
(('flog2(contract)', ('fpow', a, b)), ('fmul', b, ('flog2', a))),
(('~fmul', ('fexp2(is_used_once)', a), ('fexp2(is_used_once)', b)), ('fexp2', ('fadd', a, b))),
(('bcsel', ('flt', a, 0.0), 0.0, ('fsqrt', a)), ('fsqrt', ('fmax', a, 0.0)), 'true', TestStatus.XFAIL), # XFAIL is that bcsel(flt(NaN, 0), 0, fsqrt(NaN)) produces 0.0 instead of NaN.
(('bcsel', ('flt', a, 0.0), 0.0, ('fsqrt(nnan,nsz)', a)), ('fsqrt', ('fmax', a, 0.0))),
(('bcsel', ('fge', 0.0, a), 0.0, ('fsqrt(nnan)', a)), ('fsqrt', ('fmax', a, 0.0))),
(('bcsel', ('flt', 0.0, a), ('fsqrt', a), 0.0), ('fsqrt', ('fmax', a, 0.0))),
(('bcsel', ('fge', a, 0.0), ('fsqrt(nsz)', a), 0.0), ('fsqrt', ('fmax', a, 0.0))),
(('fmul(contract)', ('fsqrt', a), ('fsqrt', a)), ('fabs',a)),
(('fmulz(contract)', ('fsqrt', a), ('fsqrt', a)), ('fabs', a)),
# Division and reciprocal