Georg Lehmann
0066328cf1
nir/opt_algebraic: create more 64bit bit test
...
Foz-DB GFX1201:
Totals from 2 (0.00% of 205032) affected shaders:
Instrs: 3429 -> 3425 (-0.12%)
CodeSize: 19580 -> 19568 (-0.06%)
Latency: 13629 -> 13628 (-0.01%); split: -0.02%, +0.01%
InvThroughput: 1853 -> 1847 (-0.32%)
Copies: 235 -> 237 (+0.85%)
VALU: 1901 -> 1898 (-0.16%)
SALU: 381 -> 380 (-0.26%)
VOPD: 307 -> 309 (+0.65%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40705 >
2026-04-08 08:44:20 +00:00
Rhys Perry
b30c0d8264
nir/algebraic: optimize exact f2u32(fmul(unpack_norm))
...
fossil-db (navi21):
Totals from 16 (0.01% of 202427) affected shaders:
Instrs: 17730 -> 17226 (-2.84%)
CodeSize: 97500 -> 95708 (-1.84%)
InvThroughput: 44437 -> 44419 (-0.04%)
Copies: 1502 -> 1446 (-3.73%)
VALU: 9973 -> 9525 (-4.49%)
SALU: 3509 -> 3453 (-1.60%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40740 >
2026-04-08 07:10:26 +00:00
Rhys Perry
f52dace6e8
nir/tests: fix NaN/inf checks in skip_test()
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40740 >
2026-04-08 07:10:26 +00:00
Georg Lehmann
8730c039bf
nir/opt_algebraic: move some lower_lerp patterns
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
If we rely on the pattern order here, we don't need to duplicate per bit size.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40730 >
2026-04-04 10:44:09 +00:00
Georg Lehmann
f192fe99eb
nir/opt_algebraic: update open coded flerp(..., b2f(c)) to bcsel patterns
...
We remove 1.0 - b2f(a) since 6a662a59b7 .
Foz-DB Navi48:
Totals from 200 (0.10% of 205032) affected shaders:
Instrs: 410309 -> 409750 (-0.14%); split: -0.18%, +0.05%
CodeSize: 2140424 -> 2136956 (-0.16%); split: -0.21%, +0.05%
Latency: 5834394 -> 5834042 (-0.01%); split: -0.02%, +0.01%
InvThroughput: 906879 -> 906374 (-0.06%); split: -0.06%, +0.01%
VClause: 8247 -> 8244 (-0.04%)
SClause: 7721 -> 7723 (+0.03%); split: -0.03%, +0.05%
Copies: 20515 -> 20487 (-0.14%); split: -0.29%, +0.16%
PreVGPRs: 14510 -> 14481 (-0.20%)
VALU: 228703 -> 228235 (-0.20%); split: -0.28%, +0.07%
SALU: 62832 -> 62914 (+0.13%); split: -0.18%, +0.31%
VOPD: 929 -> 927 (-0.22%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40730 >
2026-04-04 10:44:09 +00:00
Georg Lehmann
eff9f00533
nir/search: remove matching variable type
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Now unused, and if you really need it use a search helper.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40713 >
2026-04-01 09:52:45 +00:00
Georg Lehmann
5b1405dcbf
nir/opt_algebraic: remove a few non 1bit bool patterns
...
We almost exclusive optimize 1bit booleans nowadays,
so I think these shouldn't be needed.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40713 >
2026-04-01 09:52:45 +00:00
Karol Herbst
5bb3c9f69c
nir: rename fsin_amd and fcos_amd to a more generic name
...
Nvidia implements both the same way as AMD does, so it makes sense to
allow for code sharing here.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40541 >
2026-03-31 01:47:29 +02:00
Georg Lehmann
643dd510d4
nir/opt_algebraic: optimize b2f(a) * b
...
When the multiplication is only used by fadd, it's not a clear win
because of potential fma fusion.
Totals from 8015 (6.99% of 114655) affected shaders:
MaxWaves: 199394 -> 199466 (+0.04%); split: +0.04%, -0.01%
Instrs: 17461518 -> 17451076 (-0.06%); split: -0.10%, +0.04%
CodeSize: 94779552 -> 94769828 (-0.01%); split: -0.07%, +0.06%
VGPRs: 526012 -> 525532 (-0.09%); split: -0.10%, +0.01%
SpillSGPRs: 12466 -> 12517 (+0.41%); split: -0.09%, +0.50%
Latency: 191274766 -> 191297394 (+0.01%); split: -0.03%, +0.04%
InvThroughput: 31465968 -> 31456785 (-0.03%); split: -0.07%, +0.04%
VClause: 312081 -> 312073 (-0.00%); split: -0.10%, +0.09%
SClause: 366914 -> 366906 (-0.00%); split: -0.02%, +0.01%
Copies: 1222482 -> 1221933 (-0.04%); split: -0.20%, +0.15%
Branches: 376651 -> 376577 (-0.02%); split: -0.03%, +0.01%
PreSGPRs: 442974 -> 443240 (+0.06%); split: -0.01%, +0.07%
PreVGPRs: 415964 -> 415668 (-0.07%); split: -0.09%, +0.02%
VALU: 9403517 -> 9393916 (-0.10%); split: -0.12%, +0.02%
SALU: 2799420 -> 2800430 (+0.04%); split: -0.13%, +0.16%
VOPD: 472826 -> 472347 (-0.10%); split: +0.09%, -0.19%
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399 >
2026-03-20 08:50:41 +00:00
Georg Lehmann
d2b37b667e
nir/opt_algebraic: optimize more fmulz(1.0, a) remains
...
If dxvk's opencoded fmulz gets partially constant folded,
it leaves this mess behind.
It's important to do this before the more general fmul+b2f patterns added
in the next commit, because they change the signed zero behavior in a way
that can't be optimized back.
Foz-DB Navi48:
Totals from 36 (0.03% of 114655) affected shaders:
Instrs: 16513 -> 15706 (-4.89%)
CodeSize: 99756 -> 95760 (-4.01%)
Latency: 45165 -> 44151 (-2.25%)
InvThroughput: 8344 -> 7886 (-5.49%)
VClause: 395 -> 401 (+1.52%)
Copies: 639 -> 634 (-0.78%)
PreSGPRs: 1158 -> 1154 (-0.35%)
PreVGPRs: 1227 -> 1225 (-0.16%)
VALU: 11310 -> 10769 (-4.78%)
SALU: 813 -> 809 (-0.49%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399 >
2026-03-20 08:50:41 +00:00
Georg Lehmann
b96c42c916
nir/opt_algebraic: optimize more near useless bcsel
...
Foz-DB Navi48:
Totals from 327 (0.29% of 114655) affected shaders:
Instrs: 732971 -> 731642 (-0.18%); split: -0.19%, +0.01%
CodeSize: 3696020 -> 3689824 (-0.17%); split: -0.17%, +0.00%
Latency: 4405319 -> 4403413 (-0.04%); split: -0.06%, +0.01%
InvThroughput: 650209 -> 649659 (-0.08%); split: -0.10%, +0.01%
Copies: 53872 -> 53736 (-0.25%); split: -0.27%, +0.02%
Branches: 15598 -> 15571 (-0.17%)
VALU: 262391 -> 261969 (-0.16%)
SALU: 268112 -> 267699 (-0.15%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399 >
2026-03-20 08:50:41 +00:00
Georg Lehmann
6cfe6eaa79
nir/opt_algebraic: create ldexp from exp2
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
ldexp uses the full width VALU path, exp2 the transcendental SIMD8.
Foz-DB Navi21:
Totals from 729 (0.64% of 114627) affected shaders:
MaxWaves: 20071 -> 20103 (+0.16%); split: +0.18%, -0.02%
Instrs: 869129 -> 867654 (-0.17%); split: -0.17%, +0.00%
CodeSize: 4709000 -> 4708460 (-0.01%); split: -0.02%, +0.00%
VGPRs: 31184 -> 31128 (-0.18%); split: -0.23%, +0.05%
Latency: 7610726 -> 7597238 (-0.18%); split: -0.18%, +0.00%
InvThroughput: 1822323 -> 1819815 (-0.14%); split: -0.14%, +0.00%
VClause: 22494 -> 22493 (-0.00%); split: -0.03%, +0.02%
SClause: 20520 -> 20509 (-0.05%)
Copies: 72025 -> 72024 (-0.00%); split: -0.01%, +0.01%
Branches: 22028 -> 22029 (+0.00%)
PreVGPRs: 21601 -> 21602 (+0.00%)
VALU: 604821 -> 603339 (-0.25%); split: -0.25%, +0.00%
SALU: 114258 -> 114262 (+0.00%); split: -0.00%, +0.01%
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33900 >
2026-03-20 08:15:08 +00:00
Georg Lehmann
ec331cc48a
nir: replace lower_ldexp with has_ldexp
...
I can be bothered to fix all the backends that don't set lower_ldexp,
and only two backends have ldexp anyway.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33900 >
2026-03-20 08:15:08 +00:00
Georg Lehmann
98ff0a394a
nir/opt_algebraic: move some fsat patterns next to the other fsat patterns
...
I almost missed that they already exist multiple times.
No Foz-DB chagnes.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40389 >
2026-03-16 13:03:50 +00:00
Georg Lehmann
607f26814f
nir/opt_algebraic: remove manual patterns that optimizes flt([0.0, 1.0], 0.0)
...
Range analysis can figure this out.
No Foz-DB changes.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40389 >
2026-03-16 13:03:50 +00:00
Georg Lehmann
530bb4278c
nir/opt_algebraic: remove manual pattern that removes fmax(..., 0.0)
...
Range analysis will figure this out.
No Foz-DB changes.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40389 >
2026-03-16 13:03:50 +00:00
Georg Lehmann
4d176c8ea5
nir/opt_algebraic: turn fabs(a) into fneg(a) if a is not positive
...
fneg is usually more optimizable.
Foz-DB Navi48:
Totals from 214 (0.19% of 114655) affected shaders:
Instrs: 694279 -> 694155 (-0.02%); split: -0.02%, +0.00%
CodeSize: 3749268 -> 3748024 (-0.03%); split: -0.03%, +0.00%
VGPRs: 18252 -> 18264 (+0.07%)
Latency: 5453691 -> 5453503 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 1024436 -> 1024314 (-0.01%); split: -0.01%, +0.00%
VALU: 453136 -> 453041 (-0.02%); split: -0.02%, +0.00%
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40389 >
2026-03-16 13:03:50 +00:00
Georg Lehmann
d77c2a1ece
nir/opt_algebraic: take advantage of range helpers including nnan
...
No Foz-DB changes.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40389 >
2026-03-16 13:03:49 +00:00
Georg Lehmann
aad2b9bfc7
nir/opt_algebraic: be more strict when optimizing fcmp(a + #b, #c)
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40291 >
2026-03-13 07:13:10 +00:00
Georg Lehmann
624313d35d
nir/opt_algebraic: lower ninf fisfinite correctly
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40291 >
2026-03-13 07:13:09 +00:00
Georg Lehmann
aa831b6690
nir/opt_algebraic: skip more redundant alignment iand
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Useful for smaller/larger loads. Also there is no reason to be bitsize
specific here if we use an signed constant.
Foz-DB Navi48:
Totals from 8 (0.01% of 114655) affected shaders:
Instrs: 7629 -> 7612 (-0.22%)
CodeSize: 40772 -> 40692 (-0.20%)
Latency: 54880 -> 54944 (+0.12%)
InvThroughput: 8879 -> 8880 (+0.01%); split: -0.08%, +0.09%
VALU: 4029 -> 4027 (-0.05%); split: -0.15%, +0.10%
SALU: 1260 -> 1249 (-0.87%)
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40292 >
2026-03-10 06:57:50 +00:00
Georg Lehmann
6936282bd3
nir/opt_algebraic: remove min(a, >= 1.0) before fsat
...
Foz-DB Navi48:
Totals from 86 (0.08% of 114655) affected shaders:
Instrs: 217553 -> 217408 (-0.07%); split: -0.07%, +0.01%
CodeSize: 1159992 -> 1159380 (-0.05%); split: -0.06%, +0.01%
Latency: 1657600 -> 1657533 (-0.00%); split: -0.01%, +0.00%
InvThroughput: 203205 -> 203178 (-0.01%); split: -0.02%, +0.00%
SClause: 5245 -> 5244 (-0.02%)
Copies: 13726 -> 13716 (-0.07%); split: -0.14%, +0.07%
VALU: 130151 -> 130039 (-0.09%); split: -0.09%, +0.00%
SALU: 26476 -> 26474 (-0.01%); split: -0.02%, +0.01%
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40281 >
2026-03-09 21:11:25 +00:00
Georg Lehmann
108a4d4341
nir: create more fsat using range analysis
...
Foz-DB Navi48:
Totals from 5922 (5.17% of 114655) affected shaders:
Instrs: 5188307 -> 5184193 (-0.08%); split: -0.09%, +0.01%
CodeSize: 27852544 -> 27843252 (-0.03%); split: -0.05%, +0.01%
Latency: 28723967 -> 28714268 (-0.03%); split: -0.04%, +0.01%
InvThroughput: 4745002 -> 4742298 (-0.06%); split: -0.07%, +0.01%
VClause: 68649 -> 68650 (+0.00%)
SClause: 103932 -> 103917 (-0.01%); split: -0.02%, +0.00%
Copies: 244683 -> 244706 (+0.01%); split: -0.01%, +0.02%
PreSGPRs: 272361 -> 272362 (+0.00%); split: -0.00%, +0.00%
VALU: 3248960 -> 3245520 (-0.11%); split: -0.11%, +0.00%
SALU: 516784 -> 516796 (+0.00%); split: -0.01%, +0.01%
VOPD: 8910 -> 8895 (-0.17%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40281 >
2026-03-09 21:11:25 +00:00
Georg Lehmann
4885e5cf3a
nir: remove more fsat using range analysis
...
Foz-DB Navi48:
Totals from 3018 (3.65% of 82636) affected shaders:
MaxWaves: 69274 -> 69280 (+0.01%)
Instrs: 7165414 -> 7157581 (-0.11%); split: -0.12%, +0.01%
CodeSize: 38890212 -> 38823132 (-0.17%); split: -0.18%, +0.00%
VGPRs: 228672 -> 228624 (-0.02%)
Latency: 64789026 -> 64784877 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 11805156 -> 11802642 (-0.02%); split: -0.02%, +0.00%
VClause: 136900 -> 136886 (-0.01%); split: -0.03%, +0.02%
SClause: 150135 -> 150130 (-0.00%); split: -0.01%, +0.01%
Copies: 574690 -> 574894 (+0.04%); split: -0.03%, +0.06%
Branches: 187169 -> 187086 (-0.04%); split: -0.04%, +0.00%
PreSGPRs: 190074 -> 190067 (-0.00%); split: -0.00%, +0.00%
PreVGPRs: 189564 -> 189538 (-0.01%); split: -0.02%, +0.00%
VALU: 3955188 -> 3949411 (-0.15%); split: -0.15%, +0.00%
SALU: 1114659 -> 1114729 (+0.01%); split: -0.02%, +0.03%
SMEM: 231080 -> 231077 (-0.00%); split: -0.00%, +0.00%
VOPD: 116150 -> 116180 (+0.03%); split: +0.04%, -0.02%
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:45 +00:00
Georg Lehmann
506bb5a609
nir/search_helpers: use fp class analysis more
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:45 +00:00
Georg Lehmann
32b5719a9f
nir/opt_algebraic: add is_not_uint_zero for b2i16(uge) pattern
...
More fallout from f2a59fdea6 .
is_not_zero now always returns whether the result is a floating point zero.
When combined with the fp denorm handling that will be added to
floating point range analysis, this is false for many sensible integer values.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
ab773fc5d4
nir/opt_algebraic: fix frsq clamp pattern
...
This is not NaN correct.
And also make the pattern 32bit only because the constant is hard coded
FLT_MAX.
Fixes: 780b5c1037 ("nir/algebraic: Simplify some Inf and NaN avoidance code")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:42 +00:00
Georg Lehmann
ba30de1f97
nir/opt_algebraic: remove pattern that skips iabs with range analysis
...
Fixes: f2a59fdea6 ("nir: remove non float nir_analyse_range support")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:41 +00:00
Caio Oliveira
da57fbfb07
nir: Fix constant folding for iadd_sat
...
Use INT_MIN instead of INT_MAX for underflow.
Fixes: cc4b50b023 ("nir/opcodes: use u_overflow to fix incorrect checks")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pelloux@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40252 >
2026-03-06 22:26:07 +00:00
Georg Lehmann
6a218e346d
nir: remove lower_vector_cmp
...
Use nir_lower_alu_width or nir_lower_alu_to_scalar instead.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40197 >
2026-03-04 19:50:28 +00:00
Georg Lehmann
3e6e1e213c
nir: remove fall_equal/fany_nequal opcodes
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40197 >
2026-03-04 19:50:27 +00:00
Georg Lehmann
7194dfcc2c
nir/opt_algebraic: optimize b2i(a) * b to bcsel
...
Foz-DB Navi48:
Totals from 3180 (2.77% of 114655) affected shaders:
MaxWaves: 85526 -> 85446 (-0.09%)
Instrs: 2681446 -> 2678641 (-0.10%); split: -0.17%, +0.07%
CodeSize: 14295536 -> 14284628 (-0.08%); split: -0.13%, +0.05%
VGPRs: 174792 -> 174636 (-0.09%); split: -0.16%, +0.07%
SpillSGPRs: 306 -> 308 (+0.65%)
Latency: 14078973 -> 14070122 (-0.06%); split: -0.07%, +0.01%
InvThroughput: 2774242 -> 2764051 (-0.37%); split: -0.37%, +0.00%
VClause: 41744 -> 41734 (-0.02%); split: -0.10%, +0.07%
SClause: 58176 -> 58154 (-0.04%); split: -0.05%, +0.01%
Copies: 222967 -> 223108 (+0.06%); split: -0.14%, +0.20%
Branches: 57317 -> 57322 (+0.01%)
PreSGPRs: 140454 -> 140451 (-0.00%); split: -0.01%, +0.00%
PreVGPRs: 131649 -> 131540 (-0.08%); split: -0.09%, +0.01%
VALU: 1509318 -> 1505443 (-0.26%); split: -0.26%, +0.00%
SALU: 384419 -> 385838 (+0.37%); split: -0.01%, +0.38%
VOPD: 13272 -> 13286 (+0.11%); split: +0.14%, -0.03%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40160 >
2026-03-02 15:58:30 +00:00
Georg Lehmann
3d304d5647
nir/opt_algebraic: remove is_used_once on outer instruction
...
This just prevents useful optimizations.
is_used_once only makes sense on inner instructions, to prevent
creating more new instructions than will be removed.
Foz-DB Navi48:
Totals from 16989 (14.82% of 114655) affected shaders:
MaxWaves: 434379 -> 434353 (-0.01%); split: +0.01%, -0.01%
Instrs: 29030794 -> 29022514 (-0.03%); split: -0.07%, +0.04%
CodeSize: 155293092 -> 155262816 (-0.02%); split: -0.05%, +0.03%
VGPRs: 1093980 -> 1094088 (+0.01%); split: -0.01%, +0.02%
SpillSGPRs: 9801 -> 9803 (+0.02%); split: -0.03%, +0.05%
Latency: 356327270 -> 356283384 (-0.01%); split: -0.03%, +0.02%
InvThroughput: 58239439 -> 58229374 (-0.02%); split: -0.03%, +0.01%
VClause: 451716 -> 451815 (+0.02%); split: -0.07%, +0.09%
SClause: 654614 -> 654556 (-0.01%); split: -0.03%, +0.03%
Copies: 1809805 -> 1809297 (-0.03%); split: -0.20%, +0.17%
Branches: 552382 -> 552384 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 947188 -> 947224 (+0.00%); split: -0.01%, +0.02%
PreVGPRs: 879583 -> 880173 (+0.07%); split: -0.01%, +0.08%
VALU: 16317859 -> 16309975 (-0.05%); split: -0.07%, +0.02%
SALU: 4256121 -> 4259315 (+0.08%); split: -0.05%, +0.12%
SMEM: 1067069 -> 1067070 (+0.00%)
VOPD: 440855 -> 440792 (-0.01%); split: +0.05%, -0.07%
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138 >
2026-03-02 15:24:36 +00:00
Georg Lehmann
41878e5714
nir_opt_algebraic: remove unneeded is_not_const
...
These were needed when we didn't constant fold inside nir_search,
to prevent infinite loops.
But now all they do is slow down pattern matching.
Foz-DB Navi48:
Totals from 107 (0.09% of 114655) affected shaders:
Instrs: 162439 -> 162481 (+0.03%); split: -0.01%, +0.03%
CodeSize: 943056 -> 942988 (-0.01%); split: -0.03%, +0.02%
Latency: 971667 -> 970865 (-0.08%); split: -0.09%, +0.00%
InvThroughput: 164452 -> 164521 (+0.04%); split: -0.02%, +0.07%
Copies: 7980 -> 7982 (+0.03%)
VALU: 103572 -> 103566 (-0.01%); split: -0.05%, +0.04%
SALU: 12825 -> 12878 (+0.41%)
VOPD: 5235 -> 5190 (-0.86%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138 >
2026-03-02 15:24:36 +00:00
Georg Lehmann
374cbc17a4
nir_opt_algebraic: reassociate fadd into ffma where one factor is a constant
...
This restriction doesn't really make sense, probably an accident.
Foz-DB Navi48:
Totals from 2290 (2.00% of 114655) affected shaders:
MaxWaves: 57496 -> 57510 (+0.02%); split: +0.06%, -0.03%
Instrs: 2817419 -> 2816209 (-0.04%); split: -0.12%, +0.08%
CodeSize: 15218816 -> 15220576 (+0.01%); split: -0.09%, +0.10%
VGPRs: 147456 -> 147384 (-0.05%); split: -0.07%, +0.02%
Latency: 13757114 -> 13751833 (-0.04%); split: -0.13%, +0.09%
InvThroughput: 2463343 -> 2462482 (-0.03%); split: -0.07%, +0.04%
VClause: 40137 -> 40153 (+0.04%); split: -0.07%, +0.11%
SClause: 57351 -> 57385 (+0.06%); split: -0.12%, +0.18%
Copies: 135482 -> 136258 (+0.57%); split: -0.22%, +0.79%
Branches: 30886 -> 30894 (+0.03%)
PreSGPRs: 113470 -> 113462 (-0.01%); split: -0.03%, +0.02%
PreVGPRs: 117554 -> 117591 (+0.03%); split: -0.01%, +0.04%
VALU: 1682734 -> 1681557 (-0.07%); split: -0.10%, +0.03%
SALU: 390685 -> 391301 (+0.16%); split: -0.07%, +0.22%
VOPD: 6159 -> 6254 (+1.54%); split: +1.72%, -0.18%
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138 >
2026-03-02 15:24:36 +00:00
Georg Lehmann
b949122908
nir/opt_algebraic: remove loops for b2f/b2i equality handling
...
The feq/fneu patterns already existed, and there is no reason to use bit size based
loops here.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138 >
2026-03-02 15:24:36 +00:00
Georg Lehmann
83091276f8
nir_opt_algebraic: remove more specific cmp+bcsel opts
...
Only some minimal difference from pattern ordering:
Foz-DB Navi48:
Totals from 3 (0.00% of 114655) affected shaders:
Instrs: 4556 -> 4533 (-0.50%)
CodeSize: 23716 -> 23608 (-0.46%)
Latency: 27424 -> 26336 (-3.97%)
InvThroughput: 4674 -> 4672 (-0.04%)
SClause: 107 -> 105 (-1.87%)
Copies: 351 -> 346 (-1.42%)
Branches: 130 -> 126 (-3.08%)
VALU: 2598 -> 2595 (-0.12%)
SALU: 561 -> 555 (-1.07%)
SMEM: 169 -> 167 (-1.18%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138 >
2026-03-02 15:24:36 +00:00
Georg Lehmann
4190241795
nir/opt_algebraic: optimize all comparisons of b2f/b2i with constants
...
Foz-DB Navi48:
Totals from 857 (0.75% of 114655) affected shaders:
Instrs: 1136993 -> 1132422 (-0.40%); split: -0.48%, +0.08%
CodeSize: 6096636 -> 6070832 (-0.42%); split: -0.48%, +0.06%
VGPRs: 49668 -> 49620 (-0.10%)
Latency: 24014661 -> 24044601 (+0.12%); split: -0.04%, +0.16%
InvThroughput: 4182482 -> 4183708 (+0.03%); split: -0.12%, +0.15%
VClause: 17698 -> 17695 (-0.02%)
SClause: 25214 -> 25213 (-0.00%)
Copies: 81474 -> 81396 (-0.10%); split: -0.79%, +0.69%
Branches: 24722 -> 24650 (-0.29%); split: -0.36%, +0.07%
PreSGPRs: 43338 -> 43291 (-0.11%); split: -0.22%, +0.11%
VALU: 652975 -> 649760 (-0.49%); split: -0.50%, +0.00%
SALU: 153961 -> 153797 (-0.11%); split: -0.72%, +0.61%
VOPD: 10650 -> 10684 (+0.32%); split: +0.38%, -0.07%
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138 >
2026-03-02 15:24:36 +00:00
Georg Lehmann
ef6f5377da
nir/opt_algebraic: remove fcmp+fneg patterns that are cleaned up earlier
...
No Foz-DB changes, as expected.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138 >
2026-03-02 15:24:36 +00:00
Georg Lehmann
a5334ec239
nir/opt_algebraic: generalize late fcmp(fneg(a), const) patterns
...
No reason just to do this for 1.0.
Foz-DB Navi48:
Totals from 44 (0.04% of 114655) affected shaders:
CodeSize: 111620 -> 111476 (-0.13%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138 >
2026-03-02 15:24:35 +00:00
Georg Lehmann
6b464785b9
nir/opt_algebraic: optimize d3d9 iand(a, inot(b))
...
Foz-DB GFX1201:
Totals from 24 (0.02% of 112525) affected shaders:
Instrs: 15598 -> 15426 (-1.10%); split: -1.17%, +0.06%
CodeSize: 88716 -> 88260 (-0.51%); split: -0.98%, +0.46%
Latency: 54419 -> 53965 (-0.83%); split: -0.91%, +0.08%
InvThroughput: 10294 -> 10166 (-1.24%); split: -1.28%, +0.04%
VClause: 302 -> 300 (-0.66%)
SClause: 367 -> 363 (-1.09%); split: -1.63%, +0.54%
Copies: 712 -> 705 (-0.98%); split: -3.09%, +2.11%
PreSGPRs: 1402 -> 1424 (+1.57%); split: -0.14%, +1.71%
PreVGPRs: 850 -> 848 (-0.24%)
VALU: 9730 -> 9591 (-1.43%)
SALU: 1579 -> 1649 (+4.43%)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40104 >
2026-02-26 14:44:01 +00:00
Georg Lehmann
a3f9c347bf
nir/opt_algebraic: optimize b2f(a) - 1.0 to -b2f(a)
...
Foz-DB GFX1201:
Totals from 81 (0.07% of 112525) affected shaders:
Instrs: 95048 -> 94965 (-0.09%); split: -0.13%, +0.05%
CodeSize: 532148 -> 531864 (-0.05%); split: -0.09%, +0.04%
SpillSGPRs: 122 -> 125 (+2.46%)
Latency: 440372 -> 440402 (+0.01%); split: -0.02%, +0.03%
InvThroughput: 296078 -> 296173 (+0.03%); split: -0.03%, +0.06%
VClause: 1449 -> 1456 (+0.48%); split: -0.21%, +0.69%
SClause: 2249 -> 2256 (+0.31%); split: -0.09%, +0.40%
Copies: 3956 -> 3965 (+0.23%); split: -0.10%, +0.33%
PreVGPRs: 2900 -> 2899 (-0.03%)
VALU: 61212 -> 61098 (-0.19%); split: -0.19%, +0.01%
SALU: 6970 -> 6981 (+0.16%)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40104 >
2026-02-26 14:44:01 +00:00
Alyssa Rosenzweig
42c4f7935a
nir: optimize u2u32(unpack_32_2x16_split_*)
...
Noticed while playing with pixel coord things.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40056 >
2026-02-24 19:16:56 +00:00
Georg Lehmann
5d5f99bfe8
nir/opt_algebraic: create more b2f if sign of zero doesn't matter
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39966 >
2026-02-19 15:21:27 +00:00
Georg Lehmann
d87943ad3d
nir/opt_algebraic: preserve signed zero when creating new b2f
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39966 >
2026-02-19 15:21:27 +00:00
Rob Clark
8cc99edb7b
nir: Fill in missing conversion opts
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
I noticed we were missing:
(('u2f16', ('u2u64', 'a@32')), ('u2f16', a))
This was do to coupling the u2f/i2f opts with i2i/u2u in the same loop
(with different positionals). The `if B <= S\ncontinue` doesn't apply
to the second part. So just split these into two loops.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14848
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39899 >
2026-02-18 15:13:21 +00:00
Rhys Perry
fd22c48b2a
nir/algebraic: remove ignore_exact
...
This was used because the exact bit meant something different for
comparisons than it did for the replacement expression, but that isn't the
case anymore.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39809 >
2026-02-18 14:04:22 +00:00
Georg Lehmann
6a662a59b7
nir/opt_algebraic: optimize 1.0 - b2f(a) to b2f(inot(a))
...
Which can then be cleaned up further.
Foz-DB Navi48:
Totals from 4156 (3.62% of 114655) affected shaders:
MaxWaves: 102580 -> 102620 (+0.04%)
Instrs: 11696222 -> 11679986 (-0.14%); split: -0.16%, +0.02%
CodeSize: 64452544 -> 64379204 (-0.11%); split: -0.13%, +0.02%
VGPRs: 288256 -> 288172 (-0.03%)
SpillSGPRs: 7290 -> 7297 (+0.10%)
Latency: 160690992 -> 160643825 (-0.03%); split: -0.05%, +0.02%
InvThroughput: 26869332 -> 26849963 (-0.07%); split: -0.09%, +0.02%
VClause: 237078 -> 237003 (-0.03%); split: -0.04%, +0.01%
SClause: 270560 -> 270564 (+0.00%); split: -0.01%, +0.01%
Copies: 936165 -> 937970 (+0.19%); split: -0.07%, +0.26%
Branches: 302981 -> 302992 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 244967 -> 245303 (+0.14%)
PreVGPRs: 232930 -> 232886 (-0.02%); split: -0.02%, +0.00%
VALU: 6200283 -> 6187264 (-0.21%); split: -0.23%, +0.02%
SALU: 1759176 -> 1760275 (+0.06%); split: -0.10%, +0.16%
VOPD: 447502 -> 446194 (-0.29%); split: +0.14%, -0.43%
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39917 >
2026-02-17 10:01:21 +00:00
Georg Lehmann
f7222d6939
nir/opt_algebraic: remove few uses of integer nir_analyze_range
...
Surprisingly, this has an effect on GFX1201:
Totals from 66 (0.08% of 82405) affected shaders:
Instrs: 200725 -> 201517 (+0.39%)
CodeSize: 978676 -> 981488 (+0.29%)
Latency: 291736 -> 291760 (+0.01%)
InvThroughput: 31556 -> 31604 (+0.15%)
Copies: 11928 -> 12588 (+5.53%)
Branches: 14850 -> 15048 (+1.33%)
SALU: 68981 -> 69509 (+0.77%)
I say surprisingly, because nir_analyze_range handles nothing but
constants and bcsel for integers. Maybe rdr2 is actually
hitting some weird bcsel(a, #b, #c) == 0 case where b and c are not 0?
No, I looked at a few of those shaders, and it's just noise from changed
instruction order.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39756 >
2026-02-16 18:08:53 +00:00
Georg Lehmann
4e2f1345d8
nir/opt_algebraic: make fcmp(a+b, 0.0) -> fcmp(a, -b) exact using ninf
...
And remove some cases that never happen because we remove fneg on compare with constants.
Foz-DB Navi48:
Totals from 1305 (1.58% of 82405) affected shaders:
MaxWaves: 32872 -> 32854 (-0.05%)
Instrs: 4554013 -> 4551638 (-0.05%); split: -0.06%, +0.01%
CodeSize: 25269108 -> 25255428 (-0.05%); split: -0.06%, +0.00%
VGPRs: 87660 -> 87732 (+0.08%)
Latency: 33291152 -> 33285023 (-0.02%); split: -0.03%, +0.01%
InvThroughput: 8965288 -> 8963071 (-0.02%); split: -0.03%, +0.00%
VClause: 104008 -> 103947 (-0.06%); split: -0.09%, +0.03%
SClause: 97577 -> 97574 (-0.00%); split: -0.01%, +0.00%
Copies: 372741 -> 372628 (-0.03%); split: -0.05%, +0.02%
Branches: 134076 -> 134072 (-0.00%)
PreSGPRs: 65109 -> 65110 (+0.00%); split: -0.00%, +0.00%
PreVGPRs: 68911 -> 68968 (+0.08%); split: -0.01%, +0.10%
VALU: 2247091 -> 2245815 (-0.06%); split: -0.07%, +0.01%
SALU: 810190 -> 810001 (-0.02%); split: -0.02%, +0.00%
VOPD: 205075 -> 205016 (-0.03%); split: +0.04%, -0.07%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00