Alyssa Rosenzweig
f55e87db93
nir: add missing ssbo atomics to nir_get_io_index_src_number
...
Match other SSBO intrinsics and other atomics.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39895 >
2026-02-17 15:42:36 +00:00
Georg Lehmann
6a662a59b7
nir/opt_algebraic: optimize 1.0 - b2f(a) to b2f(inot(a))
...
Which can then be cleaned up further.
Foz-DB Navi48:
Totals from 4156 (3.62% of 114655) affected shaders:
MaxWaves: 102580 -> 102620 (+0.04%)
Instrs: 11696222 -> 11679986 (-0.14%); split: -0.16%, +0.02%
CodeSize: 64452544 -> 64379204 (-0.11%); split: -0.13%, +0.02%
VGPRs: 288256 -> 288172 (-0.03%)
SpillSGPRs: 7290 -> 7297 (+0.10%)
Latency: 160690992 -> 160643825 (-0.03%); split: -0.05%, +0.02%
InvThroughput: 26869332 -> 26849963 (-0.07%); split: -0.09%, +0.02%
VClause: 237078 -> 237003 (-0.03%); split: -0.04%, +0.01%
SClause: 270560 -> 270564 (+0.00%); split: -0.01%, +0.01%
Copies: 936165 -> 937970 (+0.19%); split: -0.07%, +0.26%
Branches: 302981 -> 302992 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 244967 -> 245303 (+0.14%)
PreVGPRs: 232930 -> 232886 (-0.02%); split: -0.02%, +0.00%
VALU: 6200283 -> 6187264 (-0.21%); split: -0.23%, +0.02%
SALU: 1759176 -> 1760275 (+0.06%); split: -0.10%, +0.16%
VOPD: 447502 -> 446194 (-0.29%); split: +0.14%, -0.43%
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39917 >
2026-02-17 10:01:21 +00:00
Rhys Perry
c0143829f9
nir/opt_intrinsics: optimize inot(inverse_ballot(const))
...
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Georg Lehmann
bca5aab2be
nir: let nir_analyze_fp_range take a nir_def
...
This is midly worse for vector constants, but so much simpler.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39756 >
2026-02-16 18:08:53 +00:00
Georg Lehmann
474af815ff
nir: rename nir_analyze_range because it's float only
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39756 >
2026-02-16 18:08:53 +00:00
Georg Lehmann
f2a59fdea6
nir: remove non float nir_analyse_range support
...
This was always unused/unfinished.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39756 >
2026-02-16 18:08:53 +00:00
Georg Lehmann
f7222d6939
nir/opt_algebraic: remove few uses of integer nir_analyze_range
...
Surprisingly, this has an effect on GFX1201:
Totals from 66 (0.08% of 82405) affected shaders:
Instrs: 200725 -> 201517 (+0.39%)
CodeSize: 978676 -> 981488 (+0.29%)
Latency: 291736 -> 291760 (+0.01%)
InvThroughput: 31556 -> 31604 (+0.15%)
Copies: 11928 -> 12588 (+5.53%)
Branches: 14850 -> 15048 (+1.33%)
SALU: 68981 -> 69509 (+0.77%)
I say surprisingly, because nir_analyze_range handles nothing but
constants and bcsel for integers. Maybe rdr2 is actually
hitting some weird bcsel(a, #b, #c) == 0 case where b and c are not 0?
No, I looked at a few of those shaders, and it's just noise from changed
instruction order.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39756 >
2026-02-16 18:08:53 +00:00
Marek Olšák
aa92b464f3
nir/opt_non_uniform_access: use new query flags
...
NFC for drivers
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743 >
2026-02-16 12:59:36 +00:00
Marek Olšák
61a96be494
nir/lower_non_uniform_access: add an option not to lower tex & image queries
...
AMD can do non-uniform queries. The RADV change will be in a separate commit.
NFC for drivers.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743 >
2026-02-16 12:59:36 +00:00
Marek Olšák
a9df891bc6
nir: allow get_ssbo_size to return a 64-bit result
...
to match get_ubo_size, and to support HW where SSBOs can have a 64-bit size.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743 >
2026-02-16 12:59:36 +00:00
Marek Olšák
c151402f35
nir: add ACCESS to get_ubo_size
...
so that we can set NON_UNIFORM
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743 >
2026-02-16 12:59:36 +00:00
Marek Olšák
1d09a975bf
nir: handle get_ubo_size as a resource query in nir_shader_gather_info
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743 >
2026-02-16 12:59:36 +00:00
Ian Romanick
9017d37e84
nir: Use STACK_ARRAY instead of NIR_VLA
...
The number of fields comes from the shader, so it could be a value large
enough that using alloca would be problematic.
Fixes: c11833ab24 ("nir,spirv: Rework function calls")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39866 >
2026-02-14 01:19:27 +00:00
Marek Olšák
0a9bdcac79
ac: lower load_workgroup_ids for ACO in NIR
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638 >
2026-02-13 15:33:19 +00:00
Daniel Schürmann
88b4221519
nir/clone: Fix cloning indirect call instructions
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes: bb40284f76 ('nir: Add indirect calls')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39844 >
2026-02-13 11:27:59 +00:00
Sagar Ghuge
1fb8435b77
nir: Add nir_resource_intel_internal entry
...
Will use the load/store_ssbo with nir_resource_intel_internal later in
this series.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35160 >
2026-02-12 16:45:22 +00:00
Rhys Perry
fa5d4174c4
nir/search: use memcmp/memcpy/memset
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39808 >
2026-02-12 14:47:06 +00:00
Rhys Perry
5d92942241
nir/search: remove creation of swizzle
...
match_expression() only accesses the first instr->def.num_components
elements, so we don't need to ensure the rest are zero.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39808 >
2026-02-12 14:47:06 +00:00
jiajia Qian
f16d17a454
nir/opt_phi_precision: Fix bit size mismatch when moving widening conversions
...
Add a check to ensure that when load_const can be narrowed, the bit size
from other widening conversion sources must be 16-bit to maintain
consistency across all phi sources.
Signed-off-by: jiajia Qian <jiajia.qian@nxp.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39773 >
2026-02-12 12:27:55 +00:00
Karol Herbst
a274b9c6a8
nak: Fold constant ishl into shared ld/st/atoms
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Totals:
CodeSize: 9459006048 -> 9458124656 (-0.01%); split: -0.01%, +0.00%
Number of GPRs: 47358402 -> 47358138 (-0.00%)
SLM Size: 5409064 -> 5409024 (-0.00%)
Static cycle count: 6129914910 -> 6129436959 (-0.01%); split: -0.01%, +0.00%
Spills to memory: 44471 -> 44453 (-0.04%)
Fills from memory: 44471 -> 44453 (-0.04%)
Spills to reg: 186364 -> 186365 (+0.00%); split: -0.00%, +0.00%
Fills from reg: 226975 -> 226976 (+0.00%); split: -0.00%, +0.00%
Max warps/SM: 50638680 -> 50638804 (+0.00%)
Totals from 9700 (0.83% of 1163204) affected shaders:
CodeSize: 234188480 -> 233307088 (-0.38%); split: -0.43%, +0.05%
Number of GPRs: 567950 -> 567686 (-0.05%)
SLM Size: 39952 -> 39912 (-0.10%)
Static cycle count: 225267269 -> 224789318 (-0.21%); split: -0.26%, +0.05%
Spills to memory: 4792 -> 4774 (-0.38%)
Fills from memory: 4792 -> 4774 (-0.38%)
Spills to reg: 33250 -> 33251 (+0.00%); split: -0.00%, +0.01%
Fills from reg: 27531 -> 27532 (+0.00%); split: -0.00%, +0.01%
Max warps/SM: 349200 -> 349324 (+0.04%)
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39709 >
2026-02-11 03:42:05 +01:00
Karol Herbst
18bf6fb96d
nir: add nvidias shared memory non unform address shift
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39709 >
2026-02-11 03:41:23 +01:00
Georg Lehmann
fbc0562203
nir/algebraic: allow inexact optimizations with sz/inf/nan preserve
...
Vulkan says these options only apply after possible contract/reassoc/transform
optimizations using real number rules.
No Foz-DB Navi48:
Totals from 3923 (4.76% of 82405) affected shaders:
MaxWaves: 113159 -> 113121 (-0.03%); split: +0.01%, -0.05%
Instrs: 6946272 -> 6933510 (-0.18%); split: -0.22%, +0.03%
CodeSize: 38894140 -> 38844432 (-0.13%); split: -0.16%, +0.03%
VGPRs: 206280 -> 206412 (+0.06%); split: -0.06%, +0.12%
Latency: 45991075 -> 45964455 (-0.06%); split: -0.09%, +0.03%
InvThroughput: 8555282 -> 8546561 (-0.10%); split: -0.15%, +0.05%
VClause: 159765 -> 159745 (-0.01%); split: -0.05%, +0.04%
SClause: 160199 -> 160263 (+0.04%); split: -0.07%, +0.11%
Copies: 550751 -> 550432 (-0.06%); split: -0.17%, +0.11%
Branches: 192949 -> 192960 (+0.01%)
PreSGPRs: 189198 -> 189314 (+0.06%); split: -0.07%, +0.13%
PreVGPRs: 142732 -> 142544 (-0.13%); split: -0.33%, +0.20%
VALU: 3579904 -> 3569665 (-0.29%); split: -0.34%, +0.05%
SALU: 1072897 -> 1072440 (-0.04%); split: -0.18%, +0.14%
VMEM: 262759 -> 262791 (+0.01%)
SMEM: 246224 -> 246230 (+0.00%)
VOPD: 369734 -> 369207 (-0.14%); split: +0.08%, -0.23%
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
4e2f1345d8
nir/opt_algebraic: make fcmp(a+b, 0.0) -> fcmp(a, -b) exact using ninf
...
And remove some cases that never happen because we remove fneg on compare with constants.
Foz-DB Navi48:
Totals from 1305 (1.58% of 82405) affected shaders:
MaxWaves: 32872 -> 32854 (-0.05%)
Instrs: 4554013 -> 4551638 (-0.05%); split: -0.06%, +0.01%
CodeSize: 25269108 -> 25255428 (-0.05%); split: -0.06%, +0.00%
VGPRs: 87660 -> 87732 (+0.08%)
Latency: 33291152 -> 33285023 (-0.02%); split: -0.03%, +0.01%
InvThroughput: 8965288 -> 8963071 (-0.02%); split: -0.03%, +0.00%
VClause: 104008 -> 103947 (-0.06%); split: -0.09%, +0.03%
SClause: 97577 -> 97574 (-0.00%); split: -0.01%, +0.00%
Copies: 372741 -> 372628 (-0.03%); split: -0.05%, +0.02%
Branches: 134076 -> 134072 (-0.00%)
PreSGPRs: 65109 -> 65110 (+0.00%); split: -0.00%, +0.00%
PreVGPRs: 68911 -> 68968 (+0.08%); split: -0.01%, +0.10%
VALU: 2247091 -> 2245815 (-0.06%); split: -0.07%, +0.01%
SALU: 810190 -> 810001 (-0.02%); split: -0.02%, +0.00%
VOPD: 205075 -> 205016 (-0.03%); split: +0.04%, -0.07%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
ef7dd040d9
nir/opt_algebraic: make a < 0.0 ? -a : a exact using search helpers
...
Foz-DB Navi21:
Totals from 104 (0.13% of 82405) affected shaders:
Instrs: 175964 -> 175514 (-0.26%); split: -0.26%, +0.00%
CodeSize: 909008 -> 908744 (-0.03%); split: -0.05%, +0.02%
Latency: 1515203 -> 1514560 (-0.04%); split: -0.05%, +0.01%
InvThroughput: 308751 -> 308573 (-0.06%); split: -0.06%, +0.00%
Copies: 10318 -> 10315 (-0.03%); split: -0.06%, +0.03%
PreVGPRs: 5767 -> 5755 (-0.21%)
VALU: 108151 -> 107745 (-0.38%)
VOPD: 738 -> 737 (-0.14%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
0474ad1504
nir/opt_algebraic: make ffract(is_integral) exact using nnan
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
b8d1763e0a
nir/opt_algebraic: make some more fcmp patterns exact using nnan
...
No Foz-DB changes.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
8d52c59505
nir/opt_algebraic: make some fmin/fmax/fsat patterns exact using nsz/nnan
...
Foz-DB Navi48:
Totals from 90 (0.11% of 82405) affected shaders:
Instrs: 52109 -> 52032 (-0.15%); split: -0.16%, +0.01%
CodeSize: 263916 -> 263900 (-0.01%); split: -0.05%, +0.05%
Latency: 504693 -> 504775 (+0.02%); split: -0.01%, +0.03%
InvThroughput: 81444 -> 81157 (-0.35%)
Copies: 2894 -> 2895 (+0.03%)
VALU: 30097 -> 29991 (-0.35%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
486ea54184
nir/opt_algebraic: make bcsel(fcmp(b, a), b, a) -> fmin/fmax patterns exact
...
These patterns need is_only_used_as_float because fmin/fmax might change NaN
patterns, while bcsel is bit exact. For the same reason, the replacement
must not add undefined results, so make the replacement NaN/inf preserving.
It's impossible to make them signed zero correct (-0.0 == +0.0),
so it's also important that the user alu doesn't care.
Otherwise, the only thing that matters is is whether a is NaN.
Foz-DB Navi48:
Totals from 453 (0.55% of 82405) affected shaders:
MaxWaves: 8242 -> 8270 (+0.34%)
Instrs: 2382059 -> 2380094 (-0.08%); split: -0.09%, +0.00%
CodeSize: 13197208 -> 13179488 (-0.13%); split: -0.14%, +0.00%
VGPRs: 44688 -> 44604 (-0.19%)
Latency: 22839894 -> 22838985 (-0.00%); split: -0.01%, +0.00%
InvThroughput: 4873352 -> 4872924 (-0.01%)
VClause: 50862 -> 50883 (+0.04%); split: -0.02%, +0.06%
SClause: 54000 -> 53993 (-0.01%)
Copies: 250215 -> 250233 (+0.01%); split: -0.00%, +0.01%
PreVGPRs: 39694 -> 39620 (-0.19%)
VALU: 1116881 -> 1116073 (-0.07%); split: -0.07%, +0.00%
SALU: 492799 -> 492139 (-0.13%); split: -0.14%, +0.00%
VOPD: 85457 -> 85461 (+0.00%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
aa78083477
nir: make alu fp_math_ctrl helpers const
...
No Foz-DB changes.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
f55668bb50
nir/opt_algebraic: update flt -> fneu patterns
...
And remove the ones that are redundant because we already move the fneg to
the constant source.
No Foz-DB changes.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
15b13d5fd4
nir/opt_algebraic: optimize flt/fge(#c, fadd(a, #b))
...
I guess these were missing because the author forgot flt/fge aren't commutative.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
2355b63cb5
nir/opt_algebraic: use better float control for some fcmp patterns
...
Foz-DB Navi48:
Totals from 1084 (1.32% of 82405) affected shaders:
Instrs: 1969973 -> 1968947 (-0.05%); split: -0.08%, +0.02%
CodeSize: 11349704 -> 11344884 (-0.04%); split: -0.06%, +0.02%
VGPRs: 59076 -> 59064 (-0.02%); split: -0.06%, +0.04%
Latency: 20766031 -> 20755032 (-0.05%); split: -0.07%, +0.01%
InvThroughput: 2849402 -> 2846733 (-0.09%); split: -0.10%, +0.01%
VClause: 40736 -> 40740 (+0.01%)
SClause: 91835 -> 91832 (-0.00%)
Copies: 217961 -> 217868 (-0.04%); split: -0.07%, +0.02%
Branches: 60045 -> 60031 (-0.02%)
PreSGPRs: 50639 -> 50618 (-0.04%); split: -0.06%, +0.02%
PreVGPRs: 39593 -> 39590 (-0.01%); split: -0.01%, +0.01%
VALU: 960270 -> 959524 (-0.08%); split: -0.10%, +0.02%
SALU: 326638 -> 326680 (+0.01%); split: -0.04%, +0.06%
VOPD: 23963 -> 23929 (-0.14%); split: +0.04%, -0.18%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
7238888d93
nir/opt_algebraic: remove redundant patterns with fcmp(fneg(...), #c)
...
We already have patterns to move the negation to the constant.
No Foz-DB changes.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
03c497f236
nir/opt_algebraic: make 1.0 - fsat(a) -> fsat(1.0 - a) pattern exact using nnan
...
Foz-DB Navi48:
Totals from 50 (0.06% of 82405) affected shaders:
CodeSize: 137072 -> 137456 (+0.28%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:02 +00:00
Georg Lehmann
79e4530a9b
nir/opt_algebraic: make pattern pushing fmul into bcsel exact
...
The only special case here is d == -0.0.
Foz-DB Navi48:
Totals from 3 (0.00% of 82405) affected shaders:
CodeSize: 29140 -> 29188 (+0.16%)
InvThroughput: 2945 -> 2951 (+0.20%)
VALU: 3217 -> 3223 (+0.19%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:02 +00:00
Georg Lehmann
a3bc94a3d0
nir/opt_algebraic: remove inexact from floor->trunc pattern
...
This was marked inexact because of me in !21475 , but I don't see why now,
even after checking all the special values.
No Foz-DB changes.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:02 +00:00
Georg Lehmann
da7abb1337
nir/opt_algebraic: mark fmulz(finite, finite) -> fmul pattern as nsz
...
No Foz-DB chagnes.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:02 +00:00
Georg Lehmann
ea87f1f9bc
nir/opt_algebraic: add a - a with nnan
...
Foz-DB Navi48:
Totals from 576 (0.70% of 82405) affected shaders:
MaxWaves: 16706 -> 16726 (+0.12%)
Instrs: 618677 -> 580965 (-6.10%); split: -6.10%, +0.00%
CodeSize: 3022552 -> 2861612 (-5.32%); split: -5.33%, +0.00%
VGPRs: 28008 -> 28860 (+3.04%); split: -0.51%, +3.56%
Latency: 2689318 -> 2655887 (-1.24%); split: -1.25%, +0.01%
InvThroughput: 403512 -> 393404 (-2.51%); split: -2.51%, +0.00%
VClause: 7584 -> 7577 (-0.09%); split: -0.17%, +0.08%
SClause: 19974 -> 19086 (-4.45%); split: -4.48%, +0.03%
Copies: 43862 -> 40888 (-6.78%); split: -6.87%, +0.09%
Branches: 12457 -> 11407 (-8.43%)
PreSGPRs: 28315 -> 27046 (-4.48%); split: -4.53%, +0.05%
PreVGPRs: 20751 -> 19397 (-6.52%)
VALU: 317224 -> 290151 (-8.53%); split: -8.53%, +0.00%
SALU: 124297 -> 121347 (-2.37%); split: -2.39%, +0.02%
VMEM: 11918 -> 11907 (-0.09%)
SMEM: 27582 -> 26241 (-4.86%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:02 +00:00
Georg Lehmann
16db9f79d1
nir/opt_algebraic: remove inexact a * 0.0 patterns
...
We already have some with nnan,nsz.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:02 +00:00
Georg Lehmann
63d199a01e
nir: remove special fp_math_ctrl rules
...
All opcodes should now respect the nan/inf/sz preserving flags.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:02 +00:00
Georg Lehmann
e443229644
nir/opt_algebraic: mark newly created fmulz nan/inf preserving
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:02 +00:00
Georg Lehmann
b678899ef8
nir/opt_algebraic: use nan/inf/sz preserve flags instead of exact for cmp/min/max replacement
...
And remove some, because they should be covered by the search pattern anyway.
Foz-DB Navi48:
Totals from 560 (0.68% of 82405) affected shaders:
MaxWaves: 11279 -> 11291 (+0.11%)
Instrs: 5214229 -> 5214386 (+0.00%); split: -0.02%, +0.02%
CodeSize: 29613884 -> 29616740 (+0.01%); split: -0.01%, +0.02%
VGPRs: 50400 -> 50328 (-0.14%)
Latency: 36481700 -> 36481157 (-0.00%); split: -0.01%, +0.01%
InvThroughput: 7309905 -> 7307905 (-0.03%); split: -0.05%, +0.02%
VClause: 131423 -> 131424 (+0.00%); split: -0.00%, +0.00%
SClause: 111485 -> 111499 (+0.01%); split: -0.00%, +0.01%
Copies: 441899 -> 442029 (+0.03%); split: -0.02%, +0.05%
Branches: 165599 -> 165597 (-0.00%)
PreVGPRs: 43558 -> 43525 (-0.08%)
VALU: 2573609 -> 2573324 (-0.01%); split: -0.03%, +0.02%
SALU: 851172 -> 851271 (+0.01%); split: -0.01%, +0.02%
VOPD: 366409 -> 366934 (+0.14%); split: +0.23%, -0.08%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:02 +00:00
Georg Lehmann
a8ad72b912
nir/search: add option to set nan/inf/sz preserve on replacement patterns
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:02 +00:00
Georg Lehmann
52eab085e6
nir/lower_uniform_subgroup: use nan/inf preserve instead of exact for feq
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:02 +00:00
Georg Lehmann
30da75e8b1
nir/lower_double_ops: don't create more exact ops than the input requires
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:02 +00:00
Georg Lehmann
e2301164c7
nir/format_convert: use nan/inf preserve flag for fmax instead of exact
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:02 +00:00
Daniel Schürmann
e362011cca
nir/loop_analyze: also set force_unroll if the array_size is larger than max_trip_count
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Loop peeling can reduce the trip_count. It is also not
necessary that the array_size exactly matches the trip_count.
Totals from 54 (0.06% of 84383) affected shaders: (Navi48)
MaxWaves: 758 -> 884 (+16.62%)
Instrs: 284511 -> 343292 (+20.66%)
CodeSize: 1524940 -> 1837996 (+20.53%)
VGPRs: 5904 -> 5544 (-6.10%)
Scratch: 18432 -> 0 (-inf%)
Latency: 7317179 -> 7186789 (-1.78%); split: -1.80%, +0.02%
InvThroughput: 1646024 -> 1545357 (-6.12%); split: -6.19%, +0.08%
VClause: 5840 -> 6867 (+17.59%); split: -1.92%, +19.50%
SClause: 6959 -> 7935 (+14.03%)
Copies: 25516 -> 31310 (+22.71%); split: -4.87%, +27.58%
Branches: 9205 -> 10571 (+14.84%); split: -3.25%, +18.09%
PreSGPRs: 5586 -> 5394 (-3.44%); split: -3.67%, +0.23%
PreVGPRs: 5087 -> 4674 (-8.12%); split: -8.18%, +0.06%
VALU: 145243 -> 174719 (+20.29%)
SALU: 53128 -> 67594 (+27.23%); split: -0.00%, +27.23%
VMEM: 8911 -> 10221 (+14.70%); split: -1.41%, +16.11%
SMEM: 8519 -> 9509 (+11.62%)
VOPD: 419 -> 796 (+89.98%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39778 >
2026-02-10 09:24:23 +00:00
Daniel Schürmann
b5439c4fbf
nir/opt_loop_unroll: Always unroll loops with a known trip-count of 0
...
Loop peeling decrements the calculated trip count, which might
result in a known trip-count of 0 for single-iteration loops.
Thus, also unroll loops if max_trip_count == 0 and exact_trip_count_known.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39778 >
2026-02-10 09:24:23 +00:00
Faith Ekstrand
02bade5cfa
nir/lower_bool_to_bit_size: Make smarter canonicalization choices
...
Instead of blindly taking the first source, take the first source that
isn't a constant. That way we won't accidentally expand things to
32-bit just because a constant came first.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39725 >
2026-02-09 18:16:40 +00:00
Faith Ekstrand
711b3358a8
nir/lower_bool_to_bit_size: Use the correct num_components for conversions
...
There's a nice little comment here saying we use the same write mask (an
out of date term in NIR) and swizzle but we're no longer actually doing
that. Depending on nir_builder magic, we may actually generate a scalar
when we really want a vector. The fix is to use more builder helpers
and just eat the potential copy.
Fixes: 3180656bbc ("nir: don't use nir_build_alu() with incomplete sources")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39725 >
2026-02-09 18:16:40 +00:00