Georg Lehmann
6936282bd3
nir/opt_algebraic: remove min(a, >= 1.0) before fsat
...
Foz-DB Navi48:
Totals from 86 (0.08% of 114655) affected shaders:
Instrs: 217553 -> 217408 (-0.07%); split: -0.07%, +0.01%
CodeSize: 1159992 -> 1159380 (-0.05%); split: -0.06%, +0.01%
Latency: 1657600 -> 1657533 (-0.00%); split: -0.01%, +0.00%
InvThroughput: 203205 -> 203178 (-0.01%); split: -0.02%, +0.00%
SClause: 5245 -> 5244 (-0.02%)
Copies: 13726 -> 13716 (-0.07%); split: -0.14%, +0.07%
VALU: 130151 -> 130039 (-0.09%); split: -0.09%, +0.00%
SALU: 26476 -> 26474 (-0.01%); split: -0.02%, +0.01%
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40281 >
2026-03-09 21:11:25 +00:00
Georg Lehmann
108a4d4341
nir: create more fsat using range analysis
...
Foz-DB Navi48:
Totals from 5922 (5.17% of 114655) affected shaders:
Instrs: 5188307 -> 5184193 (-0.08%); split: -0.09%, +0.01%
CodeSize: 27852544 -> 27843252 (-0.03%); split: -0.05%, +0.01%
Latency: 28723967 -> 28714268 (-0.03%); split: -0.04%, +0.01%
InvThroughput: 4745002 -> 4742298 (-0.06%); split: -0.07%, +0.01%
VClause: 68649 -> 68650 (+0.00%)
SClause: 103932 -> 103917 (-0.01%); split: -0.02%, +0.00%
Copies: 244683 -> 244706 (+0.01%); split: -0.01%, +0.02%
PreSGPRs: 272361 -> 272362 (+0.00%); split: -0.00%, +0.00%
VALU: 3248960 -> 3245520 (-0.11%); split: -0.11%, +0.00%
SALU: 516784 -> 516796 (+0.00%); split: -0.01%, +0.01%
VOPD: 8910 -> 8895 (-0.17%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40281 >
2026-03-09 21:11:25 +00:00
Alyssa Rosenzweig
edccd06a0b
nir/lower_subgroups: fix boolean clustered reductions
...
It is legal to have a cluster size larger than the subgroup/ballot size,
but our lowering would blow up in this case due to the nir_ishl_imm
overflowing in the lowering. Fortunately, this is easy to handle.
Fixes sub_group_clustered_reduce_logical_and()
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40224 >
2026-03-09 14:50:37 +00:00
Kenneth Graunke
952bf55483
nir: Fix divergence of Intel URB input/output handle intrinsics
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Tessellation evaluation shaders have a single convergent URB handle
(for the common patch data) used by all lanes. Every other stage's
IO handles have separate handles in each lane.
Thanks to Alyssa Rosenzweig for catching this bug.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40280 >
2026-03-09 02:38:59 +00:00
Georg Lehmann
7c217e540c
nir: add a pass to optimize fp_math_ctrl
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40098 >
2026-03-07 08:16:27 +01:00
Georg Lehmann
f474e9853e
nir: add fp class analysis tests
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:45 +00:00
Georg Lehmann
4885e5cf3a
nir: remove more fsat using range analysis
...
Foz-DB Navi48:
Totals from 3018 (3.65% of 82636) affected shaders:
MaxWaves: 69274 -> 69280 (+0.01%)
Instrs: 7165414 -> 7157581 (-0.11%); split: -0.12%, +0.01%
CodeSize: 38890212 -> 38823132 (-0.17%); split: -0.18%, +0.00%
VGPRs: 228672 -> 228624 (-0.02%)
Latency: 64789026 -> 64784877 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 11805156 -> 11802642 (-0.02%); split: -0.02%, +0.00%
VClause: 136900 -> 136886 (-0.01%); split: -0.03%, +0.02%
SClause: 150135 -> 150130 (-0.00%); split: -0.01%, +0.01%
Copies: 574690 -> 574894 (+0.04%); split: -0.03%, +0.06%
Branches: 187169 -> 187086 (-0.04%); split: -0.04%, +0.00%
PreSGPRs: 190074 -> 190067 (-0.00%); split: -0.00%, +0.00%
PreVGPRs: 189564 -> 189538 (-0.01%); split: -0.02%, +0.00%
VALU: 3955188 -> 3949411 (-0.15%); split: -0.15%, +0.00%
SALU: 1114659 -> 1114729 (+0.01%); split: -0.02%, +0.03%
SMEM: 231080 -> 231077 (-0.00%); split: -0.00%, +0.00%
VOPD: 116150 -> 116180 (+0.03%); split: +0.04%, -0.02%
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:45 +00:00
Georg Lehmann
506bb5a609
nir/search_helpers: use fp class analysis more
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:45 +00:00
Georg Lehmann
a9e75d8ee4
nir: remove nir_analyze_fp_range
...
Use fp class analysis instead.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
eb431efc19
nir/search_helpers: switch to fp class analysis
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
58799c4e7c
nir/gather_tcs_info: use nir_analyze_fp_class directly
...
The information around positive one helps in theory.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
0ecf2c322e
nir: add fp class analysis for fround_even
...
Foz-DB Navi48:
Totals from 383 (0.33% of 114655) affected shaders:
MaxWaves: 9806 -> 9808 (+0.02%)
Instrs: 502508 -> 501762 (-0.15%); split: -0.16%, +0.01%
CodeSize: 2711404 -> 2707604 (-0.14%); split: -0.15%, +0.01%
VGPRs: 24360 -> 24348 (-0.05%)
Latency: 2068105 -> 2066817 (-0.06%); split: -0.07%, +0.01%
InvThroughput: 370962 -> 370081 (-0.24%)
VClause: 7045 -> 7041 (-0.06%)
SClause: 10551 -> 10559 (+0.08%); split: -0.08%, +0.15%
Copies: 29135 -> 29117 (-0.06%); split: -0.12%, +0.05%
Branches: 17333 -> 17328 (-0.03%)
PreSGPRs: 21511 -> 21510 (-0.00%)
PreVGPRs: 18555 -> 18545 (-0.05%)
VALU: 274445 -> 273874 (-0.21%); split: -0.21%, +0.00%
SALU: 78819 -> 78779 (-0.05%); split: -0.07%, +0.02%
VMEM: 10918 -> 10913 (-0.05%)
SMEM: 17662 -> 17656 (-0.03%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
7509b4a199
nir: add fp class analysis for fsub
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
d8734e5453
nir: add fp class analysis for shadow compare
...
Foz-DB Navi48:
Totals from 145 (0.18% of 82636) affected shaders:
Instrs: 280871 -> 280729 (-0.05%)
CodeSize: 1545724 -> 1545488 (-0.02%); split: -0.02%, +0.00%
Latency: 10840265 -> 10840216 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 2093707 -> 2093646 (-0.00%)
SClause: 4483 -> 4481 (-0.04%)
VALU: 188142 -> 188039 (-0.05%)
SALU: 22238 -> 22236 (-0.01%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
6d3a279a3b
nir: add fp class analysis for some intrinsics
...
I also tried ddx/ddy, but that was not worth it.
Foz-DB Navi48:
Totals from 1019 (1.23% of 82636) affected shaders:
Instrs: 516459 -> 515700 (-0.15%); split: -0.17%, +0.02%
CodeSize: 2712428 -> 2707008 (-0.20%); split: -0.21%, +0.01%
VGPRs: 70152 -> 70140 (-0.02%)
Latency: 1799198 -> 1795926 (-0.18%); split: -0.19%, +0.00%
InvThroughput: 233497 -> 232628 (-0.37%); split: -0.37%, +0.00%
VClause: 15315 -> 15346 (+0.20%); split: -0.11%, +0.31%
Copies: 30009 -> 30035 (+0.09%); split: -0.06%, +0.14%
VALU: 305519 -> 304727 (-0.26%); split: -0.27%, +0.01%
SALU: 45855 -> 45854 (-0.00%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
73bce23f65
nir: add fp class analysis for flog2
...
Foz-DB Navi48:
Totals from 230 (0.28% of 82636) affected shaders:
Instrs: 599005 -> 598615 (-0.07%); split: -0.09%, +0.02%
CodeSize: 3110528 -> 3103136 (-0.24%); split: -0.24%, +0.00%
Latency: 3661526 -> 3663241 (+0.05%); split: -0.01%, +0.05%
InvThroughput: 526561 -> 526487 (-0.01%); split: -0.01%, +0.00%
Copies: 33735 -> 33820 (+0.25%); split: -0.06%, +0.31%
VALU: 378034 -> 377904 (-0.03%); split: -0.03%, +0.00%
SALU: 65156 -> 65045 (-0.17%); split: -0.19%, +0.02%
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
81e272aa1d
nir: add fp class analysis for sin/cos
...
Foz-DB Navi48:
Totals from 264 (0.32% of 82636) affected shaders:
CodeSize: 1688676 -> 1688672 (-0.00%)
Latency: 510773 -> 510772 (-0.00%)
InvThroughput: 138569 -> 138568 (-0.00%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
5a298f3560
nir: rewrite fp range analysis as a fp class analysis
...
Knowing if a value is not larger than one helps proving finite
results of fmul/fadd and will allow skipping/creating more fsat.
Knowing that a value is larger than one helps proving non zero
results of fmul.
Separating positive and negative zero also has advantages when
signed zero correctness is required.
Foz-DB Navi48:
Totals from 1344 (1.63% of 82636) affected shaders:
Instrs: 5319389 -> 5312280 (-0.13%); split: -0.14%, +0.01%
CodeSize: 29702516 -> 29665684 (-0.12%); split: -0.13%, +0.01%
Latency: 40694344 -> 40694545 (+0.00%); split: -0.01%, +0.02%
InvThroughput: 7481192 -> 7480403 (-0.01%); split: -0.02%, +0.01%
VClause: 121947 -> 121946 (-0.00%); split: -0.00%, +0.00%
SClause: 104972 -> 104923 (-0.05%); split: -0.05%, +0.00%
Copies: 371098 -> 371092 (-0.00%); split: -0.02%, +0.02%
Branches: 122929 -> 122919 (-0.01%); split: -0.01%, +0.00%
PreSGPRs: 82506 -> 82510 (+0.00%); split: -0.00%, +0.01%
PreVGPRs: 79175 -> 79168 (-0.01%)
VALU: 2906718 -> 2904777 (-0.07%); split: -0.07%, +0.00%
SALU: 726256 -> 723454 (-0.39%); split: -0.39%, +0.00%
VMEM: 205021 -> 205016 (-0.00%)
SMEM: 163972 -> 163916 (-0.03%)
VOPD: 303354 -> 303298 (-0.02%); split: +0.02%, -0.04%
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
32b5719a9f
nir/opt_algebraic: add is_not_uint_zero for b2i16(uge) pattern
...
More fallout from f2a59fdea6 .
is_not_zero now always returns whether the result is a floating point zero.
When combined with the fp denorm handling that will be added to
floating point range analysis, this is false for many sensible integer values.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Georg Lehmann
ab773fc5d4
nir/opt_algebraic: fix frsq clamp pattern
...
This is not NaN correct.
And also make the pattern 32bit only because the constant is hard coded
FLT_MAX.
Fixes: 780b5c1037 ("nir/algebraic: Simplify some Inf and NaN avoidance code")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:42 +00:00
Georg Lehmann
ba30de1f97
nir/opt_algebraic: remove pattern that skips iabs with range analysis
...
Fixes: f2a59fdea6 ("nir: remove non float nir_analyse_range support")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:41 +00:00
Caio Oliveira
da57fbfb07
nir: Fix constant folding for iadd_sat
...
Use INT_MIN instead of INT_MAX for underflow.
Fixes: cc4b50b023 ("nir/opcodes: use u_overflow to fix incorrect checks")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pelloux@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40252 >
2026-03-06 22:26:07 +00:00
Emma Anholt
2ec8ecd7de
nir: Do NIR_DEBUG=print under a lock.
...
With most Vulkan engines doing multithreaded compiles, NIR_DEBUG=print has
been a frustrating racy mess. Take a lock when we're doing per-pass
printing, so that the output is coherent. This unfortunately
single-threads the compiler process itself in that case, but when you're
NIR_DEBUG=printing, that's probably not a big deal.
An assert is introduced to make sure that nobody nests NIR_PASS() in a way
that would break printing.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40126 >
2026-03-06 19:50:38 +00:00
Alyssa Rosenzweig
1c1c119d7b
nir/lower_io: handle Intel URB intrinsics
...
useful to query these too, they're kinda like load_ssbo/store_ssbo.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40181 >
2026-03-06 13:28:32 +00:00
Lionel Landwerlin
e14d6b535c
brw/nir: add new intrinsics to load data from the indirect address
...
This address is delivered on Gfx12.5+ in compute/mesh/task shaders
from the command stream instruction.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40174 >
2026-03-06 06:34:43 +00:00
Lionel Landwerlin
7b1533414a
brw/nir: enable constant offsets for global_constant_uniform_block_intel
...
Will be useful to retain the base offset added in 0e9453291c ("brw:
improve push constant loading using base offsets") once we move push
constant data loading into NIR.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40174 >
2026-03-06 06:34:43 +00:00
Rhys Perry
e43caba5f4
nir/range_analysis: use sparse array for float analysis
...
This seems to be faster.
ministat (nir_analyze_fp_range):
Difference at 95.0% confidence
-592900 +/- 2302.24
-27.6432% +/- 0.0998961%
(Student's t, pooled s = 2719.05)
ministat (overall):
Difference at 95.0% confidence
-76.8333 +/- 27.2345
-0.632558% +/- 0.223407%
(Student's t, pooled s = 46.867)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190 >
2026-03-05 11:26:25 +00:00
Rhys Perry
aecbb2a903
nir/range_analysis: use function pointers for lookup
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190 >
2026-03-05 11:26:25 +00:00
Rhys Perry
2731c34891
nir/range_analysis: use SSA index for hash table keys
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190 >
2026-03-05 11:26:25 +00:00
Rhys Perry
5e376e3ed2
nir: add nir_fp_analysis_state
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190 >
2026-03-05 11:26:25 +00:00
Rhys Perry
c0079e09ca
nir/range_analysis: set deleted key
...
If (uintptr_t)&deleted_key is small enough, inserting entries into the
hash table might not work correctly.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190 >
2026-03-05 11:26:25 +00:00
Georg Lehmann
6a218e346d
nir: remove lower_vector_cmp
...
Use nir_lower_alu_width or nir_lower_alu_to_scalar instead.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40197 >
2026-03-04 19:50:28 +00:00
Georg Lehmann
3e6e1e213c
nir: remove fall_equal/fany_nequal opcodes
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40197 >
2026-03-04 19:50:27 +00:00
Georg Lehmann
d6977adc09
nir/lower_bool_to_float: assert that vector comparisons were lowered
...
There are no backends that handle the vector comparisons with float result.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40197 >
2026-03-04 19:50:27 +00:00
Karol Herbst
e1ed7de274
nir: fix nir_round_int_to_float for fp16
...
fp16 has quite the limited value range and with bigger integers
nir_round_int_to_float might return Inf where it shouldn't depending on
the rounding mode.
Fixes conversions half_rt[npz]_(u)?(int|long) CL CTS tests.
Cc: mesa-stable
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40163 >
2026-03-04 14:32:35 +00:00
Karol Herbst
8e8fb2ebaa
nir: fix nir_alu_type_range_contains_type_range for fp16 to int
...
The special value "Inf" doesn't fit into an int and therefore we have to
clamp regardless of whether all the other values would fit. And because
f2u32 and f2u64 define out-of-range conversions as UB in nir, we need to
clamp.
This change should have no effect for non saturating conversions.
Fixes "conversions long_sat_*half" CL CTS tests
Cc: mesa-stable
Suggested-by: Rob Clark <rob.clark@oss.qualcomm.com>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40163 >
2026-03-04 14:32:35 +00:00
Daniel Schürmann
56f5e35d95
nir/opt_remove_phis: recursively check loop header phis for triviality
...
This only checks for one level of nested phis
as the potential cost of recursive checks outweighs
the rare cases.
Totals from 393 (0.35% of 112055) affected shaders: (Navi48)
Instrs: 920765 -> 915832 (-0.54%); split: -0.54%, +0.00%
CodeSize: 4887052 -> 4867876 (-0.39%); split: -0.39%, +0.00%
SpillSGPRs: 464 -> 411 (-11.42%)
Latency: 6868149 -> 6856413 (-0.17%); split: -0.21%, +0.04%
InvThroughput: 841067 -> 839821 (-0.15%); split: -0.17%, +0.02%
Copies: 73573 -> 72021 (-2.11%)
Branches: 25973 -> 25343 (-2.43%)
PreSGPRs: 34110 -> 33454 (-1.92%)
PreVGPRs: 24594 -> 24593 (-0.00%)
VALU: 513068 -> 512816 (-0.05%); split: -0.05%, +0.00%
SALU: 133157 -> 130038 (-2.34%)
VOPD: 9773 -> 9673 (-1.02%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40165 >
2026-03-04 14:03:40 +00:00
Rob Clark
dfaa4375c3
rusticl: Let backend control convert_alu_types lowering
...
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40179 >
2026-03-03 12:13:04 -08:00
Georg Lehmann
7194dfcc2c
nir/opt_algebraic: optimize b2i(a) * b to bcsel
...
Foz-DB Navi48:
Totals from 3180 (2.77% of 114655) affected shaders:
MaxWaves: 85526 -> 85446 (-0.09%)
Instrs: 2681446 -> 2678641 (-0.10%); split: -0.17%, +0.07%
CodeSize: 14295536 -> 14284628 (-0.08%); split: -0.13%, +0.05%
VGPRs: 174792 -> 174636 (-0.09%); split: -0.16%, +0.07%
SpillSGPRs: 306 -> 308 (+0.65%)
Latency: 14078973 -> 14070122 (-0.06%); split: -0.07%, +0.01%
InvThroughput: 2774242 -> 2764051 (-0.37%); split: -0.37%, +0.00%
VClause: 41744 -> 41734 (-0.02%); split: -0.10%, +0.07%
SClause: 58176 -> 58154 (-0.04%); split: -0.05%, +0.01%
Copies: 222967 -> 223108 (+0.06%); split: -0.14%, +0.20%
Branches: 57317 -> 57322 (+0.01%)
PreSGPRs: 140454 -> 140451 (-0.00%); split: -0.01%, +0.00%
PreVGPRs: 131649 -> 131540 (-0.08%); split: -0.09%, +0.01%
VALU: 1509318 -> 1505443 (-0.26%); split: -0.26%, +0.00%
SALU: 384419 -> 385838 (+0.37%); split: -0.01%, +0.38%
VOPD: 13272 -> 13286 (+0.11%); split: +0.14%, -0.03%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40160 >
2026-03-02 15:58:30 +00:00
Georg Lehmann
3d304d5647
nir/opt_algebraic: remove is_used_once on outer instruction
...
This just prevents useful optimizations.
is_used_once only makes sense on inner instructions, to prevent
creating more new instructions than will be removed.
Foz-DB Navi48:
Totals from 16989 (14.82% of 114655) affected shaders:
MaxWaves: 434379 -> 434353 (-0.01%); split: +0.01%, -0.01%
Instrs: 29030794 -> 29022514 (-0.03%); split: -0.07%, +0.04%
CodeSize: 155293092 -> 155262816 (-0.02%); split: -0.05%, +0.03%
VGPRs: 1093980 -> 1094088 (+0.01%); split: -0.01%, +0.02%
SpillSGPRs: 9801 -> 9803 (+0.02%); split: -0.03%, +0.05%
Latency: 356327270 -> 356283384 (-0.01%); split: -0.03%, +0.02%
InvThroughput: 58239439 -> 58229374 (-0.02%); split: -0.03%, +0.01%
VClause: 451716 -> 451815 (+0.02%); split: -0.07%, +0.09%
SClause: 654614 -> 654556 (-0.01%); split: -0.03%, +0.03%
Copies: 1809805 -> 1809297 (-0.03%); split: -0.20%, +0.17%
Branches: 552382 -> 552384 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 947188 -> 947224 (+0.00%); split: -0.01%, +0.02%
PreVGPRs: 879583 -> 880173 (+0.07%); split: -0.01%, +0.08%
VALU: 16317859 -> 16309975 (-0.05%); split: -0.07%, +0.02%
SALU: 4256121 -> 4259315 (+0.08%); split: -0.05%, +0.12%
SMEM: 1067069 -> 1067070 (+0.00%)
VOPD: 440855 -> 440792 (-0.01%); split: +0.05%, -0.07%
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138 >
2026-03-02 15:24:36 +00:00
Georg Lehmann
41878e5714
nir_opt_algebraic: remove unneeded is_not_const
...
These were needed when we didn't constant fold inside nir_search,
to prevent infinite loops.
But now all they do is slow down pattern matching.
Foz-DB Navi48:
Totals from 107 (0.09% of 114655) affected shaders:
Instrs: 162439 -> 162481 (+0.03%); split: -0.01%, +0.03%
CodeSize: 943056 -> 942988 (-0.01%); split: -0.03%, +0.02%
Latency: 971667 -> 970865 (-0.08%); split: -0.09%, +0.00%
InvThroughput: 164452 -> 164521 (+0.04%); split: -0.02%, +0.07%
Copies: 7980 -> 7982 (+0.03%)
VALU: 103572 -> 103566 (-0.01%); split: -0.05%, +0.04%
SALU: 12825 -> 12878 (+0.41%)
VOPD: 5235 -> 5190 (-0.86%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138 >
2026-03-02 15:24:36 +00:00
Georg Lehmann
374cbc17a4
nir_opt_algebraic: reassociate fadd into ffma where one factor is a constant
...
This restriction doesn't really make sense, probably an accident.
Foz-DB Navi48:
Totals from 2290 (2.00% of 114655) affected shaders:
MaxWaves: 57496 -> 57510 (+0.02%); split: +0.06%, -0.03%
Instrs: 2817419 -> 2816209 (-0.04%); split: -0.12%, +0.08%
CodeSize: 15218816 -> 15220576 (+0.01%); split: -0.09%, +0.10%
VGPRs: 147456 -> 147384 (-0.05%); split: -0.07%, +0.02%
Latency: 13757114 -> 13751833 (-0.04%); split: -0.13%, +0.09%
InvThroughput: 2463343 -> 2462482 (-0.03%); split: -0.07%, +0.04%
VClause: 40137 -> 40153 (+0.04%); split: -0.07%, +0.11%
SClause: 57351 -> 57385 (+0.06%); split: -0.12%, +0.18%
Copies: 135482 -> 136258 (+0.57%); split: -0.22%, +0.79%
Branches: 30886 -> 30894 (+0.03%)
PreSGPRs: 113470 -> 113462 (-0.01%); split: -0.03%, +0.02%
PreVGPRs: 117554 -> 117591 (+0.03%); split: -0.01%, +0.04%
VALU: 1682734 -> 1681557 (-0.07%); split: -0.10%, +0.03%
SALU: 390685 -> 391301 (+0.16%); split: -0.07%, +0.22%
VOPD: 6159 -> 6254 (+1.54%); split: +1.72%, -0.18%
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138 >
2026-03-02 15:24:36 +00:00
Georg Lehmann
b949122908
nir/opt_algebraic: remove loops for b2f/b2i equality handling
...
The feq/fneu patterns already existed, and there is no reason to use bit size based
loops here.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138 >
2026-03-02 15:24:36 +00:00
Georg Lehmann
83091276f8
nir_opt_algebraic: remove more specific cmp+bcsel opts
...
Only some minimal difference from pattern ordering:
Foz-DB Navi48:
Totals from 3 (0.00% of 114655) affected shaders:
Instrs: 4556 -> 4533 (-0.50%)
CodeSize: 23716 -> 23608 (-0.46%)
Latency: 27424 -> 26336 (-3.97%)
InvThroughput: 4674 -> 4672 (-0.04%)
SClause: 107 -> 105 (-1.87%)
Copies: 351 -> 346 (-1.42%)
Branches: 130 -> 126 (-3.08%)
VALU: 2598 -> 2595 (-0.12%)
SALU: 561 -> 555 (-1.07%)
SMEM: 169 -> 167 (-1.18%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138 >
2026-03-02 15:24:36 +00:00
Georg Lehmann
4190241795
nir/opt_algebraic: optimize all comparisons of b2f/b2i with constants
...
Foz-DB Navi48:
Totals from 857 (0.75% of 114655) affected shaders:
Instrs: 1136993 -> 1132422 (-0.40%); split: -0.48%, +0.08%
CodeSize: 6096636 -> 6070832 (-0.42%); split: -0.48%, +0.06%
VGPRs: 49668 -> 49620 (-0.10%)
Latency: 24014661 -> 24044601 (+0.12%); split: -0.04%, +0.16%
InvThroughput: 4182482 -> 4183708 (+0.03%); split: -0.12%, +0.15%
VClause: 17698 -> 17695 (-0.02%)
SClause: 25214 -> 25213 (-0.00%)
Copies: 81474 -> 81396 (-0.10%); split: -0.79%, +0.69%
Branches: 24722 -> 24650 (-0.29%); split: -0.36%, +0.07%
PreSGPRs: 43338 -> 43291 (-0.11%); split: -0.22%, +0.11%
VALU: 652975 -> 649760 (-0.49%); split: -0.50%, +0.00%
SALU: 153961 -> 153797 (-0.11%); split: -0.72%, +0.61%
VOPD: 10650 -> 10684 (+0.32%); split: +0.38%, -0.07%
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138 >
2026-03-02 15:24:36 +00:00
Georg Lehmann
ef6f5377da
nir/opt_algebraic: remove fcmp+fneg patterns that are cleaned up earlier
...
No Foz-DB changes, as expected.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138 >
2026-03-02 15:24:36 +00:00
Georg Lehmann
a5334ec239
nir/opt_algebraic: generalize late fcmp(fneg(a), const) patterns
...
No reason just to do this for 1.0.
Foz-DB Navi48:
Totals from 44 (0.04% of 114655) affected shaders:
CodeSize: 111620 -> 111476 (-0.13%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138 >
2026-03-02 15:24:35 +00:00
Alyssa Rosenzweig
e88346330e
nir/lower_io: remove incorrect Intel _block cases
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
These should be handled like their non-_block counterparts - there is no i/o
index for them.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40096 >
2026-02-28 16:32:14 +00:00
Georg Lehmann
6b464785b9
nir/opt_algebraic: optimize d3d9 iand(a, inot(b))
...
Foz-DB GFX1201:
Totals from 24 (0.02% of 112525) affected shaders:
Instrs: 15598 -> 15426 (-1.10%); split: -1.17%, +0.06%
CodeSize: 88716 -> 88260 (-0.51%); split: -0.98%, +0.46%
Latency: 54419 -> 53965 (-0.83%); split: -0.91%, +0.08%
InvThroughput: 10294 -> 10166 (-1.24%); split: -1.28%, +0.04%
VClause: 302 -> 300 (-0.66%)
SClause: 367 -> 363 (-1.09%); split: -1.63%, +0.54%
Copies: 712 -> 705 (-0.98%); split: -3.09%, +2.11%
PreSGPRs: 1402 -> 1424 (+1.57%); split: -0.14%, +1.71%
PreVGPRs: 850 -> 848 (-0.24%)
VALU: 9730 -> 9591 (-1.43%)
SALU: 1579 -> 1649 (+4.43%)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40104 >
2026-02-26 14:44:01 +00:00
Georg Lehmann
a3f9c347bf
nir/opt_algebraic: optimize b2f(a) - 1.0 to -b2f(a)
...
Foz-DB GFX1201:
Totals from 81 (0.07% of 112525) affected shaders:
Instrs: 95048 -> 94965 (-0.09%); split: -0.13%, +0.05%
CodeSize: 532148 -> 531864 (-0.05%); split: -0.09%, +0.04%
SpillSGPRs: 122 -> 125 (+2.46%)
Latency: 440372 -> 440402 (+0.01%); split: -0.02%, +0.03%
InvThroughput: 296078 -> 296173 (+0.03%); split: -0.03%, +0.06%
VClause: 1449 -> 1456 (+0.48%); split: -0.21%, +0.69%
SClause: 2249 -> 2256 (+0.31%); split: -0.09%, +0.40%
Copies: 3956 -> 3965 (+0.23%); split: -0.10%, +0.33%
PreVGPRs: 2900 -> 2899 (-0.03%)
VALU: 61212 -> 61098 (-0.19%); split: -0.19%, +0.01%
SALU: 6970 -> 6981 (+0.16%)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40104 >
2026-02-26 14:44:01 +00:00