Commit graph

7178 commits

Author SHA1 Message Date
Lionel Landwerlin
e14d6b535c brw/nir: add new intrinsics to load data from the indirect address
This address is delivered on Gfx12.5+ in compute/mesh/task shaders
from the command stream instruction.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40174>
2026-03-06 06:34:43 +00:00
Lionel Landwerlin
7b1533414a brw/nir: enable constant offsets for global_constant_uniform_block_intel
Will be useful to retain the base offset added in 0e9453291c ("brw:
improve push constant loading using base offsets") once we move push
constant data loading into NIR.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40174>
2026-03-06 06:34:43 +00:00
Rhys Perry
e43caba5f4 nir/range_analysis: use sparse array for float analysis
This seems to be faster.

ministat (nir_analyze_fp_range):
Difference at 95.0% confidence
        -592900 +/- 2302.24
        -27.6432% +/- 0.0998961%
        (Student's t, pooled s = 2719.05)

ministat (overall):
Difference at 95.0% confidence
        -76.8333 +/- 27.2345
        -0.632558% +/- 0.223407%
        (Student's t, pooled s = 46.867)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
2026-03-05 11:26:25 +00:00
Rhys Perry
aecbb2a903 nir/range_analysis: use function pointers for lookup
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
2026-03-05 11:26:25 +00:00
Rhys Perry
2731c34891 nir/range_analysis: use SSA index for hash table keys
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
2026-03-05 11:26:25 +00:00
Rhys Perry
5e376e3ed2 nir: add nir_fp_analysis_state
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
2026-03-05 11:26:25 +00:00
Rhys Perry
c0079e09ca nir/range_analysis: set deleted key
If (uintptr_t)&deleted_key is small enough, inserting entries into the
hash table might not work correctly.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>
2026-03-05 11:26:25 +00:00
Georg Lehmann
6a218e346d nir: remove lower_vector_cmp
Use nir_lower_alu_width or nir_lower_alu_to_scalar instead.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40197>
2026-03-04 19:50:28 +00:00
Georg Lehmann
3e6e1e213c nir: remove fall_equal/fany_nequal opcodes
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40197>
2026-03-04 19:50:27 +00:00
Georg Lehmann
d6977adc09 nir/lower_bool_to_float: assert that vector comparisons were lowered
There are no backends that handle the vector comparisons with float result.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40197>
2026-03-04 19:50:27 +00:00
Karol Herbst
e1ed7de274 nir: fix nir_round_int_to_float for fp16
fp16 has quite the limited value range and with bigger integers
nir_round_int_to_float might return Inf where it shouldn't depending on
the rounding mode.

Fixes conversions half_rt[npz]_(u)?(int|long) CL CTS tests.

Cc: mesa-stable
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40163>
2026-03-04 14:32:35 +00:00
Karol Herbst
8e8fb2ebaa nir: fix nir_alu_type_range_contains_type_range for fp16 to int
The special value "Inf" doesn't fit into an int and therefore we have to
clamp regardless of whether all the other values would fit. And because
f2u32 and f2u64 define out-of-range conversions as UB in nir, we need to
clamp.

This change should have no effect for non saturating conversions.

Fixes "conversions long_sat_*half" CL CTS tests

Cc: mesa-stable
Suggested-by: Rob Clark <rob.clark@oss.qualcomm.com>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40163>
2026-03-04 14:32:35 +00:00
Daniel Schürmann
56f5e35d95 nir/opt_remove_phis: recursively check loop header phis for triviality
This only checks for one level of nested phis
as the potential cost of recursive checks outweighs
the rare cases.

Totals from 393 (0.35% of 112055) affected shaders: (Navi48)

Instrs: 920765 -> 915832 (-0.54%); split: -0.54%, +0.00%
CodeSize: 4887052 -> 4867876 (-0.39%); split: -0.39%, +0.00%
SpillSGPRs: 464 -> 411 (-11.42%)
Latency: 6868149 -> 6856413 (-0.17%); split: -0.21%, +0.04%
InvThroughput: 841067 -> 839821 (-0.15%); split: -0.17%, +0.02%
Copies: 73573 -> 72021 (-2.11%)
Branches: 25973 -> 25343 (-2.43%)
PreSGPRs: 34110 -> 33454 (-1.92%)
PreVGPRs: 24594 -> 24593 (-0.00%)
VALU: 513068 -> 512816 (-0.05%); split: -0.05%, +0.00%
SALU: 133157 -> 130038 (-2.34%)
VOPD: 9773 -> 9673 (-1.02%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40165>
2026-03-04 14:03:40 +00:00
Rob Clark
dfaa4375c3 rusticl: Let backend control convert_alu_types lowering
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40179>
2026-03-03 12:13:04 -08:00
Georg Lehmann
7194dfcc2c nir/opt_algebraic: optimize b2i(a) * b to bcsel
Foz-DB Navi48:
Totals from 3180 (2.77% of 114655) affected shaders:
MaxWaves: 85526 -> 85446 (-0.09%)
Instrs: 2681446 -> 2678641 (-0.10%); split: -0.17%, +0.07%
CodeSize: 14295536 -> 14284628 (-0.08%); split: -0.13%, +0.05%
VGPRs: 174792 -> 174636 (-0.09%); split: -0.16%, +0.07%
SpillSGPRs: 306 -> 308 (+0.65%)
Latency: 14078973 -> 14070122 (-0.06%); split: -0.07%, +0.01%
InvThroughput: 2774242 -> 2764051 (-0.37%); split: -0.37%, +0.00%
VClause: 41744 -> 41734 (-0.02%); split: -0.10%, +0.07%
SClause: 58176 -> 58154 (-0.04%); split: -0.05%, +0.01%
Copies: 222967 -> 223108 (+0.06%); split: -0.14%, +0.20%
Branches: 57317 -> 57322 (+0.01%)
PreSGPRs: 140454 -> 140451 (-0.00%); split: -0.01%, +0.00%
PreVGPRs: 131649 -> 131540 (-0.08%); split: -0.09%, +0.01%
VALU: 1509318 -> 1505443 (-0.26%); split: -0.26%, +0.00%
SALU: 384419 -> 385838 (+0.37%); split: -0.01%, +0.38%
VOPD: 13272 -> 13286 (+0.11%); split: +0.14%, -0.03%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40160>
2026-03-02 15:58:30 +00:00
Georg Lehmann
3d304d5647 nir/opt_algebraic: remove is_used_once on outer instruction
This just prevents useful optimizations.
is_used_once only makes sense on inner instructions, to prevent
creating more new instructions than will be removed.

Foz-DB Navi48:
Totals from 16989 (14.82% of 114655) affected shaders:
MaxWaves: 434379 -> 434353 (-0.01%); split: +0.01%, -0.01%
Instrs: 29030794 -> 29022514 (-0.03%); split: -0.07%, +0.04%
CodeSize: 155293092 -> 155262816 (-0.02%); split: -0.05%, +0.03%
VGPRs: 1093980 -> 1094088 (+0.01%); split: -0.01%, +0.02%
SpillSGPRs: 9801 -> 9803 (+0.02%); split: -0.03%, +0.05%
Latency: 356327270 -> 356283384 (-0.01%); split: -0.03%, +0.02%
InvThroughput: 58239439 -> 58229374 (-0.02%); split: -0.03%, +0.01%
VClause: 451716 -> 451815 (+0.02%); split: -0.07%, +0.09%
SClause: 654614 -> 654556 (-0.01%); split: -0.03%, +0.03%
Copies: 1809805 -> 1809297 (-0.03%); split: -0.20%, +0.17%
Branches: 552382 -> 552384 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 947188 -> 947224 (+0.00%); split: -0.01%, +0.02%
PreVGPRs: 879583 -> 880173 (+0.07%); split: -0.01%, +0.08%
VALU: 16317859 -> 16309975 (-0.05%); split: -0.07%, +0.02%
SALU: 4256121 -> 4259315 (+0.08%); split: -0.05%, +0.12%
SMEM: 1067069 -> 1067070 (+0.00%)
VOPD: 440855 -> 440792 (-0.01%); split: +0.05%, -0.07%

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>
2026-03-02 15:24:36 +00:00
Georg Lehmann
41878e5714 nir_opt_algebraic: remove unneeded is_not_const
These were needed when we didn't constant fold inside nir_search,
to prevent infinite loops.
But now all they do is slow down pattern matching.

Foz-DB Navi48:
Totals from 107 (0.09% of 114655) affected shaders:
Instrs: 162439 -> 162481 (+0.03%); split: -0.01%, +0.03%
CodeSize: 943056 -> 942988 (-0.01%); split: -0.03%, +0.02%
Latency: 971667 -> 970865 (-0.08%); split: -0.09%, +0.00%
InvThroughput: 164452 -> 164521 (+0.04%); split: -0.02%, +0.07%
Copies: 7980 -> 7982 (+0.03%)
VALU: 103572 -> 103566 (-0.01%); split: -0.05%, +0.04%
SALU: 12825 -> 12878 (+0.41%)
VOPD: 5235 -> 5190 (-0.86%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>
2026-03-02 15:24:36 +00:00
Georg Lehmann
374cbc17a4 nir_opt_algebraic: reassociate fadd into ffma where one factor is a constant
This restriction doesn't really make sense, probably an accident.

Foz-DB Navi48:
Totals from 2290 (2.00% of 114655) affected shaders:
MaxWaves: 57496 -> 57510 (+0.02%); split: +0.06%, -0.03%
Instrs: 2817419 -> 2816209 (-0.04%); split: -0.12%, +0.08%
CodeSize: 15218816 -> 15220576 (+0.01%); split: -0.09%, +0.10%
VGPRs: 147456 -> 147384 (-0.05%); split: -0.07%, +0.02%
Latency: 13757114 -> 13751833 (-0.04%); split: -0.13%, +0.09%
InvThroughput: 2463343 -> 2462482 (-0.03%); split: -0.07%, +0.04%
VClause: 40137 -> 40153 (+0.04%); split: -0.07%, +0.11%
SClause: 57351 -> 57385 (+0.06%); split: -0.12%, +0.18%
Copies: 135482 -> 136258 (+0.57%); split: -0.22%, +0.79%
Branches: 30886 -> 30894 (+0.03%)
PreSGPRs: 113470 -> 113462 (-0.01%); split: -0.03%, +0.02%
PreVGPRs: 117554 -> 117591 (+0.03%); split: -0.01%, +0.04%
VALU: 1682734 -> 1681557 (-0.07%); split: -0.10%, +0.03%
SALU: 390685 -> 391301 (+0.16%); split: -0.07%, +0.22%
VOPD: 6159 -> 6254 (+1.54%); split: +1.72%, -0.18%

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>
2026-03-02 15:24:36 +00:00
Georg Lehmann
b949122908 nir/opt_algebraic: remove loops for b2f/b2i equality handling
The feq/fneu patterns already existed, and there is no reason to use bit size based
loops here.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>
2026-03-02 15:24:36 +00:00
Georg Lehmann
83091276f8 nir_opt_algebraic: remove more specific cmp+bcsel opts
Only some minimal difference from pattern ordering:

Foz-DB Navi48:
Totals from 3 (0.00% of 114655) affected shaders:
Instrs: 4556 -> 4533 (-0.50%)
CodeSize: 23716 -> 23608 (-0.46%)
Latency: 27424 -> 26336 (-3.97%)
InvThroughput: 4674 -> 4672 (-0.04%)
SClause: 107 -> 105 (-1.87%)
Copies: 351 -> 346 (-1.42%)
Branches: 130 -> 126 (-3.08%)
VALU: 2598 -> 2595 (-0.12%)
SALU: 561 -> 555 (-1.07%)
SMEM: 169 -> 167 (-1.18%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>
2026-03-02 15:24:36 +00:00
Georg Lehmann
4190241795 nir/opt_algebraic: optimize all comparisons of b2f/b2i with constants
Foz-DB Navi48:
Totals from 857 (0.75% of 114655) affected shaders:
Instrs: 1136993 -> 1132422 (-0.40%); split: -0.48%, +0.08%
CodeSize: 6096636 -> 6070832 (-0.42%); split: -0.48%, +0.06%
VGPRs: 49668 -> 49620 (-0.10%)
Latency: 24014661 -> 24044601 (+0.12%); split: -0.04%, +0.16%
InvThroughput: 4182482 -> 4183708 (+0.03%); split: -0.12%, +0.15%
VClause: 17698 -> 17695 (-0.02%)
SClause: 25214 -> 25213 (-0.00%)
Copies: 81474 -> 81396 (-0.10%); split: -0.79%, +0.69%
Branches: 24722 -> 24650 (-0.29%); split: -0.36%, +0.07%
PreSGPRs: 43338 -> 43291 (-0.11%); split: -0.22%, +0.11%
VALU: 652975 -> 649760 (-0.49%); split: -0.50%, +0.00%
SALU: 153961 -> 153797 (-0.11%); split: -0.72%, +0.61%
VOPD: 10650 -> 10684 (+0.32%); split: +0.38%, -0.07%

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>
2026-03-02 15:24:36 +00:00
Georg Lehmann
ef6f5377da nir/opt_algebraic: remove fcmp+fneg patterns that are cleaned up earlier
No Foz-DB changes, as expected.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>
2026-03-02 15:24:36 +00:00
Georg Lehmann
a5334ec239 nir/opt_algebraic: generalize late fcmp(fneg(a), const) patterns
No reason just to do this for 1.0.

Foz-DB Navi48:
Totals from 44 (0.04% of 114655) affected shaders:
CodeSize: 111620 -> 111476 (-0.13%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>
2026-03-02 15:24:35 +00:00
Alyssa Rosenzweig
e88346330e nir/lower_io: remove incorrect Intel _block cases
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
These should be handled like their non-_block counterparts - there is no i/o
index for them.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40096>
2026-02-28 16:32:14 +00:00
Georg Lehmann
6b464785b9 nir/opt_algebraic: optimize d3d9 iand(a, inot(b))
Foz-DB GFX1201:
Totals from 24 (0.02% of 112525) affected shaders:
Instrs: 15598 -> 15426 (-1.10%); split: -1.17%, +0.06%
CodeSize: 88716 -> 88260 (-0.51%); split: -0.98%, +0.46%
Latency: 54419 -> 53965 (-0.83%); split: -0.91%, +0.08%
InvThroughput: 10294 -> 10166 (-1.24%); split: -1.28%, +0.04%
VClause: 302 -> 300 (-0.66%)
SClause: 367 -> 363 (-1.09%); split: -1.63%, +0.54%
Copies: 712 -> 705 (-0.98%); split: -3.09%, +2.11%
PreSGPRs: 1402 -> 1424 (+1.57%); split: -0.14%, +1.71%
PreVGPRs: 850 -> 848 (-0.24%)
VALU: 9730 -> 9591 (-1.43%)
SALU: 1579 -> 1649 (+4.43%)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40104>
2026-02-26 14:44:01 +00:00
Georg Lehmann
a3f9c347bf nir/opt_algebraic: optimize b2f(a) - 1.0 to -b2f(a)
Foz-DB GFX1201:
Totals from 81 (0.07% of 112525) affected shaders:
Instrs: 95048 -> 94965 (-0.09%); split: -0.13%, +0.05%
CodeSize: 532148 -> 531864 (-0.05%); split: -0.09%, +0.04%
SpillSGPRs: 122 -> 125 (+2.46%)
Latency: 440372 -> 440402 (+0.01%); split: -0.02%, +0.03%
InvThroughput: 296078 -> 296173 (+0.03%); split: -0.03%, +0.06%
VClause: 1449 -> 1456 (+0.48%); split: -0.21%, +0.69%
SClause: 2249 -> 2256 (+0.31%); split: -0.09%, +0.40%
Copies: 3956 -> 3965 (+0.23%); split: -0.10%, +0.33%
PreVGPRs: 2900 -> 2899 (-0.03%)
VALU: 61212 -> 61098 (-0.19%); split: -0.19%, +0.01%
SALU: 6970 -> 6981 (+0.16%)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40104>
2026-02-26 14:44:01 +00:00
Georg Lehmann
5b974a922a nir: print all fp_math_ctrl bits
Examples:
div 32    %338 = ffma %89, %328.z, %335 // exact, preserve:sz,inf,nan
con 32    %28 = fmul %17.y, %27 (2.000000) // preserve:sz,inf,nan

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40093>
2026-02-26 14:14:26 +00:00
Alyssa Rosenzweig
8a450fb0ff nir/lower_subgroups: generalize vote lowering
We currently have code to lower quad votes to a ballot. The same idea works for
subgroup votes. Generalize the quad vote code and use it to lower
vote_all/vote_eq for backends setting a new lower_vote option.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40074>
2026-02-25 17:29:29 +00:00
Lionel Landwerlin
7f19814414 brw/nir: handle inline_data_intel more like push_data_intel
It's pretty much the same mechanism, except it's a different register
location.

With this change we gain indirect loading support.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39405>
2026-02-25 10:44:09 +00:00
Alyssa Rosenzweig
42c4f7935a nir: optimize u2u32(unpack_32_2x16_split_*)
Noticed while playing with pixel coord things.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40056>
2026-02-24 19:16:56 +00:00
Georg Lehmann
07260dc210 nir/lower_subgroups: lower shuffles and bitwise reduce to 32bit before scalarizing
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Pack/unpack should be a lot faster than duplicating the subgroup op.

No fossil-db changes, but multiple people complained about this to me.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40024>
2026-02-24 13:48:35 +00:00
Georg Lehmann
0d6fe16ce8 nir: add mixed float dot opcodes
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40003>
2026-02-24 08:55:52 +00:00
Rob Clark
ff8b688fc7 nir: Fix validation error after nir_round_int_to_float()
CL CTS test_conversions hits a nir_validate assert than ufind_msb is 32b
or 64b:

    16     %61 = @load_global (%185) (access=none, align_mul=2, align_offset=0)
    32    %240 = ufind_msb %61 error: src_bit_size == 32 || src_bit_size == 64 (../src/compiler/nir/nir_validate.c:273)

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40054>
2026-02-23 22:06:50 +00:00
Faith Ekstrand
e3dc3dccd6 pan/fb: Add a common FB load shader builder
One of the advantages to this new FB load shader, apart from it being
common, is that it's able to properly handle partial tile loads.
Instead of doing the force_preload/clear dance that PanVK is currently
doing, these shaders are clever enough to detect whether or not they're
inside the Vulkan render area and clear the inside while loading the
border pixels.

In order for this to work, there are two new intrinsics which provide
the framebuffer bounding box and the clear values.  We need this in
order to handle partial loads correctly.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39759>
2026-02-23 21:00:01 +00:00
Faith Ekstrand
88ad8bc75d nir/gather_info: Add support for panfrost tile load/store intrinsics
Fixes: 6fc1030e4f ("nir: Add some new panfrost fragment shader intrinsics")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39759>
2026-02-23 21:00:01 +00:00
Faith Ekstrand
ae901f6175 nir/print: Add panfrost blend intrinsics
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39759>
2026-02-23 21:00:01 +00:00
Caio Oliveira
4207cc673d nir: Handle nir_instr_type_cmat_call in more places
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Prefer to be explicit when handling it, like is done for regular
nir_instr_type_call.

Even though functions called by cmat_call have restrictions on them ("no
tangled instructions" for example), which could allow a couple of passes
to treat them differently, there's no tracking of what functions are
used only in such cases, so being conservative here should be safe.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39903>
2026-02-20 13:09:45 -08:00
Daniel Schürmann
f4e3ab5266 nir/divergence: Ignore divergent_loop_{continue|break} for nir_block::divergent
This is already implicitly accounted for.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39934>
2026-02-19 19:55:33 +00:00
Daniel Schürmann
eabd7cc22c nir/divergence: Fix nir_block::divergent in presence of divergent breaks
If no second pass is necessary, we might miss setting nir_block::divergent
to true, if a loop has a divergent break.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39934>
2026-02-19 19:55:31 +00:00
Daniel Schürmann
a57b900a59 nir/divergence: rename divergent_loop_cf to divergent_cf
in order to better reflect the actual semantics.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39934>
2026-02-19 19:55:31 +00:00
Georg Lehmann
5d5f99bfe8 nir/opt_algebraic: create more b2f if sign of zero doesn't matter
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39966>
2026-02-19 15:21:27 +00:00
Georg Lehmann
d87943ad3d nir/opt_algebraic: preserve signed zero when creating new b2f
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39966>
2026-02-19 15:21:27 +00:00
Georg Lehmann
5e544ecd08 nir/opcodes: remove valid_fp_math_ctrl bits from some opcodes
This is mostly about conversions.

Conversions from float to int don't care about signed zero
and in the case of plain f2u/f2i, nan and inf are always
undefined too.

Conversions for int to float can't create nan, so they don't
need preserve_nan.

b2f only cares about preserve_sz, and nothing else.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39966>
2026-02-19 15:21:27 +00:00
Georg Lehmann
62f3be87c4 nir/serialize: omit serializing fp_math_ctrl if it has to be 0
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39966>
2026-02-19 15:21:27 +00:00
Alyssa Rosenzweig
5d5c2a6430 nir/opt_intrinsics: use data helpers
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39939>
2026-02-19 14:47:11 +00:00
Alyssa Rosenzweig
84f3849688 nir/opt_fragdepth: use data helper
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39939>
2026-02-19 14:47:11 +00:00
Alyssa Rosenzweig
9da61b3ea5 nir/opt_uniform_atomics: use data helper
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39939>
2026-02-19 14:47:11 +00:00
Alyssa Rosenzweig
76d5436f04 nir/lower_atomics: use data helper
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39939>
2026-02-19 14:47:11 +00:00
Alyssa Rosenzweig
8fb1d65426 nir: add nir_get_io_data_src
This complements our existing nir_get_io_index_src helper. Most, but annoyingly
not all, stores put their data source in source 0. Having a helper for this lets
us reduce special casing in a bunch of random places.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39939>
2026-02-19 14:47:11 +00:00
Maíra Canal
5a1e0112a9 nir: add load_texture_scale intrinsic
Add load_texture_scale to the list of intrinsics whose divergence
depends on their sources. This is needed to support running divergence
analysis on VC4.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39768>
2026-02-19 09:57:05 +00:00