Faith Ekstrand
fb2c44bc51
compiler/rust/smallvec: Implement Extend<T> for SmallVec<T>
...
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41941 >
2026-06-03 18:50:41 +00:00
Faith Ekstrand
1a34d1ed3a
compiler/rust/smallvec: Implement Deref[Mut]<Target = [T]>
...
We now get last_mut() for free since it's part of `&mut [T]`.
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41941 >
2026-06-03 18:50:41 +00:00
Faith Ekstrand
ab017fd8fc
compiler/rust/smallvec: Add a push_mut() method
...
This is analagous to `Vec::push_mut()`, which was stabilied in Rust
1.95.0. Since we can't use that rust version yet, we internally
implement it as `push()` followed by `last_mut().unwrap()`.
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41941 >
2026-06-03 18:50:41 +00:00
Faith Ekstrand
1eaee3b619
compiler/rust/smallvec: Implement Clone, Default, and new()
...
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41941 >
2026-06-03 18:50:41 +00:00
Kenneth Graunke
d89a0b486a
jay: Implement dual color blending (but require SIMD16)
...
It's mildly tempting to reuse the src0_alpha source for color1 since
the two features should never overlap, but for now we add an extra
optional source.
We require SIMD16 for now as we only have SIMD16 messages. Eventually,
we're likely to want to support SIMD32 with 2x16 sends, but this gets
us going for now.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41872 >
2026-06-03 15:23:20 +00:00
Mel Henning
e68c9b791c
compiler/rust: Fix inline wrapper include dir
...
This follows the same pattern rusticl uses to handle this. See 36a18208f7
Fixes: b60694b91e ("compiler/rust: Add a float16 wrapper")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15586
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41964 >
2026-06-02 22:53:26 +00:00
Faith Ekstrand
d88b53075a
panfrost: Add the basis for the new Kraid compiler
...
This adds some mostly empty rust files, bindings, meson bits, and a call
into kraid from the bifrost compiler, guarded by PAN_USE_KRAID=1.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41841 >
2026-06-02 21:19:25 +00:00
Marek Olšák
e5723a61f2
ac/nir: add a new pass ac_nir_lower_sample_mask_in
...
This covers all the optimal lowering cases of sample_mask_in.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41768 >
2026-06-02 20:38:05 +00:00
Ian Romanick
dcfc90a8fc
nir/algebraic: Convert bcsel of addition to addition of b2i or b2f
...
Recent changes to continue handling in loops results in many cases of
loop {
...
if (...) {
do_continue = true; // was continue;
}
i = do_continue ? i : i + 1;
}
I noticed this while investigating mesa#15154. Unfortunately, this
doesn't fix the performance regressions noted in that issue.
One fragment shader in XCOM: Enemy Unknown doesn't like this change. :(
v2: Drop _nsz from a couple bcsel patterns where it is not needed.
Suggested by Georg.
v3: Drop ~ from the last two fadd patterns. Suggested by Georg. Update
expected checksum for plot3d-v2.trace on many platforms.
shader-db:
All Iris platforms had similar results. (Lunar Lake shown)
total instructions in shared programs: 17089936 -> 17086837 (-0.02%)
instructions in affected programs: 864928 -> 861829 (-0.36%)
helped: 696 / HURT: 110
total cycles in shared programs: 864096306 -> 863913752 (-0.02%)
cycles in affected programs: 345726340 -> 345543786 (-0.05%)
helped: 620 / HURT: 196
total spills in shared programs: 3318 -> 3319 (0.03%)
spills in affected programs: 14 -> 15 (7.14%)
helped: 0 / HURT: 1
total fills in shared programs: 1604 -> 1606 (0.12%)
fills in affected programs: 28 -> 30 (7.14%)
helped: 0 / HURT: 1
total sends in shared programs: 876852 -> 876850 (<.01%)
sends in affected programs: 6 -> 4 (-33.33%)
helped: 2 / HURT: 0
fossil-db:
Lunar Lake
Totals:
Instrs: 914468779 -> 914215874 (-0.03%); split: -0.03%, +0.00%
CodeSize: 12885732160 -> 12881939568 (-0.03%); split: -0.04%, +0.01%
Cycle count: 100100279922 -> 100096866800 (-0.00%); split: -0.05%, +0.04%
Spill count: 3459786 -> 3459693 (-0.00%); split: -0.01%, +0.01%
Fill count: 4909835 -> 4909177 (-0.01%); split: -0.04%, +0.03%
Max live registers: 191819298 -> 191822052 (+0.00%); split: -0.00%, +0.00%
Max dispatch width: 48511264 -> 48510608 (-0.00%); split: +0.00%, -0.00%
Non SSA regs after NIR: 136334891 -> 136301926 (-0.02%); split: -0.03%, +0.00%
Totals from 37416 (1.87% of 2003390) affected shaders:
Instrs: 53346249 -> 53093344 (-0.47%); split: -0.48%, +0.01%
CodeSize: 775396384 -> 771603792 (-0.49%); split: -0.60%, +0.11%
Cycle count: 32275003526 -> 32271590404 (-0.01%); split: -0.14%, +0.13%
Spill count: 569304 -> 569211 (-0.02%); split: -0.05%, +0.03%
Fill count: 620240 -> 619582 (-0.11%); split: -0.31%, +0.21%
Max live registers: 6712048 -> 6714802 (+0.04%); split: -0.01%, +0.05%
Max dispatch width: 893344 -> 892688 (-0.07%); split: +0.10%, -0.17%
Non SSA regs after NIR: 7191473 -> 7158508 (-0.46%); split: -0.49%, +0.03%
Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 985625036 -> 985366432 (-0.03%); split: -0.03%, +0.00%
CodeSize: 16446268768 -> 16442606864 (-0.02%); split: -0.03%, +0.01%
Cycle count: 91278956920 -> 91272371300 (-0.01%); split: -0.07%, +0.06%
Spill count: 3713935 -> 3714003 (+0.00%); split: -0.00%, +0.00%
Fill count: 5001514 -> 5001259 (-0.01%); split: -0.03%, +0.02%
Max live registers: 120736970 -> 120738919 (+0.00%); split: -0.00%, +0.00%
Max dispatch width: 37827808 -> 37829472 (+0.00%); split: +0.01%, -0.00%
Non SSA regs after NIR: 160606595 -> 160573270 (-0.02%); split: -0.02%, +0.00%
Totals from 38664 (1.71% of 2265137) affected shaders:
Instrs: 53621392 -> 53362788 (-0.48%); split: -0.49%, +0.01%
CodeSize: 932994544 -> 929332640 (-0.39%); split: -0.52%, +0.13%
Cycle count: 24442489628 -> 24435904008 (-0.03%); split: -0.25%, +0.22%
Spill count: 550952 -> 551020 (+0.01%); split: -0.02%, +0.03%
Fill count: 525010 -> 524755 (-0.05%); split: -0.27%, +0.23%
Max live registers: 3594805 -> 3596754 (+0.05%); split: -0.01%, +0.07%
Max dispatch width: 510928 -> 512592 (+0.33%); split: +0.47%, -0.14%
Non SSA regs after NIR: 7652247 -> 7618922 (-0.44%); split: -0.46%, +0.03%
Tiger Lake, Ice Lake, and Skylake had similar results. (Tiger Lake shown)
Totals:
Instrs: 997905938 -> 997771670 (-0.01%); split: -0.01%, +0.00%
CodeSize: 13990460928 -> 13988346016 (-0.02%); split: -0.02%, +0.00%
Cycle count: 83465002175 -> 83456829524 (-0.01%); split: -0.02%, +0.01%
Spill count: 3815020 -> 3814879 (-0.00%); split: -0.01%, +0.00%
Fill count: 6561078 -> 6560768 (-0.00%); split: -0.01%, +0.00%
Max live registers: 121468149 -> 121468160 (+0.00%); split: -0.00%, +0.00%
Max dispatch width: 37914400 -> 37914624 (+0.00%); split: +0.00%, -0.00%
Non SSA regs after NIR: 155941530 -> 155944033 (+0.00%); split: -0.00%, +0.00%
Totals from 27771 (1.22% of 2273117) affected shaders:
Instrs: 31224666 -> 31090398 (-0.43%); split: -0.44%, +0.01%
CodeSize: 450250800 -> 448135888 (-0.47%); split: -0.57%, +0.10%
Cycle count: 15045135658 -> 15036963007 (-0.05%); split: -0.13%, +0.08%
Spill count: 406812 -> 406671 (-0.03%); split: -0.05%, +0.01%
Fill count: 391210 -> 390900 (-0.08%); split: -0.10%, +0.02%
Max live registers: 2592759 -> 2592770 (+0.00%); split: -0.02%, +0.02%
Max dispatch width: 383888 -> 384112 (+0.06%); split: +0.23%, -0.17%
Non SSA regs after NIR: 4221402 -> 4223905 (+0.06%); split: -0.01%, +0.07%
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41871 >
2026-06-02 17:44:14 +00:00
Ian Romanick
daa38c1972
nir/opt_if: Merge if-statements with inverted conditions
...
Cases like
if (x) {
...
} else {
...
}
if (!x) {
...
} else {
...
}
should be merged.
I don't know why Ice Lake is affected differetly by this commit.
v2: Add implementation of srcs_equal_or_logical_inverse after bad
rebase. That's what I get for rushing out an MR right before lunch.
Noticed by Georg.
shader-db:
Lunar Lake
No changes.
All other Iris platforms had simlar results. (Meteor Lake shown)
total cycles in shared programs: 882310108 -> 882311504 (<.01%)
cycles in affected programs: 74306 -> 75702 (1.88%)
helped: 4
HURT: 2
helped stats (abs) min: 2.0 max: 38.0 x̄: 11.00 x̃: 2
helped stats (rel) min: 0.02% max: 0.29% x̄: 0.09% x̃: 0.02%
HURT stats (abs) min: 720.0 max: 720.0 x̄: 720.00 x̃: 720
HURT stats (rel) min: 5.27% max: 5.27% x̄: 5.27% x̃: 5.27%
95% mean confidence interval for cycles value: -163.75 629.08
95% mean confidence interval for cycles %-change: -1.21% 4.61%
Inconclusive result (value mean confidence interval includes 0).
fossil-db:
All Intel platforms except Ice Lake had similar results. (Lunar Lake shown)
Totals:
Instrs: 914554534 -> 914546744 (-0.00%); split: -0.00%, +0.00%
CodeSize: 12887129264 -> 12886823808 (-0.00%); split: -0.00%, +0.00%
Send messages: 40220826 -> 40219429 (-0.00%); split: -0.00%, +0.00%
Cycle count: 100101810976 -> 100101804762 (-0.00%); split: -0.00%, +0.00%
Spill count: 3459811 -> 3459786 (-0.00%); split: -0.00%, +0.00%
Fill count: 4909877 -> 4909835 (-0.00%); split: -0.00%, +0.00%
Max live registers: 191837229 -> 191838000 (+0.00%); split: -0.00%, +0.00%
Max dispatch width: 48514400 -> 48514336 (-0.00%)
Non SSA regs after NIR: 136346777 -> 136343948 (-0.00%); split: -0.00%, +0.00%
Totals from 1937 (0.10% of 2003486) affected shaders:
Instrs: 3013550 -> 3005760 (-0.26%); split: -0.39%, +0.13%
CodeSize: 43169072 -> 42863616 (-0.71%); split: -0.81%, +0.10%
Send messages: 183171 -> 181774 (-0.76%); split: -0.82%, +0.06%
Cycle count: 126864798 -> 126858584 (-0.00%); split: -0.67%, +0.67%
Spill count: 7354 -> 7329 (-0.34%); split: -0.45%, +0.11%
Fill count: 5547 -> 5505 (-0.76%); split: -0.88%, +0.13%
Max live registers: 296895 -> 297666 (+0.26%); split: -0.04%, +0.30%
Max dispatch width: 41856 -> 41792 (-0.15%)
Non SSA regs after NIR: 545672 -> 542843 (-0.52%); split: -1.15%, +0.63%
Ice Lake
Totals:
Instrs: 996341606 -> 996312120 (-0.00%); split: -0.00%, +0.00%
CodeSize: 12563695936 -> 12563195200 (-0.00%); split: -0.00%, +0.00%
Send messages: 45911343 -> 45909063 (-0.00%); split: -0.00%, +0.00%
Cycle count: 82819362995 -> 82818778468 (-0.00%); split: -0.00%, +0.00%
Spill count: 2935451 -> 2935452 (+0.00%); split: -0.00%, +0.00%
Fill count: 5034267 -> 5034281 (+0.00%); split: -0.00%, +0.00%
Max live registers: 124672355 -> 124672961 (+0.00%); split: -0.00%, +0.00%
Max dispatch width: 41330808 -> 41330672 (-0.00%)
Non SSA regs after NIR: 160790466 -> 160785863 (-0.00%); split: -0.01%, +0.00%
Totals from 2163 (0.09% of 2327905) affected shaders:
Instrs: 4164788 -> 4135302 (-0.71%); split: -0.80%, +0.09%
CodeSize: 53351344 -> 52850608 (-0.94%); split: -0.95%, +0.01%
Send messages: 271164 -> 268884 (-0.84%); split: -0.84%, +0.00%
Cycle count: 145818114 -> 145233587 (-0.40%); split: -0.66%, +0.26%
Spill count: 7819 -> 7820 (+0.01%); split: -0.32%, +0.33%
Fill count: 7191 -> 7205 (+0.19%); split: -0.57%, +0.76%
Max live registers: 192403 -> 193009 (+0.31%); split: -0.08%, +0.40%
Max dispatch width: 34728 -> 34592 (-0.39%)
Non SSA regs after NIR: 570874 -> 566271 (-0.81%); split: -1.49%, +0.68%
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41871 >
2026-06-02 17:44:14 +00:00
Ian Romanick
e8cef4725d
nir/opt_if: use nir_def_replace() instead of nir_def_rewrite_uses()
...
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41871 >
2026-06-02 17:44:13 +00:00
Ian Romanick
4a37fda884
nir: Use nir_instr_remove_v in nir_def_replace
...
The non _v version sets up and returns a nir_cursor that isn't
used. Skip that work by calling nir_instr_remove_v directly.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41871 >
2026-06-02 17:44:13 +00:00
Pavel Ondračka
f6b06ea3de
nir/algebraic: prevent ffract optimization on lowered ffloor
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
ffloor(a) is lowered as a - ffract(a). dEQP expects that for example
ffloor(a) == 1.0 for every a in between 1.0 a 2.0. This worked fine,
but the new ffract(a + b(is_integral)) -> ffract(a) rule broke this.
Specifically, dEQP-GLES2.functional.shaders.struct.uniform.equal_fragment
checks that ffloor(a + 1.0) == 1.0 for every a between 0.0 and 1.0.
However this is not exactly true once the ffract(a + 1.0) is lowered
to ffract(a).
Prevent this by marking ffract from ffloor lowering as exact so that
the recently introduced ffract(a + b(is_integral)) -> ffract(a) rule
does not trigger.
Fixes: c6aaafa3 ("nir: add lowering for ffloor")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15562
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41882 >
2026-06-02 12:03:09 +00:00
Mary Guillemard
b95dbc64bf
nir,nak: Add match_any_nv
...
NVIDIA hardware have an instruction allowering you to retrive the mask
of active threads matching the same source value as the current
invocation.
This is going to be used by shared memory lowering for mesh / task
stages on NVK.
Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Tested-by: Thomas H.P. Andersen <phomes@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27196 >
2026-06-02 10:34:31 +00:00
Mary Guillemard
90d963d353
nir/nir_format_convert: Add missing u2f32 in nir_format_unpack_r9g9b9e5
...
Fix
"dEQP-VK.api.copy_and_blit.*.image_to_image.all_formats.color.2d_to_1d.*.e5b9g9r9_ufloat_pack32.*"
on HK.
Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: 5f5f4474f6 ("nir: Add a format unpack helper and tests")
Reviewed-by: Janne Grunau <j@jannau.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41929 >
2026-06-01 20:28:44 +00:00
Faith Ekstrand
3f18c81d4f
compiler/rust: Add an EnumAsU8 trait
...
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41915 >
2026-06-01 19:51:17 +00:00
Faith Ekstrand
fdc5d446ee
compiler/rust/bitset: Add a new ConstBitSet type
...
Unlike BitSet, which is backed by a Vec<u32>, this is backed by a
fixed-length array is therefore Copy. It's also mostly const so it can
be constructed and used from const contexts. Because of the const
rules, it's a bit more rigid and can only really accept keys which are
unsigned integer types.
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41915 >
2026-06-01 19:51:17 +00:00
Faith Ekstrand
76e3ecd97a
compiler/rust/bitset: Implement Into/FromBitIndex for more types
...
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41915 >
2026-06-01 19:51:17 +00:00
Faith Ekstrand
63d2ccd64b
compiler/rust/bitset: Generalize BitSetIterator
...
It now iterates over a slice instead of a BitSet
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41915 >
2026-06-01 19:51:17 +00:00
Faith Ekstrand
f43e57b3c0
compiler/rust/bitset: Add find_next_[un]set() helpers
...
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41915 >
2026-06-01 19:51:17 +00:00
Faith Ekstrand
ffe6cdd52d
compiler/rust/bitset: Don't reserve space in remove()
...
If the requested bit is past the end of the set, we can just return
false. We don't have to grow the bitset.
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41915 >
2026-06-01 19:51:16 +00:00
Faith Ekstrand
81c9eddb69
compiler/rust/bitset: Add a BitIndex helper struct
...
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41915 >
2026-06-01 19:51:16 +00:00
Karol Herbst
d25e7e330f
nir/lower_alu: fix lower_fminmax_signed_zero for denorms
...
When both inputs are denorms, the bcsel picks the integer min/max result,
which does not flush denorms and therefore might return the wrong result.
Fixes OpenCL fmin/fmax on asahi.
Fixes: d238d766c6 ("nir: add lower_fminmax_signed_zero")
Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by Janne Grunau <j@jannau.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41386 >
2026-06-01 13:43:01 +00:00
Faith Ekstrand
fd9c2ce73d
compiler/rust/nir: Add helpers for getting ALU input/output types
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41896 >
2026-06-01 03:24:17 +00:00
Faith Ekstrand
52e2439973
compiler/rust/nir: Add structured block iterators
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41896 >
2026-06-01 03:24:17 +00:00
Faith Ekstrand
4d129b10ac
compiler/rust: Add a nir_shader::to_string()
...
Annoyingly, this has to take a &mut nir_shader because printing
re-indexes SSA value.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41896 >
2026-06-01 03:24:16 +00:00
Faith Ekstrand
5922db15bf
compiler/rust: Add a nir_shader::get_entrypoint() helper
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41896 >
2026-06-01 03:24:16 +00:00
Faith Ekstrand
8b71407672
compiler/rust/bindings: Add util_dyarray
...
NIR uses dynarray for predecessors now so we should include it in the
bindings to keep other users from pulling in a duplicate.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41896 >
2026-06-01 03:24:16 +00:00
Faith Ekstrand
0d8ca7cc6c
meson: Suffix compiler/rust bindings with _compiler_rs_extern
...
This prevents symbol collisions with other crates in Mesa.
Fixes: b60694b91e ("compiler/rust: Add a float16 wrapper")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41896 >
2026-06-01 03:24:16 +00:00
Konstantin Seurer
f48f681fb5
nir: Duplicate the name in nir_def_set_name
...
nir_sweep expects that nir_instr_debug_info::variable_name is owned by
nir_instr_debug_info.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40706 >
2026-05-31 13:31:55 +02:00
Faith Ekstrand
e6bc41ed44
compiler/rust: Add LowerBoundedU32[Array] types
...
This is a generalization of NAK's SSAValue and SSAValueArray structs.
But instead of depending on NAK's bespoke invariants, this depends on
something far simpler: A lower bound on the u32. As long as you can
guarantee that the maximum array length is strictly less than the
minimum U32 value, we can pull the same trick as NAK and generalize it
into a LowerBoundedU32Array type.
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41462 >
2026-05-30 01:20:10 +00:00
Karol Herbst
87b5340831
nir/opt_dead_write_vars: cache is_entrypoint of the function
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
ends_program calls into nir_cf_node_get_function repeadtly to fetch the
same function and to check whether we are inside an entry point or not.
But we already got the information higher up the chain so use that
instead.
nir_cf_node_get_function is quite expensive, because it follows pointers
through the tree.
Speeds up compilation of more complex shaders by quite a bit. I am seeing
a 66% cut of compilation time spent in e.g. llama-bench.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41891 >
2026-05-29 22:58:00 +00:00
Paulo Zanoni
a84addd941
libcl/vk: add VkCopyMemoryToImageIndirectCommandKHR and its members
...
The members are all naturally aligned to 4, but other
naturally-aligned-to-4 structs in this file still have the attribute
declared (such as VkDispatchIndirectCommand), so I'm adding the
attributes to these as well.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39338 >
2026-05-29 18:12:37 +00:00
Paulo Zanoni
d3c87303da
libcl/vk: add aligned(4) to VkCopyMemoryIndirectCommandKHR
...
This structure, despite containing 8-bit members, can be 4-byte
aligned:
"VUID-VkCopyMemoryIndirectInfoKHR-copyAddressRange-10942
copyAddressRange.address must be 4 byte aligned"
So do it like we do with the other structures.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39338 >
2026-05-29 18:12:37 +00:00
Faith Ekstrand
932ee0f806
etnaviv: Remove f32_to_f16_fallback() in favor of float16::F16
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41375 >
2026-05-29 05:13:24 +00:00
Faith Ekstrand
b60694b91e
compiler/rust: Add a float16 wrapper
...
This adds an F16 struct which provides a 16-bit float type using Mesa's
existing half-precision support internally. Right now, it only contains
the basics but it could be expanded if needed.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41375 >
2026-05-29 05:13:24 +00:00
Georg Lehmann
dea444f80f
nir/deref: consider atomics that store derefs as complex use
...
src[1] or src[2] would mean that the atomic uses the deref as data for the
op, we only want to allow address source uses.
Fixes: bb311ce370 ("nir: Allow atomics as non-complex uses for var-splitting passes")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41818 >
2026-05-28 18:58:33 +00:00
Karol Herbst
8dc4e8094e
nir/opt_algebraic: add missing fmadz lowering for lower_fmulz_with_abs_min
...
Fixes: 32e91a7467 ("nir: add new float multiply-add opcodes")
Suggested-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41723 >
2026-05-27 16:28:48 +00:00
Rhys Perry
b1429caab3
nir,ac/nir,aco: add load_global_tr_amd
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41653 >
2026-05-27 14:44:59 +00:00
Rhys Perry
b982e71084
nir: add load_global_transpose_amd
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41653 >
2026-05-27 14:44:59 +00:00
Rhys Perry
57498eca83
nir: add load_deref_transpose_amd
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41653 >
2026-05-27 14:44:59 +00:00
Rhys Perry
6229e89fa8
nir: make cmat_muladd_amd a subgroup intrinsic
...
It's a subgroup op.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41653 >
2026-05-27 14:44:59 +00:00
Rhys Perry
81925d7f41
nir/algebraic: optimize ishl(iadd(iadd(iadd(a, #b), c), d), #e)
...
This improves combining of constants offsets into memory accesses in
dEQP-VK.compute.pipeline.cooperative_matrix.khr_a.subgroupscope.mul.float16_float16.buffer.colmajor.linear
fossil-db (gfx1201):
Totals from 121 (0.06% of 208640) affected shaders:
Instrs: 204278 -> 204199 (-0.04%); split: -0.06%, +0.03%
CodeSize: 1110856 -> 1110076 (-0.07%); split: -0.10%, +0.03%
VGPRs: 7620 -> 7680 (+0.79%); split: -0.16%, +0.94%
Latency: 1225169 -> 1225067 (-0.01%); split: -0.02%, +0.01%
InvThroughput: 191629 -> 191580 (-0.03%); split: -0.03%, +0.01%
SClause: 5732 -> 5731 (-0.02%)
Copies: 16358 -> 16356 (-0.01%); split: -0.02%, +0.01%
PreSGPRs: 5715 -> 5711 (-0.07%)
PreVGPRs: 5907 -> 5905 (-0.03%)
VALU: 112808 -> 112742 (-0.06%); split: -0.06%, +0.00%
SALU: 27121 -> 27113 (-0.03%)
fossil-db (gfx1201, dEQP-VK.compute.pipeline.cooperative_matrix.*):
Totals from 198 (12.20% of 1623) affected shaders:
Instrs: 13011 -> 11584 (-10.97%)
CodeSize: 90188 -> 77920 (-13.60%)
VGPRs: 3456 -> 2724 (-21.18%)
Latency: 144421 -> 142553 (-1.29%)
InvThroughput: 11158 -> 10608 (-4.93%)
Copies: 1119 -> 1117 (-0.18%)
PreSGPRs: 1954 -> 1857 (-4.96%)
PreVGPRs: 1675 -> 1354 (-19.16%)
VALU: 4894 -> 3476 (-28.97%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41653 >
2026-05-27 14:44:59 +00:00
Rhys Perry
c3db34a525
nir/algebraic: optimize ishl(iadd(ishl, ishl))
...
This reduces arithmetic for cooperative matrix loads:
v_mbcnt_lo_u32_b32 v0, -1, 0
v_and_b32_e32 v1, 15, v0
v_lshrrev_b32_e32 v0, 4, v0
v_lshlrev_b32_e32 v1, 4, v1
v_lshl_add_u32 v0, v0, 3, v1
v_lshlrev_b32_e32 v0, 1, v0
->
v_mbcnt_lo_u32_b32 v0, -1, 0
v_and_b32_e32 v1, -16, v0
v_and_b32_e32 v0, 15, v0
v_lshl_add_u32 v0, v0, 5, v1
fossil-db (gfx1201):
Totals from 38 (0.02% of 208640) affected shaders:
Instrs: 42234 -> 42181 (-0.13%)
CodeSize: 232656 -> 232384 (-0.12%)
Latency: 128807 -> 128759 (-0.04%)
InvThroughput: 20860 -> 20850 (-0.05%)
VALU: 23035 -> 23013 (-0.10%)
SALU: 4790 -> 4784 (-0.13%)
fossil-db (gfx1201, dEQP-VK.compute.pipeline.cooperative_matrix.*):
Totals from 44 (2.71% of 1623) affected shaders:
Instrs: 46834 -> 46802 (-0.07%)
CodeSize: 287536 -> 287272 (-0.09%)
Latency: 100960 -> 100918 (-0.04%); split: -0.10%, +0.06%
InvThroughput: 21808 -> 21796 (-0.06%)
VALU: 19336 -> 19328 (-0.04%)
SALU: 10790 -> 10782 (-0.07%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41653 >
2026-05-27 14:44:59 +00:00
Samuel Pitoiset
8c9995e7fa
nir: add nir_lower_abort
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41651 >
2026-05-27 06:37:03 +00:00
Samuel Pitoiset
88fb73c883
spirv: implement SPV_KHR_abort
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41651 >
2026-05-27 06:37:03 +00:00
Samuel Pitoiset
f431d6bc87
nir: add new intrinsics for SPV_KHR_abort
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41651 >
2026-05-27 06:37:03 +00:00
Marek Olšák
7f2130c86e
nir/opt_algebraic: add more ffract/ffloor/ftrunc/f2u/f2i patterns
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Totals from 1390 (0.69% of 202429) affected shaders:
MaxWaves: 33336 -> 33348 (+0.04%)
Instrs: 4101809 -> 4095218 (-0.16%); split: -0.17%, +0.01%
CodeSize: 22973700 -> 22944812 (-0.13%); split: -0.13%, +0.00%
VGPRs: 95592 -> 95460 (-0.14%); split: -0.15%, +0.01%
SpillSGPRs: 2910 -> 2913 (+0.10%)
Latency: 27815305 -> 27807064 (-0.03%); split: -0.06%, +0.03%
InvThroughput: 4563067 -> 4555622 (-0.16%); split: -0.18%, +0.02%
VClause: 98544 -> 98570 (+0.03%); split: -0.04%, +0.06%
SClause: 91148 -> 91149 (+0.00%); split: -0.00%, +0.01%
Copies: 324008 -> 324028 (+0.01%); split: -0.10%, +0.10%
Branches: 99085 -> 99084 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 70920 -> 70734 (-0.26%); split: -0.27%, +0.00%
PreVGPRs: 78288 -> 78190 (-0.13%); split: -0.15%, +0.03%
VALU: 2123606 -> 2117766 (-0.28%); split: -0.28%, +0.00%
SALU: 621757 -> 621671 (-0.01%); split: -0.02%, +0.00%
VMEM: 163395 -> 163387 (-0.00%); split: -0.01%, +0.00%
SMEM: 140374 -> 140376 (+0.00%)
VOPD: 258332 -> 258264 (-0.03%); split: +0.04%, -0.07%
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41455 >
2026-05-25 20:02:30 +00:00
Thong Thai
931dba218e
nir: Only build NIR headers when with_gfx_compute is false
...
Signed-off-by: Thong Thai <thong.thai@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41493 >
2026-05-25 15:44:12 +00:00
Marek Olšák
1b45a8aee2
radv: select frag_coord_xy and pixel_coord conditionally based on dynamic state
...
the code explains it
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> (shader parts)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41689 >
2026-05-25 13:38:08 +00:00