Commit graph

10679 commits

Author SHA1 Message Date
Alyssa Rosenzweig
67237b6f1b treewide: use nir_break_if
Via Coccinelle patch:

    @@
    expression builder, condition;
    @@

    -nir_push_if(builder, condition);
    -{
    -nir_jump(builder, nir_jump_break);
    -}
    -nir_pop_if(builder, NULL);
    +nir_break_if(builder, condition);

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35794>
2025-06-30 14:51:24 -04:00
Karol Herbst
b3c245ecf2 clc: add support for cl_ext_image_unorm_int_2_101010
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35469>
2025-06-30 18:04:59 +00:00
Alyssa Rosenzweig
7fd7b18b38 nir: rename AGX geom/tess intrinsics
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
to the new common code name.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>
2025-06-30 16:24:10 +00:00
Alyssa Rosenzweig
d13b321201 nir/lower_gs_intrinsics: drop stuff added for AGX
AGX now vendors a significantly different version of this pass, so the common
one doesn't need the stuff added for AGX.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>
2025-06-30 16:24:10 +00:00
Alyssa Rosenzweig
16b53d356a nir: add rasterization_stream sysval
for plumbing transformFeedbackRasterizationStreamSelect (in turn for exercising
more CTS and proving out my design).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>
2025-06-30 16:24:06 +00:00
Alyssa Rosenzweig
805ef6cc17 nir: add intrinsics for geometry shader lowering
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>
2025-06-30 16:24:05 +00:00
Alyssa Rosenzweig
4f7cae5e61 nir/opt_algebraic: add trichotomy identity
In https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802 we will
significantly rework geometry shaders & transform feedback. In the new approach,
transform feedback is executed as part of the hardware vertex shader, meaning
the vertex shader needs to write out all the "copies" of the same value into
different parts of the XFB buffer. In the general case of a GS writing triangle
strips, we get 0-3 copies. This is good and lets us parallelize XFB better with
GS.

In the case of a VS alone with XFB, we insert a passthrough GS. In that case
special case, we can only get at most 1 copy, so if we can prove the length of
the output strip is 3 we can delete 2/3 of the shader.

Anyway, the only thing preventing NIR from doing that optimization is failing to
see through some conditionals, fixed by optimizing with the law of trichotomy.

We could add other variants of this pattern (signed vs unsigned, iand vs
ior/ixor) if we expect anything else to hit this other than my boutique use
case.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>
2025-06-30 16:24:04 +00:00
Robert Mader
a166d7609f gles: Add support for 10/12/16 bit SW decoder YCbCr formats
Signed-off-by: Robert Mader <robert.mader@collabora.com>
Co-Authored-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34303>
2025-06-30 11:56:23 +00:00
Rhys Perry
7b291a33d4 nir/search: fix dumping of conversions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35770>
2025-06-30 10:41:39 +00:00
Rhys Perry
08859cbe50 nir/lower_bit_size: fix bitz/bitnz
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 6585209cdd ("nir/lower_bit_size: mask bitz/bitnz src1 like shifts")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35770>
2025-06-30 10:41:39 +00:00
Mel Henning
8795006994 nir/opt_uniform_subgroup: Handle vote_feq
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Brings the vertex shader in
dEQP-VK.subgroups.vote.framebuffer.subgroupallequal_dvec4_vertex
from 234 to 169 instructions on NAK.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35778>
2025-06-28 16:10:50 +00:00
Mel Henning
70fccc59fc nir/opt_uniform_subgroup: Handle vote_ieq
No shader-db changes here, but it does improve some cts shaders, eg. the
vertex shader in
dEQP-VK.subgroups.vote.framebuffer.subgroupallequal_i64vec4_vertex
goes from 80 to 56 instructions with NAK

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35778>
2025-06-28 16:10:50 +00:00
Mel Henning
10acb44c64 nir: Split lower_vote_eq into int/float versions
Recent nvidia hardware has a native instruction for
nir_intrinsic_vote_ieq but not for nir_intrinsic_vote_feq. So, split
this boolean into two so we can contol the lowering separately for each
instruction.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35778>
2025-06-28 16:10:50 +00:00
Lionel Landwerlin
fcf4401824 brw: handle wa_18019110168 with independent shader compilation
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:35 +00:00
Matt Turner
102d7409ef nir: Add convert_cmat_intel intrinsic
This intrinsic will be used to implement matrix type and layout
conversions in the backend compiler.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35616>
2025-06-27 01:26:22 +00:00
James Price
10ae673368 spirv: Fix cooperative matrix in OpVariable initializer
Check for cooperative matrix types first in the
nir_lower_variable_initializers pass, since they are also considered
to be scalar types.

Fixes: 7e6cd395c7 ("nir: Handle cmat types in lower_variable_initializers")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13388
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35668>
2025-06-26 22:24:31 +00:00
Konstantin Seurer
aacfc663cb nir: Add nir_lower_halt_to_return
This is a lowering pass that was implemented by multiple drivers.

Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33003>
2025-06-26 20:12:12 +00:00
Marek Olšák
1754507d49 nir: rename nir_lower_io_to_temporaries -> nir_lower_io_vars_to_temporaries
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:54 +00:00
Marek Olšák
1e03827c77 nir: rename nir_lower_io_arrays_to_elements -> nir_lower_io_array_vars_to_elements
same for *_no_indirects

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:54 +00:00
Marek Olšák
3713e2d580 nir: rename nir_lower_clip_cull_distance_arrays -> nir_lower_clip_cull_distance_array_vars
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:53 +00:00
Marek Olšák
adb17a8609 nir: move nir_recompute_io_bases into its own file
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:53 +00:00
Marek Olšák
97743980ce nir: remove unused nir_force_mediump_io & nir_unpack_16bit_varying_slots
I think I added these.

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:52 +00:00
Marek Olšák
aefea49dad nir: move lots of code from nir_lower_io.c into new nir_lower_explicit_io.c
nir_lower_io is just for regular inputs/outputs.

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:52 +00:00
Marek Olšák
5bd3e0c08c nir: move nir_assign_var_locations to freedreno (its only use)
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:52 +00:00
Marek Olšák
c8cda0dc1a nir: move nir_io_add_const_offset_to_base into its own file
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:51 +00:00
Marek Olšák
d78070ded5 nir: move nir_io_add_intrinsic_xfb_info into its own file
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:51 +00:00
Marek Olšák
12df9b3def nir: rename nir_vectorize_tess_levels -> nir_lower_tess_level_array_vars_to_vec
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:50 +00:00
Marek Olšák
2aa94caf82 nir: rename nir_lower_io_to_vector -> nir_opt_vectorize_io_vars
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:50 +00:00
Marek Olšák
944f8f6db2 nir: move nir_lower_io_vars_to_scalar into its own file
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:49 +00:00
Marek Olšák
439d805291 nir: rename nir_lower_io_to_scalar_early -> nir_lower_io_vars_to_scalar
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:49 +00:00
Alyssa Rosenzweig
6efe557718 nir/search_helpers: add has_multiple_uses helper
heuristic for the next patch.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35720>
2025-06-26 16:41:55 +00:00
Alyssa Rosenzweig
63ce73a601 nir,hk: sink lowered UBOs
this is better than doing it once we've lowered to hardware ops which makes it
more challenging to sink since then we'd have to sink the whole tree instead of
a single intrinsic.

Totals from 17617 (32.81% of 53701) affected shaders:
MaxWaves: 16863872 -> 16901504 (+0.22%); split: +0.24%, -0.02%
Instrs: 12406405 -> 12430375 (+0.19%); split: -0.15%, +0.35%
CodeSize: 87055248 -> 87180802 (+0.14%); split: -0.18%, +0.33%
Spills: 10350 -> 9301 (-10.14%); split: -11.57%, +1.43%
Fills: 5215 -> 3733 (-28.42%); split: -31.49%, +3.07%
Scratch: 113164 -> 110472 (-2.38%); split: -2.63%, +0.25%
ALU: 9552550 -> 9558513 (+0.06%); split: -0.22%, +0.28%
FSCIB: 9552545 -> 9558508 (+0.06%); split: -0.22%, +0.28%
IC: 2874032 -> 2876442 (+0.08%); split: -0.00%, +0.09%
GPRs: 1470040 -> 1459283 (-0.73%); split: -1.00%, +0.27%
Uniforms: 5113254 -> 5115158 (+0.04%); split: -0.82%, +0.85%

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Job Noorman <job@noorman.info> [NIR]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35720>
2025-06-26 16:41:55 +00:00
Alyssa Rosenzweig
caa0854da8 nir: plumb load_global_bounded
this lets the backend implement bounded loads (i.e. robust SSBOs) in a way
that's more clever than a full branch. similar idea to
load_global_constant_bound which should eventually be merged into this.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Job Noorman <job@noorman.info>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35720>
2025-06-26 16:41:53 +00:00
Georg Lehmann
7de352e99e nir,radv: add an option to not move 8/16bit vecs
ACO will overestimate the register demand of the sources, so we don't
want to create the vector later.

Foz-DB Navi48:
Totals from 240 (0.30% of 80265) affected shaders:
MaxWaves: 6429 -> 6435 (+0.09%)
Instrs: 3406069 -> 3406646 (+0.02%); split: -0.01%, +0.03%
CodeSize: 18231596 -> 18233288 (+0.01%); split: -0.01%, +0.02%
VGPRs: 14768 -> 14732 (-0.24%)
Latency: 18981274 -> 18979170 (-0.01%); split: -0.02%, +0.01%
InvThroughput: 4247331 -> 4246634 (-0.02%); split: -0.02%, +0.01%
VClause: 85453 -> 85458 (+0.01%); split: -0.01%, +0.01%
Copies: 262046 -> 261971 (-0.03%); split: -0.06%, +0.03%
PreVGPRs: 10899 -> 10775 (-1.14%)
VALU: 1923441 -> 1923485 (+0.00%); split: -0.01%, +0.01%
SALU: 457983 -> 457982 (-0.00%)
VOPD: 4980 -> 4861 (-2.39%); split: +0.48%, -2.87%

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35729>
2025-06-26 09:29:43 +00:00
Georg Lehmann
7ac9a87572 nir/opt_sink: don't assume moving conversion can't increase register pressure
Foz-DB Navi48:
Totals from 11311 (14.09% of 80265) affected shaders:
MaxWaves: 337664 -> 337648 (-0.00%); split: +0.00%, -0.01%
Instrs: 10102221 -> 10101625 (-0.01%); split: -0.05%, +0.04%
CodeSize: 55000184 -> 54999292 (-0.00%); split: -0.04%, +0.03%
VGPRs: 571052 -> 571064 (+0.00%); split: -0.03%, +0.03%
Latency: 59247189 -> 59204726 (-0.07%); split: -0.13%, +0.06%
InvThroughput: 10236407 -> 10215675 (-0.20%); split: -0.26%, +0.06%
VClause: 211730 -> 211677 (-0.03%); split: -0.07%, +0.04%
SClause: 284802 -> 284762 (-0.01%); split: -0.07%, +0.06%
Copies: 702890 -> 702539 (-0.05%); split: -0.18%, +0.13%
Branches: 205117 -> 205112 (-0.00%)
PreSGPRs: 475898 -> 475825 (-0.02%); split: -0.02%, +0.00%
PreVGPRs: 366318 -> 366449 (+0.04%); split: -0.14%, +0.17%
VALU: 5764791 -> 5764349 (-0.01%); split: -0.02%, +0.01%
SALU: 1259529 -> 1259517 (-0.00%); split: -0.04%, +0.04%
VOPD: 5854 -> 5724 (-2.22%); split: +0.70%, -2.92%

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35729>
2025-06-26 09:29:43 +00:00
Konstantin Seurer
42c2ccbfb2 spirv: Move the shader_call_data workaround above nir_validate_shader
Prevents validation failures.

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35532>
2025-06-26 06:30:44 +00:00
Rob Clark
6f5ff6be44 nir: Fix lower_readonly_images_to_tex bitsize
The txf instruction could be returning something smaller than 32b.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35758>
2025-06-26 02:48:16 +00:00
Georg Lehmann
e6d208b1f9 nir/opt_shrink_vectors: also split vecs into distinct smaller vecs if possible
Foz-DB Navi48:
Totals from 17 (0.02% of 80265) affected shaders:
Instrs: 75085 -> 74912 (-0.23%); split: -0.23%, +0.00%
CodeSize: 428968 -> 427028 (-0.45%); split: -0.45%, +0.00%
Latency: 1306841 -> 1306080 (-0.06%); split: -0.06%, +0.00%
InvThroughput: 598998 -> 598719 (-0.05%)
Copies: 15733 -> 15561 (-1.09%)
Branches: 2435 -> 2422 (-0.53%)
PreVGPRs: 1723 -> 1721 (-0.12%)
VALU: 43019 -> 42847 (-0.40%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35676>
2025-06-25 05:34:48 +00:00
Georg Lehmann
22d7dd69b2 nir/shrink_vectors: shrink larger vectors too
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35676>
2025-06-25 05:34:48 +00:00
Matt Turner
6100dbc3d0 compiler: Generate files with newline at end
These generator scripts use the `write` function that, unlike `print`,
doesn't print a trailing newline. So let's add one to the template.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35697>
2025-06-24 14:01:04 +00:00
Ashley Smith
2ce201707e mesa: Add support for GL_EXT_shader_clock
Signed-off-by: Ashley Smith <ashley.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35630>
2025-06-24 13:21:28 +00:00
Georg Lehmann
b729ad1742 nir/loop_analyze: consider movs/vecs free
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
They are free more likely than not.

Foz-DB Navi31:
Totals from 462 (0.58% of 80251) affected shaders:
Instrs: 1464013 -> 1868466 (+27.63%)
CodeSize: 8476352 -> 10745544 (+26.77%)
VGPRs: 27412 -> 27560 (+0.54%)
SpillSGPRs: 0 -> 16 (+inf%)
SpillVGPRs: 83 -> 76 (-8.43%)
Scratch: 6072832 -> 6071808 (-0.02%)
Latency: 19282476 -> 19552323 (+1.40%); split: -0.11%, +1.51%
InvThroughput: 2198357 -> 2258490 (+2.74%); split: -0.47%, +3.21%
VClause: 32986 -> 43491 (+31.85%)
SClause: 72760 -> 126112 (+73.33%)
Copies: 165286 -> 223368 (+35.14%)
Branches: 60530 -> 79743 (+31.74%); split: -0.03%, +31.77%
PreSGPRs: 24885 -> 25077 (+0.77%)
PreVGPRs: 23004 -> 22494 (-2.22%); split: -2.26%, +0.04%
VALU: 760978 -> 898136 (+18.02%)
SALU: 187786 -> 252995 (+34.73%); split: -0.03%, +34.75%
VMEM: 58469 -> 69164 (+18.29%); split: -0.07%, +18.36%
SMEM: 87926 -> 158175 (+79.90%); split: -0.00%, +79.90%
VOPD: 580 -> 732 (+26.21%); split: +31.38%, -5.17%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35686>
2025-06-24 12:18:47 +00:00
Georg Lehmann
b1290fdf20 nir/loop_analyze: handle vector selections properly
Consider all conditions, not just the first.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35686>
2025-06-24 12:18:47 +00:00
Georg Lehmann
47aba15489 nir/loop_analyze: always consider comparisions between induction var and constant free
There is no reason why this should be restricted to single uses.

Foz-DB Navi31:
Totals from 21 (0.03% of 80251) affected shaders:
Instrs: 54424 -> 65851 (+21.00%)
CodeSize: 286688 -> 346896 (+21.00%)
Latency: 2980310 -> 2959904 (-0.68%)
InvThroughput: 403744 -> 400782 (-0.73%)
VClause: 923 -> 1316 (+42.58%)
SClause: 1217 -> 1705 (+40.10%)
Copies: 3226 -> 3393 (+5.18%); split: -0.87%, +6.04%
Branches: 1014 -> 1130 (+11.44%); split: -0.39%, +11.83%
PreSGPRs: 1327 -> 1306 (-1.58%)
PreVGPRs: 1896 -> 1868 (-1.48%)
VALU: 36083 -> 43560 (+20.72%)
SALU: 4471 -> 4708 (+5.30%); split: -2.75%, +8.05%
VMEM: 2225 -> 2743 (+23.28%)
SMEM: 1662 -> 2273 (+36.76%); split: -0.06%, +36.82%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35686>
2025-06-24 12:18:47 +00:00
Georg Lehmann
bdd2c7b9f2 spirv: implement CooperativeMatrixConversionsNV
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34793>
2025-06-24 07:14:34 +00:00
Georg Lehmann
8c4225b99b nir: add cmat_transpose
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34793>
2025-06-24 07:14:34 +00:00
David Neto
e9b9f1f764 spirv: spirv-to-c-array: use '-' to specify stdin
The spirv-to-c-array.py script assembles a SPIR-V module,
then disassembles it, capturing that text, then re-assembles
that text, providing it on stdin. But this last invocation of
spirv-as must use '-' to specify that the text input appears
on stdin.

Currently it always errors out, complaining that there must
be exactly one input file.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35691>
2025-06-23 21:24:56 +00:00
Alyssa Rosenzweig
d37bf148d2 nir/lower_blend: fix snorm factor clamping
The spec says (emphasis mine):

  If the color attachment is fixed-point, the components of the source and
  destination values **AND BLEND FACTORS** are each clamped to [0,1] or [-1,1]
  respectively for an unsigned normalized or signed normalized color attachment
  prior to evaluating the blend operations. If the color attachment is
  floating-point, no clamping occurs.

However, neither the CTS nor any hardware implement this semantic.

For unsigned normalized formats, the definitions are roughly equivalent (except
perhaps around constant colours). 0 <= x <= 1 implies that 0 <= 1 - x <= 1.
Therefore if the source/destination colours are clamped to [0, 1], then their
complements are also in [0, 1], so clamping any blend factor (except constant
colour) has no effect if the source/dest were already clamped.

For signed normalized formats, however, this difference matters. -1 <= x <= 1
implies that 0 <= 1 - x <= 2... so to implement the spec text faithfully, we
would need to clamp again the complemented colour blend factors to return back
to signed normalized range. Software blending implementations can of course do
that... but doing so causes CTS fails, as the CTS reference renderer does not do
this.

This commit adjusts nir_lower_blend to match what actual hardware does, what CTS
requires, and what the spec should have said.

See https://gitlab.khronos.org/vulkan/vulkan/-/issues/4293 for the spec
resolution.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35519>
2025-06-23 19:38:27 +00:00
Emma Anholt
bc8994cb48 nir: Add a pass to reassociate multiplication of mat*mat*vec.
The typical case of mat4*mat4*vec4 is 80 scalar multiplications, but
mat4*(mat4*vec4) is only 32.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35622>
2025-06-23 17:49:51 +00:00
Timothy Arceri
21ea8c205f nir: raise NIR_SEARCH_MAX_VARIABLES limit to 24
This is required to process the pattern in the following patch.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35622>
2025-06-23 17:49:51 +00:00