Commit graph

12149 commits

Author SHA1 Message Date
Marek Olšák
24e0226058 nir/tests: don't leave "namespace {" unclosed in nir_opt_varyings_tests.h
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41329>
2026-05-19 04:26:55 +00:00
Marek Olšák
ca137e545c nir/opt_varyings: rewrite elimination of duplicated outputs
to strengthen it and fix a rare case of incorrect behavior.

This is the incorrect behavior that's fixed:

    if (invocation_id == 0)
       patch_output[0] = invocation_id;
    else if (invocation_id == 1)
       patch_output[1] = invocation_id;
    else
       patch_output[2] = invocation_id;

The stored SSA defs are the same, but each invocation writes a different
value to different outputs, so they are not duplicated, but the code
incorrectly treated them as duplicated and removed [1] and [2].

The requirement for correctness is that if an output is duplicated,
it must have stores in the same blocks as the output it duplicates.

To fix the incorrect behavior, awareness of blocks is added to the algorithm,
which leads to the following new cases that are optimized:

    1. Different blocks store different SSA defs, but equal in each block:
        if (..) {
           output[0] = a;
           output[1] = a; // eliminated
        } else {
           output[0] = b;
           output[1] = b; // eliminated
        }

    2. Different GS emit sections store different SSA defs, but equal in each
       section:
        output[0] = a;
        output[1] = a; // eliminated
        emit_vertex;
        output[0] = b;
        output[1] = b; // eliminated
        emit_vertex;

    3. A weird case that could be duplicated but is left alone due to
       the counterexample above:
        if (..)
           output[0] = a;
        else
           output[1] = a;

Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41329>
2026-05-19 04:26:54 +00:00
Simon Perretta
1821309b54 pvr, pco: add support for spilling shared memory to global memory
Add a PCO_DEBUG option to force spilling to begin with.

Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41545>
2026-05-18 12:43:51 +00:00
Daniel Schürmann
3749db73f4 nir/lower_bit_size: skip conversion for more opcodes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41388>
2026-05-18 11:01:06 +00:00
Daniel Schürmann
01c9311cfb nir/lower_bit_size: use nir_def_replace() instead of nir_def_rewrite_uses()
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41388>
2026-05-18 11:01:06 +00:00
Daniel Schürmann
16bf0a2047 nir/lower_bit_size: use nir_builder::constant_fold_alu
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41388>
2026-05-18 11:01:06 +00:00
Daniel Schürmann
bc941eb4ff nir/builder: constant-fold nir_mov_alu() if requested
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41388>
2026-05-18 11:01:06 +00:00
Calder Young
768f4782e3 spirv: Fix debugPrintfEXT not working with multiple arguments
The struct field offsets weren't getting initialized.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
CC: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41590>
2026-05-18 08:21:19 +00:00
Ian Romanick
e76abd1e3a nir/opt_constant_folding: Don't fight with nir_lower_bit_size
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Intel uses nir_lower_bit_size to convert 8-bit integer values to 16-bit
for most instructions. By constant folding u2u8 or i2i8 through a bcsel,
this lowering is undone.

Fixes assertion failure in fossils/parallel-rdp/small_subgroup.foz.

fossilize-replay: src/intel/compiler/brw/brw_from_nir.cpp:852: void brw_from_nir_emit_alu(nir_to_brw_state&, nir_alu_instr*, bool): Assertion `brw_type_size_bytes(op[i].type) > 1' failed.

v2: Reject all integer conversions. Suggested by Daniel Schürmann.

Fixes: f4812dc11d ("nir/opt_constant_folding: constant-fold op(bcsel(), #c) -> bcsel(.., #c1, #c2)")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41412>
2026-05-15 23:36:25 +00:00
Karol Herbst
c3832060a4 clc: do not use std::filesystem
It seems like davinci resolve conflicts on those symbols and we got
regressions from our static libstdc++ linking workaround.

Cc: mesa-stable
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41488>
2026-05-15 22:59:58 +00:00
Caio Oliveira
0281eb2e98 nir/instr_set: Fix multi-slot intrinsic index equality
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
nir_intrinsic_index_size() expects a nir_intrinsic_index_flag, not
the position in the intrinsic's index list.  This could cause
part of a multi-slot index to be ignored.

Fixes: b2bc57551a ("nir/instr_set: allow cse with fp_math_ctrl mismatches for intrinsics")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41593>
2026-05-15 18:13:15 +00:00
Georg Lehmann
0be2d71ad1 nir/opt_uniform_subgroup: preserve divergence during optimization
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
nir_def_init sets divergent = true, this means for something like
reduce(reduce(convergent)) we previously only optimized the inner
reduce.

No fossil changes at the moment, but I hit this when trying to
optimize shared memory to subgroup operations.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41542>
2026-05-15 08:42:18 +00:00
Samuel Pitoiset
dc398afb27 nir: fix shuffling local IDs for quad derivatives with larger workgroup sizes
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This was fundamentally broken for workgroup sizes >= 8x8.

This fixes new VKCTS coverage
dEQP-VK.glsl.texture_functions.texture.*_compute, and also few tests
from the vkd3d-proton testsuite (note that quad derivatives is
currently disabled for < GFX12).

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41483>
2026-05-15 07:04:32 +00:00
Marek Olšák
e966c1bdec nir/opt_varyings: use workgroup divergence to identify convergent mesh outputs
It turns out we do have workgroup divergence, hidden behind
nir_divergence_across_subgroups, if I understand it correctly.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41225>
2026-05-15 00:09:42 +00:00
Marek Olšák
edb60c76e2 nir: generalize nir_vertex_divergence_analysis -> nir_custom_divergence_analysis
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41225>
2026-05-15 00:09:42 +00:00
Marek Olšák
3831935818 nir/opt_move_to_top: add an option to exclude moving at_offset/at_sample loads
This splits the nir_move_to_top_input_loads option into 2 options. The latter
option is mainly for at_offset/at_sample loads. Then it updates most places to
use only the first option.

The rationale is that moving at_sample loads makes Control (game) shaders
worse, as per the code comment.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41167>
2026-05-14 16:48:39 -04:00
Arzaq Naufail Khan
e2cd37e422 spirv: fix resource leak in spirv shader replacement
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39512>
2026-05-14 17:09:28 +00:00
Lionel Landwerlin
df5a6d7b87 brw/jay: move some coarse lowering to NIR
We add a pass to allow testing partially known fs config bits (main
user is DX11 always disabling VRS/coarse).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41529>
2026-05-14 14:05:06 +00:00
David Airlie
6a7a46ac21 nir/coopmat: rename the box split variables.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
When we add workgroup support we have to add inner box splits as well
so name the outer splits correctly.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41500>
2026-05-14 02:15:05 +00:00
David Airlie
6062bcde56 nir/coopmat: move the row/col into a box and add some helpers.
This makes adding workgroup scope easier, this just creates the
split_box and moves things into it and adds some helpers.

This also rewrites some loops from r/c into i which calc r/c

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41500>
2026-05-14 02:15:05 +00:00
David Airlie
eaf6207e06 nir/coopmat: refactor the split vars to clean it up
This just moves to using the split_info to store all the info,
and updating the hash table inside the split function.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41500>
2026-05-14 02:15:04 +00:00
Kenneth Graunke
f2c5410caf nir: Lower SSBO helper writes too
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41535>
2026-05-13 23:03:14 +00:00
Kenneth Graunke
9f56e4679e nir: Allow bias for nir_texop_sparse_residency_intel
Fixes validation errors in tests like:
dEQP-VK.glsl.texture_functions.textureoffset.clamp_to_edge.sparse_isampler2d_bias_fragment

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41535>
2026-05-13 23:03:14 +00:00
Christian Gmeiner
e201d4fa77 compiler/rust: Move VecPair from NAK to shared compiler crate
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Move the VecPair<A, B> data structure from NAK's ir.rs to the shared
compiler Rust crate so it can be reused by other backends.

The fields are private, and NAK's ir.rs (now in a different crate)
needs to read and mutate the inner Vecs. Add a_as_slice(..),
a_as_mut_slice(..), b_as_slice(..) and b_as_mut_slice(..), and update
NAK's SrcsAsSlice and DstsAsSlice impls to call them. Returning slices
keeps callers from changing the length of one side without the other,
which is what VecPair is built to prevent.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41435>
2026-05-13 22:32:44 +00:00
Mary Guillemard
dd97257209 nir/lower_bit_size: Preserve float controls when lowering alu ops
fp_math_ctrl should be preserved when recreating alu operations.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15455
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41511>
2026-05-13 14:53:01 +00:00
Jesse Natalie
23f8e75f8c nir_lower_non_uniform_access: Add ASSERTED for assert-only var
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41445>
2026-05-12 15:44:25 +00:00
Marek Olšák
933b25b0b6 nir: add an option to ignore INTERP_MODE_NONE in nir_shader_gather_info
Color interpolation (INTERP_MODE_NONE) has unknown barycentrics
and it could be flat shading at runtime.

It's a problem when shader_info is expected to match what's actually
used.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41226>
2026-05-12 14:13:45 +00:00
Marek Olšák
feeaae1c28 nir/licm: allow speculative hoisting across terminate if the filter is set
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41453>
2026-05-12 13:41:06 +00:00
Marek Olšák
3c2786b76a nir/tests: add nir_opt_licm tests
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41453>
2026-05-12 13:41:06 +00:00
Marek Olšák
1dfc0e3c30 nir/opt_licm: add filter callback
Speculative hoisting is only possible with the filter callback.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41453>
2026-05-12 13:41:06 +00:00
Samuel Pitoiset
975ae01208 spirv: preserve the explicit stride for untyped pointers with matrices
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Make sure to cast with the deref type that contains more information
than the returned type because it's valid in SPIR-V.

This fixes dEQP-VK.binding_model.descriptor_heap.graphics.*_vectors and
also the PositiveGpuAV.HeapWithUntypedPointers VVL test.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41469>
2026-05-12 06:15:45 +00:00
Iván Briano
fea8830946 intel/brw: add load_frag_shading_rate_intel
Add a new intrinsic to read the raw shading rate provided to the FS
payload, and lower load_frag_shading_rate in NIR using it.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Caleb Callaway <caleb.callaway@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38879>
2026-05-11 18:15:49 +00:00
Iván Briano
5383afadbf intel/brw: add load_msaa_rate_intel intrinsic
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Caleb Callaway <caleb.callaway@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38879>
2026-05-11 18:15:49 +00:00
Iván Briano
3448f3ce4a intel/brw: add load_coverage_mask_intel intrinsic
We'll need the raw coverage mask provided to the fragment shader in a
future patch.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Caleb Callaway <caleb.callaway@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38879>
2026-05-11 18:15:49 +00:00
squidbus
1591e67a93 kk: Split per-draw data to separate binding
Instead of dirtying the root buffer and re-uploading the whole thing for each
draw where a per-draw value like the draw ID is changed, use a smaller
secondary buffer for per-draw data. We can also skip flushing state for every
indiviual batched draw and just flush once for the whole draw command.

This may also be useful in the future for handling how sized index buffers from
maintenance5 and null index buffers from maintenance6 work with robustness2,
allowing us to pass through indexed draw parameters and lower the index buffer
read into the shader with bounds checks.

Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41399>
2026-05-11 11:27:02 +00:00
squidbus
5b34d1ff34 nir: Only attempt subgroups lower_boolean_reduce for single component.
lower_boolean_reduce only works if the number of components is 1, and even
asserts on this in its prologue. Otherwise, given a boolean vector type, it
may produce output using ballot/vote with a boolean vector input.

Acked-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41186>
2026-05-11 09:50:27 +00:00
Daniel Schürmann
0832f3251c nir/opt_algebraic: extend some extract_u8 pattern to extract_i8
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
and remove some duplicate extract pattern.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41385>
2026-05-09 21:23:40 +00:00
Daniel Schürmann
9895b5e5da nir/opt_algebraic: optimize downcast followed by upcast to extract
Totals from 217 (0.10% of 208640) affected shaders: (Navi48)

Instrs: 283561 -> 282870 (-0.24%)
CodeSize: 1604864 -> 1601136 (-0.23%); split: -0.24%, +0.01%
Latency: 2992301 -> 2990107 (-0.07%); split: -0.09%, +0.02%
InvThroughput: 602722 -> 601316 (-0.23%); split: -0.23%, +0.00%
Copies: 26490 -> 26471 (-0.07%); split: -0.10%, +0.03%
VALU: 147735 -> 147176 (-0.38%)
SALU: 51545 -> 51541 (-0.01%)
VOPD: 11140 -> 11204 (+0.57%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41385>
2026-05-09 21:23:40 +00:00
Georg Lehmann
1716cbff37 nir,amd: reassociate fadd to create more fma/mad
ACO's backend fusing is quite competent, but it cannot reorder adds.
This adds a simple algebraic pass to do that for us.

Foz-DB Navi10:
Totals from 13568 (18.76% of 72319) affected shaders:
MaxWaves: 304722 -> 304004 (-0.24%); split: +0.10%, -0.33%
Instrs: 15084252 -> 14993010 (-0.60%); split: -0.61%, +0.00%
CodeSize: 81480188 -> 81372600 (-0.13%); split: -0.17%, +0.04%
VGPRs: 741580 -> 743680 (+0.28%); split: -0.10%, +0.38%
SpillSGPRs: 9418 -> 9434 (+0.17%)
Latency: 154602014 -> 154312940 (-0.19%); split: -0.29%, +0.10%
InvThroughput: 44628554 -> 44442595 (-0.42%); split: -0.47%, +0.05%
VClause: 300035 -> 300054 (+0.01%); split: -0.31%, +0.31%
SClause: 370992 -> 370640 (-0.09%); split: -0.15%, +0.06%
Copies: 1162401 -> 1162800 (+0.03%); split: -0.30%, +0.33%
Branches: 300646 -> 300654 (+0.00%); split: -0.00%, +0.01%
PreSGPRs: 673675 -> 675057 (+0.21%); split: -0.00%, +0.21%
PreVGPRs: 633017 -> 634768 (+0.28%); split: -0.29%, +0.57%
VALU: 10800351 -> 10712041 (-0.82%); split: -0.82%, +0.00%
SALU: 1752917 -> 1753203 (+0.02%); split: -0.04%, +0.06%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41348>
2026-05-08 11:49:43 +00:00
Georg Lehmann
9e87090db4 nir/loop_analyze: do not count fmul towards the limit when only used by fadd
As always with loop unrolling, don't look too closely at stats, but they
confirm more loops are now unrolled.

Foz-DB Navi10:
Totals from 66 (0.09% of 72319) affected shaders:
MaxWaves: 1464 -> 1424 (-2.73%); split: +0.82%, -3.55%
Instrs: 101778 -> 173128 (+70.10%)
CodeSize: 544148 -> 905392 (+66.39%)
VGPRs: 3652 -> 3788 (+3.72%); split: -0.77%, +4.49%
SpillSGPRs: 105 -> 75 (-28.57%)
Latency: 1197088 -> 1033471 (-13.67%); split: -17.08%, +3.41%
InvThroughput: 315257 -> 293245 (-6.98%); split: -13.29%, +6.31%
VClause: 1663 -> 3057 (+83.82%); split: -0.12%, +83.94%
SClause: 2797 -> 4496 (+60.74%); split: -0.21%, +60.96%
Copies: 6472 -> 11219 (+73.35%); split: -0.08%, +73.42%
Branches: 2695 -> 4697 (+74.29%); split: -0.56%, +74.84%
PreSGPRs: 3418 -> 3619 (+5.88%); split: -0.79%, +6.67%
PreVGPRs: 3305 -> 3423 (+3.57%); split: -1.06%, +4.63%
VALU: 73061 -> 124934 (+71.00%)
SALU: 11775 -> 20803 (+76.67%); split: -0.99%, +77.66%
VMEM: 2729 -> 4627 (+69.55%)
SMEM: 3796 -> 5869 (+54.61%); split: -0.18%, +54.79%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41348>
2026-05-08 11:49:43 +00:00
Georg Lehmann
25add9cbd1 nir/opt_peephole_select: do not count fmul towards the limit when only used by fadd
Foz-DB Navi10:
Totals from 4077 (5.64% of 72319) affected shaders:
MaxWaves: 84057 -> 83325 (-0.87%); split: +0.07%, -0.94%
Instrs: 6019711 -> 6007338 (-0.21%); split: -0.27%, +0.07%
CodeSize: 32373984 -> 32356152 (-0.06%); split: -0.18%, +0.13%
VGPRs: 236588 -> 238172 (+0.67%); split: -0.05%, +0.72%
SpillSGPRs: 7341 -> 7367 (+0.35%); split: -0.65%, +1.01%
Latency: 61833147 -> 61386674 (-0.72%); split: -0.91%, +0.19%
InvThroughput: 22328993 -> 22364077 (+0.16%); split: -0.16%, +0.32%
VClause: 97803 -> 97832 (+0.03%); split: -0.29%, +0.32%
SClause: 147544 -> 146274 (-0.86%); split: -1.19%, +0.33%
Copies: 606083 -> 593887 (-2.01%); split: -2.27%, +0.26%
Branches: 171344 -> 164203 (-4.17%); split: -4.17%, +0.00%
PreSGPRs: 234116 -> 234922 (+0.34%); split: -0.17%, +0.52%
PreVGPRs: 211250 -> 211374 (+0.06%); split: -0.00%, +0.06%
VALU: 4130666 -> 4132669 (+0.05%); split: -0.11%, +0.16%
SALU: 854007 -> 852585 (-0.17%); split: -0.77%, +0.61%
VMEM: 162718 -> 162755 (+0.02%); split: -0.00%, +0.03%
SMEM: 237856 -> 236323 (-0.64%); split: -0.65%, +0.00%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41348>
2026-05-08 11:49:43 +00:00
Georg Lehmann
0dd50a426e nir: fix fp_math_ctrl in fisnan
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Otherwise, nir_opt_algebraic will replace it with false.

Fixes: 63d199a01e ("nir: remove special fp_math_ctrl rules")
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41420>
2026-05-08 08:20:16 +00:00
Faith Ekstrand
ccdcbde6dd nak,compiler: Move FromVariants to common code
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41410>
2026-05-08 00:29:09 +00:00
Faith Ekstrand
14fcd9b2f8 nak,compiler: Move AttrList into NAK
AttrList never really made sense as part of AsSlice.  The trait returns
slices so it makes sense that it would return slice of attributes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41410>
2026-05-08 00:29:08 +00:00
Kenneth Graunke
4018aea9fa nir: Set FRAG_RESULT_DUAL_SRC_BLEND in outputs_written when lowering
Detecting dual source blending is currently annoying: you can either
look at info->fs.color_is_dual_source, or FRAG_RESULT_DUAL_SRC_BLEND
being in the info->outputs_written bitfield.

The former is only set if nir_shader_gather_info runs prior to
nir_lower_io lowering it to FRAG_RESULT_DUAL_SRC_BLEND.

The latter is only set if nir_shader_gather_info runs after the
nir_lower_io lowering.

Just make the IO lowering also set the outputs_written flag so if
you're trying to use FRAG_RESULT_DUAL_SRC_BLEND, you can always
check outputs_written without worrying about pass ordering.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41122>
2026-05-07 08:29:40 +00:00
Karol Herbst
3df48dec23 nir/lower_cl_images: call nir_progress on every function
llvmpipe supports real function calls, so we need to call nir_progress on
every function, not just the entry point.

Cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41404>
2026-05-07 05:35:12 +00:00
Pavel Ondračka
0f75fa5bfd nir/tests: add partial unroll OOB tests
Assisted-by: OpenAI Codex (GPT-5.5)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41203>
2026-05-06 20:08:13 +00:00
Pavel Ondračka
e517e3da0b nir/tests: add helpers for counting used/unused instructions
Assisted-by: OpenAI Codex (GPT-5.5)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41203>
2026-05-06 20:08:12 +00:00
Pavel Ondračka
959f59b3f0 nir: fix partial loop unroll OOB check for loops not starting at 0
is_access_out_of_bounds() decides whether the residual loop (created
by partial_unroll) will access arrays out of bounds by checking whether
array_length is less than or equal to trip_count. That assumes the
induction variable starts at 0. For example glamor gradient shader
shader-db/shaders/glamor/4.shader_test:

   uniform float stops[18];
   for (i = 1; i < n_stop; i++)
      if (stop_len < stops[i]) break;

trip_count is guessed as 17 from the array indexing, so the residual
loop's index begins at 18, out of bounds for the 18-element array, yet
18 <= 17 is false, so the OOB removal is skipped and the residual loop
is not eliminated.

Correctly consider the start value for the OOB check. This lets glamor
gradient shaders with loops starting at i=1 unroll the same way as i=0
loops.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41203>
2026-05-06 20:08:12 +00:00
Rhys Perry
ec59b59b97 nir: rename nir_src_parent_instr to nir_src_use_instr
sed -i "s/nir_src_parent_instr/nir_src_use_instr/" `find ./ -type f`
sed -i "s/nir_src_parent_if/nir_src_use_if/" `find ./ -type f`
sed -i "s/nir_src_set_parent/nir_src_set_use/" `find ./ -type f`

There are two kinds of "parent" in relation to a src/def:
- the instruction where the def or src's def is defined
- the instruction which the src is a part of and where the def is used

Clarify that the parent here is where the src's def is used, not where
it's defined.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41344>
2026-05-06 17:09:22 +00:00