Commit graph

11302 commits

Author SHA1 Message Date
Natalie Vock
d563415100 radv: Add traversal stack size to cache
We just... didn't do this at all??? I have no idea how this didn't blow
up before, given that plenty of apps should generate a traversal shader
that spills (and thus has a large stack size), but it did finally blow
up in function-call related work.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29580>
2026-01-14 14:19:05 +00:00
Alyssa Rosenzweig
e98728de3c radv: cleanup texture builder
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39271>
2026-01-14 08:18:15 +00:00
Samuel Pitoiset
14deea2633 radv: enable SPM for GFX11.5
This adds support for performance counters with RGP on GFX11.5.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39270>
2026-01-13 22:16:40 +00:00
Konstantin Seurer
58a35647e1 radv: Fix crash if proceed comes before initialize
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
"initialize" can be NULL if the rq_proceed was visited before
rq_initialize.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14626
cc: mesa-stable

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39251>
2026-01-12 22:34:32 +00:00
Samuel Pitoiset
b65cc9d587 ac,radv: sample and set correct shader/memory clocks for RGP
These clocks need to be the clocks at trace time. This shouldn't fix
anything given that RADV sets profile_peak when SQTT is enabled but
better to report it correctly anyways.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39208>
2026-01-12 11:58:43 +00:00
David Rosca
0518784b62 radv/amdgpu: Only wait on queue syncobj when needed
This would always wait on the queue syncobj if there was any other
wait syncobj, but it should only wait after zero submit.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39193>
2026-01-12 10:59:03 +00:00
Dave Airlie
ab9e904f24 radv/coopmat: fix deref stride
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This at least fixes the nir debug output to have correct values.

Fixes: 48fc8c8d1c ("radv/nir/lower_cmat: set optimal load/store alignment")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39256>
2026-01-12 10:39:05 +00:00
David Rosca
df4220d500 radv/video: Use different dpb swizzle mode for 10 bit encode
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39189>
2026-01-12 10:18:18 +00:00
David Rosca
587a7aa510 radv: Enable DCC modifiers for multi plane formats on GFX12
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39190>
2026-01-12 09:57:56 +00:00
Samuel Pitoiset
5bcca4a832 radv/spm: use a staging buffer for faster reads on dGPUS
This allows us to move the SPM buffer to VRAM because I think it must
be in VRAM too.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39195>
2026-01-12 09:35:37 +00:00
Samuel Pitoiset
6863a90486 radv/spm: rework allocating the SPM buffer
For using a staging buffer.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39195>
2026-01-12 09:35:37 +00:00
Samuel Pitoiset
c7d0aa6671 radv/sqtt: use a staging buffer for faster reads on dGPUS
This is way faster.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39195>
2026-01-12 09:35:36 +00:00
Samuel Pitoiset
5d430940d2 radv/sqtt: rework allocating the SQTT buffer
For using a staging buffer.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39195>
2026-01-12 09:35:36 +00:00
Samuel Pitoiset
1c611c2dac radv/sqtt: use VkCommandBuffer objects for SQTT start/stop sequences
For using a staging buffer.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39195>
2026-01-12 09:35:35 +00:00
Samuel Pitoiset
6722a6332a ac,radv,radeonsi: rename num_spm_counters to num_spm_modules
A module can have different number of counters.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39199>
2026-01-12 08:10:32 +00:00
Samuel Pitoiset
db02077c8a radv: remove extra instructions after UNREACHABLE
Minor cleanups.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39237>
2026-01-12 07:41:08 +00:00
Samuel Pitoiset
e1e2517664 radv: use UNREACHABLE for illegal texture filter
Found this with a broken CTS test, way easier to crash for isolating
the test case.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39237>
2026-01-12 07:41:08 +00:00
Samuel Pitoiset
91e0f8f1e5 radv/rt: fix a compilation warning about uninitialized fields
Just zero-initialize the layout struct to fix the following warning
because radv_use_bvh8() might return FALSE.

../src/amd/vulkan/radv_acceleration_structure.c: In function ‘radv_update_as_gfx12’:
../src/amd/vulkan/radv_acceleration_structure.c:873:70: warning: ‘layout.bounds_offsets’ may be used uninitialized [-Wmaybe-uninitialized]
  873 |       .bounds = state->build_info->scratchData.deviceAddress + layout.bounds_offsets,
      |                                                                ~~~~~~^~~~~~~~~~~~~~~
../src/amd/vulkan/radv_acceleration_structure.c:866:33: note: ‘layout.bounds_offsets’ was declared here
  866 |    struct update_scratch_layout layout;
      |                                 ^~~~~~

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39228>
2026-01-12 07:18:50 +00:00
Konstantin Seurer
077292f65b radv/bvh: Use box16 nodes when bvh8 is not used
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Using box16 nodes trades bvh quality for memory bandwidth which seems to
be roughly equal in performance.

Stats assuming box16 nodes are as expensive as box32 nodes:
Totals from 7668 (79.68% of 9624) affected BVHs:
compacted_size: 951666944 -> 742347648 (-22.00%)
max_depth: 57606 -> 57615 (+0.02%)
sah: 129114796242 -> 129998517775 (+0.68%); split: -0.00%, +0.68%
scene_sah: 188564162 -> 192063633 (+1.86%); split: -0.02%, +1.88%
box16_node_count: 0 -> 3270600 (+inf%)
box32_node_count: 3365707 -> 95100 (-97.17%)

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37883>
2026-01-10 11:36:28 +01:00
Konstantin Seurer
543a88af99 radv/bvh: Add radv_aabb16 and use it for box16 nodes
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37883>
2026-01-10 11:36:19 +01:00
Konstantin Seurer
fefdad9249 radv/rra: Count box16 nodes properly
Otherwise rra won't allocate memory when loading the capture.

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37883>
2026-01-10 11:34:18 +01:00
Benjamin Cheng
499d9e2e98 radv/video: Allow aliasing of video images
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39109>
2026-01-09 13:52:56 +00:00
Samuel Pitoiset
edb730f647 radv: fix flushing gang semaphore with SDMA/ACE
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
If the main CS is SDMA and the gang CS is ACE, this would emit a
SDMA_FENCE packet on ACE which just hangs.

Fixes: b1938901d0 ("radv: Use SDMA fence packet when flushing gang semaphores")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39211>
2026-01-09 09:07:45 +00:00
Natalie Vock
1f6ac3fa93 radv/rt,aco: Always dispatch 1D workgroups for RT
We will swizzle the workgroups ourselves in the next commit.
Removes the need for 1D dispatch workarounds.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39142>
2026-01-08 19:49:54 +01:00
Natalie Vock
8baa95e4aa radv/rt: Use subgroup invocation for stack index
Workgroup == subgroup anyway, and we don't have the workgroup thread IDs
in RT shaders.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39142>
2026-01-08 19:49:45 +01:00
Georg Lehmann
a706769a0b nir: move exact bit to nir_fp_math_control
Unifies nir per instruction float control.

In the future this can be split into contract/reassoc/transform
like SPIR-V.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (except SPIR-V)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39103>
2026-01-07 09:40:57 +00:00
Marek Olšák
1912a00a91 ALL: use SHA1_DIGEST_LENGTH etc. instead of hardcoding the numbers
only build_id is switched to use literal 20 instead of SHA1_DIGEST_LENGTH
because we will increase SHA1_DIGEST_LENGTH to BLAKE3_KEY_LEN

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39110>
2026-01-07 08:32:33 +00:00
Samuel Pitoiset
9f5dd888b6 radv/sqtt: add a comment about the allocation strategy of the SQTT BO
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39172>
2026-01-07 06:57:29 +00:00
Samuel Pitoiset
ffa343ed05 Revert "radv: allocate the SQTT BO in GTT for faster readback"
This reverts commit da07f1ef3f.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14591
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39172>
2026-01-07 06:57:29 +00:00
Samuel Pitoiset
59dc20262c ac/perfcounter: rename ac_pc_block::num_instances to num_scoped_instances
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39155>
2026-01-06 11:43:21 +00:00
Konstantin Seurer
405c93c665 radv: Optimize BVH4 acceleration structure updates
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
It is more efficient to compute the child index of the current node
inside the parent node and write the bounds when available. The previous
code could load up to 16 AABBs to compute the new ones. The new code
also only needs 1/7 of the previously used scratch memory. The new code
seems to be around 30% faster (0.5ms) in GOTG on a 6700XT.

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39139>
2026-01-05 15:24:54 +00:00
Timur Kristóf
c05d276473 radv: Mitigate GFX6-7 SMEM bug for robust OOB access
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Implement a mitigation for VM faults caused by SMEM reading
out of bounds when using robust buffer access.

- Pad uniform and storage buffer allocations with a readonly VM page
- Clamp SMEM offsets that can potentially read past the next page

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38769>
2026-01-02 23:42:16 +00:00
Timur Kristóf
f866bed0db radv: Mitigate GFX6-7 SMEM bug for NULL and mutable descriptors
Implement a mitigation for VM faults caused by SMEM reading
from NULL descriptors.

In order to satisfy VKD3D-Proton's expectations on mutable
descriptors, we must do this in shader code, it is not
sufficient to use the address of a mapped BO when writing
null descriptors. It is not feasible to mitigate this
in VKD3D-Proton.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38769>
2026-01-02 23:42:16 +00:00
Timur Kristóf
10a5e5e4f3 radv/amdgpu: Add ability to pad BOs with a read-only VM page
Map the first page of the same BO as read-only after the BO itself
in order to pad each BO with an extra page. This doesn't require
us to allocate any memory.

This is going to be used for a HW bug mitigation.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38769>
2026-01-02 23:42:16 +00:00
Marek Olšák
bd9206192d radv: use ac_set_sx_downconvert_state_for_mrt
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39093>
2026-01-02 16:46:20 +00:00
Marek Olšák
e64e41f69e radv: fix halved pixel throughput for a few non-blended 16bpp/32bpp formats
Fixed formats:
* R16_SFLOAT
* R16G16_SFLOAT
* R5G5B5A1_UNORM
* A2B10G10R10_UINT

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39093>
2026-01-02 16:46:20 +00:00
Samuel Pitoiset
489550d380 radv: add new drirc radv_prefer_2d_swizzle_for_3d_storage
Because some games perform much better with 2D swizzle for 3D storage.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38084>
2026-01-02 09:54:29 +00:00
Samuel Pitoiset
ae99082f96 radv: use 2D swizzle modes for 3D CB render targets when optimal
Much faster because CB is optimal with 2D swizzle modes. This isn't
applied for storage images because it depends on the access pattern,
and benchmark results are very different.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38084>
2026-01-02 09:54:29 +00:00
Samuel Pitoiset
7ee931b98f radv: increase the reserved CS space size for SPM
Because there are many more instances per SPM counters.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39099>
2026-01-02 09:27:20 +01:00
Timur Kristóf
3d803d7a2e radv: Use compute copy for emulated formats
These aren't supported by the hardware, so better to use the
compute copy implementation with these formats.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25594>
2026-01-02 04:32:06 +00:00
Timur Kristóf
0638fa5156 radv: Use compute for transfer operations unsupported by SDMA
For transfer queue operations that aren't supported by SDMA,
implement them with ACE (Async Compute Engine) using the pre-
existing compute copy functions.

Add a helper radv_get_pm4_cs that returns the ACE gang CS for
transfer command buffers and the main CS for graphics/compute
command buffers. Use radv_get_pm4_cs to make sure to emit the
compute commands to the correct command stream.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25594>
2026-01-02 04:32:06 +00:00
Timur Kristóf
aa6c8b8953 radv: Add layout argument to transfer_copy_buffer_image.
This argument will be used with gang submit.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25594>
2026-01-02 04:32:06 +00:00
Timur Kristóf
72ac874ba6 radv: Remove radv_remove_varyings.
Not needed anymore, since we are now doing this on lowered I/O.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33928>
2026-01-01 21:26:26 -06:00
Timur Kristóf
473ef0b6fb radv: Use nir_remove_outputs with the noop FS.
As opposed to radv_remove_varyings, this one works fine with
mesh shaders as well.

This commit helps depth-only rendering with mesh shaders.

No Fossil DB changes.
(Possibly there are no applicable fossils in our DB.)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33928>
2026-01-01 21:26:04 -06:00
Timur Kristóf
24e0e8980f radv: Don't call nir_link_opt_varyings anymore
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The old nir_link_opt_varyings pass is superseded by the
new nir_opt_varyings pass.

Fossil DB stats on Strix Halo (GFX11.5):

Totals from 1291 (1.62% of 79825) affected shaders:
MaxWaves: 37070 -> 37078 (+0.02%)
Instrs: 985094 -> 985326 (+0.02%); split: -0.07%, +0.09%
CodeSize: 5144668 -> 5145384 (+0.01%); split: -0.06%, +0.07%
VGPRs: 68040 -> 68160 (+0.18%); split: -0.12%, +0.30%
Latency: 7923260 -> 7921208 (-0.03%); split: -0.04%, +0.02%
InvThroughput: 1291120 -> 1291008 (-0.01%); split: -0.05%, +0.05%
VClause: 16590 -> 16580 (-0.06%)
SClause: 27360 -> 27376 (+0.06%); split: -0.08%, +0.14%
Copies: 68767 -> 69041 (+0.40%); split: -0.51%, +0.91%
Branches: 19431 -> 19449 (+0.09%)
PreSGPRs: 55679 -> 55704 (+0.04%); split: -0.01%, +0.05%
PreVGPRs: 47787 -> 47926 (+0.29%); split: -0.00%, +0.30%
VALU: 572252 -> 572489 (+0.04%); split: -0.10%, +0.14%
SALU: 139916 -> 139845 (-0.05%); split: -0.10%, +0.05%

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33979>
2026-01-01 18:10:05 -06:00
Timur Kristóf
7ee52b7066 radv: Don't call nir_remove_unused_varyings anymore
The nir_remove_unused_varyings pass is not necessary anymore,
because nir_opt_varyings already does the same.

Fossil DB stats on Strix Halo (GFX11.5):

Totals from 3085 (3.86% of 79825) affected shaders:
MaxWaves: 91286 -> 91290 (+0.00%)
Instrs: 1337749 -> 1335687 (-0.15%); split: -0.39%, +0.24%
CodeSize: 6625244 -> 6618148 (-0.11%); split: -0.38%, +0.27%
VGPRs: 140424 -> 140352 (-0.05%); split: -0.07%, +0.02%
Latency: 5028592 -> 5021465 (-0.14%); split: -0.26%, +0.12%
InvThroughput: 669773 -> 671718 (+0.29%); split: -0.24%, +0.53%
VClause: 24431 -> 24407 (-0.10%); split: -0.17%, +0.07%
SClause: 30114 -> 29435 (-2.25%); split: -2.28%, +0.03%
Copies: 99243 -> 101319 (+2.09%); split: -1.32%, +3.41%
Branches: 27445 -> 27599 (+0.56%)
PreSGPRs: 119444 -> 119472 (+0.02%); split: -0.67%, +0.69%
PreVGPRs: 96667 -> 96688 (+0.02%); split: -0.00%, +0.02%
VALU: 741846 -> 744017 (+0.29%); split: -0.14%, +0.44%
SALU: 197068 -> 195256 (-0.92%); split: -0.96%, +0.05%
VMEM: 54067 -> 54053 (-0.03%); split: -0.03%, +0.00%
SMEM: 56565 -> 55131 (-2.54%); split: -2.59%, +0.05%

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33979>
2026-01-01 18:03:22 -06:00
Timur Kristóf
43496a6bf9 radv: Don't call nir_compact_varyings anymore
nir_compact_varyings is not necessary anymore, because everything
that it does, is also done by nir_opt_varyings.

The resulting shader stats are slightly negative because
without nir_compact_varyings, the I/O variables in TCS
are sorted less "fortunately".

After discussing this with the RADV team, we decided that
this is an acceptable loss.

Fossil DB stats on Strix Halo (GFX11.5):

Totals from 4577 (5.73% of 79825) affected shaders:
MaxWaves: 130456 -> 130532 (+0.06%); split: +0.06%, -0.00%
Instrs: 3012724 -> 3014809 (+0.07%); split: -0.06%, +0.13%
CodeSize: 15476368 -> 15484724 (+0.05%); split: -0.05%, +0.10%
VGPRs: 227976 -> 227832 (-0.06%); split: -0.14%, +0.07%
Latency: 13230769 -> 13237431 (+0.05%); split: -0.03%, +0.08%
InvThroughput: 1862029 -> 1864167 (+0.11%); split: -0.07%, +0.19%
VClause: 43128 -> 43123 (-0.01%); split: -0.08%, +0.07%
SClause: 61636 -> 61647 (+0.02%); split: -0.01%, +0.02%
Copies: 178023 -> 180309 (+1.28%); split: -0.80%, +2.09%
PreSGPRs: 195628 -> 195683 (+0.03%)
PreVGPRs: 161817 -> 161749 (-0.04%)
VALU: 1828727 -> 1831037 (+0.13%); split: -0.08%, +0.20%
SALU: 336688 -> 336668 (-0.01%); split: -0.01%, +0.00%
VMEM: 99441 -> 99545 (+0.10%); split: -0.00%, +0.11%

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33979>
2026-01-01 18:01:34 -06:00
Timur Kristóf
e2fabb4e4a radv: Don't call nir_opt_combine_stores anymore
Also no need for nir_lower_tess_level_array_vars_to_vec.
These should be now handled by nir_opt_vectorize_io.

Fossil DB stats on Strix Halo (GFX11.5):

Totals from 373 (0.47% of 79825) affected shaders:
Instrs: 381930 -> 380786 (-0.30%); split: -0.30%, +0.00%
CodeSize: 1888160 -> 1883644 (-0.24%); split: -0.24%, +0.01%
Latency: 1008755 -> 1008053 (-0.07%); split: -0.08%, +0.01%
InvThroughput: 156523 -> 155275 (-0.80%); split: -0.81%, +0.01%
Copies: 22357 -> 20812 (-6.91%); split: -6.93%, +0.02%
VALU: 240904 -> 239359 (-0.64%); split: -0.64%, +0.00%

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33979>
2026-01-01 17:59:48 -06:00
Timur Kristóf
1106b0a1e2 radv: Only run some optimizations when scalarization made progress
These passes are called to clean up after scalarization, so
only call them when scalarization actually made progress.

No Fossil DB changes on Strix Halo (GFX11.5)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33979>
2026-01-01 17:54:55 -06:00
Timur Kristóf
58020fdc01 radv: Scalarize and re-vectorize unlinked shader I/O
Reasons to do this:
- Optimize VS inputs (always unlinked)
- Allow some optimization on unlinked shaders for GPL/ESO
- Prepare for retiring the old linking passes

Fossil DB stats on Strix Halo (GFX11.5):

Totals from 1814 (2.27% of 79825) affected shaders:
MaxWaves: 51232 -> 51434 (+0.39%)
Instrs: 1213430 -> 1212744 (-0.06%); split: -0.20%, +0.14%
CodeSize: 6124996 -> 6122472 (-0.04%); split: -0.17%, +0.13%
VGPRs: 93336 -> 92988 (-0.37%); split: -0.45%, +0.08%
Latency: 5360820 -> 5357501 (-0.06%); split: -0.29%, +0.23%
InvThroughput: 763087 -> 762937 (-0.02%); split: -0.11%, +0.09%
VClause: 22037 -> 22059 (+0.10%); split: -0.19%, +0.29%
SClause: 30971 -> 30884 (-0.28%); split: -0.46%, +0.17%
Copies: 73139 -> 73294 (+0.21%); split: -0.82%, +1.03%
Branches: 20370 -> 20346 (-0.12%)
PreSGPRs: 77373 -> 77404 (+0.04%)
PreVGPRs: 68218 -> 67093 (-1.65%); split: -1.78%, +0.13%
VALU: 662849 -> 663059 (+0.03%); split: -0.09%, +0.12%
SALU: 206745 -> 206781 (+0.02%); split: -0.06%, +0.08%
VMEM: 34230 -> 34250 (+0.06%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33979>
2026-01-01 17:54:31 -06:00