Commit graph

3848 commits

Author SHA1 Message Date
Timothy Arceri
ca88f851c8 ac/nir/lower_tex_coord: basic lower tex coord test
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Tests issue from: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15494

Assisted-by: ChatGPT (GPT-5.5)
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41666>
2026-05-21 00:54:56 +00:00
David Rosca
2c0caee6db ac/info: Print number of VPE instances
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35431>
2026-05-20 13:36:25 +00:00
David Rosca
c5a9bfc210 ac/info: Remove old video codec caps
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35431>
2026-05-20 13:36:25 +00:00
David Rosca
914b4a700c ac/info: Add video codec caps
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35431>
2026-05-20 13:36:24 +00:00
Pierre-Eric Pelloux-Prayer
0ecc6eb2f5 ac/sqtt: add ac_sqtt_update_bo_size
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
buffer_size is uint32_t so we must be careful to not overflow it.
radeonsi had code for this but radv doesn't, which means it will
hang if RADV_THREAD_TRACE_BUFFER_SIZE is too big or if buffer_size
is being doubled up to the point it overflows.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41383>
2026-05-20 08:33:22 +00:00
Thong Thai
d7f72e4e5b amd: Build nir files only when with_gfx_compute
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Benjamin Cheng <benjamin.cheng@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41494>
2026-05-19 22:14:41 +00:00
Karol Herbst
e9c1cce35f nir: remove ffma_old
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>
2026-05-19 18:13:42 +00:00
Karol Herbst
109d93dd98 nir: update ffma helpers to use new opcodes
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>
2026-05-19 18:13:41 +00:00
Karol Herbst
a80e254fd4 ac: use nir_op_ffma_weak
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>
2026-05-19 18:13:35 +00:00
Karol Herbst
570bfe1ee0 ac: handle new float multadd opcodes
This advertizes both ffma and fmad on a couple of chipsets to support
VK_KHR_shader_fma and to improve OpenCL fma performance, where fma is not
optional and the emulation is more expensive than slow fma.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>
2026-05-19 18:13:35 +00:00
Karol Herbst
a9b18f8607 nir: rename ffma to ffma_old
We'll get three new opcodes to properly model float multiply-add.
ffma_old is temporary and will be deleted at the end of this series.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>
2026-05-19 18:13:27 +00:00
Marek Olšák
8a68ae989d ac,radeonsi: remove all frag_coord_xy code
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This is now dead code. This code is not used by RADV.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41461>
2026-05-19 15:03:51 +00:00
Marek Olšák
159f22abbe radeonsi: switch to nir_frag_coord_xy_z_w_separate with w_rcp
FYI, ac_nir_lower_ps_early is only used by radeonsi.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41461>
2026-05-19 15:03:51 +00:00
Marek Olšák
cb66de3beb radv: switch to nir_frag_coord_xy_z_w_separate with w_rcp
Only Cyberpunk affected:

Totals from 10 (0.00% of 202429) affected shaders:
Latency: 337370 -> 337366 (-0.00%)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41461>
2026-05-19 15:03:50 +00:00
Timothy Arceri
45421292f1 ac/nir/lower_tex_coord: update cursor when moving wqm coordinates
We need to update the build cursor when moving a load otherwise
subsequent loads will be using a stale cursor.

Fixes: 0ff1650662 ("ac/nir/lower_tex_coord: fix moving wqm coordinates")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15494

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41629>
2026-05-18 14:03:20 +00:00
Benjamin Cheng
e44cb38c3b ac/parse_ib: Add parsing for variable slice mode
Fixes: d9ba641e28 ("ac: Add variable slice mode interface")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41140>
2026-05-15 08:12:11 +00:00
Georg Lehmann
d256c1f49e ac/nir/lower_tex_coords: fix optimizing cube txd to tex
We need to remove ddx/ddy before doing the cube lowering,
otherwise we insert instructions that break dominance.

Affects Sable.

Fixes: 7d552d71e9 ("ac/nir: optimize txd(coord, ddx/ddy(coord))")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41489>
2026-05-12 18:17:22 +00:00
Marek Olšák
c488d02c5e ac: add ac_shader_args::line_stipple_tex_ena
for later use

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41226>
2026-05-12 14:13:45 +00:00
Marek Olšák
c9c0dce948 aco,radeonsi: use enums for color barycentrics instead of input VGPR indices
The VGPR indices will be dynamic. This replaces hardcoded VGPR indices
with enums in the PS prolog key.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41226>
2026-05-12 14:13:45 +00:00
Marek Olšák
97660e91b5 ac,radeonsi: add helpers to print SPI_SHADER_COL/Z_FORMAT
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41226>
2026-05-12 14:13:45 +00:00
Marek Olšák
1c76351aee ac,radeonsi: add a helper to print PS input VGPR layout
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41226>
2026-05-12 14:13:45 +00:00
Marek Olšák
23f7c87e3e amd/tools: rewrite ac_print_tiling_layouts to print all layouts, including XORs
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Instead of expecting just 1 address bit to be flipped by 1 coordinate bit,
expect any address bits to be flipped by 1 coordinate bit. If multiple
coordinate bits flip the same address bit, that means all those coordinate
bits are XOR'd.

v2: also print 128bpp

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41431>
2026-05-11 13:46:41 +00:00
Peyton Lee
b4dde2ee02 amd: validate and expose VPE 2.0.0
Define VPE 2.0.0 version identifiers in amd_family.h.
In ac_gpu_info.c, assign vpe_version only when the detected version is supported.
This ensures userspace only sees a valid VPE version.

Signed-off-by: Peyton Lee <peytolee@amd.com>
2026-05-08 14:56:45 +08:00
Marek Olšák
c6ddfe1a3b amd: add a tool that prints tiling layouts for all shim devices
This prints the swizzle pattern for all non-XOR tiling modes.
It can be used to determine which GPUs have the same tiling.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41405>
2026-05-07 21:41:01 +00:00
Pierre-Eric Pelloux-Prayer
e3beb262bd amd/virtio: fix amdgpu_sw_info_address_prt_wa_control_bit handling
Fixes: 60b406e233 ("ac/gpu_info: query the PRT workaround control bit from libdrm")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41381>
2026-05-07 07:59:00 +00:00
Pierre-Eric Pelloux-Prayer
760ed3e888 amd/virtio: use AMDGPU_VA_MGR_RESERVE_HALF_VA_FOR_PRT
To match what libdrm_amdgpu does in non-virtualized env.

Fixes: e0b5724e85 ("meson: bump required libdrm to 2.4.133 for AMDGPU")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41381>
2026-05-07 07:59:00 +00:00
Rhys Perry
6e06012825 radv,ac: make rembrandt and vangogh cache compatible
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41340>
2026-05-06 17:41:31 +00:00
Rhys Perry
ec59b59b97 nir: rename nir_src_parent_instr to nir_src_use_instr
sed -i "s/nir_src_parent_instr/nir_src_use_instr/" `find ./ -type f`
sed -i "s/nir_src_parent_if/nir_src_use_if/" `find ./ -type f`
sed -i "s/nir_src_set_parent/nir_src_set_use/" `find ./ -type f`

There are two kinds of "parent" in relation to a src/def:
- the instruction where the def or src's def is defined
- the instruction which the src is a part of and where the def is used

Clarify that the parent here is where the src's def is used, not where
it's defined.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41344>
2026-05-06 17:09:22 +00:00
Samuel Pitoiset
6699eecb6f ac/surface: allow to select hybrid/block memcpy path for host copies
Based on my profiling, the hybrid mode performs better (+~20%) with
block compressed formats, so let's use that.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41019>
2026-05-05 17:53:17 +00:00
Georg Lehmann
0ff1650662 ac/nir/lower_tex_coord: fix moving wqm coordinates
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Even if they are in the same block, we might still need to move the
source instructions if they are otherwise after our insert location.
This can happen in the case where we insert strict_wqm_coord before
terminate_if.

Fixes: ac33f82d54 ("ac/nir/lower_tex_coords: move input loads instead of cloning them")
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41336>
2026-05-04 15:09:46 +00:00
Pierre-Eric Pelloux-Prayer
2267c14803 ac/info: add gfx12.1 identification
Not the full support yet, just the id part so the family/gfx_level
fields are set to the proper values.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41264>
2026-05-04 09:38:31 +02:00
Marek Olšák
f583f6e717 nir: use nir_build_frag_coord everywhere
nir_build_frag_coord generates the correct sysval loads based on NIR
options. nir_load_frag_coord shouldn't be used directly because drivers
don't have to support it.

v2: RADV can't use it because nir->options isn't set, so use load_pixel_coord.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>
2026-05-03 13:03:01 +00:00
Marek Olšák
ac33f82d54 ac/nir/lower_tex_coords: move input loads instead of cloning them
This stops leaving dead input loads behind.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41173>
2026-05-01 02:37:16 +00:00
Marek Olšák
ad4eaaae68 ac/nir: factor out ac_nir_lower_tex_coords from ac_nir_lower_image_tex
This just separates tex coord lowering into a new pass.

The gfx_level parameter is now unused in ac_nir_lower_image_tex, but I'm
keeping it because it will be used in the future.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41173>
2026-05-01 02:37:16 +00:00
Samuel Pitoiset
a4668733e5 ac/nir: add a pass to fixup SMEM loads with NULL PRT pages
Only global/SSBO SMEM loads are considered because for UBOs the "LOW"
VA will be set in descriptors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38698>
2026-04-30 09:29:43 +00:00
Samuel Pitoiset
60b406e233 ac/gpu_info: query the PRT workaround control bit from libdrm
libdrm splits the HIGH address space in two equal parts for GPUs that
are affected by the SMEM loads with NULL PRT page.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38698>
2026-04-30 09:29:43 +00:00
Samuel Pitoiset
978605fd06 ac/gpu_info: add has_smem_with_null_prt_bug
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38698>
2026-04-30 09:29:43 +00:00
Samuel Pitoiset
ecfda339ca ac/gpu_info: store more addr space info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38698>
2026-04-30 09:29:43 +00:00
xueyuli2
da7ed1c576 amd/virtio: fix bo use-after-free race condition in amdvgpu_bo_free
In amdvgpu_bo_free(), when the reference count drops to 0, vdrm_flush()
is called before removing the bo from the handle_to_vbo hash table.

Since vdrm_flush() is a time-consuming operation and is executed outside
of the handle_to_vbo_mutex lock, another thread calling amdvgpu_bo_import()
can concurrently find this bo in the hash table, increment its refcount,
and attempt to use it. Once vdrm_flush() finishes, amdvgpu_bo_free()
proceeds to remove the bo and call free(), leaving the importing thread
with a dangling pointer, which leads to a use-after-free or double free
crash.

To fix this race condition, we must remove the bo from the hash table
under the lock first. After the bo is safely unlinked and the lock is
released, we can then perform the time-consuming vdrm_flush() and the
actual memory release.

Signed-off-by: zhaqian <zhaqian@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41146>
2026-04-30 08:41:50 +00:00
Rhys Perry
0249fcfbb6 radv: assert there is no padding in cache keys
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41022>
2026-04-29 08:10:08 +00:00
Rhys Perry
5ee0935861 ac: move has_cs_regalloc_hang_bug to ac_compiler_info
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41022>
2026-04-29 08:10:08 +00:00
Rhys Perry
e40457b136 ac: move lds_size_per_workgroup to ac_compiler_info
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41022>
2026-04-29 08:10:08 +00:00
Samuel Pitoiset
df3de4acbb ac,radv,radeonsi: replace mesh_fast_launch_2 by gfx_level checks
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41204>
2026-04-28 06:50:43 +00:00
Timothy Arceri
a42c55da46 amd/radeonsi: dont clamp packed user varyings
ac_nir_optimize_outputs() might pack user varyings into the color
built-ins. If this happens we skip adding clamping to the
components that contain the user varying.

This change also fixes a second bug where a color built-in can be
packed into a non-color slot and was no longer being clamped.

Fixes: 3777a5d7 ("radeonsi: assign param export indices before compilation")
Closes: #14443

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40594>
2026-04-27 22:59:58 +00:00
Marek Olšák
0684976de8 ac/nir: add ac_nir_assign_fs_input_locations to set PS input locations in stone
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
No intended functional change.

This prevents possible breakage due to DCE removing input loads followed
by nir_shader_gather_info updating input masks and changing the result of
ac_nir_get_io_driver_location after PS input register contents are already
determined.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41175>
2026-04-27 21:05:53 +00:00
Suresh Guttula
71508d90aa ac: Add vcn_5_3_0 support
Enable hardware decode/encode capabilities for VCN 5.3.0 by
configuring the supported codec list. This allows vainfo to
properly enumerate available codec capabilities.

Signed-off-by: Suresh Guttula <suresh.guttula@amd.com>
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41202>
2026-04-27 17:13:18 +00:00
Benjamin Cheng
922d04c9a5 ac/vcn: Rename VCN5 swizzle mode to GFX12
The original naming is inaccurate, it depends on the GFX version, not
VCN.

Signed-off-by: Suresh Guttula <suresh.guttula@amd.com>
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41202>
2026-04-27 17:13:18 +00:00
Julia Zhang
0e36d7112c radv: set TMZ bit in sdma_copy packet
Pass secure and set TMZ bit in sdma_copy packet for protected image

Signed-off-by: Julia Zhang <Julia.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40619>
2026-04-27 09:03:34 +00:00
Marek Olšák
bfb6c41b64 amd: remove unnecessary and transitive #includes
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reported by clang tools.
See: https://clangd.llvm.org/guides/include-cleaner

struct ac_cmdbuf had to be moved to ac_cmdbuf_base.h because we can't
include ac_cmdbuf.h->sid.h->amdgfxregs.h in radeon_winsys.h for r300.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41091>
2026-04-24 21:53:07 +00:00
Rhys Perry
91d555c2cb radv: lower indirect derefs after linking
Scratch access isn't very optimizable, so more stores are optimized away
if we lower indirect derefs after both linking and radv_optimize_nir.

fossil-db (navi21):
Totals from 1264 (0.62% of 202427) affected shaders:
Instrs: 1504703 -> 1504708 (+0.00%); split: -0.02%, +0.02%
CodeSize: 8031388 -> 8031020 (-0.00%); split: -0.02%, +0.02%
SpillSGPRs: 1865 -> 1869 (+0.21%)
Latency: 12106362 -> 12106464 (+0.00%); split: -0.01%, +0.01%
InvThroughput: 4056269 -> 4056044 (-0.01%); split: -0.01%, +0.00%
VClause: 13927 -> 13940 (+0.09%)
SClause: 32382 -> 32396 (+0.04%); split: -0.03%, +0.08%
Copies: 188004 -> 187897 (-0.06%); split: -0.17%, +0.11%
Branches: 39045 -> 39052 (+0.02%); split: -0.01%, +0.03%
PreSGPRs: 79885 -> 79814 (-0.09%); split: -0.11%, +0.02%
VALU: 1072639 -> 1072532 (-0.01%); split: -0.01%, +0.00%
SALU: 187317 -> 187375 (+0.03%); split: -0.11%, +0.14%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31265>
2026-04-24 11:01:03 +00:00