Commit graph

18726 commits

Author SHA1 Message Date
Natalie Vock
9c8a17e172 aco/lower_to_hw_instr: Lower calls
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
Natalie Vock
3667a7b687 aco: Add call info
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
Natalie Vock
af812862b7 aco: Add call-related program/block properties
Indicates various properties about calls: Whether a program is an
indirect callee, whether a program or block contains function calls, and
whether registers used by a caller need to be preserved.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
Natalie Vock
917a98b722 aco: Add ABI and Pseudo CALL format
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
Natalie Vock
e850650f92 aco: Add function call attributes
ACO needs RADV to set certain attributes on NIR functions to help with
compilation of function calls.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
Natalie Vock
d18b438832 aco: Add RegisterDemand::operator!=
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
David Rosca
e1ea3f8bbf radv/video: Always use OBU_FRAME in AV1 encode
Saves couple bytes per frame.

Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37176>
2025-09-15 14:12:17 +00:00
Qiang Yu
310cfda034 ac/surface: add ac_compute_surface_modifier
Used by radeonsi to export existing texture modifier.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31658>
2025-09-15 09:39:19 +00:00
Qiang Yu
98cd68ec05 ac/surface: add radeonsi exported modifiers to supported list
radeonsi will export texture with these modifiers.

piglit tests:
spec@ext_image_dma_buf_import@ext_image_dma_buf_import-export-tex
spec@ext_image_dma_buf_import@ext_image_dma_buf_import-tex-modifier

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31658>
2025-09-15 09:39:19 +00:00
Qiang Yu
1b0ec56c40 ac/surface: refine supported modifier list for multi block size
Reference KMD convert_tiling_flags_to_modifier(). And we are
going to add 4K block size.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31658>
2025-09-15 09:39:19 +00:00
Georg Lehmann
5962659c85 radv: remove uses_rt from radv_shader_info
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37294>
2025-09-14 13:21:21 +00:00
Georg Lehmann
a2d3cbac2a radv: determine subgroup/wave size early
This means we can actually implement varying subgroup size correctly.
It also means that we implement the implicit SPIR-V 1.6 full subgroups
requirement in compute shaders with cswave32/rtwave32.

In the future it will also allow more optimizations that use the subgroup size.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>

The only somewhat complex case here is GFX10 geometry shaders, if gewave32 is
used. We then only know the subgroup size when is_ngg is decided, as legacy
GS doesn't support wave32.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37294>
2025-09-14 13:21:21 +00:00
Georg Lehmann
76a502d75a ac/nir: set subgroup size for gs copy shader
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37294>
2025-09-14 13:21:21 +00:00
Georg Lehmann
c4cdbee2e6 radv: remove unused ballot_bit_size from shader info
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37294>
2025-09-14 13:21:21 +00:00
Georg Lehmann
2cda56e8b7 ac/llvm: remove unused ballot size
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37294>
2025-09-14 13:21:20 +00:00
Georg Lehmann
f83a6e6389 radv: add varying subgroup size to shader stage key
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37294>
2025-09-14 13:21:20 +00:00
Yonggang Luo
7db518cfe4 aco: Fixes warning: function get_branch_target/to_clrx_device_name defined but not used
../../src/amd/compiler/aco_print_asm.cpp:156:1: warning: 'bool aco::{anonymous}::get_branch_target(char**, aco::Program*, const std::vector<bool>&, char**)' defined but not used [-Wunused-function]
  156 | get_branch_target(char** output, Program* program, const std::vector<bool>& referenced_blocks,
      | ^~~~~~~~~~~~~~~~~
../../src/amd/compiler/aco_print_asm.cpp:105:1: warning: 'const char* aco::{anonymous}::to_clrx_device_name(amd_gfx_level, radeon_family)' defined but not used [-Wunused-function]
  105 | to_clrx_device_name(amd_gfx_level gfx_level, radeon_family family)
      | ^~~~~~~~~~~~~~~~~~~

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37289>
2025-09-13 08:23:07 +00:00
Timur Kristóf
c183eb5bc8 radv: Flush L2 before CP DMA copy/fill when CP DMA doesn't use L2
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
In case the source or destination were previously written
through L2, we need to writeback L2 to avoid the CP DMA accessing
stale data.

However, as the CP DMA doesn't write L2 either, an invalidation
is also needed to make sure other clients don't access stale data
when they read it through L2 after the CP DMA is complete.

Doing an invalidation before the CP DMA operation should take
care of both.

Additionally, radv_src_access_flush also invalidates L2 before
the copied data can be read.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36820>
2025-09-13 05:18:28 +00:00
Timur Kristóf
8ef2f492e2 radv: Add comment to document CP DMA prefetch
Explain which caches (MALL, L2) the prefetch works with.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36820>
2025-09-13 05:18:28 +00:00
Val Packett
7f556805c1 radv: detect platform:virtio-mmio devices for virtgpu native context
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
VirtIO devices can be configured as platform devices instead of PCI,
which is especially common in microVM projects like libkrun.

Let's allow RADV to probe MMIO virtgpu devices.

Signed-off-by: Val Packett <val@invisiblethingslab.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37281>
2025-09-12 12:56:46 +00:00
Karol Herbst
4f94ca6c96 ac/llvm: fix get_global_address for global atomics
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
They do not have an ACCESS index.

Fixes: 9a33c03654 ("ac/llvm: port load_smem_amd behavior to load_global_amd")
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37292>
2025-09-12 08:55:39 +00:00
Samuel Pitoiset
a52483d9e7 radv: fix capture/replay with sampler border color
The border color index must be captured and returned to the app,
otherwise it's broken if samplers aren't replayed in-order.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37291>
2025-09-12 06:51:51 +00:00
Samuel Pitoiset
a658be8a24 radv: get NIR options after initializing the physical device cache key
Otherwise, radv_split_fma isn't considered.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13878
Fixes: 7304423b5c ("radv: mark RADV_DEBUG=splitfma as deprecated")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37293>
2025-09-12 06:18:26 +00:00
Eric Engestrom
6ecb7f11b7 radv/ci: document recent flakes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37302>
2025-09-11 16:02:38 +00:00
Samuel Pitoiset
b00b8c763b radv: disable radv_disable_hiz_his_gfx12 for Mafia Definition Edition
This should be properly fixed now. Also remove this drirc option which
should no longer be needed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36174>
2025-09-11 15:21:50 +00:00
Samuel Pitoiset
ec09ac1501 radv: switch to the full HiZ workaround by default on GFX12
The full HiZ workaround is the only one that fixes the issue reliably.

Sadly, the performance results are mixed and sometimes it hurts. To
maintain performance, RADV will opt-in by selecting which workarounds
to apply from drirc.

As we can't know the full list of games that will be affected by a
potential performance regression, users are encouraged to try with
RADV_GFX12_HIZ_WA (see Mesa documentation for more explanations).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36174>
2025-09-11 15:21:50 +00:00
Samuel Pitoiset
5f8f4686bf radv: replace RADV_GFX12_HIZ_WA by a drirc option
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36174>
2025-09-11 15:21:50 +00:00
Samuel Pitoiset
0a2ef363a8 radv: report an message when RADV_GFX12_HIZ_WA value is invalid
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36174>
2025-09-11 15:21:50 +00:00
Samuel Pitoiset
ec87f1338f radv: emit more push shader registers on GFX12
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
They are supposed to be slightly faster.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37256>
2025-09-11 06:47:40 +00:00
Samuel Pitoiset
9039f33a8d Revert "radv: handle fbfetch output after binding graphics shaders"
This is actually wrong because if radv_handle_fbfetch_output() triggers
a decompression pass and graphics shaders (ESO) are saved/restored
they won't be updated because radv_bind_graphics_shaders() was called
before.

This fixes a very recent regression that I noticed while implementing
a new extension.

This reverts commit 9b912f00c7.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37194>
2025-09-11 06:22:20 +00:00
Samuel Pitoiset
b69b953973 radv: add RADV_DEBUG=bo_history
This dumps the BO history to /tmp/radv_bo_history.log after each BO
operations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37062>
2025-09-11 06:03:15 +00:00
Samuel Pitoiset
1da270fb35 radv/amdgpu: add more helpers for managing virtual BOs
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
All these new helpers will make the SMEM PRT workaround better
organized.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37193>
2025-09-10 14:50:25 +00:00
Samuel Pitoiset
3c4168a3cc radv/amdgpu: return OOM device when BO mapping fails
It's more appropriate than VK_ERROR_UNKNOWN.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37193>
2025-09-10 14:50:24 +00:00
Konstantin Seurer
7c9e945460 radv,vulkan: Avoid a useless barrier in radv_update_bind_pipeline
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36982>
2025-09-10 08:35:50 +00:00
Konstantin Seurer
a35dfab281 radv: Use vk_barrier_compute_w_to_compute_r more
vk_barrier_compute_w_to_compute_r shows up in rgp captures and is less
code.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36982>
2025-09-10 08:35:50 +00:00
Samuel Pitoiset
c739d836f7 radv: exclude dynamic vertex input stride for the late scissor workaround
RADV_DYNAMIC_VERTEX_INPUT_BINDING_STRIDE doesn't emit any context
registers, so it can be excluded for the late scissor workaround to
avoid re-emitting scissors all the time it's dirty.

This fixes a performance regression noticed with Cyberpunk on Vega10,
but other games are likely affected too. The late scissor workaround is
only applied on Raven/Vega10.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13828
Fixes: d7f401c2bb ("radv: bind the vertex binding strides like a normal dynamic state")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37252>
2025-09-10 07:09:48 +00:00
abdelhadi
3a41644165 aco, radv: remove line duplicate
Signed-off-by: abdelhadi <abdelhadims@icloud.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37243>
2025-09-10 06:34:43 +00:00
Rhys Perry
e2181744c2 aco/tests: add barrier-to-waitcnt tests
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
0f32b573a4 aco/gfx10: skip waitcnts or use vm_vsrc(0) for workgroup lds barriers
fossil-db (navi21):
Totals from 36594 (45.84% of 79825) affected shaders:
Instrs: 19922581 -> 19922563 (-0.00%)
CodeSize: 103616980 -> 103616956 (-0.00%)
Latency: 69862064 -> 69053273 (-1.16%)
InvThroughput: 14607708 -> 14606308 (-0.01%); split: -0.01%, +0.00%

fossil-db (navi31):
Totals from 1641 (2.06% of 79825) affected shaders:
Instrs: 1247591 -> 1247875 (+0.02%); split: -0.00%, +0.03%
CodeSize: 6259516 -> 6260612 (+0.02%); split: -0.00%, +0.02%
Latency: 7657224 -> 7577299 (-1.04%); split: -1.05%, +0.00%
InvThroughput: 1150669 -> 1148171 (-0.22%); split: -0.22%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
ac882985c0 aco/gfx10: skip waitcnts or use vm_vsrc(0) for workgroup vmem barriers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
145b178de2 aco: fix workgroup-scope barrier between vmem and lds
A barrier between two lds/vmem instructions needs to ensure that the
second starts after the first finishes, which means that we can't just
skip workgroup-scope vmem barriers if there is a lds instruction later.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
02718fd4c5 aco: use a separate event for sendmsg_rtn
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
5812c2ea89 aco: update waitcnt events for exports
Include primitive, dual source blend and POS4 exports.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
711c023b55 aco: remove waitcnt code for POPS
We now insert barriers around these instead.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
005694fe1f aco: remove waitcnt code for SMEM stores
These were removed in GFX10.3 and we haven't used them in a while.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
20cd5cf5f7 aco: delay barrier waitcnt until they are needed
fossil-db (navi21):
Totals from 44 (0.06% of 79825) affected shaders:
Instrs: 16001 -> 15932 (-0.43%); split: -0.46%, +0.02%
CodeSize: 85800 -> 85548 (-0.29%); split: -0.30%, +0.01%
Latency: 190124 -> 173458 (-8.77%)
InvThroughput: 23605 -> 22756 (-3.60%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
843acfa50b aco: add a separate barrier_info for release/acquire barriers
These can wait for different sets of accesses.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
6c446c2f83 aco: refactor waitcnt pass to use barrier_info
Currently there's just barrier_info_all, but more will be added later.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
21332609b9 aco: don't move acquire barriers before interlock begin
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
0ee1c137f9 aco: don't move release barriers after interlock end
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00