Commit graph

10188 commits

Author SHA1 Message Date
Samuel Pitoiset
2ebfa64be7 radv: add radv_disable_hiz_his_gfx12 and enable for Mafia Definitive Edition
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This is a workaround for random GPU hangs with HiZ/HiS on GFX12
because the correct fix is complex and it will take time to be
implemented properly.

Mafia Definitive Edition is the first known game affected by this.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13222
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35182>
2025-05-28 07:20:26 +00:00
Samuel Pitoiset
63758bc093 radv: fix capture/replay with sparse images and descriptor buffer
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The sparse image VA needs to be returned to the application for replay.

Reported by Baldur.

VKCTS has coverage but it doesn't verify this yet.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35162>
2025-05-27 19:30:18 +00:00
Konstantin Seurer
36c9b66ee2 radv/bvh: Fix updating empty bvhs
valid_child_count_minus_one is 15 for box nodes without child so every
child was considered valid which made the code read invalid data and use
that for addressing.

Fixes: 2d48b2c ("radv: Use subgroup OPs for BVH updates on GFX12")
Closes: #13217
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35119>
2025-05-26 12:03:21 +00:00
Lionel Landwerlin
87e57a9bb2 radv: rename radv_lower_terminate_to_discard for wider use
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35111>
2025-05-26 05:52:30 +00:00
Samuel Pitoiset
a4a59a2504 radv: eliminate useless mov(const) after lowering all IO to scalar
This eliminates useless mov copies introduced by nir_lower_io_to_scalar
and this might be useful for nir_opt_varyings which optimizes
constant varyings.

It also uncovers a bug with mesh shader and constant varyaings that is
fixed by https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35081.

No fossils-db change on NAV21.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35090>
2025-05-23 05:56:31 +00:00
David Rosca
1608bc20b5 radv/video: Limit 10bit H265 decode support to stoney and newer
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12132
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35105>
2025-05-22 09:20:51 +00:00
David Rosca
1f795ec226 radv/video: Remove carrizo workaround from VCN decode
Carrizo has UVD so this can never be true.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35105>
2025-05-22 09:20:50 +00:00
David Rosca
63e952ff2c radv/video: Support encoding multiple slices
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12285
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35070>
2025-05-22 08:40:17 +00:00
Stéphane Cerveau
72a1c4ffb2 radv/debug: use common path for dmesg and tail
popen does not find the command when the full path
is not specified.

Chose /bin as the main location for both dmesg and tail to keep
compatibility with old distribution and possible embedded rootfs.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35087>
2025-05-22 07:05:03 +00:00
Samuel Pitoiset
25eb836eec radv: fix CP DMA with NULL PRT pages on GFX8-9
On GFX8-9 (starting from Polaris10), CP DMA is broken with NULL PRT
pages. It doesn't read 0 and doesn't discard writes which can cause
GPU hangs.

Fix that by always using the compute path when a BO is sparse.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12828
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35071>
2025-05-21 09:41:23 +00:00
Samuel Pitoiset
6528bb76b1 radv: stop using GDS for emulated prims gen/xfb queries on GFX11-GFX11.5
Use the same path as GFX12 using SSBO atomics because performance
should be equal or slightly better due to less synchronization.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35017>
2025-05-21 08:48:04 +02:00
Samuel Pitoiset
2812efd7ad radv: declare and emit NGG_QUERY_BUF_VA on GFX11-GFX11.5
This user SGPR is used to pass the query buffer VA for emulated queries.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35017>
2025-05-21 08:46:12 +02:00
Samuel Pitoiset
439baafe5e radv: increase size of the buffer for emulated queries on GFX12
This increases this buffer by 20 bytes but it will be re-used for
emulated queries on GFX11-GFX11.5 in order to remove the GDS path.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35017>
2025-05-21 08:46:12 +02:00
Samuel Pitoiset
98c1753214 radv: stop reserving NGG streamout counters
NGG streamout counters use GDS_OA, not GDS.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35017>
2025-05-21 08:46:12 +02:00
Samuel Pitoiset
3922cc6fbd radv: rename a variable in gfx10_copy_shader_query_ace()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35017>
2025-05-21 08:46:12 +02:00
Samuel Pitoiset
266c3bdeaf radv: adjust comments describing GDS needs
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35017>
2025-05-21 08:46:12 +02:00
Samuel Pitoiset
4d1fcd75f9 radv: fix non-indexed draws with primitive restart enable
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
On GFX11+, DISABLE_FOR_AUTO_INDEX=1 automatically disables primitive
restart enable for non-indexed draws.

On GFX10-GFX10.3 the hw considers primitive restart enable for
non-indexed draws and the driver must disable it explicitly.

GFX9 and older gens aren't affected but applying the change for them
simplifies the implementation.

To fix that, move emitting primitive restart enable at draw time
because it needs to know if the draw is indexed or not.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13037
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34996>
2025-05-20 13:57:35 +00:00
Samuel Pitoiset
7ce7009ee4 radv/meta: move and rename get_r32g32b32_format()
For future work.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34971>
2025-05-20 13:30:07 +00:00
Samuel Pitoiset
b7ce612743 radv: add vk_format_is_96bit()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34971>
2025-05-20 13:30:07 +00:00
Samuel Pitoiset
c22d86e844 radv: fix missing texel scale for unaligned linear SDMA copies
texel_scale was 0 which caused GPU hangs for unaligned linear copies.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13195
Fixes: 4b73d7e817 ("radv: fix SDMA copies for linear 96-bits formats")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35047>
2025-05-20 13:06:51 +00:00
Samuel Pitoiset
d7099675b6 radv: expose VK_EXT_zero_initialize_device_memory unconditionally
This extension doesn't require AMDGPU to clear VRAM on allocations by
default. RADEON_FLAG_ZERO_VRAM exists since the beginning.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35021>
2025-05-20 12:43:59 +00:00
Konstantin Seurer
97f71420df radv/bvh: Fix comment
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34938>
2025-05-19 14:08:33 +00:00
Konstantin Seurer
100616859e radv/bvh: Remove some unused variables
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34938>
2025-05-19 14:08:33 +00:00
Konstantin Seurer
f00b25331a radv/bvh: Make sure the AABB is written before internal_ready_count
Otherwise, the next stage can read garbage. Fixes flickering in The
Witcher 3.

Closes: #13145
Closes: #13196
Fixes: 2d48b2c ("radv: Use subgroup OPs for BVH updates on GFX12")
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34938>
2025-05-19 14:08:33 +00:00
Konstantin Seurer
f42d52f922 radv: Flush L2 on GFX12 when binding an update pipeline
This is just for completeness since the flush above is probably
sufficient.

Fixes: 2d48b2c ("radv: Use subgroup OPs for BVH updates on GFX12")
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34938>
2025-05-19 14:08:33 +00:00
Hans-Kristian Arntzen
e674823d55 radv: Consider that DGC might need shader reads of predicated data.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Similar to indirect draw barrier, need similar fixups for conditional
rendering access.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34956>
2025-05-15 06:14:46 +00:00
Samuel Pitoiset
3ca2f71f3d radv: fix conditional rendering with DGC and non native 32-bit predicate
When the hardware doesn't natively support 32-bit predication, the
driver has a fallback which allocates a 64-bit predicate to the upload
BO in order to copy the original value.

But when conditional rendering is enabled in the stateCommandBuffer
which is used by preprocess() and the execute() is recorded also in the
stateCommandBuffer. If the preprocess() is recorded in a different
cmdbuf which is submitted before the cmdbuf that contains execute(),
the fallback (ie. alloc + COPY_DATA) will be performed after. This would
cause the predicate value to be always 0.

To fix that, keep track of the user predication VA which is the only
VA that needs to be used by DGC because it reads 32-bit from the shader.

This fixes a very weird corner case with vkd3d-proton.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13143
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34953>
2025-05-15 05:51:04 +00:00
Samuel Pitoiset
e2625fa9ca radv: fix fetching conditional rendering state for DGC preprocess
This state must be fetched from the stateCommandBuffer, not from the
current cmdbuf which executes the preprocess().

Partial fix for https://gitlab.freedesktop.org/mesa/mesa/-/issues/13143

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34953>
2025-05-15 05:51:04 +00:00
Samuel Pitoiset
69ff204422 radv: remove the optimization for equal immutable samplers
This optimization used to optimize the allocated space for descriptors
when immutable samplers are equal. Though, this was basically broken :

- descriptor copies were broken for combiner image sampler (or sampler)
  with equal immutable samplers because 96 bytes were copied instead of
  64 bytes (cf. the linked ticket). This could be fixed but it's not
  worth it.
- the value returned by vkGetDescriptorLayoutSupport() was broken, it
  should have been 96 with no immutable samplers (or when they aren't
  equal)

This optimization was also not applied for descriptor buffers which is
the default for vkd3d-proton and Zink. DXVK doesn't use db but it
doesn't use immutable samplers, so basically only native vulkan games
would be concerned.

Note that immutable samplers would still be inlined in shaders if no
indirect access which should be 99.9% of the usecase.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11165
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34928>
2025-05-13 16:27:22 +00:00
Samuel Pitoiset
9a07ccbc89 radv: fix emitting dynamic viewports/scissors when the count is static
In a scenario where the viewports/scissors are a dynamic state but the
count is static (ie. updated when a graphics pipeline is bound), the
driver wasn't considering that and it was re-emitting the previous
number of viewports/scissors.

This fixes rendering issue with Blender.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13127
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34921>
2025-05-13 16:08:14 +00:00
David Rosca
5fee04bcae radv/video: Use ac_uvd_alloc_stream_handle
ac_uvd_alloc_stream_handle tries to avoid collisions in the case
when PID is not unique (eg. in sandboxes like Flatpak).

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12607
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34807>
2025-05-13 09:36:48 +00:00
Natalie Vock
e32a90b57c radv,driconf: Add radv_force_64k_sparse_alignment config
Needed by DOOM: The Dark Ages.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34944>
2025-05-13 07:58:03 +00:00
Samuel Pitoiset
4b73d7e817 radv: fix SDMA copies for linear 96-bits formats
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The hardware requires a power of two bpe. To do that, the driver
needs to adjust the pitch/offset/extent based on a texel scale factor
which only applies to 96-bits formats.

This fixes new VKCTS coverage.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34927>
2025-05-13 06:15:55 +00:00
Konstantin Seurer
2d48b2cb47 radv: Use subgroup OPs for BVH updates on GFX12
This patch changes the update code to launch 8 invocations for every
internal node. The internal nodes update their child leaf nodes using
the geometry index and primitive index stored inside the primitive node.

Processing 8 child nodes in parallel is faster than looping over them.
Moving to one dispatch that updates all nodes in one go lets us get rid
of atomics and will also enable updatable BVHs to use pair compression.

Improves Elden Ring (high settings, max RT settings, 1080p) by around
10%.

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34601>
2025-05-12 17:45:31 +02:00
Konstantin Seurer
c6fdf11303 radv: Make radv_update_memory non-static
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34601>
2025-05-12 17:45:25 +02:00
Konstantin Seurer
8157f84246 radv: Refactor the update scratch layout code
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34601>
2025-05-12 17:45:06 +02:00
Konstantin Seurer
b2aa0647d5 radv: Use a specialized shader for in place updates
If src == dst, we only need to update aabbs for the internal nodes.

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34601>
2025-05-12 17:45:00 +02:00
Konstantin Seurer
e1110d20f8 vulkan: Add acceleration structure update keys
The driver can use an optimized shader when src == dst.

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34601>
2025-05-12 17:44:56 +02:00
Samuel Pitoiset
219a2b1e32 radv: ignore radv_zero_vram=true if zeroInitialDeviceMemory is enabled
To let applications like vkd3d-proton to take full control.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34896>
2025-05-12 06:53:55 +00:00
Samuel Pitoiset
21badbf336 radv: advertise VK_EXT_zero_initialize_device_memory
Only expose this extension when AMDGPU supports zerovram allocations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34896>
2025-05-12 06:53:55 +00:00
Samuel Pitoiset
eaf646d020 radv: implement VK_EXT_zero_initialize_device_memory
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34896>
2025-05-12 06:53:55 +00:00
Daniel Schürmann
fa4eb37bf6 radv: move terminate{_if} out of loops.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33479>
2025-05-09 17:20:29 +00:00
Georg Lehmann
6f4e26e54d radv/gfx12+: enable VK_KHR_shader_bfloat16
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
GFX11 seems to have precision issues, so don't enable the extension there for now.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>
2025-05-09 11:20:26 +00:00
Georg Lehmann
7716e63cd6 radv/nir/lower_cmat: handle bf16 conversions
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>
2025-05-09 11:20:25 +00:00
Georg Lehmann
78524837c1 radv/nir/opt_cmat: support bfloat16
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>
2025-05-09 11:20:25 +00:00
Georg Lehmann
e8f5c335ff radv,aco,nir: keep the A and B base type for cmat_muladd_amd
With bfloat16, and the two fp8 formats in the future, using just the bit size
to identify the types is no longer possible.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>
2025-05-09 11:20:25 +00:00
Konstantin Seurer
c21e1776b3 radv: Use build flags instead of defines
Using the meta framework makes managing shader variants much easier.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34594>
2025-05-09 09:55:32 +00:00
Konstantin Seurer
33ac143779 vulkan: Introduce VK_BUILD_FLAG for specializing BVH build shaders
The advantage of using spec constants is that we do not have to include
multiple spirv binaries for multiple variants of a build stage.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34594>
2025-05-09 09:55:32 +00:00
Samuel Pitoiset
ae6d3df139 radv,aco: dump more SQ_WAVE registers from the trap handler on GFX12
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34840>
2025-05-09 09:04:57 +00:00
Samuel Pitoiset
0e73c85424 radv: fix configuring TRAP_PRESENT for compute shaders on GFX12
It no longer exists and it's been replaced by DYNAMIC_VGPR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34840>
2025-05-09 09:04:57 +00:00