Commit graph

17367 commits

Author SHA1 Message Date
David Rosca
3b3e051ba5 radv/video: Set correct minCodedExtent for encode
Cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35261>
(cherry picked from commit 25f7996395)
2025-06-04 15:52:49 +02:00
Samuel Pitoiset
8537a314f6 radv,radeonsi: emit UPDATE_DB_SUMMARIZER_TIMEOUT on GFX12
This try to mitigate the HiZ GPU hang by increasing a timeout. Loosely
based on PAL but I can confirm it delays the hang when
BOTTOM_OF_PIPE_TS is used as a workaround.

This must be emitted when the GFX queue is idle.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35212>
(cherry picked from commit 47f5d25f93)
2025-06-04 15:52:49 +02:00
Samuel Pitoiset
e35c5d643b radv: add radv_disable_hiz_his_gfx12 and enable for Mafia Definitive Edition
This is a workaround for random GPU hangs with HiZ/HiS on GFX12
because the correct fix is complex and it will take time to be
implemented properly.

Mafia Definitive Edition is the first known game affected by this.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13222
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35182>
(cherry picked from commit 2ebfa64be7)
2025-06-04 15:52:47 +02:00
Samuel Pitoiset
5ad7ae003f radv: fix capture/replay with sparse images and descriptor buffer
The sparse image VA needs to be returned to the application for replay.

Reported by Baldur.

VKCTS has coverage but it doesn't verify this yet.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35162>
(cherry picked from commit 63758bc093)
2025-06-04 15:52:47 +02:00
David Rosca
840325c2f7 radv/video: Limit 10bit H265 decode support to stoney and newer
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12132
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35105>
(cherry picked from commit 1608bc20b5)
2025-06-04 15:52:44 +02:00
Samuel Pitoiset
f0f5af73d4 radv: fix non-indexed draws with primitive restart enable
On GFX11+, DISABLE_FOR_AUTO_INDEX=1 automatically disables primitive
restart enable for non-indexed draws.

On GFX10-GFX10.3 the hw considers primitive restart enable for
non-indexed draws and the driver must disable it explicitly.

GFX9 and older gens aren't affected but applying the change for them
simplifies the implementation.

To fix that, move emitting primitive restart enable at draw time
because it needs to know if the draw is indexed or not.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13037
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34996>
(cherry picked from commit 4d1fcd75f9)
2025-05-20 20:18:08 +02:00
Samuel Pitoiset
db7d38ccef radv: fix missing texel scale for unaligned linear SDMA copies
texel_scale was 0 which caused GPU hangs for unaligned linear copies.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13195
Fixes: 4b73d7e817 ("radv: fix SDMA copies for linear 96-bits formats")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35047>
(cherry picked from commit c22d86e844)
2025-05-20 20:18:08 +02:00
Georg Lehmann
4fa2dbdf55 aco: assume sram ecc is enabled on Vega20
There are D16 load issues on Vega20 that are expected if sram ecc is enabled.
It's a professional class chip and I found mentions of it supporting ecc,
so assume it's enabled.

Maybe this could be improved by querying ecc info from the kernel, but
I'm not sure which query should be used.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13189
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12393
Cc: mesa-stable

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35045>
(cherry picked from commit 0257644130)
2025-05-20 20:18:08 +02:00
Hans-Kristian Arntzen
912437c1a3 radv: Consider that DGC might need shader reads of predicated data.
Similar to indirect draw barrier, need similar fixups for conditional
rendering access.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34956>
(cherry picked from commit e674823d55)
2025-05-20 20:18:07 +02:00
Samuel Pitoiset
91898fbe37 radv: fix conditional rendering with DGC and non native 32-bit predicate
When the hardware doesn't natively support 32-bit predication, the
driver has a fallback which allocates a 64-bit predicate to the upload
BO in order to copy the original value.

But when conditional rendering is enabled in the stateCommandBuffer
which is used by preprocess() and the execute() is recorded also in the
stateCommandBuffer. If the preprocess() is recorded in a different
cmdbuf which is submitted before the cmdbuf that contains execute(),
the fallback (ie. alloc + COPY_DATA) will be performed after. This would
cause the predicate value to be always 0.

To fix that, keep track of the user predication VA which is the only
VA that needs to be used by DGC because it reads 32-bit from the shader.

This fixes a very weird corner case with vkd3d-proton.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13143
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34953>
(cherry picked from commit 3ca2f71f3d)
2025-05-20 20:18:07 +02:00
Samuel Pitoiset
0e15ea9546 radv: fix fetching conditional rendering state for DGC preprocess
This state must be fetched from the stateCommandBuffer, not from the
current cmdbuf which executes the preprocess().

Partial fix for https://gitlab.freedesktop.org/mesa/mesa/-/issues/13143

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34953>
(cherry picked from commit e2625fa9ca)
2025-05-20 20:18:07 +02:00
Rhys Perry
799685659d aco/gfx115: consider point sample acceleration
Like 15428e0d786939a5c7629a9978947c8a9112ce96 in LLVM.

fossil-db (gfx1150):
Totals from 909 (1.14% of 79653) affected shaders:
Instrs: 5840489 -> 5840705 (+0.00%); split: -0.00%, +0.00%
CodeSize: 31133460 -> 31134296 (+0.00%); split: -0.00%, +0.00%
Latency: 52982280 -> 53438577 (+0.86%); split: -0.00%, +0.86%
InvThroughput: 10841454 -> 10942682 (+0.93%); split: -0.00%, +0.93%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 25.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34935>
(cherry picked from commit 171920ceed)
2025-05-20 20:18:07 +02:00
Samuel Pitoiset
e81572403c radv: remove the optimization for equal immutable samplers
This optimization used to optimize the allocated space for descriptors
when immutable samplers are equal. Though, this was basically broken :

- descriptor copies were broken for combiner image sampler (or sampler)
  with equal immutable samplers because 96 bytes were copied instead of
  64 bytes (cf. the linked ticket). This could be fixed but it's not
  worth it.
- the value returned by vkGetDescriptorLayoutSupport() was broken, it
  should have been 96 with no immutable samplers (or when they aren't
  equal)

This optimization was also not applied for descriptor buffers which is
the default for vkd3d-proton and Zink. DXVK doesn't use db but it
doesn't use immutable samplers, so basically only native vulkan games
would be concerned.

Note that immutable samplers would still be inlined in shaders if no
indirect access which should be 99.9% of the usecase.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11165
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34928>
(cherry picked from commit 69ff204422)
2025-05-20 20:18:06 +02:00
Samuel Pitoiset
1512a1cdd7 radv: fix emitting dynamic viewports/scissors when the count is static
In a scenario where the viewports/scissors are a dynamic state but the
count is static (ie. updated when a graphics pipeline is bound), the
driver wasn't considering that and it was re-emitting the previous
number of viewports/scissors.

This fixes rendering issue with Blender.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13127
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34921>
(cherry picked from commit 9a07ccbc89)
2025-05-20 20:18:06 +02:00
David Rosca
22826ec621 radv/video: Use ac_uvd_alloc_stream_handle
ac_uvd_alloc_stream_handle tries to avoid collisions in the case
when PID is not unique (eg. in sandboxes like Flatpak).

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12607
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34807>
(cherry picked from commit 5fee04bcae)
2025-05-20 20:18:06 +02:00
David Rosca
4cfaede767 ac/uvd: Add ac_uvd_alloc_stream_handle
Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34807>
(cherry picked from commit 69455e8208)
2025-05-20 20:18:06 +02:00
Natalie Vock
42519ff23a radv,driconf: Add radv_force_64k_sparse_alignment config
Needed by DOOM: The Dark Ages.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34944>
(cherry picked from commit e32a90b57c)
2025-05-20 20:18:06 +02:00
Samuel Pitoiset
6a1a256578 radv: fix SDMA copies for linear 96-bits formats
The hardware requires a power of two bpe. To do that, the driver
needs to adjust the pitch/offset/extent based on a texel scale factor
which only applies to 96-bits formats.

This fixes new VKCTS coverage.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34927>
(cherry picked from commit 4b73d7e817)
2025-05-20 20:18:06 +02:00
Rhys Perry
55241615c6 ac/llvm: correctly set alignment of vector global load/store
For coherent/volatile access, this would be too high for vector access.

Even when we didn't set the alignment, LLVM seemed to assume too high of
an alignment for 8/16-bit vector access.

Fixes generated_tests/cl/vload/vload-char-constant.cl

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Backport-to: 25.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34903>
(cherry picked from commit d0a09b6ff7)
2025-05-20 20:18:05 +02:00
Rhys Perry
98f96feda8 ac/llvm: correctly split vector 8/16-bit stores
This assumes that the start of the load is 32-bit aligned.

For example, a vec3 16-bit store with align_offset=2 should split off the
first component, not the last.

This probably also fixed splitting with 8-bit stores.

Fixes arb_copy_buffer-overlap

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Backport-to: 25.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34903>
(cherry picked from commit c1ecad2b11)
2025-05-20 20:18:05 +02:00
Samuel Pitoiset
25c188a743 radv: ignore conditional rendering with vkCmdTraceRays*
CmdTraceRays is neither a dispatch or a draw command which means it
shouldn't be affected by conditional rendering.

Fixes recent VKCTS coverage.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34868>
(cherry picked from commit 4b76d04f7f)
2025-05-20 20:18:05 +02:00
Samuel Pitoiset
b1c7064a68 radv: ignore radv_disable_dcc_stores on GFX12
It's not necessary because DCC is completely transparent to the
userspace driver. Also it's causing issues with scanout.

This fixes rendering issues with scanout in Indiana Jones.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12924
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34859>
(cherry picked from commit b7d2cdd2b4)
2025-05-20 20:18:04 +02:00
Konstantin Seurer
8b8ca028a0 radv: Return VK_ERROR_INCOMPATIBLE_DRIVER for unsupported devices
VK_ERROR_INITIALIZATION_FAILED will fail physical device enumeration.
Returning VK_ERROR_INCOMPATIBLE_DRIVER means that the driver can still
be used on supported GPUs when multiple GPUs are installed.

cc: mesa-stable

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34783>
(cherry picked from commit 84b9c281fe)
2025-05-07 09:04:49 +02:00
Rhys Perry
77e7fd0dee aco: swap the correct v_mov_b32 if there are two of them
Previously, this function tried to swap the instruction which is not
v_mov_b32, so that it doesn't introduce any new OPY-only instructions. If
both were v_mov_b32, it swapped Y. Since this makes Y opy-only, this can't
be done if X is also opy-only.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 408fa33c09 ("aco/gfx12: don't use second VALU for VOPD's OPX if there is a WaR")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13101
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34841>
(cherry picked from commit 9ca71b52aa)
2025-05-07 09:04:39 +02:00
Samuel Pitoiset
867fb6756b radv: fix GPU hangs with image copies for ASTC/ETC2 formats on transfer queue
Emitting compute dispatches on SDMA just hangs. It might be needed
to switch to gang submit for these to work but fixing the GPU hang is
more important for now.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34805>
(cherry picked from commit 0684dc5fa8)
2025-05-06 17:24:03 +02:00
Samuel Pitoiset
963b9fc2f3 radv: disable SINGLE clear codes to workaround a hw bug with DCC on GFX11
This fixes a very weird cache-related corruption with DCC on GFX11 due
to a hw bug according to PAL.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12932
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34790>
(cherry picked from commit 1356d20042)
2025-05-06 17:24:02 +02:00
Samuel Pitoiset
49d96917d5 radv: do not clear unwritten color attachments with dual-source blending
This is incorrect because the color format at slot 0 needs to be
replicated to the slot 1. But with dual-source blending the colors
written mask is only 0xf and this was clearing the color format at
slot 1.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13082
Fixes: e1483d022b ("radv: clear unwritten color attachments for monolithic PS earlier")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34773>
(cherry picked from commit 55ad0fd35c)
2025-05-06 17:24:01 +02:00
Paul Gofman
234d66a1e2 radv/amdgpu: Fix hash key in radv_amdgpu_winsys_destroy().
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34774>
(cherry picked from commit 96765935e8)
2025-05-03 12:48:02 +02:00
Samuel Pitoiset
03dc23baa2 radv: fix re-emitting VRS state when rendering begins
This state also depends on whether a VRS attachment is used.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11693
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34735>
(cherry picked from commit 1fccc09abe)
2025-04-30 14:15:54 +02:00
Rhys Perry
87902dca71 aco: fix get_temp_reg_changes with clobbered operands
The spiller might have tried to spill a live-through first or second
s_fmac_f32 operand, but this wouldn't have reduced the SGPRs if the third
operand wasn't killed

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13038
Fixes: d6cb45dbb0 ("aco/spill: Allow spilling live-through operands")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34699>
(cherry picked from commit 7fe84024cb)
2025-04-30 14:15:47 +02:00
Rhys Perry
e1f06788f5 aco/gfx11: create waitcnt for workgroup vmem barriers
It seems this is necessary on GFX11.

Similar to 576a2e798c

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Backport-to: 25.0
Backport-to: 25.1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34634>
(cherry picked from commit b03e071583)
2025-04-27 11:45:27 +02:00
Timur Kristóf
5c9733618d radv: Clear dirty flag for clip rects state after emitting it.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Tested-by: Marcus Seyfarth <m.seyfarth@gmail.com>
Fixes: 0ba3a8b3cc
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34686>
(cherry picked from commit 3ad385b9cc)
2025-04-27 11:45:24 +02:00
Timur Kristóf
d18a3d5f09 radv: Clear dirty flag for MSAA state after emitting it.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Tested-by: Marcus Seyfarth <m.seyfarth@gmail.com>
Fixes: 08918f0880
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13022
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34686>
(cherry picked from commit 3a05477ac6)
2025-04-27 11:45:23 +02:00
Georg Lehmann
3d9ac270e2 aco/insert_exec: reset temporary when recreating wqm mask from exact mask
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The old, now incorrect temporary was still used for invert blocks and loop masks.

Foz-DB Navi31:
Totals from 379 (0.48% of 79789) affected shaders:
Instrs: 399471 -> 399897 (+0.11%); split: -0.00%, +0.11%
CodeSize: 2197292 -> 2198908 (+0.07%); split: -0.00%, +0.08%
Latency: 2500636 -> 2500895 (+0.01%); split: -0.00%, +0.01%
SClause: 7912 -> 7918 (+0.08%); split: -0.04%, +0.11%
Copies: 25687 -> 26068 (+1.48%); split: -0.04%, +1.53%
PreSGPRs: 15648 -> 15562 (-0.55%)
SALU: 35125 -> 35517 (+1.12%)

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12901
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13019
Fixes: b872ff6ef2 ("aco/insert_exec_mask: if applicable, use s_wqm to restore exec after divergent CF")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34659>
(cherry picked from commit dd3e1190a2)
2025-04-23 12:21:56 +02:00
Georg Lehmann
4fb4880183 aco/insert_exec: only restore wqm mask after control flow if necessary
The next commit will make this not free, so we should avoid it if possible.

Foz-DB Navi31:
Totals from 3933 (4.93% of 79789) affected shaders:
Instrs: 5726914 -> 5727295 (+0.01%); split: -0.00%, +0.01%
CodeSize: 31307100 -> 31308884 (+0.01%); split: -0.00%, +0.01%
SpillSGPRs: 1797 -> 1793 (-0.22%); split: -0.33%, +0.11%
Latency: 58973929 -> 58974343 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 8591893 -> 8591911 (+0.00%); split: -0.00%, +0.00%
SClause: 209074 -> 209115 (+0.02%); split: -0.00%, +0.02%
Copies: 423965 -> 432420 (+1.99%)
Branches: 149976 -> 149979 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 200175 -> 200663 (+0.24%)
VALU: 3440165 -> 3440156 (-0.00%); split: -0.00%, +0.00%
SALU: 555727 -> 556143 (+0.07%); split: -0.00%, +0.08%

Fixes: b872ff6ef2 ("aco/insert_exec_mask: if applicable, use s_wqm to restore exec after divergent CF")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34659>
(cherry picked from commit 13f6be262a)
2025-04-23 12:21:56 +02:00
Georg Lehmann
d3285fe971 aco: set opsel_hi to 1 for WMMA
This is ignored by the hardware but LLVM requires it to disassemble GFX12 WMMA.

Cc: mesa-stable
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34396>
(cherry picked from commit b0c8f31600)
2025-04-23 12:21:56 +02:00
Marek Olšák
39e4fe7ab4 radv: fix incorrect patch_outputs_read for TCS with dynamic state
Fixes: 8c2f9f0665 - radv: switch to the new TCS LDS/offchip size computation

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34544>
(cherry picked from commit 4a51089f30)
2025-04-22 01:25:00 +02:00
Rhys Perry
76db8496a9 aco: combine VALU lanemask hazard into VALUMaskWriteHazard
This is now basically the same as the original VALUMaskWriteHazard, except
it now considers both VALU and SALU writes.

Now that it's a part of VALUMaskWriteHazard, differences from the original
VALU lanemask workaround are:
- it includes SALU reads after the write
- it includes VALU writes and SALU/VALU reads after the write which are
  not lanemasks
- it combines s_waitcnt_depctr instructions when it's a read after both a
  SALU write and a VALU write
- non-exec VALU SGPR reads reset the SGPRs read by VALU as a lanemask
- exec SGPRs are ignored

resolve_all_gfx11() is also finished.

fossil-db (navi31):
Totals from 21538 (27.13% of 79377) affected shaders:
Instrs: 27628855 -> 27552972 (-0.27%); split: -0.30%, +0.03%
CodeSize: 145968448 -> 145667616 (-0.21%); split: -0.23%, +0.02%
Latency: 209537805 -> 209509519 (-0.01%); split: -0.02%, +0.00%
InvThroughput: 36304270 -> 36301624 (-0.01%); split: -0.01%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12623
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11480
Backport-to: 25.0
Backport-to: 25.1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34529>
(cherry picked from commit ce2be5ab8e)
2025-04-22 01:24:39 +02:00
Rhys Perry
dd304bfd80 aco/gfx12: don't use second VALU for VOPD's OPX if there is a WaR
fossil-db (gfx1201):
Totals from 38908 (49.02% of 79377) affected shaders:
Instrs: 30268107 -> 30268131 (+0.00%); split: -0.00%, +0.00%
CodeSize: 180843648 -> 180843640 (-0.00%); split: -0.00%, +0.00%
Latency: 224905962 -> 224906072 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 44322988 -> 44323004 (+0.00%)
VALU: 15124145 -> 15124167 (+0.00%)
VOPD: 4018504 -> 4018482 (-0.00%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 25.0
Backport-to: 25.1
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34246>
(cherry picked from commit 408fa33c09)
2025-04-22 01:24:31 +02:00
Samuel Pitoiset
b4940255ed radv/sdma: add support for compression on GFX12
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Similar to previous generations that support compression, except that
the driver don't need to configure a meta VA because DCC is completely
transparent to the userspace.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34517>
2025-04-16 06:57:00 +00:00
Samuel Pitoiset
efa0b16bb2 radv/sdma: add a new flag to know if the surface is compressed
On GFX12, DCC is transparent to the driver and there is no meta VA.
Adding a new flag to know if the SDMA surface is compressed is needed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34517>
2025-04-16 06:57:00 +00:00
Samuel Pitoiset
03671ccf9e radv/sdma: use the correct helper to get the number type field
This wasn't technically incorrect because V_028C70_BU_NUM_xxx values
are similar to V_028C70_NUMBER_xxx but it's better to use the corect
helper.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34517>
2025-04-16 06:57:00 +00:00
Samuel Pitoiset
b44dc98cde radv/sdma: remove redundant check for compression when getting metadata
It's already checked by the caller.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34517>
2025-04-16 06:57:00 +00:00
Samuel Pitoiset
d3d5d2fe86 radv/sdma: use SDMA5_DCC_xxx bitfields
It's cleaner.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34517>
2025-04-16 06:57:00 +00:00
Samuel Pitoiset
f44342199a radv/sdma: simplify configuring the number of uncompressed DCC blocks
SDMA doesn't support MSAA, so the value can be
V_028C78_MAX_BLOCK_SIZE_256B.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34517>
2025-04-16 06:57:00 +00:00
Samuel Pitoiset
13db408e59 ac/perfcounter: add support for GFX12
Sourced from PAL to add SPM support.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34524>
2025-04-16 06:35:33 +00:00
Samuel Pitoiset
c42d43e8eb radv: print more error messages during SPM initialization
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34524>
2025-04-16 06:35:33 +00:00
Marek Olšák
78cacfd9ce ac/surface: select 3D tile mode without overallocating too much for gfx6-8
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12466
Fixes: c87ce78d - ac/surface: enable thick tiling for 3D textures for better perf on gfx6-8

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34432>
2025-04-16 06:08:48 +00:00
Marek Olšák
195e7b4f75 ac/surface: make gfx12_estimate_size reusable by gfx6
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12466
Fixes: c87ce78d - ac/surface: enable thick tiling for 3D textures for better perf on gfx6-8

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34432>
2025-04-16 06:08:48 +00:00
Marek Olšák
2c122d478b ac/nir: set X=0 for task->mesh shader dispatch when Y or Z is 0
The code set X=0 when Y and Z is 0, not "or".

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34432>
2025-04-16 06:08:48 +00:00