Commit graph

17328 commits

Author SHA1 Message Date
Samuel Pitoiset
b4940255ed radv/sdma: add support for compression on GFX12
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Similar to previous generations that support compression, except that
the driver don't need to configure a meta VA because DCC is completely
transparent to the userspace.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34517>
2025-04-16 06:57:00 +00:00
Samuel Pitoiset
efa0b16bb2 radv/sdma: add a new flag to know if the surface is compressed
On GFX12, DCC is transparent to the driver and there is no meta VA.
Adding a new flag to know if the SDMA surface is compressed is needed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34517>
2025-04-16 06:57:00 +00:00
Samuel Pitoiset
03671ccf9e radv/sdma: use the correct helper to get the number type field
This wasn't technically incorrect because V_028C70_BU_NUM_xxx values
are similar to V_028C70_NUMBER_xxx but it's better to use the corect
helper.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34517>
2025-04-16 06:57:00 +00:00
Samuel Pitoiset
b44dc98cde radv/sdma: remove redundant check for compression when getting metadata
It's already checked by the caller.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34517>
2025-04-16 06:57:00 +00:00
Samuel Pitoiset
d3d5d2fe86 radv/sdma: use SDMA5_DCC_xxx bitfields
It's cleaner.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34517>
2025-04-16 06:57:00 +00:00
Samuel Pitoiset
f44342199a radv/sdma: simplify configuring the number of uncompressed DCC blocks
SDMA doesn't support MSAA, so the value can be
V_028C78_MAX_BLOCK_SIZE_256B.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34517>
2025-04-16 06:57:00 +00:00
Samuel Pitoiset
13db408e59 ac/perfcounter: add support for GFX12
Sourced from PAL to add SPM support.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34524>
2025-04-16 06:35:33 +00:00
Samuel Pitoiset
c42d43e8eb radv: print more error messages during SPM initialization
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34524>
2025-04-16 06:35:33 +00:00
Marek Olšák
78cacfd9ce ac/surface: select 3D tile mode without overallocating too much for gfx6-8
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12466
Fixes: c87ce78d - ac/surface: enable thick tiling for 3D textures for better perf on gfx6-8

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34432>
2025-04-16 06:08:48 +00:00
Marek Olšák
195e7b4f75 ac/surface: make gfx12_estimate_size reusable by gfx6
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12466
Fixes: c87ce78d - ac/surface: enable thick tiling for 3D textures for better perf on gfx6-8

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34432>
2025-04-16 06:08:48 +00:00
Marek Olšák
2c122d478b ac/nir: set X=0 for task->mesh shader dispatch when Y or Z is 0
The code set X=0 when Y and Z is 0, not "or".

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34432>
2025-04-16 06:08:48 +00:00
Marek Olšák
963147d7fd ac/gpu_info: add 256 to payload_entry_size to increase future task shader perf
It has no effect because num_entries is 1K, but the table shows a lot of
potential.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34432>
2025-04-16 06:08:48 +00:00
Marek Olšák
d7c903f258 ac/gpu_info: add payload_entry_size into ac_task_info
to stop causing full RADV recompiles when it's changed.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34432>
2025-04-16 06:08:48 +00:00
Marek Olšák
0dafd04695 ac/gpu_info: remove has_tmz_support function
It's not needed since:
    8b3056343f - ac/gpu_info: bump required DRM minor version to 3.42.0 (kernel 5.15+)

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34432>
2025-04-16 06:08:48 +00:00
Marek Olšák
0be5a3559a ac/gpu_info: increase the attribute ring size for gfx12
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34432>
2025-04-16 06:08:48 +00:00
Eric Engestrom
54bcfb4c1f ci/deqp: fix vulkan video build
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34532>
2025-04-15 17:23:05 +00:00
Samuel Pitoiset
e86e0fc525 radv: allocate the SPM BO in GTT for faster readback
Reading VRAM from CPU is very slow.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34467>
2025-04-15 06:30:38 +00:00
Samuel Pitoiset
8ea46b14fa ci: update VKCTS main to 76c1572eaba42d7ddd9bb8eb5788e52dd932068e
RADV is the only driver using VKCTS main.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34299>
2025-04-14 08:24:14 +00:00
Samuel Pitoiset
410f7f9f6e radv: only enable DCC for invisible VRAM on GFX12
DCC should only be allowed on invisible VRAM, otherwise the CPU could
read the data and it will read garbage if it's compressed.

This also caused GPU hangs after suspend/resume probably because
some buffers were compressed when moved back from GTT to VRAM.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12962
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12922
Fixes: 9af11bf306 ("radv: add initial DCC support on GFX12")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34347>
2025-04-14 07:39:33 +00:00
Samuel Pitoiset
75be860eec radv: use paired context regs when optimal on GFX12
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
CP is very slow on GFX12 and parsing the packet header is the main
bottleneck. Using paired context regs reduce the number of packet
headers and it should be more optimal.

It doesn't seem worth when only one context reg is emitted (one packet
header and same number of DWORDS) or when consecutive context regs are
emitted (would increase the number of DWORDS).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34421>
2025-04-14 06:18:13 +00:00
Samuel Pitoiset
f92f50c58a radv: add macros for paired context registers on GFX12
Imported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34421>
2025-04-14 06:18:13 +00:00
Konstantin Seurer
676e26aed5 radv: Fix rayTracingPositionFetch with multiple geometies
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The fix adds more indirections to avoid increasing register pressure by
tracking the primitive address.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34460>
2025-04-11 22:26:08 +00:00
Timur Kristóf
371b1bf789 radv: Don't call nir_opt_varyings a second time when unnecessary.
When nir_opt_varyings doesn't make progress the first time,
it should not be necessary to call it a second time.

No Fossil DB changes.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33880>
2025-04-11 18:01:47 +00:00
Timur Kristóf
403b3958c1 radv: Move preparation and fixup to separate loops in varying optimization.
This is to stop calling nir_shader_gather_info repeatedly for
some stages, and also as a pre-requisite to the work in the next commits.

No Fossil DB changes.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33880>
2025-04-11 18:01:47 +00:00
Timur Kristóf
a98186bbf6 radv: Refactor loops in radv_graphics_shaders_link_varyings.
No functional changes, just improved code readability.

No Fossil DB changes.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33880>
2025-04-11 18:01:47 +00:00
Timur Kristóf
1942227e73 radv: Inline radv_graphics_shaders_link_varyings_{first/second}.
The first step of reorganizing this code.

No Fossil DB changes.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33880>
2025-04-11 18:01:47 +00:00
Timur Kristóf
412af41258 radv: Add radv_foreach_stage to ForEachMacros again.
This was lost when .clang-format was removed
from the amd folder.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33880>
2025-04-11 18:01:47 +00:00
David Rosca
f1f87d302f radv/video: Always enable B pictures for H264 encode
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
We always allocate the extra memory needed for B pictures, so there is
no reason not to also enable B pictures always.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34449>
2025-04-11 11:15:47 +00:00
David Rosca
a1fbaddc9c radv/video: Use ac_vcn_enc_init_cmds
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34449>
2025-04-11 11:15:47 +00:00
David Rosca
7249d9548e radv/video: Fix encode session info for VCN3+
Last dword should be 0.

Cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34449>
2025-04-11 11:15:47 +00:00
David Rosca
34031531fc radv/video: Fix msg header total size
It needs to include also codec msg size.

Cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34449>
2025-04-11 11:15:47 +00:00
Konstantin Seurer
b218c45973 radv: Handle nir_intrinsic_printf
Makes it possible to use printf statements inside glsl meta shaders.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34208>
2025-04-10 19:31:37 +00:00
Samuel Pitoiset
2f00daf67a radv: tidy up radv_emit_hw_ngg()
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34420>
2025-04-10 06:56:25 +00:00
Samuel Pitoiset
1290b38f57 radv: tidy up radv_emit_raster_state()
Better isolation between configuration and emission.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34420>
2025-04-10 06:56:25 +00:00
Samuel Pitoiset
4b2d119d90 radv: reduce the number of emitted DWORDS for MSAA 8x user sample locs
From 24 DWORDS to 16 DWORDS.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34420>
2025-04-10 06:56:25 +00:00
Samuel Pitoiset
c1ebf82700 radv: track redundant DB_RENDER_OVERRRIDE register writes on GFX12
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34420>
2025-04-10 06:56:25 +00:00
Samuel Pitoiset
7f5727b313 radv: use consecutive registers for PA_SC_WINDOW_SCISSOR_{TL,BR}
For less DWORDS.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34420>
2025-04-10 06:56:25 +00:00
Samuel Pitoiset
32ea7df586 radv: move emitting more fb registers when rendering begins
No need to delay the emission of these registers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34420>
2025-04-10 06:56:25 +00:00
Samuel Pitoiset
001fa1cf11 radv: move the disable_trunc_coord drirc at instance/pdev level
It no longer relies on enabled device features.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34425>
2025-04-10 06:36:09 +00:00
Samuel Pitoiset
65d717b45a radv: remove an old workaround for D3D9 with DXVK 2.3.0 and older
Proton 8.x+ uses this DXVK version but Proton 9.x+ is the default now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34425>
2025-04-10 06:36:09 +00:00
Natalie Vock
916d7277c0 radv/ci: Test FP16 for GFX8
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34114>
2025-04-09 14:21:37 +00:00
Natalie Vock
f0f4ae1713 radv: Add radv_enable_float16_gfx8 drirc and enable for Indiana Jones TGC
This is a hard requirement from the game preventing it to start on GFX8.
Adding this allows playing it on GFX8.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34114>
2025-04-09 14:21:37 +00:00
Natalie Vock
e385cb1750 radv: Add radv_emulate_rt drirc and enable for Indiana Jones TGC
There have been various people successfully trying it out on GFX9-GFX10.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34114>
2025-04-09 14:21:37 +00:00
Natalie Vock
3d8db3cbbb aco: Make private_segment_buffer/scratch_offset per-resume
We need different Temps for each resume shader, because registers aren't
preserved across resume boundaries.

This was likely fine in practice because arg registers are the same for
each shader, but resulted in invalid IR and asserts.

Fixes crashes in Indiana Jones RT with assertions enabled on GFX8.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34114>
2025-04-09 14:21:37 +00:00
Natalie Vock
d1ff9e951a aco: Fix RT VGPR limit on Navi31/32, GFX11.5, GFX12
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Since 128 is not a multiple of the VGPR allocation granule, we will
actually allocate 134 VGPRs. No reason not to use the extra 6.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34265>
2025-04-09 10:02:52 +00:00
Georg Lehmann
64cae5c48d aco: form mixed MTBUF/MUBUF clauses
This should be one clause (all of the instructions load from the same vertex buffer)

s_clause 0x2                                                ; bfa10002
tbuffer_load_format_xyzw v[8:11], v5, s[4:7], 0 format:[BUF_FMT_8_8_8_8_UNORM] idxen offset:36 ; e9c32024 80010805
tbuffer_load_format_xyzw v[12:15], v5, s[4:7], 0 format:[BUF_FMT_8_8_8_8_UNORM] idxen offset:16 ; e9c32010 80010c05
tbuffer_load_format_xyzw v[16:19], v5, s[4:7], 0 format:[BUF_FMT_8_8_8_8_UNORM] idxen offset:12 ; e9c3200c 80011005
s_clause 0x2                                                ; bfa10002
buffer_load_dwordx3 v[20:22], v5, s[4:7], 0 idxen           ; e03c2000 80011405
buffer_load_dwordx3 v[23:25], v5, s[4:7], 0 idxen offset:20 ; e03c2014 80011705
buffer_load_dwordx4 v[28:31], v5, s[4:7], 0 idxen offset:48 ; e0382030 80011c05
tbuffer_load_format_xy v[0:1], v5, s[4:7], 0 format:[BUF_FMT_8_8_UNORM] idxen offset:32 ; e8712020 80010005

Foz-DB Navi21:
Totals from 5624 (7.08% of 79395) affected shaders:
MaxWaves: 149894 -> 149898 (+0.00%)
Instrs: 3032697 -> 3034853 (+0.07%); split: -0.05%, +0.12%
CodeSize: 15907852 -> 15915752 (+0.05%); split: -0.05%, +0.10%
VGPRs: 216248 -> 216144 (-0.05%)
Latency: 10955137 -> 11008760 (+0.49%); split: -0.22%, +0.70%
InvThroughput: 2032857 -> 2033916 (+0.05%); split: -0.03%, +0.08%
VClause: 50120 -> 41778 (-16.64%); split: -16.66%, +0.02%
SClause: 62034 -> 62004 (-0.05%); split: -0.33%, +0.29%
Copies: 253836 -> 254505 (+0.26%); split: -0.17%, +0.43%
VALU: 1621606 -> 1622274 (+0.04%); split: -0.03%, +0.07%
SALU: 653251 -> 653252 (+0.00%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34379>
2025-04-08 09:22:04 +00:00
Georg Lehmann
babe7f3e12 aco/gfx10: simpler solution to avoid store instructions in clauses
Foz-DB Navi21 has no changes.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34379>
2025-04-08 09:22:04 +00:00
Samuel Pitoiset
0ba3a8b3cc radv: add clip rects state bit for emitting discard rectangles
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Better match the hw naming.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34361>
2025-04-08 08:42:17 +00:00
Samuel Pitoiset
08918f0880 radv: regroup emitting all MSAA states in one function
All register writes are optimized out. Also this will allow to use
paired context register writes on GFX12.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34361>
2025-04-08 08:42:17 +00:00
Samuel Pitoiset
e8d787e1ef radv: track more MSAA related register writes
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34361>
2025-04-08 08:42:17 +00:00