Commit graph

10015 commits

Author SHA1 Message Date
Samuel Pitoiset
64e6e043b3 Revert "radv: program SAMPLE_MASK_TRACKER_WATERMARK optimally for GFX11 APUs"
This reverts commit 96e9c3fe77.

This actually causes random GPU hangs like on Phoenix.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12461
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12426
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12692
Tested-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34306>
2025-04-02 07:10:40 +00:00
Autumn Ashton
ae6d24c4ef radv: Expose VK_SAMPLE_COUNT_1_BIT for sample position on GFX10+
This works on GFX10+.

Signed-off-by: Autumn Ashton <misyl@froggi.es>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28237>
2025-04-01 21:15:34 +01:00
Autumn Ashton
693e3b47f7 radv: Expose EXT_sample_locations everywhere
This works and passes CTS now!

Signed-off-by: Autumn Ashton <misyl@froggi.es>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28237>
2025-04-01 21:15:31 +01:00
Autumn Ashton
343c434c50 radv: Enable fragmentShadingRateWithCustomSampleLocations
We need to expose this, as we support it.

Otherwise 1x1 is assumed and we fail some CTS.

Signed-off-by: Autumn Ashton <misyl@froggi.es>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28237>
2025-04-01 21:15:28 +01:00
Autumn Ashton
3d75082c02 radv: Fix compute resolve rounding
When we are using compute resolve, we can get
values the CTS does not expect due to the value
we end up writing for UNORM in
`nir_image_deref_store`.

Make the compute resolve rounding path match with
the output of the fragment shader resolve path,
by going through the same FP16 RTZ conversion as
we do for UNORM/SNORM formats.

This is why VK_EXT_sample_locations CTS was
failing on > GFX9.
On <= GFX9, I am assuming we are falling back to
RESOLVE_FRAGMENT, due to DCC stuff, which is why
it works there.

I tested a handful of images from the Vulkan CTS
for the sample locations and resolve tests for
diff UNORM formats from the qpa file forcing
FRAGMENT and with this change.
With this change, we now match on the compute
resolve path the same sha for the ones I compared
with ImageMagick `identify`.

CTS passes for: *resolve*, *image_clearing* and
*sample_locations* on RX 7900XTX.

Signed-off-by: Autumn Ashton <misyl@froggi.es>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28237>
2025-04-01 21:15:24 +01:00
Samuel Pitoiset
71b49aecdc radv: switch back radeon_cmdbuf to use 32-bit counters
This has been tested again with vkoverhead on 4 different CPUs and
using 32-bit counters is the fastest combination overall.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34229>
2025-04-01 06:18:28 +00:00
Samuel Pitoiset
f0b3a6f9d4 radv: rework command buffer emission with begin/end sequences
A begin/end sequence is something like (it's all macros based):

   radeon_begin(cs);
   radeon_emit(PKT3(PKT3_DRAW_INDEX_AUTO, 1, cmd_buffer->state.predicating));
   radeon_emit(vertex_count);
   radeon_emit(V_0287F0_DI_SRC_SEL_AUTO_INDEX | use_opaque);
   radeon_end();

This is loosely based on RadeonSI (see !8653 (a0978fff)) and it seems
indeed faster overall.

The main goal of this rework is to re-use the same logic as RadeonSI
for paired packets on GFX12 (also GFX11 dGPUs) because it's supposed
to be way faster, especially on GFX12 where the CP is slow. The other
goal is to share more cmdbuf emission between both drivers in the near
future.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34229>
2025-04-01 06:18:28 +00:00
Samuel Pitoiset
97e8872f1c radv: only enable HTILE for depth/stencil attachment images
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
It's really only useful for depth/stencil attachments. vkd3d and DXVK
both always use that usage flag for depth/stencil images.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34231>
2025-03-31 11:55:02 +00:00
Samuel Pitoiset
ba9988d230 radv: remove useless use of radv_image_use_comp_to_single()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34231>
2025-03-31 11:55:02 +00:00
Samuel Pitoiset
5398ec6356 radv: add queue family assertions when doing decompression passes
This is to make sure the previous functions that are supposed to
trigger a decompression pass work as expected.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34231>
2025-03-31 11:55:02 +00:00
Samuel Pitoiset
086f529bbe radv: do not trigger FCE or FMASK decompress on compute queue
A pipeline barrier which contains an image layout transition like
COLOR_ATTACHMENT_OPTIMAL -> TRANSFER_DST_OPTIMAL on compute queue
would just hang. Such a barrier is useless in practice but it's legal.

Prevent GPU hangs by skipping FCE or FMASK_DECOMPRESS when it's not
on the graphics queue.

Fixes dEQP-VK.synchronization2.layout_transition.compute_transition*.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34231>
2025-03-31 11:55:02 +00:00
Natalie Vock
c1e1d86bd1 radv/rt: Flush CP writes from the common BVH framework with INV_L2 on GFX12
a1b05991 ("radv/rt: Flush L2 after writing internal node offset on GFX12")
did this for radv-internal CP writes - we also need to do this for PLOC
sync data initialization which is done in the common framework.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34178>
2025-03-28 23:07:17 +00:00
Dave Airlie
dc8e21ce60 radv: expose VK_KHR_video_mainteance2
Reviewed-by: Lynne <dev@lynne.ee>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34204>
2025-03-28 21:18:00 +00:00
Dave Airlie
feef12b2a8 radv/video: convert to using common parameter wrappers.
Reviewed-by: Lynne <dev@lynne.ee>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34204>
2025-03-28 21:18:00 +00:00
Samuel Pitoiset
a7d8e5d4ca ac,radv,radeonsi: use PM4 for shadowed registers
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34228>
2025-03-28 20:50:22 +00:00
Samuel Pitoiset
250742519f radv: disable TC-compatible CMASK with {FMASK,DCC}_DECOMPRESS
Because if FMASK_COMPRESS_1FRAG_ONLY is set, the FMASK decompress
operation actually doesn't occur. Note that DCC_DECOMPRESS implicitly
decompresses FMASK.

This fixes an issue on GFX10-GFX10.3 which is uncovered by enabling
VK_EXT_sample_locations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33639>
2025-03-28 19:41:07 +00:00
Samuel Pitoiset
8c96b9e306 radv: make sure to always decompress FMASK before expanding it
This is actually required even for TC-compatible CMASK images.

VKCTS coverage is missing.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33639>
2025-03-28 19:41:07 +00:00
Samuel Pitoiset
42b0df447c radv: inline radv_fast_clear_flush_image_inplace()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33639>
2025-03-28 19:41:07 +00:00
Samuel Pitoiset
09d91837e4 radv: rework radv_handle_color_image_transition()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33639>
2025-03-28 19:41:07 +00:00
Samuel Pitoiset
7bb3a2363d radv: add radv_fmask_color_expand()
Similar to radv_fmask_decompress()/radv_fast_clear_eliminate() helpers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33639>
2025-03-28 19:41:06 +00:00
Samuel Pitoiset
aaf634cc24 radv: rework radv_fast_clear_flush_image_inplace()
FMASK_DECOMPRESS also implies FAST_CLEAR_ELIMINATE, so it can run first.
The only exception is fast-clear for color images that have DCC and
FMASK but without comp-to-single (only GFX10) because FMASK_DECOMPRESS
can't eliminate DCC fast-clears.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33639>
2025-03-28 19:41:06 +00:00
Samuel Pitoiset
a452098791 radv: skip FCE for comp-to-single fast clears with DCC MSAA
comp-to-single supports MSAA since a while and it's useless to perform
a fast clear eliminate for these fast color clears.

Only GFX10-GFX10.3 are affected because these are the only GPUs that
support DCC with MSAA with FMASK.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33639>
2025-03-28 19:41:06 +00:00
Samuel Pitoiset
8032f628ad radv: add a helper to emit PM4 commands to a CS
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34223>
2025-03-28 07:49:04 +00:00
Samuel Pitoiset
498fc42fa9 radv: add a helper to emit a PKT3_COPY_DATA with an immediate
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34223>
2025-03-28 07:49:04 +00:00
Samuel Pitoiset
cd08da2f20 radv/video: slightly change radv_vcn_sq_header()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34223>
2025-03-28 07:49:04 +00:00
Samuel Pitoiset
a2b6b6f1f9 radv: add more helpers to start/stop perfcounters
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34223>
2025-03-28 07:49:04 +00:00
Samuel Pitoiset
6d3ee9d8ad radv: use radv_cs_write_data_imm() more
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34223>
2025-03-28 07:49:04 +00:00
Samuel Pitoiset
7affd623c0 radv: slightly change the COND_EXEC for sampling performance counters
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34223>
2025-03-28 07:49:04 +00:00
Samuel Pitoiset
8d12578989 radv: add a helper to emit SPM muxsel
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34223>
2025-03-28 07:49:04 +00:00
Samuel Pitoiset
f12bf800e3 radv: add a helper to emit indirect buffer for draws/dispatches
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34223>
2025-03-28 07:49:04 +00:00
Samuel Pitoiset
af5cde7107 radv: apply some cosmetic changes for future begin/end CS sequences
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34223>
2025-03-28 07:49:04 +00:00
Samuel Pitoiset
391da996ed radv: rework the shader pointer emit as macros
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34223>
2025-03-28 07:49:04 +00:00
Samuel Pitoiset
ae8c0b06a7 radv: add radeon_event_write() macros
Similar to RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34145>
2025-03-27 07:09:07 +00:00
Samuel Pitoiset
344aa38925 radv: add new helper to emit PKT3_EVENT_WRITE for sampling queries
Everything in one function is easier to share.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34145>
2025-03-27 07:09:07 +00:00
Samuel Pitoiset
e2e8dca941 radv: rework radeon_set_uconfig_perfctr_reg_seq to use amd_ip_type
To be more generic.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34145>
2025-03-27 07:09:07 +00:00
Samuel Pitoiset
88df7e709a radv: move the optimized context reg macros with other similar ones
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34145>
2025-03-27 07:09:07 +00:00
Samuel Pitoiset
30948e63f4 radv: switch all emit helpers to macros
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34145>
2025-03-27 07:09:07 +00:00
Samuel Pitoiset
74a5266d8f radv: replace radeon_set_reg_seq by a macro
To be more close to RadeonSI, other similar functions will be replaced
by macros in the next commits.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34145>
2025-03-27 07:09:07 +00:00
Rhys Perry
0619cc45b7 radv/winsys: set has_distributed_tess for null winsys
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33978>
2025-03-26 20:52:53 +00:00
Rhys Perry
ee0be147b9 radv/winsys: set gart_page_size for null winsys
Fixes assertion failure when initializing memory types for devices without
dedicated vram.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33978>
2025-03-26 20:52:53 +00:00
Rhys Perry
4632ca258b radv/winsys: increase gfx12 vgprs for null winsys
LLVM has Feature1_5xVGPRs for both gfx1200 and gfx1201.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33978>
2025-03-26 20:52:53 +00:00
Samuel Pitoiset
c036736e2e radv/video: rework command buffer emission
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This is much closer to RadeonSI and could be shared at some point.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34150>
2025-03-26 14:59:12 +00:00
Samuel Pitoiset
0e0a393a4a radv/video: use a pointer to write the total task size
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34150>
2025-03-26 14:59:12 +00:00
Samuel Pitoiset
2c3b9312cc radv/meta: fix color<->depth/stencil image copies
The color format needs to be compatible with depth or stencil. Also
the depth/stencil format was incorrect when it's the source.

Fixes dEQP-VK.api.ds_color_copy.*
and VKD3D_TEST_FILTER=test_copy_texture.

Fixes: d4ff011b12 ("radv: advertise VK_KHR_maintenance8")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34142>
2025-03-26 13:27:03 +00:00
Samuel Pitoiset
ef0a6f59f3 radv: use PM4 for setting specific graphics registers in the preamble
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34172>
2025-03-26 10:14:22 +00:00
Samuel Pitoiset
c5d0764fce radv: remove radv_force_pstate_peak_gfx11_dgpu=true for Helldivers 2
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Our QA team extensively tested Helldivers 2 on AMD RX 7800 XT/RX 7600
with many different presents and didn't get any GPU hangs. Few users
also reported the game being very stable without this workaround.

Few other users reported issues with the workaround itself (like
pstate not correctly restored etc), so let's remove it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34164>
2025-03-26 09:33:19 +00:00
Samuel Pitoiset
4d68875acd radv: cleanup passing the aspect mask for SDMA operations
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Less error prone than it used to be.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34143>
2025-03-25 19:13:20 +00:00
Samuel Pitoiset
e60cafa533 radv: remove useless parameter to radv_sdma_get_buf_surf()
Same aspect mask is passed through.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34143>
2025-03-25 19:13:20 +00:00
Samuel Pitoiset
114fbdc534 radv: fix compresed depth/stencil copies on transfer queue
HTILE is always pipe aligned.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34143>
2025-03-25 19:13:20 +00:00
Samuel Pitoiset
7b15e85b95 radv: fix bpe for the stencil aspect of depth/stencil copies on transfer queue
Using the bpe of depth+stencil when copying the stencil aspect only
doesn't work.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34143>
2025-03-25 19:13:20 +00:00