Commit graph

18081 commits

Author SHA1 Message Date
David Rosca
850a3b0cae radv/video: Set correct VP9 decode minCodedExtent
Fixes: b8ac2d47e7 ("radv/video: add KHR_video_decode_vp9 support.")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35997>
2025-07-15 17:44:15 +00:00
David Rosca
50eaa0c19f radv/video: Set correct H264/5 decode minCodedExtent
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35997>
2025-07-15 17:44:15 +00:00
Marek Olšák
6286c1c66f nir/opt_vectorize_io: optionally vectorize loads with holes
e.g. load X; load W; ==> load XYZW. Verified with a shader test.

This will be used by AMD drivers. See the code comments.

Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36098>
2025-07-15 16:29:30 +00:00
Marek Olšák
b0494f9485 nir/opt_varyings: optimize the consumer after constant propagation and dedupli.
A TF2 shader propagates 0 to the consumer, which eliminates 1 input
if we run algebraic opts and DCE before compaction.

This is a prerequisite for removing all IO var optimizations from the GLSL
linker that are redundant with nir_opt_varyings.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36091>
2025-07-15 13:38:29 +00:00
Samuel Pitoiset
b59895140d radv: add a way to disable the HIZ/HiS events based workaround on GFX12
This workaround doesn't mitigate the issue reliably/completely. An
alternative (but complex) solution also exists.

This introduces a small option that allows to disable the current
workaround as preliminary work.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36110>
2025-07-15 10:01:54 +00:00
Pavel Gribov
24cb745460 radv: small fix for sam check
for exact PCIe 3.0 x8 case there will be
pcie_bandwidth_mbps >= bandwidth_mbps_threshold => (8069 >= 8069,12) == false

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36109>
2025-07-15 09:37:32 +00:00
Samuel Pitoiset
d510f95f67 radv/ci: enable RADV_PERFTEST=hic for GFX10+ jobs
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:16 +00:00
Samuel Pitoiset
fbea486854 radv: advertise VK_EXT_host_image_copy on GFX10+ behind RADV_PERFTEST=hic
This exposes an experimental implementation of HIC with
RADV_PERFTEST=hic. It's passing 100% of VKCTS but it requires some
benchmarks first to verify if performance is acceptable or not.

No addrlib support for GFX6-9.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:16 +00:00
Samuel Pitoiset
ea4ad51eb1 radv: implement vkTransitionImageLayout()
It's a no-op.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:16 +00:00
Samuel Pitoiset
b2e338a9c7 radv: implement vkCopyImageToImageEXT()
Because there is no surface<->surface helper in addrlib, this allocates
a temporary buffer on the CPU to do image->buffer->image. It's a naive
implementation which is probably not the best for performance, but it
works at least.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:16 +00:00
Samuel Pitoiset
c9ea920da0 radv: implement vkCopyMemoryToImageEXT()/vkCopyImageToMemoryEXT()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:16 +00:00
Samuel Pitoiset
4a5370819c radv: do not use MRT counters for host-transfer images
Otherwise, the tile swizzle changes and addrlib is confused.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:15 +00:00
Samuel Pitoiset
8d38b25cb3 radv: add support for querying HIC memcpy size
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:15 +00:00
Samuel Pitoiset
031843ebb1 radv: add support for querying HIC performance info
On GFX12, everything is compressed with DCC and it's completely
transparent to the userspace driver, so that should be optimal.

On older gens, using HIC disables compression which isn't optimal
for device access.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:15 +00:00
Samuel Pitoiset
d89b11011f radv: add support for formats with host-transfer
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:15 +00:00
Samuel Pitoiset
545d5a0675 radv: set RADEON_SURF_HOST_TRANSFER for host-transfer images
To forbid some swizzles on GFX11.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:14 +00:00
Samuel Pitoiset
37f3997edf radv: disable compression for host-transfer images
HIC isn't supposed to have compression.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:14 +00:00
Samuel Pitoiset
afa7509207 radv: map images with host-transfer at bind time
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:13 +00:00
Samuel Pitoiset
d85b7b6c62 radv: only expose host visible memory types for images with host-transfer
Because the memory must be mapped on the CPU.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:13 +00:00
Samuel Pitoiset
a75bd251df ac/surface: add a flag to forbid some swizzles for surface<->memory copies
256KiB (also block variables) aren't supported on GFX11.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:13 +00:00
Samuel Pitoiset
f5f2392cf7 ac/surface: add support for surface<->memory copy using addrlib
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:13 +00:00
Samuel Pitoiset
16be376cc5 ac/surface: constify bpe_to_format()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:12 +00:00
Daniel Schürmann
47ef60cbf1 aco/ra: always use bytes for register stride requirements
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
instead of a mixture between dwords and bytes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36053>
2025-07-14 08:45:29 +00:00
Valentine Burley
8f1eca471f radv/ci: Lower concurrency of radv-raven-traces-restricted
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36074>
2025-07-14 08:15:25 +00:00
David Rosca
f78222dc29 radv/video: Add support for decode tier3
On VCN5 both distinct and coincide output/dpb are supported. Tier3
(coincide) requires tiling, Tier2 (distinct) also works with linear.
Application can decide which one to use.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35878>
2025-07-14 07:42:28 +00:00
David Rosca
2d06b43292 radv: Enable tiling for video images on VCN5
All planes must have the same swizzle mode and no tile swizzle.
Only linear decode target requires the custom height alignment.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35878>
2025-07-14 07:42:27 +00:00
David Rosca
e50ee32876 radv: Don't allow linear tiling for video DPB images
We don't support linear DPB images and they will currently always
be tiled internally.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35878>
2025-07-14 07:42:27 +00:00
David Rosca
e2554c8f51 ac/surface: Support RADEON_SURF_FORCE_SWIZZLE_MODE on gfx12
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35878>
2025-07-14 07:42:26 +00:00
Samuel Pitoiset
a51afbaff8 radv/sdma: fix unaligned 96-bits copies on GFX9
On SDMA4, when the pitch isn't aligned, the width needs to be scaled
by 3 for 96-bits formats.

On SDMA5+, the pitch is aligned and the driver doesn't need to fallback
to unaligned copies.

CC: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36067>
2025-07-14 06:30:55 +00:00
Yogesh Mohan Marimuthu
0068dbd76b ac: enable kernelq reg shadowing only when userq is disabled
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35106>
2025-07-13 20:05:25 +00:00
Yogesh Mohan Marimuthu
1000ee3d2f ac,radeonsi,radv: rename register_shadowing_required
rename register_shadowing_required to has_kernelq_reg_shadowing

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35106>
2025-07-13 20:05:25 +00:00
Marek Olšák
34580a32ff ac/nir: remove redundant option dont_export_cull_distances
It has the same value as can_cull.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:21 +00:00
Marek Olšák
54c969882b ac/nir: rename ac_nir_get_lds_gs_out_slot_offset -> ac_nir_get_gs_out_lds_offset
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:21 +00:00
Marek Olšák
fde3384cfd ac/nir: remove pack_clip_cull_distances option
it's always true

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:21 +00:00
Marek Olšák
7bbc4ef719 ac/llvm: rename misnamed get_memory_ptr -> get_shared_mem_ptr
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:21 +00:00
Marek Olšák
0fbdefd770 ac/llvm: remove LDS linking code
LDS sizes and offsets from LLVM are no longer used.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:21 +00:00
Marek Olšák
f6aecfb886 ac/llvm: don't declare LDS as an array for HS & GS & CS, use IntToPtr(0)
We don't need all this stuff anymore.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:21 +00:00
Marek Olšák
5ded4f3c7d aco: remove unused aco_symbol_lds_ngg_gs_out_vertex_base
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:21 +00:00
Marek Olšák
24260644e8 radeonsi: remove now unused LLVM LDS logic for NGG
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:21 +00:00
Marek Olšák
65c5ee1628 radeonsi: stop using LLVM LDS linking logic for the GS out LDS offset
This will enable large code removal.

shader->config.lds_size is now always computed the same as ACO except for
compute shaders.

We have to add a new 8-bit user SGPR bitfield called
GS_STATE_GS_OUT_LDS_OFFSET_256B, which contains the offset
that was previously set by the relocation.

Since the offset must be a multiple of 256, we have to add padding
to the LDS size computation to make sure the alignment to 256 for the ESGS
LDS size doesn't cause us to exceed the maximum LDS size.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:20 +00:00
Marek Olšák
c743c3dd1a radeonsi: support 8 non-ClipVertex clip planes instead of 6
If there are more than 6 planes without gl_ClipVertex and gl_ClipDistance,
add "gl_ClipVertex = gl_Position;" to support up to 8 planes.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:20 +00:00
Georg Lehmann
92d433c54a aco: vectorize conversions from 8bit to 16bit
Massively helps emulated fp8 performance.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35854>
2025-07-12 08:39:15 +00:00
Georg Lehmann
7fece5592c aco: vectorize 16bit extracts
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35854>
2025-07-12 08:39:14 +00:00
Georg Lehmann
a045e9a624 ac/nir: lower uniform extract_i8/u8 to 32bit
To prevent vectorizing this later.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35854>
2025-07-12 08:39:13 +00:00
Georg Lehmann
2cc3e1876c ac/llvm: support vec2 extract
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35854>
2025-07-12 08:39:13 +00:00
Marek Olšák
f8918ed6c6 radv: stop using LLVM LDS linking logic
Not needed.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:06 +00:00
Marek Olšák
44dd39d121 radv: pack clip and cull distance outputs for both legacy and NGG pipelines
This increases primitive throughput when packing reduces the number
of pos exports due to holes in clip and cull distance arrays that could be
punched out by nir_opt_clip_cull_const. This applies to all chips.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:06 +00:00
Marek Olšák
2751d488ce radv: enable nir_opt_clip_cull_const for GS too
The pass also supports GS now.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:05 +00:00
Marek Olšák
bdcfe15457 radv: don't export cull distances if the shader culls against them
This increases primitive throughput for all hw with NGG if the shader
culls and the removal of cull distances reduces the number of position
exports.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:05 +00:00
Marek Olšák
0cce0505cc radv: compute the number of position outputs after compilation
It will be different between NGG and legacy because NGG with culling
will not export cull distances.

The number of position exports could also be gathered from final NIR
to reduce logic duplication.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:05 +00:00