Commit graph

10399 commits

Author SHA1 Message Date
Marek Olšák
6286c1c66f nir/opt_vectorize_io: optionally vectorize loads with holes
e.g. load X; load W; ==> load XYZW. Verified with a shader test.

This will be used by AMD drivers. See the code comments.

Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36098>
2025-07-15 16:29:30 +00:00
Marek Olšák
b0494f9485 nir/opt_varyings: optimize the consumer after constant propagation and dedupli.
A TF2 shader propagates 0 to the consumer, which eliminates 1 input
if we run algebraic opts and DCE before compaction.

This is a prerequisite for removing all IO var optimizations from the GLSL
linker that are redundant with nir_opt_varyings.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36091>
2025-07-15 13:38:29 +00:00
Samuel Pitoiset
b59895140d radv: add a way to disable the HIZ/HiS events based workaround on GFX12
This workaround doesn't mitigate the issue reliably/completely. An
alternative (but complex) solution also exists.

This introduces a small option that allows to disable the current
workaround as preliminary work.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36110>
2025-07-15 10:01:54 +00:00
Pavel Gribov
24cb745460 radv: small fix for sam check
for exact PCIe 3.0 x8 case there will be
pcie_bandwidth_mbps >= bandwidth_mbps_threshold => (8069 >= 8069,12) == false

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36109>
2025-07-15 09:37:32 +00:00
Samuel Pitoiset
fbea486854 radv: advertise VK_EXT_host_image_copy on GFX10+ behind RADV_PERFTEST=hic
This exposes an experimental implementation of HIC with
RADV_PERFTEST=hic. It's passing 100% of VKCTS but it requires some
benchmarks first to verify if performance is acceptable or not.

No addrlib support for GFX6-9.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:16 +00:00
Samuel Pitoiset
ea4ad51eb1 radv: implement vkTransitionImageLayout()
It's a no-op.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:16 +00:00
Samuel Pitoiset
b2e338a9c7 radv: implement vkCopyImageToImageEXT()
Because there is no surface<->surface helper in addrlib, this allocates
a temporary buffer on the CPU to do image->buffer->image. It's a naive
implementation which is probably not the best for performance, but it
works at least.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:16 +00:00
Samuel Pitoiset
c9ea920da0 radv: implement vkCopyMemoryToImageEXT()/vkCopyImageToMemoryEXT()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:16 +00:00
Samuel Pitoiset
4a5370819c radv: do not use MRT counters for host-transfer images
Otherwise, the tile swizzle changes and addrlib is confused.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:15 +00:00
Samuel Pitoiset
8d38b25cb3 radv: add support for querying HIC memcpy size
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:15 +00:00
Samuel Pitoiset
031843ebb1 radv: add support for querying HIC performance info
On GFX12, everything is compressed with DCC and it's completely
transparent to the userspace driver, so that should be optimal.

On older gens, using HIC disables compression which isn't optimal
for device access.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:15 +00:00
Samuel Pitoiset
d89b11011f radv: add support for formats with host-transfer
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:15 +00:00
Samuel Pitoiset
545d5a0675 radv: set RADEON_SURF_HOST_TRANSFER for host-transfer images
To forbid some swizzles on GFX11.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:14 +00:00
Samuel Pitoiset
37f3997edf radv: disable compression for host-transfer images
HIC isn't supposed to have compression.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:14 +00:00
Samuel Pitoiset
afa7509207 radv: map images with host-transfer at bind time
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:13 +00:00
Samuel Pitoiset
d85b7b6c62 radv: only expose host visible memory types for images with host-transfer
Because the memory must be mapped on the CPU.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35974>
2025-07-15 09:12:13 +00:00
David Rosca
f78222dc29 radv/video: Add support for decode tier3
On VCN5 both distinct and coincide output/dpb are supported. Tier3
(coincide) requires tiling, Tier2 (distinct) also works with linear.
Application can decide which one to use.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35878>
2025-07-14 07:42:28 +00:00
David Rosca
2d06b43292 radv: Enable tiling for video images on VCN5
All planes must have the same swizzle mode and no tile swizzle.
Only linear decode target requires the custom height alignment.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35878>
2025-07-14 07:42:27 +00:00
David Rosca
e50ee32876 radv: Don't allow linear tiling for video DPB images
We don't support linear DPB images and they will currently always
be tiled internally.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35878>
2025-07-14 07:42:27 +00:00
Samuel Pitoiset
a51afbaff8 radv/sdma: fix unaligned 96-bits copies on GFX9
On SDMA4, when the pitch isn't aligned, the width needs to be scaled
by 3 for 96-bits formats.

On SDMA5+, the pitch is aligned and the driver doesn't need to fallback
to unaligned copies.

CC: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36067>
2025-07-14 06:30:55 +00:00
Yogesh Mohan Marimuthu
1000ee3d2f ac,radeonsi,radv: rename register_shadowing_required
rename register_shadowing_required to has_kernelq_reg_shadowing

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35106>
2025-07-13 20:05:25 +00:00
Marek Olšák
34580a32ff ac/nir: remove redundant option dont_export_cull_distances
It has the same value as can_cull.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:21 +00:00
Marek Olšák
fde3384cfd ac/nir: remove pack_clip_cull_distances option
it's always true

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:21 +00:00
Marek Olšák
f6aecfb886 ac/llvm: don't declare LDS as an array for HS & GS & CS, use IntToPtr(0)
We don't need all this stuff anymore.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:21 +00:00
Marek Olšák
65c5ee1628 radeonsi: stop using LLVM LDS linking logic for the GS out LDS offset
This will enable large code removal.

shader->config.lds_size is now always computed the same as ACO except for
compute shaders.

We have to add a new 8-bit user SGPR bitfield called
GS_STATE_GS_OUT_LDS_OFFSET_256B, which contains the offset
that was previously set by the relocation.

Since the offset must be a multiple of 256, we have to add padding
to the LDS size computation to make sure the alignment to 256 for the ESGS
LDS size doesn't cause us to exceed the maximum LDS size.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:20 +00:00
Marek Olšák
f8918ed6c6 radv: stop using LLVM LDS linking logic
Not needed.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:06 +00:00
Marek Olšák
44dd39d121 radv: pack clip and cull distance outputs for both legacy and NGG pipelines
This increases primitive throughput when packing reduces the number
of pos exports due to holes in clip and cull distance arrays that could be
punched out by nir_opt_clip_cull_const. This applies to all chips.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:06 +00:00
Marek Olšák
2751d488ce radv: enable nir_opt_clip_cull_const for GS too
The pass also supports GS now.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:05 +00:00
Marek Olšák
bdcfe15457 radv: don't export cull distances if the shader culls against them
This increases primitive throughput for all hw with NGG if the shader
culls and the removal of cull distances reduces the number of position
exports.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:05 +00:00
Marek Olšák
0cce0505cc radv: compute the number of position outputs after compilation
It will be different between NGG and legacy because NGG with culling
will not export cull distances.

The number of position exports could also be gathered from final NIR
to reduce logic duplication.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:05 +00:00
Marek Olšák
21646b0124 radv: don't include positions exports in pipeline executable stats
It will be different between NGG and legacy because NGG with culling
will not export cull distances.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:04 +00:00
Marek Olšák
88a1c1f881 radv: enable NGG culling for GS
This is very useful for increasing raw primitive throughput for GS
(mostly just RDNA 2), increasing raw primitive throughput with clip
and cull distance outputs when they actually cull anything (RDNA 1-4),
and reducing attribute store bandwidth usage (RDNA 3-4).

It will also replace fixed-func culling against cull distances when
culling in the shader is enabled, which will increase primitive throughput
even further.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:04 +00:00
Marek Olšák
ae4d539540 radv: rework radv_link_shaders_info as as not be called in a loop
It receives all shaders and decides how to link them.

When culling is enabled for GS, we will need ES, GS, and FS in this
function at the same time.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:03 +00:00
Marek Olšák
b97c4bfd58 radv: enable W/front/back face NGG culling with multiple viewports
This is supported.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:03 +00:00
Marek Olšák
89e1ec92c5 radv: cull against clip and cull distances in the shader
Clip and cull distance outputs decrease primitive throughput, so culling
against them in the shader has even more benefit than other culling
options.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:03 +00:00
Marek Olšák
65972f2301 ac/nir: return GSVS emit sizes from legacy GS lowering and simplify shader info
This simplifies shader info in drivers by returning GSVS emit sizes from
ac_nir_lower_legacy_gs. The pass knows the sizes, so drivers shouldn't
have to determine them independently.

This also makes the values more accurate because both drivers were
computing the GSVS emit sizes inaccurately and had redundant fields
in shader info. RADV had a lot of redudancy there.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:02 +00:00
Marek Olšák
c1d3108855 radv: call radv_get_legacy_gs_info after ac_nir_lower_legacy_gs
The pass will determíne the GSVS ring size, so radv_get_legacy_gs_info
must be called after that.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:01 +00:00
Marek Olšák
76ce37058d radv: set the maximum possible workgroup size for legacy GS before linking
The optimal workgroup size will be set after lowering.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:00 +00:00
Marek Olšák
d674e97d5c radv: use shared ac_legacy_gs_compute_subgroup_info
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:20:00 +00:00
Marek Olšák
8a1e357f71 radv: use shared ac_ngg_compute_subgroup_info
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12496

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>
2025-07-12 05:19:59 +00:00
David Rosca
95986108a6 radv/video: Add couple missing encode flags and stdSyntaxFlags
Also remove VK_VIDEO_ENCODE_H264_STD_WEIGHTED_BIPRED_IDC_EXPLICIT_BIT_KHR,
only default and implicit are supported.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35972>
2025-07-11 15:21:21 +00:00
Samuel Pitoiset
fa23a50322 radv: simplify creating descriptor sets with variable desciptor count
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36044>
2025-07-11 06:04:41 +00:00
Samuel Pitoiset
b7f4e344bc radv: fix the maximum variable descriptor count with inline uniform blocks
It must not be larger than maxInlineUniformBlockSize.

Fixes recent VKCTS
dEQP-VK.api.maintenance3_check.support_count_inline_uniform_block*.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36044>
2025-07-11 06:04:41 +00:00
Samuel Pitoiset
36879c4f99 radv: fix indexing with variable descriptor count
The Vulkan spec says:
    "If bindingCount is zero or if this structure is not included in
     the pNext chain, the VkDescriptorBindingFlags for each descriptor
     set layout binding is considered to be zero. Otherwise, the
     descriptor set layout binding at
     VkDescriptorSetLayoutCreateInfo::pBindings[i] uses the flags in
     pBindingFlags[i]."

Fixes recent VKCTS dEQP-VK.api.maintenance3_check.*.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13503
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36044>
2025-07-11 06:04:40 +00:00
Samuel Pitoiset
01fccec1dc ac/descriptors,radv: move the nbc view param to the gfx10 union
This is only GFX10+.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36043>
2025-07-11 05:46:50 +00:00
Samuel Pitoiset
36e4b52e9f radv: do not perform a per-pixel copy for BCn formats with mips on GFX12+
This is unnecessary because GFX12 isn't affected by this clamping issue
when NO_EDGE_CLAMP is correctly configured.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36043>
2025-07-11 05:46:50 +00:00
Qiang Yu
88c79a13b9 ac,radv: move nir_load_ring_mesh_scratch_offset_amd to ac
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
To be shared with radeonsi.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35931>
2025-07-11 02:25:51 +00:00
Qiang Yu
5ddbd8c83b ac,radv: move mesh scratch ring constants to ac
To be shared with radeonsi.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35931>
2025-07-11 02:25:51 +00:00
Qiang Yu
78fed5fc13 ac,radv: move nir_load_task_ring_entry_amd to ac
To be shared with radeonsi.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35931>
2025-07-11 02:25:51 +00:00
Qiang Yu
1dbe7b65a1 radv: change mesh shader gs_vgpr_comp_cnt for gfx11
It should always be 0, but now it's set to be 1 when gfx11
which allocate one unused VGPR.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35931>
2025-07-11 02:25:51 +00:00