Commit graph

2985 commits

Author SHA1 Message Date
Pierre-Eric Pelloux-Prayer
dc83195175 ac/virtio: disable userptr and local buffers
They're not supported yet so let's not pretend they are.

In particular use_local_buffers can cause VM_ALWAYS_VALID to be
used, which then prevents the creation of a dmabuf.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21658>
2025-01-16 12:24:32 +00:00
Pierre-Eric Pelloux-Prayer
22263616ed amd: amdgpu-virtio implementation
Native context support is implemented by diverting the libdrm_amdgpu functions
into new functions that use virtio-gpu.
VA allocations are done directly in the guest, using newly exposed libdrm_amdgpu
helpers (retrieved through dlopen/dlsym).

Guest <-> Host roundtrips can be expensive so we try to avoid them as much as
possible. When possible we also don't wait for the host reply in case where
it's not needed to get correct result.

Implicit sync works because virtio-gpu commands are submitted in order to the
host (there a single queue per device, shared by all the guest processes).

virtio-gpu also only supports one context per file description (but multiple
file descriptions per process) while amdgpu only allows one fd per process,
but multiple contexts per fd. This causes synchronization problems, because
virtio-gpu drops all sync primitive if they belong to the same fd/context/ring:
ie the amdgpu_ctx can't be expressed in virtio-gpu terms.

For now the solution is to only allocate a single amdgpu_ctx per application.

Contrary to radeonsi/radv, amdgpu_virtio can use libdrm_amdgpu directly: the
ones that don't rely on ioctl() are safe to use here.

Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21658>
2025-01-16 12:24:32 +00:00
Pierre-Eric Pelloux-Prayer
a565f2994f amd: move all uses of libdrm_amdgpu to ac_linux_drm
This is required to implement virtio native-context.

In a virtualized environment, most of the functions provided
by libdrm_amdgpu will be implemented using virtio.
This allows to implement efficient virtualization, by
forwarding the kernel API to the host, instead of the GL/VK calls.

Similarly, the raw 'fd' or 'gem_handle' arguments are replaced
by opaque types. This allows to encapsulate all the needed
state in the handle, and use unmodified API between baremetal
and virtualized contexts.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21658>
2025-01-16 12:24:32 +00:00
Samuel Pitoiset
7e6159d10c ac/surface: honor RADEON_SURF_PREFER_xxx_ALIGNMENT on GFX12
This allows to select a better alignment to not waste memory.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33033>
2025-01-16 08:49:38 +00:00
Marek Olšák
d160252270 ac: use Z_EXPORT_FORMAT=32_AR for Z + Alpha mrtz exports
This should be faster than 32_ABGR.

Also, stencil exports are changed from UINT16_ABGR to 32_GR,
which should have no effect on performance.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33046>
2025-01-16 02:58:03 +00:00
Timur Kristóf
50035f0316 ac/nir: Move all ac_nir_* files to a new folder.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:30 +01:00
Timur Kristóf
fe9eda9969 ac: Stop including ac_nir.h from ac_shader_util.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:28 +01:00
Timur Kristóf
cc43bd151b ac: Move AC_HS_MSG_VOTE_LDS_BYTES to ac_shader_util.h
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:27 +01:00
Timur Kristóf
736f61fa80 ac/nir: Move ac_nir_lower_ngg_mesh to separate file.
Along with it, move some functions to the prerast utils file.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:25 +01:00
Timur Kristóf
c1eb006695 ac/nir: Rename ac_nir_lower_ngg_ms to ac_nir_lower_ngg_mesh.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:23 +01:00
Timur Kristóf
955315f831 ac/nir: Move pre-rasterization related utilities in separate file.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:21 +01:00
Timur Kristóf
a986f9b90d ac/nir: Move ac_nir_lower_sin_cos to separate file.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:19 +01:00
Timur Kristóf
19bca6d425 ac/nir: Move ac_nir_lower_mem_access_bit_sizes to separate file.
Also ac_nir_flag_smem_for_loads along with it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:17 +01:00
Timur Kristóf
85eab189ee ac/nir: Move ac_nir_opt_pack_half to separate file.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:15 +01:00
Timur Kristóf
e79c77b1ef ac/nir: Move ac_nir_gs_shader_query declaration to ac_nir_helpers.h
This is a helper function, so drivers don't need to call it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:13 +01:00
Timur Kristóf
88c951bd46 ac/nir: Move ac_nir_lower_legacy_gs to separate file.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:11 +01:00
Timur Kristóf
6dd3f53204 ac/nir: Move ac_nir_lower_legacy_vs to separate file.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:10 +01:00
Timur Kristóf
d0e71ac9cd ac/nir: Move ac_nir_lower_intrinsics_to_args to separate file.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:08 +01:00
Timur Kristóf
a0b226bafb ac/nir: Expose ac_nir_unpack_value in ac_nir_helpers.h
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:06 +01:00
Timur Kristóf
1181348e80 ac/nir: Move ac_nir_create_gs_copy_shader to separate file.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:04 +01:00
Timur Kristóf
1191408d4b ac: Move ac_nir_config struct to ac_nir.h
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:03 +01:00
Timur Kristóf
4cad0bc438 ac/nir: Rename emit_streamout to ac_nir_emit_legacy_streamout
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:46:01 +01:00
Timur Kristóf
015e5080e9 ac: Stop including nir.h in ac_shader_util.h
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:45:36 +01:00
Timur Kristóf
305fdfddb5 ac/nir: Move ac_set_nir_options to ac_nir.c
And rename it to ac_nir_set_options to match other functions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:45:34 +01:00
Timur Kristóf
855de0483f ac/nir: Move ac_nir callback functions to ac_nir.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:45:32 +01:00
Timur Kristóf
cc0166462e ac/nir: Move ac_nir_get_mem_access_flags to ac_nir.c
And change its name to indicate that it is NIR specific.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:45:30 +01:00
Timur Kristóf
ad5c0b7103 ac/nir: Move ac_nir_lower_bit_size_callback to ac_nir.c
ac_shader_util should not concern itself with NIR stuff.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:45:28 +01:00
Marek Olšák
7e21b48a2e ac/nir: split ac_nir_lower_ps into 2 passes
It's split into ac_nir_lower_ps_early ac_nir_lower_ps_late.

ac_nir_lower_ps_early doesn't generate any AMD specific intrinsics except
some system values and is mainly an optimization pass with some lowering.
The new change here is that it also eliminates output components not needed
by spi_shader_col_format.

ac_nir_lower_ps_late lowers output stores to exports and does the bc_optimize
thing.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:45:25 +01:00
Marek Olšák
62c184c491 ac/nir: remove broadcast_last_cbuf because it can be deduced from NIR
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>
2025-01-14 13:45:22 +01:00
Samuel Pitoiset
603541f1a2 ac/gpu_info: add cp_dma_use_L2
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32971>
2025-01-13 08:07:58 +00:00
Marek Olšák
e640d5a9c3 amd: vectorize SMEM loads aggressively, allow overfetching for ACO
If there is a 4-byte hole between 2 loads, they are vectorized. Example:
    load 4 + hole 4 + load 8 -> load 16
This helps GLSL uniform loads, which are often sparse. See the code for more
info.

RADV could get better code by vectorizing later.

radeonsi+ACO - TOTALS FROM AFFECTED SHADERS (45482/58355)
  Spilled SGPRs: 841 -> 747 (-11.18 %)
  Code Size: 67552396 -> 65291092 (-3.35 %) bytes
  Max Waves: 714439 -> 714520 (0.01 %)

This should have no effect on LLVM because ac_build_buffer_load scalarizes
SMEM, but it's improved for some reason:

radeonsi+LLVM - TOTALS FROM AFFECTED SHADERS (4673/58355)
  Spilled SGPRs: 1450 -> 1282 (-11.59 %)
  Spilled VGPRs: 106 -> 107 (0.94 %)
  Scratch size: 101 -> 102 (0.99 %) dwords per thread
  Code Size: 14994624 -> 14956316 (-0.26 %) bytes
  Max Waves: 66679 -> 66735 (0.08 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29399>
2025-01-09 22:01:54 +00:00
Marek Olšák
abd5216ae8 ac,radeonsi: scalarize overfetching loads
There is nothing preventing ACO from generating loads with unused
components. This happens often with GLSL uniforms. Some of those loads
are partially re-vectorized after this.

radeonsi+ACO:

TOTALS FROM AFFECTED SHADERS (19564/58918)
  VGPRs: 732900 -> 728448 (-0.61 %)
  Spilled SGPRs: 429 -> 433 (0.93 %)
  Code Size: 38446004 -> 38485612 (0.10 %) bytes
  Max Waves: 305440 -> 305549 (0.04 %)

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29399>
2025-01-09 22:01:54 +00:00
Marek Olšák
58a88bbdb9 ac/nir/ngg: export positions after streamout to improve performance
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32686>
2025-01-09 20:47:16 +00:00
Marek Olšák
fc73749d6c ac/nir/ngg: fold so_vertex_index * so_stride into immediate offset
Instead of using a different voffset VGPR per streamout vertex,
point voffset to the first vertex for all 3 vertices because
the stride and vertex index are constant and can be in the immediate
offset.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32686>
2025-01-09 20:47:16 +00:00
Marek Olšák
97e82af162 ac/nir/ngg: vectorize streamout stores for NGG optimally
Walk the whole vertex stride thanks to XFB info sorted by offset, gather
individual components from same or different outputs, and once we have
gathered 4, store them as vec4.

It also removes the memory_modes field from VMEM stores because I don't
think it's needed.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32686>
2025-01-09 20:47:16 +00:00
Marek Olšák
4f2e2e10bc ac/nir: vectorize streamout stores for legacy pipeline optimally
Walk the whole vertex stride thanks to XFB info sorted by offset, gather
individual components from same or different outputs, and once we have
gathered 4, store them as vec4.

It also removes the COHERENT flag from VMEM stores because NGG streamout
doesn't use it either and I don't think it's needed.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32686>
2025-01-09 20:47:16 +00:00
Marek Olšák
e399f3bed9 ac/nir: sort xfb info to facilitate vectorization of xfb stores
xfb stores are not vectorized properly, leading to generating random soup
of b32, b64, b96, and b128 stores.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32686>
2025-01-09 20:47:16 +00:00
Samuel Pitoiset
f09f31d093 ac/nir: fix a comment typo in load_subgroup_id_lowered()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32940>
2025-01-09 08:02:19 +00:00
Samuel Pitoiset
44ba856089 ac/nir: fix lowering subgroup ID for compute shaders on GFX12
This is lowered in backend compilers (LLVM or ACO) because it needs
to access ttmp registers which aren't exposed to NIR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32940>
2025-01-09 08:02:19 +00:00
Marek Olšák
c20c46cf7b ac: update ATOMIC_MEM definitions
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32877>
2025-01-07 20:24:19 +00:00
Samuel Pitoiset
c5fe9dcf16 ac/descriptors: fix configuring NBC views on GFX12
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32892>
2025-01-07 09:15:12 +00:00
David Rosca
e33452a6d3 ac/surface: Don't force linear for VIDEO_REFERENCE with emulated image opcodes
This caused regression by using higher pitch than needed on compute-only
devices, resulting in video decode errors.

Fixes: 308bae950f ("ac/surface: Add RADEON_SURF_VIDEO_REFERENCE")
Tested-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32863>
2025-01-04 09:13:44 +00:00
Marek Olšák
7fbca998b1 amd: optimize atomics before lowering intrinsics
ac_nir_lower_intrinsics_to_args will lower most system values.

I have to keep the divergence analysis in ACO, otherwise it goes haywire.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>
2025-01-02 17:36:56 +00:00
Marek Olšák
5dd9171765 ac/nir: set upper ranges for range analysis while lowering system values
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>
2025-01-02 17:36:55 +00:00
Marek Olšák
0d5b03f2b9 ac/nir: split local_invocation_ids to 3 separate VGPR inputs
so that we can set the upper range per VGPR.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>
2025-01-02 17:36:55 +00:00
Marek Olšák
65d241c947 ac/nir: set arg_upper_bound_u32 for vs_rel_patch_id
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>
2025-01-02 17:36:55 +00:00
Marek Olšák
1d9fbe5387 ac/nir: add helper ac_nir_load_arg_upper_bound
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>
2025-01-02 17:36:55 +00:00
Marek Olšák
cfeaa45dc6 ac/nir: clean up ac_nir_lower_indirect_derefs
IO variables can't occur here anymore.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>
2025-01-02 17:36:55 +00:00
Marek Olšák
ae22da2ff8 ac/nir: lower more loads in ac_nir_lower_intrinsics_to_args instead of drivers
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>
2025-01-02 17:36:55 +00:00
Marek Olšák
ceb6f8fc32 amd: lower load_tess_rel_patch_id/primitive_id/tess_coord and overwrite.. in NIR
The overwrite instruction complicates it a little, which is why these
intrinsics are lowered together.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>
2025-01-02 17:36:55 +00:00