Commit graph

63652 commits

Author SHA1 Message Date
Marek Olšák
a84729d368 radeonsi/ci: add gfx11 flakes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
205646cd77 winsys/amdgpu: simplify code using amdgpu_cs_context::chunk_ib
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
44df9517cd winsys/amdgpu: don't use amdgpu_fence::ctx for fence dependencies
The only remaining use of ctx is amdgpu_fence_is_syncobj.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
7ccdcae4b5 winsys/amdgpu: use pipe_reference for amdgpu_ctx refcounting
this is the standard utility for refcounting

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
33980355d4 winsys/amdgpu: implement explicit fence dependencies as sequence numbers
This eliminates redundant fence dependencies if BOs add the same ones.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
6d7a76595d winsys/amdgpu: remove dependency_flags parameter from cs_add_fence_dependency
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
6ac0b4ef05 winsys/amdgpu: rename amdgpu_bo_real::lock to map_lock
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
1e2c02d76b winsys/amdgpu: rename amdgpu_bo_sparse::lock -> commit_lock
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
e1261c77b5 winsys/amdgpu: rename amdgpu_winsys_bo::bo -> bo_handle
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
4d486888ee winsys/amdgpu: rewrite BO fence tracking by adding a new queue fence system
This decreases the time spent in amdgpu_cs_submit_ib from 15.4% to 8.3%
in VP2020/Catia1, which is a decrease of CPU load for that thread by 46%.
Overall, it increases performance by a small number in CPU-bound benchmarks.
The biggest improvement I have seen is VP2020/Catia2, where it increases
FPS by 12%.

It no longer stores pipe_fence_handle references inside amdgpu_winsys_bo.

The idea is to have a global fixed list of queues (only 1 queue per IP
for now) where each queue generates its own sequence numbers (generated
by the winsys, not the kernel). Each queue also has a ring of fences.
The sequence numbers are used as indices into the ring of fences, which
is how sequence numbers are converted to fences.

With that, each BO only has to keep a list of sequence numbers, 1 for each
queue. The maximum number of queues is set to 6. Since the system can
handle integer wraparounds of sequence numbers correctly, we only need
16-bit sequence numbers in BOs to have accurate busyness tracking. Thus,
each BO uses only 12 bytes to represent all its fences for all queues.
There is also a 1-byte bitmask saying which sequence numbers are
initialized.

amdgpu_winsys.h contains the complete description. It has several
limitations that exist to minimize the memory footprint and updating of
BO fences.

Acked-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
b976f8fc1e winsys/amdgpu: compute bo->unique_id at pb_slab_alloc, not at memory allocation
We would compute the unique IDs for 1000 slab entries and then only use
a few, wasting the IDs. Assign the IDs only when we actually need to
return a new buffer.

This decreases the number of collisions we get in amdgpu_lookup_buffer,
and thus the number of times we have to search in the BO list.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
32dae84d43 winsys/amdgpu: allocate 1 amdgpu_bo_slab_entry per cache line
The structure size is exactly 64 bytes, so every entry occupies exactly
1 cache line.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
6d913a2bcc r300,r600,radeonsi: switch to pb_buffer_lean
to remove pb_buffer::vtbl from all buffer structures

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
d2c76c4d77 winsys/radeon: stop using pb_buffer::vtbl
Only the destroy function used it.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
6c4ab02674 gallium/pb_cache: remove pb_cache_entry::buffer
The buffer pointer is always at a constant offset from pb_cache_entry,
so just pass the "offsetof" value to pb_cache and use that to get
the pointer.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
20bf2a06fb gallium/pb_cache: remove pb_cache_entry::mgr
We can just pass it via functions.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
d7de372358 gallium/pb_cache: switch to pb_buffer_lean
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
39c1311766 gallium/pb_buffer: define pb_buffer_lean without vtbl, inherit it by pb_buffer
amdgpu doesn't need vtbl.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
eb19f0daa3 winsys/amdgpu: don't use gpu_address to compute slab entry offset in bo_map
use the code we have in amdgpu_bo_get_va

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
a8e98882ea winsys/amdgpu: remove va (gpu_address) from amdgpu_bo_slab_entry
Keep it only in amdgpu_bo_real and amdgpu_bo_sparse. Slab entries can
compute it from the slab BO and adding their entry index.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
3cc2562ac0 winsys/amdgpu: remove now-redundant amdgpu_bo_slab_entry::real
The pb_slab pointer can be used to get the BO pointer because pb_slab is
inside the BO structure now.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
49bf2545fe winsys/amdgpu: add amdgpu_bo_real_reusable slab for the backing buffer
Add contents of amdgpu_bo_slab into it. This will allow removing the "real"
pointer from amdgpu_bo_slab_entry because "(char*)entry.slab" is now
pointing next to it.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
cf2dc2d512 winsys/amdgpu: don't layer slabs, use only 1 level of slabs, it improves perf
This increases FPS in VP2020/Catia1 by 10-18%!!!!!!!!!!!!!!!!!!!!!!!

I have no rational explanation for this.

In the most extreme case, 8192 256B slab BOs (smallest size) are now
allocated from a single 2MB slab.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
4a078e693e r300,r600,radeon/winsys: always pass the winsys to radeon_bo_reference
This will allow the removal of pb_cache_entry::mgr.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
643f390de5 radeon_winsys: add struct radeon_winsys* parameter into fence_reference
Since the radeon winsys implements fences as buffers, we need radeon_winsys*
to destroy them. This will enable the removal of pb_cache_entry::mgr.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:30 +00:00
Marek Olšák
e847c6e301 gallium/pb_cache: switch time variables to milliseconds and 32-bit type
to decrease pb_cache_entry by 8 bytes.

Add msecs_base_time to offset time == 0 to the creation of pb_cache.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:29 +00:00
Marek Olšák
896c8b67cb gallium/pb_cache: remove pb_cache_entry::end to save space
just compute it at each use

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:29 +00:00
Marek Olšák
523a4f71f2 winsys/amdgpu: stop using pb_buffer::vtbl
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:29 +00:00
Marek Olšák
7752579202 winsys/amdgpu: rename to amdgpu_bo_slab to amdgpu_bo_slab_entry
It's a slab entry. "Slab" is the whole buffer, which is AMDGPU_BO_REAL
if we want to be precise.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:29 +00:00
Marek Olšák
b3c64638b4 iris,zink,winsys/amdgpu: remove unused/redundant slab->entry_size
slab->base has the same field.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:29 +00:00
Marek Olšák
9431c33899 gallium/pb_slab: move group_index and entry_size from pb_slab_entry to pb_slab
This removes 8 bytes from every slab entry, and thus amdgpu_bo_slab.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:29 +00:00
Marek Olšák
5a3bacc376 winsys/amdgpu: reduce wasted memory due to the size tolerance in pb_cache
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>
2024-01-06 20:55:29 +00:00
Karol Herbst
1e5bc00715 rusticl/program: add LLVM functions to cache timestamp
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24386>
2024-01-06 03:09:48 +00:00
Karol Herbst
299f949775 rusticl/meson: generate bindings for LLVM
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24386>
2024-01-06 03:09:48 +00:00
Eric Engestrom
a0fab95bc0 lvp: update symbols that have become aliases for newer ones
All of these have been renamed in the spec (usually by being promoted);
renamed them in our code too.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26494>
2024-01-06 00:49:53 +00:00
Mark Janes
a6a95591aa intel/dev: poison macros for workarounds fixed at a stepping
INTEL_NEEDS_WA macros are valid when a workaround applies to all
platforms which have the GFX_VERx10 versions for the workaround.

Some workarounds were fixed at a stepping after the platform release.
If a workaround applies partially to any platform, then GFX_VERx10
cannot be used to correctly apply the workaround.

This change invalidates INTEL_NEEDS_WA_16014538804 and
INTEL_NEEDS_WA_22014412737, which were fixed for MTL platforms at
stepping b0.  The run-time checks were already present for all uses of
these macros.  Updating the poisoned macros to INTEL_WA_{num}_GFX_VER
compiles out the run-time checks on platforms where they cannot apply.

Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26898>
2024-01-05 22:51:45 +00:00
Yonggang Luo
d6c258d9ee util: Add align_uintptr and use it treewide to replace ALIGN that works on size_t and uintptr_t
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26866>
2024-01-05 21:54:35 +00:00
Mary Guillemard
b6d828576e zink: Always fill external_only in zink_query_dmabuf_modifiers
Fix piglit.spec@ext_image_dma_buf_import@ext_image_dma_buf_import-modifiers
randomly skipping some tests as external_only content was never initialized.

Cc: mesa-stable

Reviewed-by: default avatarMike Blumenkrantz <michael.blumenkrantz@gmail.com>
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26305>
2024-01-05 13:30:45 +00:00
Mary Guillemard
db0f177edd zink: Initialize pQueueFamilyIndices for image query / create
Fixes: d922850e36 ("zink: break out image/buffer create info structs into helper funcs")

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26303>
2024-01-05 13:59:49 +01:00
Karol Herbst
5ff33f9905 rusticl: use real buffer for cb0 for drivers prefering
At the moment it's radeonsi and zink.

Consequentially this also fixes data races in zink due to driver internal
reasons.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25946>
2024-01-05 01:26:44 +01:00
Karol Herbst
900ce1f4f4 rusticl/queue: release bound constant buffer
This fixes memory leaks in drivers.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25946>
2024-01-05 01:26:44 +01:00
Karol Herbst
5f97ef3d03 rusticl: add QueueContext to track GPU state
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25946>
2024-01-05 01:26:44 +01:00
Karol Herbst
a4f47ba52c rusticl: specify buffer bindings explicitly
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25946>
2024-01-05 01:26:44 +01:00
Karol Herbst
b06f6e00fb zink: fix heap-use-after-free on batch_state with sub-allocated pipe_resources
zink_bo_create can run into a heap-use-after-free when the bo is still
referencing an batch_state from an older destroyed context. In order to
fix this, every context gives back their batch_states to the zink, where
they can be reused from for new contexts.

Cc: mesa-stable
Suggested-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26889>
2024-01-04 20:56:09 +00:00
Corentin Noël
b8e06fa48a virgl: Only send the same amount of data than declared in pipe_sampler_state
Adjust the masks to only send the data that we are sure to actually use.

Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26856>
2024-01-04 11:00:12 +00:00
Pavel Ondračka
cc7ce6c01f r300: mark load_ubo_vec4 with ACCESS_CAN_SPECULATE
This is safe to do in all circumstances due to the age of the hardware.
(we don't have UBOs, just constant registers with automatic OOB checks)

R500 hardware doesn't have standard adress register in fragment shaders
and while we have the loop register which we in theory can use for indirect
access, this is currently not possible to wire through NIR. So anytime
there is an indirect uniform array access in a loop, we end with a if
ladder with size depending on the size of the uniform array. The two worst
behaving apps here are glamor and some GTK shaders, both of which are
sometimes ending over the 512 instructions limit. Flattening the if
ladders helps a LOT, so we can get into the instruction limit in most
cases (all glamor shaders are OK now). So just enable the flattening by
setting all load_ubo_vec4 with ACCESS_CAN_SPECULATE.

Shader-db RV530:
total instructions in shared programs: 128762 -> 128440 (-0.25%)
instructions in affected programs: 540 -> 218 (-59.63%)
helped: 3
HURT: 0
total temps in shared programs: 17543 -> 17550 (0.04%)
temps in affected programs: 11 -> 18 (63.64%)
helped: 0
HURT: 3
total cycles in shared programs: 196984 -> 196657 (-0.17%)
cycles in affected programs: 592 -> 265 (-55.24%)
helped: 3
HURT: 0

LOST:   0
GAINED: 7

No changes for R300/R400 because there we don't have control flow
anyway.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6366
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26877>
2024-01-04 08:27:42 +01:00
Pavel Ondračka
f8a5cba3b4 r300: remove backend LRP lowering
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26816>
2024-01-04 08:02:01 +01:00
Pavel Ondračka
f62a128274 r300: remove backend CMP lowering
Leave assert in place for now though.

No changes in shader-db.

Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26816>
2024-01-04 08:02:01 +01:00
Pavel Ondračka
e6e1da8124 r300: lower ftrunc in NIR
and remove the backend TRUNC lowering.

Shader-db RV370:
total instructions in shared programs: 82155 -> 82154 (<.01%)
instructions in affected programs: 38 -> 37 (-2.63%)
helped: 1
HURT: 0
total consts in shared programs: 80719 -> 80733 (0.02%)
consts in affected programs: 2775 -> 2789 (0.50%)
helped: 0
HURT: 14

Shader-db RV530:
total presub in shared programs: 7676 -> 7702 (0.34%)
presub in affected programs: 81 -> 107 (32.10%)
helped: 0
HURT: 26

Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26816>
2024-01-04 08:02:01 +01:00
Pavel Ondračka
77f429e1a5 r300: fcsel_ge lowering from lowered ftrunc
The fcsel lowering for R3xx happens already in the main loop, here we
only do it for the fcsel_ge that comes from the frunc.

No change in shader-db

Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26816>
2024-01-04 08:02:01 +01:00