Commit graph

19483 commits

Author SHA1 Message Date
Georg Lehmann
0c42141299 aco: allow opsel for last v_alignbyte/bit operand
For completeness' sake.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13285
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39061>
2025-12-31 08:58:24 +00:00
Georg Lehmann
cbedced5e8 ac/nir/cull: do not reuse variables if subgroup ops are used
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Subgroup ops make divergence information useless for our purpose,
we would need workgroup divergence.

The game affected here has control flow dependent on vote_any,
so it's possible that a wave only executes the code after culling/reordering
invocations.
That means we can't reuse the maybe undefined value from before culling.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14459
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39060>
2025-12-29 18:38:29 +00:00
Samuel Pitoiset
78e1f53429 ac/perfcounter: update configuration of many blocks on GFX12
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39083>
2025-12-29 07:22:41 +00:00
Samuel Pitoiset
e377060e5c ac/perfcounter: rework computing the number of block instances on GFX12
This needs to be generalized to older generations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39083>
2025-12-29 07:22:41 +00:00
Samuel Pitoiset
a90b913817 ac/perfcounter: fix the number of static instances for some blocks on GFX12
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39083>
2025-12-29 07:22:41 +00:00
Samuel Pitoiset
a62ca19010 ac/perfcounter: update the number of events for GRBME_SE on GFX12
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39083>
2025-12-29 07:22:40 +00:00
Samuel Pitoiset
3317ea5122 ac/perfcounter: define a distribution mode for all perf blocks on GFX12
This will be used to compute the number of instances and more stuff.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39083>
2025-12-29 07:22:40 +00:00
Samuel Pitoiset
5de9390d4c ac/perfcounter: move configuration for GFX12 in a separate file
Performance counters are too different between generations and it's
less error prone to define them separately for each generations.

I'm starting with GFX12 first.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39083>
2025-12-29 07:22:39 +00:00
Samuel Pitoiset
b3c983b8dd amd,radv,radeonsi: add a new function to update windowed perf counters
Some checks failed
macOS-CI / macOS-CI (dri) (push) Has been cancelled
macOS-CI / macOS-CI (xlib) (push) Has been cancelled
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39065>
2025-12-24 07:20:01 +00:00
Samuel Pitoiset
47366527ce radv: fix capturing performance counters with SPM
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14333
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39065>
2025-12-24 07:20:01 +00:00
Samuel Pitoiset
e03461f3bd radv: change the default value of RADV_TRACE_CACHE_COUNTERS on < GFX10
To not print a warning about missing SPM by default on < GFX10.
Also move the function to radv_physical_device.c and make it non-static.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39065>
2025-12-24 07:20:01 +00:00
Timur Kristóf
450a6189de radv: Initialize transfer queue gang when needed
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Initialize gang CS on unsupported transfer operations.

Add a wait when:
- SDMA needs to wait for previous transfer operations on ACE
- ACE needs to wait for previous transfer operations on SDMA

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39057>
2025-12-23 12:14:59 +00:00
Timur Kristóf
cc5190829f radv: Declare some gang submit functions in radv private header.
They will be called from the transfer copy functions.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39057>
2025-12-23 12:14:59 +00:00
Timur Kristóf
b1938901d0 radv: Use SDMA fence packet when flushing gang semaphores
Add back the SDMA fence packet to radv_flush_gang_semaphore.
This was regressed by 9666bd1245.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39057>
2025-12-23 12:14:59 +00:00
Timur Kristóf
d71a05dffa radv: Implement gang semaphores for transfer queues.
We need to use gang semaphores in the following two scenarios:

1. Leader to follower semaphore:
Increment the leader to follower semaphore when the leader wants
to block the follower: a transfer operation on ACE needs to wait
for a previous operation on SDMA.

2. Follower to leader semaphore:
Increment the follower to leader semaphore when the follower wants
to block the leader: a transfer operation on SDMA needs to wait
for a previous operation on ACE.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39057>
2025-12-23 12:14:58 +00:00
Timur Kristóf
4d0975dc83 radv: Update comments for gang semaphores
Change the explanation to use "leader" and "follower" terminology.
Explain better how it is used with GFX/ACE and SDMA/ACE.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39057>
2025-12-23 12:14:58 +00:00
Timur Kristóf
65bf4e7dcd radv: Require gang submit and compute for transfer queues
RADV's transfer queue implementation will use compute for
the transfer operations that aren't supported by the SDMA,
so we'll need gang submissions for that.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39057>
2025-12-23 12:14:58 +00:00
Timur Kristóf
f481a5f887 radv: Add function to determine if SDMA supports an image.
The following are not supported by SDMA:
- Sparse images (aka. PRT) on older GPUs
- Multisampled images

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39057>
2025-12-23 12:14:58 +00:00
Timur Kristóf
f55771a17d radv: Bypass L2 for gang semaphore BO with SDMA/ACE
When the "gang leader" is SDMA, we need to ensure that the
gang semaphores BO is coherent between SDMA and CP.
To achieve this, we need bypass the L2 cache when either SDMA
or CP are connected to L2.

Suggested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39057>
2025-12-23 12:14:58 +00:00
Timur Kristóf
7dbabc6acc ac/nir/lower_taskmesh_io_to_mem: Use AC_TASK_DRAW_ENTRY_BYTES
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Replace draw_entry_bytes with AC_TASK_DRAW_ENTRY_BYTES.
This is 16 on all AMD HW that supports task/mesh shaders.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39032>
2025-12-22 15:17:59 +00:00
Timur Kristóf
fc57fa4589 radv, radeonsi: Don't pass task ring info to mesh/task payload lowering
The pass now uses the ring descriptors to figure these out.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39032>
2025-12-22 15:17:59 +00:00
Timur Kristóf
4d381c9136 ac/nir/lower_taskmesh_io_to_mem: Don't hardcode payload entry size in shaders
Currently the number of task payload entry size is hardcoded
in shaders as a constant. This isn't a good idea because it
makes the code inflexible, eg. doesn't allow us
to change the number of entries dynamically.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39032>
2025-12-22 15:17:59 +00:00
Timur Kristóf
5348d953aa ac/nir/lower_taskmesh_io_to_mem: Don't hardcode num_entries in shaders
Currently the number of task shader ring entries is hardcoded
in shaders as a constant. This isn't a good idea because it
makes the code inflexible, eg. prevents us from using the same
shader binary accross some chips as well as doesn't allow us
to change the number of entries dynamically.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39032>
2025-12-22 15:17:58 +00:00
Samuel Pitoiset
3b18fa348e ac/rgp: enable new performance counters for RGP 2.6 on GFX10-GFX11
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
GFX12 needs more work and it will be added separately.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:52:14 +01:00
Samuel Pitoiset
8bc37d0d19 ac/spm: add support for Ray Tracing counters in RGP
These aren't new in RGP 2.6, they have been added since a while. But
because RADV wasn't supporting the new derived SPM chunk it wasn't
possible to expose them.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:51:44 +01:00
Samuel Pitoiset
0b5ae0758e ac/spm: add support for new Memory percentage counters in RGP 2.6
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:51:14 +01:00
Samuel Pitoiset
3d2bb52a81 ac/spm: add support for new Memory bytes counters in RGP 2.6
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:50:44 +01:00
Samuel Pitoiset
84ecdc534c ac/spm: add support for new LDS counters in RGP 2.6
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:50:41 +01:00
Samuel Pitoiset
07d9fc574c ac/spm: implement the new derived SPM chunk for performance counters
This is the new method to add performance counters to RGP captures.
This will be used to add the new RGP 2.6 counters too.

The previous SPM code will be deprecated at some point but it's hard
to support all generations in one batch. So, I will implement this
step by step.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:48:59 +01:00
Samuel Pitoiset
3e4d629458 ac/spm: add an ID to raw performance counters
This will be used to compute derived values for the new RGP/SPM chunk.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:48:29 +01:00
Samuel Pitoiset
21ad7e4e32 ac/spm: print an error message when a group is unknown
Help debugging.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:48:21 +01:00
Samuel Pitoiset
7da6fe6a00 ac/spm: fix programming more than one counter slot
Some blocks have two or more SPM counters and they should be used when
more than 4 counters are programmed (ie. 16-bit per counter).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:48:14 +01:00
Samuel Pitoiset
e5a041ee1c ac/spm: add an assertion to check the number of global instances
To make sure counters aren't silently discarded.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:48:06 +01:00
Samuel Pitoiset
eca9c00430 ac/spm: adjust configuration of some GPU blocks
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:47:58 +01:00
Samuel Pitoiset
6613dfb234 ac/perfcounter: add GCEA block description on GFX10-11
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:47:29 +01:00
Samuel Pitoiset
25e28819bd ac/perfcounter: adjust the number of events for TD on GFX10.3
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:47:21 +01:00
Samuel Pitoiset
a4cb114f5a ac/perfcounter: add a separate group for GFX10.3
This is just a copy&paste but GFX10.3 has way more counters than GFX10
that will be added later.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39013>
2025-12-22 09:47:09 +01:00
Samuel Pitoiset
044e7f6017 radv/nir: fix front_face opts for points/lines and unknown prim
Fixes new VKCTS coverage dEQP-VK.glsl.builtin_var.frontfacing.*.

Fixes: af375c6756 ("radv: Optimize fs builtins using static gfx state")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39041>
2025-12-22 07:59:30 +00:00
Daniel Schürmann
7b1f6fa6fc aco: remove radeon_family from aco::Program
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:48 +00:00
Daniel Schürmann
1e8d367537 amd: add and use ac_cu_info::has_vtx_format_alpha_adjust_bug
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:48 +00:00
Daniel Schürmann
febc29907c amd: add and use ac_cu_info::has_gfx6_mrt_export_bug
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:47 +00:00
Daniel Schürmann
7b7bdb76ab amd: add ac_cu_info::has_point_sample_accel flag and use in ACO
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:47 +00:00
Daniel Schürmann
cfb745592d amd: add ac_cu_info::has_mad32 flag and use in ACO
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:47 +00:00
Daniel Schürmann
1e3db50170 aco: use additional flags from ac_cu_info
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:46 +00:00
Daniel Schürmann
f7c4aa48a0 ac/gpu_info: add some more flags to ac_cu_info
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:46 +00:00
Daniel Schürmann
f791e46c47 aco: add ac_cu_info to aco_compiler_options
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:46 +00:00
Daniel Schürmann
addd4ea59f aco: pass aco_compiler_options to init_program()
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:46 +00:00
Daniel Schürmann
bf9bec07c2 aco/tests: don't pass CHIP_UNKNOWN to ACO
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:46 +00:00
Daniel Schürmann
6f4e8046b5 ac/gpu_info: create separate function ac_fill_cu_info() to fill out CU info
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:45 +00:00
Daniel Schürmann
749c619c45 ac/gpu_info: correct some SGPR and VGPR allocation values in ac_cu_info
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:45 +00:00