Commit graph

17349 commits

Author SHA1 Message Date
Rhys Perry
1bd5ae7b14 aco: refactor can_use_vopd so that it returns flags
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34246>
2025-04-17 14:00:29 +00:00
Rhys Perry
d4b418bbb9 aco: add are_src_banks_compatible helper for VOPD creation
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34246>
2025-04-17 14:00:29 +00:00
Rhys Perry
4b0da5b51f aco: rename is_opy_only to can_be_opx
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34246>
2025-04-17 14:00:29 +00:00
Rhys Perry
408fa33c09 aco/gfx12: don't use second VALU for VOPD's OPX if there is a WaR
fossil-db (gfx1201):
Totals from 38908 (49.02% of 79377) affected shaders:
Instrs: 30268107 -> 30268131 (+0.00%); split: -0.00%, +0.00%
CodeSize: 180843648 -> 180843640 (-0.00%); split: -0.00%, +0.00%
Latency: 224905962 -> 224906072 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 44322988 -> 44323004 (+0.00%)
VALU: 15124145 -> 15124167 (+0.00%)
VOPD: 4018504 -> 4018482 (-0.00%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 25.0
Backport-to: 25.1
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34246>
2025-04-17 14:00:29 +00:00
Samuel Pitoiset
209a0ede98 radv: add a function to emit meshlet registers on GFX11+
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34518>
2025-04-17 12:49:47 +00:00
Samuel Pitoiset
836757bec3 radv: tidy up radv_emit_ps_epilog_state()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34518>
2025-04-17 12:49:47 +00:00
Samuel Pitoiset
dca35b7226 radv: tidy up radv_emit_geometry_shader()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34518>
2025-04-17 12:49:47 +00:00
Samuel Pitoiset
d999afeb7a radv: tidy up radv_emit_vertex_shader()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34518>
2025-04-17 12:49:47 +00:00
Samuel Pitoiset
85fdf69027 radv: simplify combining TES/VS+GS config registers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34518>
2025-04-17 12:49:47 +00:00
Samuel Pitoiset
0dd9833348 radv: remove redundant assertion when emitting PS epilog state
It's already checked by radv_emit_32bit_pointer().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34518>
2025-04-17 12:49:47 +00:00
Samuel Pitoiset
a230d2daa3 radv: use radeon_set_sh_reg() for only 1 DWORD
It's just shorter to write.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34518>
2025-04-17 12:49:47 +00:00
Samuel Pitoiset
11e8a96495 radv: use common scratch tmpring size programming
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
No logical changes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34549>
2025-04-17 10:35:40 +00:00
Samuel Pitoiset
710d7ea8b8 radv: compute the optimal scratch wavesize
This might increase the scratch BO sizes but it's supposed to be
faster because scratch waves would be distributed among memory channels.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34549>
2025-04-17 10:35:40 +00:00
Samuel Pitoiset
e433a57650 ac,radeonsi: rework computing scratch wavesize and tmpring register
To be re-used by RADV.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34549>
2025-04-17 10:35:40 +00:00
Samuel Pitoiset
d94f8b4460 ac/gpu_info,radv: add scratch_wavesize_granularity info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34549>
2025-04-17 10:35:40 +00:00
Daniel Stone
8d08cde667 ci/piglit: Use structured tagging for Piglit
Structured tagging (cf. mesa/mesa!33421) captures a checksum of the
thing we think we're building, and verifies this through the chain.

When we run container builds, we check that the tag we've captured in
the CI variables matches the calculated checksum, to make sure the
declared tags are consistent and we always have traceability.

When we run tests, we check the tags again between what was declared in
the CI variables and what we're actually running from the test
container. This makes sure that we're always testing what we think we're
testing.

As a side advantage, the rule inheritance we need to make this work
means that we can start doing more optional downloads via overlays,
instead of pulling a whole container full of stuff we might not ever
use.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34539>
2025-04-17 09:22:39 +00:00
Samuel Pitoiset
e616761fb2 radv: re-introduce the compute vs CP DMA heuristic for copy/fill operations
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This caused a -5% performance regression in Control because using
compute always eats resources.

This new approach introduces a flag called RADV_COPY_FLAGS_DEVICE_LOCAL
which can be used to indicate if the underlying memory is device local.
This should also help for future work.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12639
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34556>
2025-04-17 08:59:58 +00:00
Samuel Pitoiset
5e2508e7c4 radv: simplify radv_fill_xxx() helpers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34556>
2025-04-17 08:59:58 +00:00
Samuel Pitoiset
8ba94d8263 radv: add radv_fill_image() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34556>
2025-04-17 08:59:58 +00:00
Samuel Pitoiset
0fa43b5bfb radv: use radv_fill_memory() in the accel struct path
It's now possible to remove the NULL BO check.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34556>
2025-04-17 08:59:58 +00:00
Caio Oliveira
d5ad798140 spirv, radv, intel: Add NIR intrinsic for cmat conversion
A cooperative matrix conversion operation was represented in NIR by the
cmat_unary_op intrinsic with an nir_alu_op as extra parameter,
that was already lowered to a specific conversion operation
based on the matrix types.

Instead of that, add a new intrinsic `cmat_convert` that is specific
for that conversion.  In addition to the src/dst matrix descriptions
already available, also include the signedness information in the
intrinsic (reuse nir_cmat_signed for that).  This is needed because
different Convert operations define different interpretations for
integers, regardless their original type.

In this patch, both radv and intel were changed to use the same logic
that was previously used to pick the lowered ALU op.

This change will help represent cmat conversions involving BFloat16,
because it avoids having to create new NIR ALU ops for all the
combinations involving BFloat16.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34511>
2025-04-16 23:13:36 +00:00
Samuel Pitoiset
b4940255ed radv/sdma: add support for compression on GFX12
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Similar to previous generations that support compression, except that
the driver don't need to configure a meta VA because DCC is completely
transparent to the userspace.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34517>
2025-04-16 06:57:00 +00:00
Samuel Pitoiset
efa0b16bb2 radv/sdma: add a new flag to know if the surface is compressed
On GFX12, DCC is transparent to the driver and there is no meta VA.
Adding a new flag to know if the SDMA surface is compressed is needed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34517>
2025-04-16 06:57:00 +00:00
Samuel Pitoiset
03671ccf9e radv/sdma: use the correct helper to get the number type field
This wasn't technically incorrect because V_028C70_BU_NUM_xxx values
are similar to V_028C70_NUMBER_xxx but it's better to use the corect
helper.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34517>
2025-04-16 06:57:00 +00:00
Samuel Pitoiset
b44dc98cde radv/sdma: remove redundant check for compression when getting metadata
It's already checked by the caller.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34517>
2025-04-16 06:57:00 +00:00
Samuel Pitoiset
d3d5d2fe86 radv/sdma: use SDMA5_DCC_xxx bitfields
It's cleaner.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34517>
2025-04-16 06:57:00 +00:00
Samuel Pitoiset
f44342199a radv/sdma: simplify configuring the number of uncompressed DCC blocks
SDMA doesn't support MSAA, so the value can be
V_028C78_MAX_BLOCK_SIZE_256B.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34517>
2025-04-16 06:57:00 +00:00
Samuel Pitoiset
13db408e59 ac/perfcounter: add support for GFX12
Sourced from PAL to add SPM support.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34524>
2025-04-16 06:35:33 +00:00
Samuel Pitoiset
c42d43e8eb radv: print more error messages during SPM initialization
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34524>
2025-04-16 06:35:33 +00:00
Marek Olšák
78cacfd9ce ac/surface: select 3D tile mode without overallocating too much for gfx6-8
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12466
Fixes: c87ce78d - ac/surface: enable thick tiling for 3D textures for better perf on gfx6-8

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34432>
2025-04-16 06:08:48 +00:00
Marek Olšák
195e7b4f75 ac/surface: make gfx12_estimate_size reusable by gfx6
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12466
Fixes: c87ce78d - ac/surface: enable thick tiling for 3D textures for better perf on gfx6-8

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34432>
2025-04-16 06:08:48 +00:00
Marek Olšák
2c122d478b ac/nir: set X=0 for task->mesh shader dispatch when Y or Z is 0
The code set X=0 when Y and Z is 0, not "or".

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34432>
2025-04-16 06:08:48 +00:00
Marek Olšák
963147d7fd ac/gpu_info: add 256 to payload_entry_size to increase future task shader perf
It has no effect because num_entries is 1K, but the table shows a lot of
potential.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34432>
2025-04-16 06:08:48 +00:00
Marek Olšák
d7c903f258 ac/gpu_info: add payload_entry_size into ac_task_info
to stop causing full RADV recompiles when it's changed.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34432>
2025-04-16 06:08:48 +00:00
Marek Olšák
0dafd04695 ac/gpu_info: remove has_tmz_support function
It's not needed since:
    8b3056343f - ac/gpu_info: bump required DRM minor version to 3.42.0 (kernel 5.15+)

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34432>
2025-04-16 06:08:48 +00:00
Marek Olšák
0be5a3559a ac/gpu_info: increase the attribute ring size for gfx12
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34432>
2025-04-16 06:08:48 +00:00
Eric Engestrom
54bcfb4c1f ci/deqp: fix vulkan video build
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34532>
2025-04-15 17:23:05 +00:00
Samuel Pitoiset
e86e0fc525 radv: allocate the SPM BO in GTT for faster readback
Reading VRAM from CPU is very slow.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34467>
2025-04-15 06:30:38 +00:00
Samuel Pitoiset
8ea46b14fa ci: update VKCTS main to 76c1572eaba42d7ddd9bb8eb5788e52dd932068e
RADV is the only driver using VKCTS main.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34299>
2025-04-14 08:24:14 +00:00
Samuel Pitoiset
410f7f9f6e radv: only enable DCC for invisible VRAM on GFX12
DCC should only be allowed on invisible VRAM, otherwise the CPU could
read the data and it will read garbage if it's compressed.

This also caused GPU hangs after suspend/resume probably because
some buffers were compressed when moved back from GTT to VRAM.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12962
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12922
Fixes: 9af11bf306 ("radv: add initial DCC support on GFX12")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34347>
2025-04-14 07:39:33 +00:00
Samuel Pitoiset
75be860eec radv: use paired context regs when optimal on GFX12
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
CP is very slow on GFX12 and parsing the packet header is the main
bottleneck. Using paired context regs reduce the number of packet
headers and it should be more optimal.

It doesn't seem worth when only one context reg is emitted (one packet
header and same number of DWORDS) or when consecutive context regs are
emitted (would increase the number of DWORDS).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34421>
2025-04-14 06:18:13 +00:00
Samuel Pitoiset
f92f50c58a radv: add macros for paired context registers on GFX12
Imported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34421>
2025-04-14 06:18:13 +00:00
Konstantin Seurer
676e26aed5 radv: Fix rayTracingPositionFetch with multiple geometies
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The fix adds more indirections to avoid increasing register pressure by
tracking the primitive address.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34460>
2025-04-11 22:26:08 +00:00
Timur Kristóf
371b1bf789 radv: Don't call nir_opt_varyings a second time when unnecessary.
When nir_opt_varyings doesn't make progress the first time,
it should not be necessary to call it a second time.

No Fossil DB changes.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33880>
2025-04-11 18:01:47 +00:00
Timur Kristóf
403b3958c1 radv: Move preparation and fixup to separate loops in varying optimization.
This is to stop calling nir_shader_gather_info repeatedly for
some stages, and also as a pre-requisite to the work in the next commits.

No Fossil DB changes.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33880>
2025-04-11 18:01:47 +00:00
Timur Kristóf
a98186bbf6 radv: Refactor loops in radv_graphics_shaders_link_varyings.
No functional changes, just improved code readability.

No Fossil DB changes.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33880>
2025-04-11 18:01:47 +00:00
Timur Kristóf
1942227e73 radv: Inline radv_graphics_shaders_link_varyings_{first/second}.
The first step of reorganizing this code.

No Fossil DB changes.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33880>
2025-04-11 18:01:47 +00:00
Timur Kristóf
412af41258 radv: Add radv_foreach_stage to ForEachMacros again.
This was lost when .clang-format was removed
from the amd folder.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33880>
2025-04-11 18:01:47 +00:00
David Rosca
f1f87d302f radv/video: Always enable B pictures for H264 encode
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
We always allocate the extra memory needed for B pictures, so there is
no reason not to also enable B pictures always.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34449>
2025-04-11 11:15:47 +00:00
David Rosca
a1fbaddc9c radv/video: Use ac_vcn_enc_init_cmds
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34449>
2025-04-11 11:15:47 +00:00