Commit graph

2812 commits

Author SHA1 Message Date
Samuel Pitoiset
f7652de1f1 Revert "ac/surface: add RADEON_SURF_VIEW_3D_AS_2D_ARRAY for GFX9+"
This reverts commit dc5ef90547.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31869>
2024-10-28 12:47:38 +00:00
Samuel Pitoiset
aa19bf3d93 amd/descriptors: set fmask_tile_swizzle for TC-compat CMASK images on GFX8
This is required.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31797>
2024-10-28 08:21:12 +01:00
Samuel Pitoiset
927a17f30a amd: do not emit PA_SU_PRIM_FILTER_CNTL in the common GFX preamble
RADV needs to adjust this register for user sample locations because
it seems possible to have a sample on the -8 coordinate.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31815>
2024-10-25 07:41:22 +00:00
Daniel Schürmann
87cb42f953 treewide: don't lower to LCSSA before calling nir_divergence_analysis()
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Daniel Schürmann
8d1abd4996 treewide: use nir_src_is_divergent() rather than checking the divergence of the SSA
Without LCSSA, divergence between src and def might differ.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Daniel Schürmann
c8348139fd nir: change signature of nir_src_is_divergent()
Now, it takes nir_src * instead of nir_src.
Also move the implementation to nir_divergence_analysis.c.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Marek Olšák
45d8cd037a ac/nir: rewrite ac_nir_lower_ps epilog to fix dual src blending with mono PS
Unigine Heaven with AMD_DEBUG=mono has incorrect rendering on gfx11
because it doesn't set nir_io_semantics::dual_source_blend_index for
the second output, resulting in garbage asm.

Instead of trying to find out what's wrong, I decided to rewrite this
to make it the same as the LLVM IR path. It simplifies the code and fixes
Unigine Heaven with AMD_DEBUG=mono.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31669>
2024-10-21 12:06:14 +00:00
Pierre-Eric Pelloux-Prayer
5607c7ee49 ac/surface: fix determination of gfx12_enable_dcc
For surfaces without a modifier, the surf_size check wasn't
necessary, but it was also invalid since surf_size is set later
(in gfx12_compute_miptree).

Since it's not required anyway, drop this check.

Fixes: 060d5dacfd ("ac: add gfx12 DCC shared code")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31683>
2024-10-18 14:04:04 +02:00
Georg Lehmann
cba575f4df nir: always emit ddx intrinsics
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>
2024-10-17 09:50:19 +00:00
Marek Olšák
02923e237d nir: add hole_size parameter into the vectorize callback
It will be used to allow merging loads with a hole between them.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>
2024-10-15 05:50:24 +00:00
David Heidelberg
d14d3c5bdd amd: Pass addrlib cpp args to the tests
The declaration and definition used by tests otherwise differs from
addrlib.
Found by LTO -Werror=lto-type-mismatch.

Fixes: 1d69c0419b ("amd/addrlib: prevent defining regparm differently")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31613>
2024-10-14 16:52:31 +00:00
David Rosca
1e1f078099 radeonsi/vcn: Add support for VCN5 AV1 compound
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31520>
2024-10-10 09:02:21 +00:00
David Rosca
8b2f0fb574 radeonsi/vcn: Support raw packed headers for AV1
Same as H264/HEVC, we still write sequence header ourselves
and slice header is sent to FW, everything else gets copied
directly to output bitstream buffer.
Fixes generating correct output with libva-utils/av1encode.
Also fixes temporal delimiter insertion, it's no longer forced
on every frame, but instead it lets application handle it.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31520>
2024-10-10 09:02:21 +00:00
David Rosca
813812b925 radeonsi/vcn: Switch to app DPB management for AV1
Also move the common part of the frame header into shared function.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31520>
2024-10-10 09:02:20 +00:00
Samuel Pitoiset
dc5ef90547 ac/surface: add RADEON_SURF_VIEW_3D_AS_2D_ARRAY for GFX9+
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31385>
2024-10-01 08:33:51 +00:00
Marek Olšák
246051ebc6 ac/gpu_info: print 32bpp modifiers
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31187>
2024-09-27 19:21:55 +00:00
Marek Olšák
89db355cc4 ac/llvm: use LLVM processor gfx942 for GFX940 when it's available
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31187>
2024-09-27 19:21:55 +00:00
Marek Olšák
163222abd0 ac/nir: set .image_dim and .image_array for all opcodes
for consistency

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31187>
2024-09-27 19:21:55 +00:00
Marek Olšák
14b576e023 ac: make sure VEGA20 and MI200 version ranges don't overlap with other chips
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31187>
2024-09-27 19:21:55 +00:00
Georg Lehmann
151cd9c92b ac/lower_ngg: use is_subgroup_invocation_lt_amd offset
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31184>
2024-09-26 14:29:14 +00:00
David Rosca
1459193b99 ac: Add VCN IB parser
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31275>
2024-09-23 19:25:08 +00:00
David Rosca
72ae8e25a8 ac: Add remaining VCN encode defines
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31113>
2024-09-20 06:58:29 +00:00
David Rosca
aed89d28d3 ac: Add ac_vcn_init_enc_cmds
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31113>
2024-09-20 06:58:29 +00:00
David Rosca
8ecad47695 ac: Fix typo RENCDOE -> RENCODE
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31113>
2024-09-20 06:58:29 +00:00
David Rosca
d6cf36b4d2 radeonsi/vcn: Add rc_per_pic_ex encode command
This makes it a bit cleaner as VCN5 goes back to using base rc_per_pic.

Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31113>
2024-09-20 06:58:29 +00:00
Georg Lehmann
2789cee0c0 amd/nir: add ac_nir_opt_shared_append
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31075>
2024-09-19 16:21:47 +00:00
Marek Olšák
0d8fe2d03b ac/nir/meta: tune clear/copy_buffer performance for gfx6-10.3
Finally, old GPUs have optimal clear/copy_buffer performance, but only
the top dGPU of each generation gets the best behavior.

Other dGPUs might need slightly different conditions.
APUs likely need very different conditions.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31082>
2024-09-17 15:19:32 -04:00
Ganesh Belgur Ramachandra
62592674e0 amd: fix incorrect PIPE_INTERLEAVE_BYTES size for CDNA chips
The expected PIPE_INTERLEAVE_BYTES size is ADDR_PIPEINTERLEAVE_256B on
gfx940 (or other CDNA based chips). Since CDNA based chips like gfx940
doesn't support image opcodes, it gets gibberish value from the kernel.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30891>
2024-09-16 09:31:49 +00:00
Marek Olšák
1537b9355a ac,radeonsi: update comments related to the L2 cache, use "L2", not "TC"
"GL2" is also OK. "TC-compatible" is also OK.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30869>
2024-09-07 01:51:23 +00:00
Marek Olšák
1b94137039 ac/nir/meta: move the "skip compute if no DCC image stores" condition to common
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30869>
2024-09-07 01:51:23 +00:00
Marek Olšák
5250128c6a ac: fix WAVES_PER_SH value for gfx12
not a serious issue because we only use it for PRIME without SDMA IIRC

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30869>
2024-09-07 01:51:23 +00:00
Timur Kristóf
79df320463 ac/nir: Move varying cost functions from radeonsi to common code.
This code will be shared between RADV and RadeonSI.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28676>
2024-09-06 09:16:21 +00:00
Timur Kristóf
4d5bc893b4 ac/nir/tess: Remove no_inputs_in_lds.
When there are no VS outputs, we expect that the drivers set
the LS-HS vertex stride to zero, which will produce the
same result as no_inputs_in_lds did.

Remove the unnecessary code path from the output lowering.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30962>
2024-09-05 19:54:29 +00:00
Marek Olšák
52c41f25de ac/nir/tess: don't allocate LDS for HS inputs that are passed via VGPRs
Right now we don't allocate LDS for HS inputs when all HS inputs are passed
via VGPRs.

This changes it to skip allocating exactly the HS inputs passed via VGPRs
by reducing the inputs_read mask to remove holes.

radeonsi changes to the LDS allocation will be in a different MR.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30962>
2024-09-05 19:54:29 +00:00
Qiang Yu
588a65f29a ac: do not lower some ops in nir_lower_packing
AMD does not implement nir_op_pack_32_4x8_split, others
are implemented, so don't lower them.

Fixes: 0f937426cc ("radeonsi: lower subgroup ops after wave size is known")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11781
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30885>
2024-08-30 05:46:51 +00:00
Yinjie Yao
7f1c0fbe61 radeonsi/vcn: Rename transform_skip_disabled and remove hardcoded value for VCN5
This fix the HEVC encode corruption caused by mismatch between PPS
header and IB setting, the fix only apply for VCN5.
Rename from transform_skip_dicarded to transform_skip_disabled.

Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30930>
2024-08-30 01:17:22 +00:00
Samuel Pitoiset
2fda0db66f ac,radeonsi,radv: add common GFX preambles
RADV and RadeonSI have a few differences.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30789>
2024-08-27 14:14:57 +00:00
Samuel Pitoiset
80e8e18cc6 ac: add ac_gfx103_get_cu_mask_ps()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30789>
2024-08-27 14:14:57 +00:00
Benjamin Cheng
95a980b61f radv/video: add event support for VCN4
This was the main missing piece for passing vulkan video CTS
as the video firmwares couldn't do proper vulkan events.

With new enough firmware this is now possible.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30837>
2024-08-26 22:19:09 +00:00
Qiang Yu
58e412014a ac,radv,radeonsi: stop using quad vote any/all when llvm
ClustedAnd with bool argument and cluster_size==4 will be lowered
to quad_vote_all. So does ALU nir_iand/ior op with bool src.

OpenGL and Vulkan subgroup clustered_and tests with bool argument
fail when using LLVM. It seems LLVM has bug when quad vote bool
is in complex control flow. So stop using it for now.

Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30610>
2024-08-26 10:46:15 +08:00
David Rosca
6e2ae9c581 radeonsi/vcn: Use pipe header params in H264 header encoder
This now supports writing all fields as we get them on input from
packed headers.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30672>
2024-08-23 10:00:02 +02:00
David Rosca
af849516f0 radeonsi/vcn: Use pipe header params in HEVC header encoder
This now supports writing all fields as we get them on input from
packed headers.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30672>
2024-08-23 10:00:02 +02:00
David Rosca
32c6a61e2b radeonsi/vcn: Switch to app DPB management for H264 and HEVC encode
This removes the internal DPB management logic, which was unnecessary as
it was duplicating what applications already do, and it was also causing
issues when the internal DPB would de-sync from application DPB (eg.
driver removes reference that application still intends to use).

DPB is now dynamically resized instead of using fixed number of slots.
This also saves a lot of memory with HEVC encoding, as that was always
using the max_references which va frontend sets to 15.

Move reconstructed pictures to the end of the context and meta buffers
to ensure resizing works correctly.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30672>
2024-08-23 09:59:58 +02:00
Marek Olšák
665eae51ef amd: update addrlib
There are some changes in ac_surface.c to make this work.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30507>
2024-08-16 21:44:32 +00:00
Marek Olšák
07554d32db ac/nir: adjust gfx11 tuning for the compute blit
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30208>
2024-08-10 02:14:44 +00:00
Marek Olšák
db7823e8b9 ac/nir: adjust performance-related decisions for clear/copy_buffer shader
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30208>
2024-08-10 02:14:44 +00:00
Marek Olšák
361266fec7 ac/nir: import the clear/copy_buffer compute shader from radeonsi
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30208>
2024-08-10 02:14:44 +00:00
Timur Kristóf
f317311bad ac/nir: Shorten the name of ac_nir_calc_io_offset_mapped.
The other variant of this function doesn't
exist anymore, so there is no ambiguity.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Timur Kristóf
c9b5ef0e53 ac/nir/tess: Simplify calculation of HS output LDS offset.
No functional change, just make the code more readable.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Timur Kristóf
d43466e917 ac/nir: Remove ac_nir_calc_io_offset function.
This function is not used anymore, because none of the callers
rely on driver locations (intrinsic base) anymore.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00