Commit graph

2796 commits

Author SHA1 Message Date
Marek Olšák
89db355cc4 ac/llvm: use LLVM processor gfx942 for GFX940 when it's available
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31187>
2024-09-27 19:21:55 +00:00
Marek Olšák
163222abd0 ac/nir: set .image_dim and .image_array for all opcodes
for consistency

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31187>
2024-09-27 19:21:55 +00:00
Marek Olšák
14b576e023 ac: make sure VEGA20 and MI200 version ranges don't overlap with other chips
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31187>
2024-09-27 19:21:55 +00:00
Georg Lehmann
151cd9c92b ac/lower_ngg: use is_subgroup_invocation_lt_amd offset
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31184>
2024-09-26 14:29:14 +00:00
David Rosca
1459193b99 ac: Add VCN IB parser
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31275>
2024-09-23 19:25:08 +00:00
David Rosca
72ae8e25a8 ac: Add remaining VCN encode defines
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31113>
2024-09-20 06:58:29 +00:00
David Rosca
aed89d28d3 ac: Add ac_vcn_init_enc_cmds
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31113>
2024-09-20 06:58:29 +00:00
David Rosca
8ecad47695 ac: Fix typo RENCDOE -> RENCODE
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31113>
2024-09-20 06:58:29 +00:00
David Rosca
d6cf36b4d2 radeonsi/vcn: Add rc_per_pic_ex encode command
This makes it a bit cleaner as VCN5 goes back to using base rc_per_pic.

Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31113>
2024-09-20 06:58:29 +00:00
Georg Lehmann
2789cee0c0 amd/nir: add ac_nir_opt_shared_append
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31075>
2024-09-19 16:21:47 +00:00
Marek Olšák
0d8fe2d03b ac/nir/meta: tune clear/copy_buffer performance for gfx6-10.3
Finally, old GPUs have optimal clear/copy_buffer performance, but only
the top dGPU of each generation gets the best behavior.

Other dGPUs might need slightly different conditions.
APUs likely need very different conditions.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31082>
2024-09-17 15:19:32 -04:00
Ganesh Belgur Ramachandra
62592674e0 amd: fix incorrect PIPE_INTERLEAVE_BYTES size for CDNA chips
The expected PIPE_INTERLEAVE_BYTES size is ADDR_PIPEINTERLEAVE_256B on
gfx940 (or other CDNA based chips). Since CDNA based chips like gfx940
doesn't support image opcodes, it gets gibberish value from the kernel.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30891>
2024-09-16 09:31:49 +00:00
Marek Olšák
1537b9355a ac,radeonsi: update comments related to the L2 cache, use "L2", not "TC"
"GL2" is also OK. "TC-compatible" is also OK.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30869>
2024-09-07 01:51:23 +00:00
Marek Olšák
1b94137039 ac/nir/meta: move the "skip compute if no DCC image stores" condition to common
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30869>
2024-09-07 01:51:23 +00:00
Marek Olšák
5250128c6a ac: fix WAVES_PER_SH value for gfx12
not a serious issue because we only use it for PRIME without SDMA IIRC

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30869>
2024-09-07 01:51:23 +00:00
Timur Kristóf
79df320463 ac/nir: Move varying cost functions from radeonsi to common code.
This code will be shared between RADV and RadeonSI.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28676>
2024-09-06 09:16:21 +00:00
Timur Kristóf
4d5bc893b4 ac/nir/tess: Remove no_inputs_in_lds.
When there are no VS outputs, we expect that the drivers set
the LS-HS vertex stride to zero, which will produce the
same result as no_inputs_in_lds did.

Remove the unnecessary code path from the output lowering.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30962>
2024-09-05 19:54:29 +00:00
Marek Olšák
52c41f25de ac/nir/tess: don't allocate LDS for HS inputs that are passed via VGPRs
Right now we don't allocate LDS for HS inputs when all HS inputs are passed
via VGPRs.

This changes it to skip allocating exactly the HS inputs passed via VGPRs
by reducing the inputs_read mask to remove holes.

radeonsi changes to the LDS allocation will be in a different MR.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30962>
2024-09-05 19:54:29 +00:00
Qiang Yu
588a65f29a ac: do not lower some ops in nir_lower_packing
AMD does not implement nir_op_pack_32_4x8_split, others
are implemented, so don't lower them.

Fixes: 0f937426cc ("radeonsi: lower subgroup ops after wave size is known")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11781
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30885>
2024-08-30 05:46:51 +00:00
Yinjie Yao
7f1c0fbe61 radeonsi/vcn: Rename transform_skip_disabled and remove hardcoded value for VCN5
This fix the HEVC encode corruption caused by mismatch between PPS
header and IB setting, the fix only apply for VCN5.
Rename from transform_skip_dicarded to transform_skip_disabled.

Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30930>
2024-08-30 01:17:22 +00:00
Samuel Pitoiset
2fda0db66f ac,radeonsi,radv: add common GFX preambles
RADV and RadeonSI have a few differences.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30789>
2024-08-27 14:14:57 +00:00
Samuel Pitoiset
80e8e18cc6 ac: add ac_gfx103_get_cu_mask_ps()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30789>
2024-08-27 14:14:57 +00:00
Benjamin Cheng
95a980b61f radv/video: add event support for VCN4
This was the main missing piece for passing vulkan video CTS
as the video firmwares couldn't do proper vulkan events.

With new enough firmware this is now possible.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30837>
2024-08-26 22:19:09 +00:00
Qiang Yu
58e412014a ac,radv,radeonsi: stop using quad vote any/all when llvm
ClustedAnd with bool argument and cluster_size==4 will be lowered
to quad_vote_all. So does ALU nir_iand/ior op with bool src.

OpenGL and Vulkan subgroup clustered_and tests with bool argument
fail when using LLVM. It seems LLVM has bug when quad vote bool
is in complex control flow. So stop using it for now.

Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30610>
2024-08-26 10:46:15 +08:00
David Rosca
6e2ae9c581 radeonsi/vcn: Use pipe header params in H264 header encoder
This now supports writing all fields as we get them on input from
packed headers.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30672>
2024-08-23 10:00:02 +02:00
David Rosca
af849516f0 radeonsi/vcn: Use pipe header params in HEVC header encoder
This now supports writing all fields as we get them on input from
packed headers.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30672>
2024-08-23 10:00:02 +02:00
David Rosca
32c6a61e2b radeonsi/vcn: Switch to app DPB management for H264 and HEVC encode
This removes the internal DPB management logic, which was unnecessary as
it was duplicating what applications already do, and it was also causing
issues when the internal DPB would de-sync from application DPB (eg.
driver removes reference that application still intends to use).

DPB is now dynamically resized instead of using fixed number of slots.
This also saves a lot of memory with HEVC encoding, as that was always
using the max_references which va frontend sets to 15.

Move reconstructed pictures to the end of the context and meta buffers
to ensure resizing works correctly.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30672>
2024-08-23 09:59:58 +02:00
Marek Olšák
665eae51ef amd: update addrlib
There are some changes in ac_surface.c to make this work.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30507>
2024-08-16 21:44:32 +00:00
Marek Olšák
07554d32db ac/nir: adjust gfx11 tuning for the compute blit
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30208>
2024-08-10 02:14:44 +00:00
Marek Olšák
db7823e8b9 ac/nir: adjust performance-related decisions for clear/copy_buffer shader
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30208>
2024-08-10 02:14:44 +00:00
Marek Olšák
361266fec7 ac/nir: import the clear/copy_buffer compute shader from radeonsi
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30208>
2024-08-10 02:14:44 +00:00
Timur Kristóf
f317311bad ac/nir: Shorten the name of ac_nir_calc_io_offset_mapped.
The other variant of this function doesn't
exist anymore, so there is no ambiguity.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Timur Kristóf
c9b5ef0e53 ac/nir/tess: Simplify calculation of HS output LDS offset.
No functional change, just make the code more readable.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Timur Kristóf
d43466e917 ac/nir: Remove ac_nir_calc_io_offset function.
This function is not used anymore, because none of the callers
rely on driver locations (intrinsic base) anymore.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Timur Kristóf
7c009172e3 ac/nir/esgs: Map linked ES/GS I/O based on GS input mask.
With this change, ES/GS	linking	will not rely on driver	locations
anymore (driver locations are considered deprecated in NIR).

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Timur Kristóf
d758bea8dd ac/nir/tess: Map linked LS/HS I/O based on TCS input mask.
With this change, LS/HS linking will not rely on driver locations
anymore (driver locations are considered deprecated in NIR).

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Timur Kristóf
b162c7962f ac/nir: Add helper for I/O location mapping.
Map I/O locations based on a prefix sum (for linked shaders),
or based on the provided callback.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Timur Kristóf
ed6499db6b ac/nir/esgs: Don't emit ES outputs that aren't read by GS.
This is	necessary to prevent a regression from a future	commit,
in order to allow us to	map output locations differently.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Timur Kristóf
6d83389a39 ac/nir/esgs: Add gs_inputs_read to ES output lowering.
This commit just adds the field, it will be taken into use
in a following commit.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Timur Kristóf
9aa5c38e8d ac/nir/tess: Don't emit VS outputs that aren't read by TCS.
This is necessary to prevent a regression from a future commit,
in order to allow us to map output locations differently.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Timur Kristóf
b5f53fdf32 ac/nir/tess: Add tcs_inputs_read to LS output lowering.
This commit just adds the field, it will be taken into use
in a following commit.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
2024-08-08 16:55:02 +00:00
Alyssa Rosenzweig
daa97bb41a amd: switch to derivative intrinsics
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30565>
2024-08-08 15:26:07 +00:00
Marek Olšák
97d664b22f ac/surface/gfx12: turn off HiZ for pre-production samples
Fixes: f703dfd1bb - radeonsi: add gfx12

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30525>
2024-08-07 20:35:18 +00:00
Zan Dobersek
7fd5f76393 nir/lower_vars_to_scratch: calculate threshold-limited variable size separately
ir3's lowering of variables to scratch memory has to treat 8-bit values as
16-bit ones when comparing such value's size against the given threshold
since those values are handled through 16-bit half-registers. But those
values can still use natural 8-bit size and alignment for storing inside
scratch memory.

nir_lower_vars_to_scratch now accepts two size-and-alignment functions,
one used for calculating the variable size and the other for calculating
the size and alignment needed for storing inside scratch memory. Non-ir3
uses of this pass can just duplicate the currently-used function. ir3
provides a separate variable-size function that special-cases 8-bit types.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29875>
2024-08-07 14:32:28 +00:00
Marek Olšák
de83b5ef77 ac/surface/gfx12: fix setting tile_swizzle
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30503>
2024-08-05 19:35:39 +00:00
yinjiyao
50ff1e4f86 radeonsi/vcn: add HDR sei in hevc enc
Enable HDR sei in hevc encoder.

Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30388>
2024-07-29 21:48:04 +00:00
David Rosca
8b1a889e45 radeonsi/vcn: Add support for QVBR rate control mode
This rate control mode needs pre-encode enabled and currently is
supported for VCN3 and VCN4.
AV1 needs quality level scaled down from 255 to 51 range.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4024
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30221>
2024-07-24 20:18:49 +00:00
Ruijing Dong
8a5ef9413b radeonsi/vcn: add HDR metadata obu in av1enc
Enable HDR metadata obu in av1 encoder for
both vcn4/5.

Reviewed-by: David Rosca <david.rosca@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30237>
2024-07-23 17:16:21 +00:00
Ruijing Dong
aa86c3a235 radeonsi/vcn: input av1 hdr metadata
get av1 hdr metadata from frontends.

Reviewed-by: David Rosca <david.rosca@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30237>
2024-07-23 17:16:21 +00:00
Marek Olšák
d90080b51b nir/opt_vectorize_io: optionally don't vectorize IO with different types
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11443

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29895>
2024-07-23 16:13:17 +00:00