RADV needs to adjust this register for user sample locations because
it seems possible to have a sample on the -8 coordinate.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31815>
Unigine Heaven with AMD_DEBUG=mono has incorrect rendering on gfx11
because it doesn't set nir_io_semantics::dual_source_blend_index for
the second output, resulting in garbage asm.
Instead of trying to find out what's wrong, I decided to rewrite this
to make it the same as the LLVM IR path. It simplifies the code and fixes
Unigine Heaven with AMD_DEBUG=mono.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31669>
For surfaces without a modifier, the surf_size check wasn't
necessary, but it was also invalid since surf_size is set later
(in gfx12_compute_miptree).
Since it's not required anyway, drop this check.
Fixes: 060d5dacfd ("ac: add gfx12 DCC shared code")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31683>
It will be used to allow merging loads with a hole between them.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>
The declaration and definition used by tests otherwise differs from
addrlib.
Found by LTO -Werror=lto-type-mismatch.
Fixes: 1d69c0419b ("amd/addrlib: prevent defining regparm differently")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31613>
Same as H264/HEVC, we still write sequence header ourselves
and slice header is sent to FW, everything else gets copied
directly to output bitstream buffer.
Fixes generating correct output with libva-utils/av1encode.
Also fixes temporal delimiter insertion, it's no longer forced
on every frame, but instead it lets application handle it.
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31520>
Finally, old GPUs have optimal clear/copy_buffer performance, but only
the top dGPU of each generation gets the best behavior.
Other dGPUs might need slightly different conditions.
APUs likely need very different conditions.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31082>
The expected PIPE_INTERLEAVE_BYTES size is ADDR_PIPEINTERLEAVE_256B on
gfx940 (or other CDNA based chips). Since CDNA based chips like gfx940
doesn't support image opcodes, it gets gibberish value from the kernel.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30891>
This code will be shared between RADV and RadeonSI.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28676>
When there are no VS outputs, we expect that the drivers set
the LS-HS vertex stride to zero, which will produce the
same result as no_inputs_in_lds did.
Remove the unnecessary code path from the output lowering.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30962>
Right now we don't allocate LDS for HS inputs when all HS inputs are passed
via VGPRs.
This changes it to skip allocating exactly the HS inputs passed via VGPRs
by reducing the inputs_read mask to remove holes.
radeonsi changes to the LDS allocation will be in a different MR.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30962>
This fix the HEVC encode corruption caused by mismatch between PPS
header and IB setting, the fix only apply for VCN5.
Rename from transform_skip_dicarded to transform_skip_disabled.
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30930>
This was the main missing piece for passing vulkan video CTS
as the video firmwares couldn't do proper vulkan events.
With new enough firmware this is now possible.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30837>
ClustedAnd with bool argument and cluster_size==4 will be lowered
to quad_vote_all. So does ALU nir_iand/ior op with bool src.
OpenGL and Vulkan subgroup clustered_and tests with bool argument
fail when using LLVM. It seems LLVM has bug when quad vote bool
is in complex control flow. So stop using it for now.
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30610>
This removes the internal DPB management logic, which was unnecessary as
it was duplicating what applications already do, and it was also causing
issues when the internal DPB would de-sync from application DPB (eg.
driver removes reference that application still intends to use).
DPB is now dynamically resized instead of using fixed number of slots.
This also saves a lot of memory with HEVC encoding, as that was always
using the max_references which va frontend sets to 15.
Move reconstructed pictures to the end of the context and meta buffers
to ensure resizing works correctly.
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30672>
The other variant of this function doesn't
exist anymore, so there is no ambiguity.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
No functional change, just make the code more readable.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>
This function is not used anymore, because none of the callers
rely on driver locations (intrinsic base) anymore.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29812>