This is to indicate when an argument was loaded from VMEM
and needs a waitcnt before it can be used.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21696>
This patch will add support for dpb resize when low to high resolution
change/ svc use-cases.
With DPB tier1 type,vp9 svc decoder use cases are failed. This
Change will fix this[VCN1/VCN2].
Signed-off-by: SureshGuttula <suresh.guttula@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21548>
This is required by register shadowing (required by the new PAIRS packets),
preemption, user queues, and we only have to wait for VS after streamout,
not PS. This is how gfx11 streamout should have been done.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21584>
Swizzle of 8-bit stencil format is defined as _x__ but the hw expects
BC_SWIZZLE_XYZW.
Fixes dEQP-VK.pipeline.monolithic.sampler.border_swizzle.*s8_uint*.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21482>
Modifying pitch for all LINEAR surface isn't correct;
the original change that modified surf_pitch was only
intended for YUV textures.
This fixes vkGetImageSubresourceLayout rowPitch return value
for VK_FORMAT_BC3_UNORM_BLOCK + VK_IMAGE_TILING_LINEAR.
Fixes: fcc499d5 (ac/surface: adjust gfx9.pitch[*] based on surf->blk_w)
v2: add check for UYVY format (Pierre-Eric)
v3: move blk_w division to above if check (Pierre-Eric)
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21595>
It's just v_fma with fixed DPP8 and builtin s_waitcnt_expcnt, so it can mostly
be handled as a pure VALU instruction.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21023>
It's unnecessary and already checked elsewhere like in
vkCmdBindDescriptorSets(). This improves performance of vkoverhead
test #76 (descriptor_1image) by +18%. It's the same performance as
PRO on my Threadripper 1950X now. This should also slightly improve
texel and buffer descriptors.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20699>
This is preliminary work for separate shader functions.
The ray_tracing_module is eventually intended as self-contained
pipeline struct per RT group.
For now, these modules only contain the group handles.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21667>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21390>
because the hw and LLVM only support subdword single-component SSBO loads,
and ac_nir_to_llvm splits multi-component loads because of that, which is
inefficient.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19399>
This fixes broken subdword UBO loads with LLVM.
It's only needed for LLVM, but it's done for both LLVM and ACO because
the pass can be fully validated only with ACO and the Vulkan CTS right now.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19399>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19399>
Instead of having ac_set_reg_cu_en that sets the register, replace it with
ac_apply_cu_en that only returns the modified register value,
which allows a large simplification in both drivers because a lot of code
becomes duplicated after it's switched to ac_apply_cu_en.
RADV also didn't apply it to a few registers. Fixed.
This removes 82 lines of code in total.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21641>