Doing it at bind-time causes a 1.4% overhead (among all driver calls) in
Overwatch 2. !24502 mentions that it can be precomputed in case overhead
is a concern, so do it here.
max_waves is stored directly in the radv_shader struct, because
ac_shader_config conforms to LLVM ABI and we cannot add anything custom,
and radv_shader_info needs to be determined from NIR only.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26692>
This is for enabling ROI (region of interest)
feature in VAAPI interface. It will support
both CQP and rate control mode.
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26659>
We don't use these registers and since RDNA3 removed the explicit usage,
it is unlikely that we will properly support them in the future.
Removing the registers from the ACO IR prevents accidentally using them
without proper support.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26664>
With dynamic rendering, it's allowed to begin rendering with depth or
stencil only but still with a depth/stencil format. The test below
checks that unbound part of ds isn't modified, if depth is bound and
stencil not and vice versa.
This fixes a recent CTS
dEQP-VK.dynamic_rendering.primary_cmd_buff.basic.partial_binding_depth_stencil.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25350>
Now that PS epilogs are always compiled during cmdbuf recording, we
have all information to enable MRT compaction, for optimal performance.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26667>
This has no effects because key->spi_shader_col_format isn't used when
the graphics pipeline needs to compile a PS epilog.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26663>
The only major change is the code removal of the legacy BO list path
in the winsys, which required switching "debug_all_bos" to the new path.
Reviewed-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26547>
A BO can be always resident by two ways:
1. Through kernel bookkeeping. The BO is created with
AMDGPU_GEM_CREATE_VM_ALWAYS_VALID and bo->is_local gets set to true.
2. Through the driver global BO list. On every submission, the global
BO list is added to the CS's BO list.
Until now, use_global_list reflected either 1. or 2. . This commit
changes it to reflect 2. only, and update callsites that checks for
residency to use a new helper.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26591>
meshShaderQueries has been recently disabled because it causes random
GPU hangs in CI, I'm still investigating it. But let's clean the CI
lists to avoid any confusion, I will re-introduce them if needed but
this issue can also be reproduced without mesh shaders.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26651>
This is a promotion from the EXT, except the new property
supportsNonZeroFirstInstance which should already be supported.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26595>
TC-compat CMASK means Fmask decompression isn't needed because the hw
can read it directly from shaders, so this shouldn't have any effects.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26575>
Crysis 2 and 3 Remastered's RT shaders non-uniformly index into SSBO
descriptor arrays without specifying the NonUniformEXT qualifier on the
relevant access chains/load ops. This leads to artifacts around objects.
To add insult to injury, the game fails to provide a meaningful
applicationName/engineName in the Vulkan part of the DX11-Vulkan interop
solution used for RT. Both of these fields are set to "nvpro-sample"
(perhaps the code has been copied from NVIDIA's sample applications).
Therefore, fall back to executable name matching.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9883
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26391>
These are hard limits and don't depend on wave size.
Accordingly, also update the usage in order to avoid
reporting unreasonable occupancy.
Totals from 192 (0.24% of 79330) affected shaders:
MaxWaves: 5814 -> 3072 (-47.16%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26521>
The current flush flags in RADV don't really match the SDMA HW,
so just always emit a NOP packet, for now.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26580>
Instead, the retile will be executed on another queue type
when the image is transitioned to another queue.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25834>
DCC and HTILE are only supported by SDMA on GFX10+ (unless disabled by a workaround).
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25834>
radv_init_metadata hits several assert failures when the image is
multi-planar. Make sure we use plane 0.
This change should make no difference in practice. Also, this is done
only to follow radeonsi. Since the opaque metadata is mainly for
validations and DCC, and we don't enable DCC for multi-planar images, we
probably don't need to call radv_query_opaque_metadata at all.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25964>
Do not report DCC modifiers for multi-planar formats. We don't support
DCC for them and drmFormatModifierPlaneCount had incorrect values.
Fix vkGetImageSubresourceLayout for multi-planar images with modifiers.
In that case, memory planes and format planes are equivalent.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25964>