Because SDMA doesn't support MSAA, it's possible to get there because
RADV fallback to compute queue in this case.
Some tests only pass because RDNA2 and older don't support image
stores with depth/stencil and MSAA.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41492>
When a VRS view is used with a depth/stencil view, the driver is
expected to copy the VRS rates to the HTILE buffer of the depth/stencil
view. Though if the image uses mipmaps and the base level can't support
HTILE there is no way to copy the rates. The workaround is to force VRS
to be 1x1 which is valid in Vulkan.
This fixes old VKCTS failures on RAPHAEL just because it supports
fragmentShadingRateWithShaderDepthStencilWrites compared to other GPUs
in CI (NAVI21/VANGOGH).
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41427>
It's required with VK_KHR_maintenance11. This allows way more transfer
queue related CTS tests to run and all issues I found should already
be fixed.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41316>
The Vulkan spec says:
If a fragment shader entry point statically uses an input variable
decorated with a BuiltIn of SampleId or SamplePosition,
sample shading is enabled and a value of 1.0 is used instead of minSampleShading.
If a fragment shader entry point statically uses an input variable decorated
with Sample, sample shading may be enabled and a value of 1.0 will be
used instead of minSampleShading if it is.
This means we have to overwrite the command buffer state entirely.
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41482>
This shouldn't be necessary because SDMA can detile the image just fine,
only buffer->image and image->image need to fallback.
It just works on GFX10+ because RADV is using NBC views, and I think
it works on eg. VEGA10 just by luck due to different
swizzles/alignments.
Fixes: 3d803d7a2e ("radv: Use compute copy for emulated formats")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41384>
This can only happen with RADV_DEBUG=fullsync which literally flushes
all caches, but INV_ICACHE is invalid with RELEASE_MEM apparently.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41396>
To verify that some GPUs are compatible and that shader binaries can be
shared to avoid precompiling twice for SteamOS.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41346>
sed -i "s/nir_src_parent_instr/nir_src_use_instr/" `find ./ -type f`
sed -i "s/nir_src_parent_if/nir_src_use_if/" `find ./ -type f`
sed -i "s/nir_src_set_parent/nir_src_set_use/" `find ./ -type f`
There are two kinds of "parent" in relation to a src/def:
- the instruction where the def or src's def is defined
- the instruction which the src is a part of and where the def is used
Clarify that the parent here is where the src's def is used, not where
it's defined.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41344>
addrlib has an extra optimization for memcpy with HIC, there are two
modes:
- blockMemcpy: chip-specific layout but better performance overall
- hybridMemcpy: chip-agnostic
Because matching UUIDs doesn't matter on desktop, use the block memcpy
by default.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41019>
The vertex input state can be NULL if rasterization is disabled with
dynamic vertex inputs.
The input assembly state can be NULL if rasterization is disabled
and both states are dynamic (primive topology and primitive restart
enable).
This fixes a segfault with gpu-ratemeter vk_dyn.prim
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41335>
This has a bit of sorting overhead, but can significantly increase BVH
quality especially in big BVHs. gfx12 is faster at intersecting, so only
enable for gfx11 and earlier right now.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41300>
This property is unrelated to the CTS conformance process from Khronos,
it just means that the driver passes that CTS version, even if not
"officially" conformant.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41258>
nir_build_frag_coord generates the correct sysval loads based on NIR
options. nir_load_frag_coord shouldn't be used directly because drivers
don't have to support it.
v2: RADV can't use it because nir->options isn't set, so use load_pixel_coord.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>
We used this for two different purposes with different caching
requirements:
- it's always needed for LLVM, and needs to be part of the cache key
- it's needed for disassembly with ACO, and shouldn't be part of the cache
key
Eventually, we'll want the family to only be part of the cache key if LLVM
is used, but still accessable for when ACO needs the disassembler.
If we put it in radv_compiler_info::debug, we'll need to treat that
specially to hash it into the key when LLVM is used.
If we put it in radv_compiler_info::key, that will hash it into the key
unnecessarily if ACO is used and disassembly might be needed.
So just put the family in both, and use debug::family for disassembly and
key::family for LLVM.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41261>
This just separates tex coord lowering into a new pass.
The gfx_level parameter is now unused in ac_nir_lower_image_tex, but I'm
keeping it because it will be used in the future.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41173>
The expand was considering only the first sample, very old bug.
This fixes test_{copy,compute}_queue_depth_stencil_msaa from
vkd3d-proton on GFX11-GFX11.7 GPUs. Older GPUs don't support image
stores with depth/stencil MSAA images.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41267>