ACO doesn't create a waitcnt for barriers between texture samples and
image stores because texture samples are supposed to use read-only
memory. It could also schedule the barrier to above the texture sample.
We also have use a larger memory scope to avoid an ACO optimization.
Tested on GFX8 with Sachsa Willems deferred sample. With some DCC
decompressions and the compute path forced.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: 21.1 <mesa-stable>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9496>
Allocating a descriptor set is aligned to 32 bytes, so just like the
other buffer types, bump the descriptor size to 32 bytes when allocating
MUTABLE descriptor types from a pool.
Fixes: 86644b84b9 ("radv: Implement VK_VALVE_mutable_descriptor_type.")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10132>
The resource indices we get point to descriptor map entries that include
all shader stages, so we need to size the arrays to account for more than
just one stage.
For now we only support up to 2 stages in a pipeline, so we use that.
Fixes: 002304482c ('v3dv: avoid redundant BO job additions for UBO/SSBO')
Fixes: fa170dab4c ('v3dv: avoid redundant BO job additions for textures and samplers')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10252>
There were various issues here:
- MAX_DYNAMIC_UNIFORM_BUFFERS was larger than MAX_UNIFORM_BUFFERS.
- In some cases we were exposing more than the minimums required.
While that is not incorrect, it is not following what we have
been doing in general.
- The Vulkan spec states that some of the MaxDescriptorSet limits
need to be multipled by 6 to include all shader stages, even
if the implementation doesn't support all shader stages.
Fixes: cbd299b051 ('v3dv/device: do not compute per-pipeline limits multiplying per-stage')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10252>
piglit_gl clocked in at 6:12 end-to-end runtime, and piglit_shader spent
2:53 in deqp-runner, so merging them together should be about 9 minutes.
Removing a boot should save us a minute or two of runner time per
pipeline.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10243>
Accept non-linear tiling for multi-planar formats on GFX9+, as long
as DCC is disabled. DCC support is possible in theory, but untested
for now.
GFX8 is still restricted to linear tiling because it's not yet clear
how modifiers should be handled on these chips for multi-planar
formats. Each plane may need a different modifier.
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10134>
Instead of having a special case for YUV formats in
si_query_dmabuf_modifiers, let ac_get_supported_modifiers handle
them. Keep setting external_only = 1 for YUV formats, since we
can only sample from such formats (we can't use them as render
targets).
This shouldn't change si_query_dmabuf_modifiers's behavior, because
for YUV formats ac_get_supported_modifiers will return a single
LINEAR modifier.
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10134>
util_format_get_blocksize asserts that the blocksize isn't zero.
However the blocksize will be zero if the format's channel encoding
is unspecified. The channel encoding is only meaningful for the
plain u_format layout, so util_format_get_blocksize can't be used
for formats with another layout. For example, YUV formats don't have
the channel encoding specified.
Use util_format_get_blocksizebits, which just returns zero without
an assertion for formats which don't have a channel encoding.
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10134>
After the pixel block width and height, a third field is used to
store the pixel block depth. Document this field.
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10134>
Set the swizzle mode when decoding.
Add a safe-guard to make sure the provided surface isn't DCC, because
we don't handle this situation.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10134>
When the depth or stencil state changes dynamically, that might affect
LRZ state and we need to recalculate it and emit it again.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8615>
Move them up, so they are initialized even when the dynamic state is
not used.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8615>
In these cases we know that the BO has not been added to the job
before, so we can skip the usual process for adding the BO where
we check if we had already added it before to avoid duplicates.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10210>
We are flushing tile cache more often than is necessary. In
unified cache mode, tile cache flushing is expensive, evicting all
depth/pixel data from the L3$. This is only need for a handful of
cases, such as: making cpu or gpu changes globally visible
(e.g. map), fast color clears, or slow depth clears. Tile cache
flushing is a gen12+ feature.
Remove blanket flushing of tile cache on all depth/RT flushes.
Replace with selective tile cache flushing.
Improves performance in several workloads:
AztecRuins.ogl-high-offscreen-1440p 1%
UnigineValley.ogl-g2 1%
Dota 2 (replay Jul 2020).ogl-g2 1%
Counter-Strike GO.ogl-g2 1%
Manhattan.ogl-Off-19x10 2%
CarChase.ogl-Off-19x10 1%
Bioshock Infinite.ogl-g2 1%
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10217>
enable per-format querying of texture_filter_minmax support if the ARB extension
is enabled
also now return 0 if neither extension is supported
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10030>
this enables detection for the EXT vs the ARB extension, which have
different specifications regarding which formats must be supported
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10030>
This change is for supporting VK_ANDROID_native_buffer implementation,
and it does not advertise VK_KHR_external_memory_fd.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10195>
We are no longer limited to Vulkan 1.1 in VMs.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10146>
Add vn_renderer_info::has_implicit_fencing. Force vkQueueWaitIdle
during vkQueuePresentKHR when it is false.
This kills the performance, but we have to do this until the kernel does
implicit fencing.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10146>
Move away from vn_renderer_sync and toward a userspace-only solution
temporarily until the kernel does what we need.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10146>
Move away from vn_renderer_sync and toward a userspace-only solution
temporarily until the kernel does what we need.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10146>
Move away from vn_renderer_sync and toward a userspace-only solution
temporarily until the kernel does what we need.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10146>