This means we can actually implement varying subgroup size correctly.
It also means that we implement the implicit SPIR-V 1.6 full subgroups
requirement in compute shaders with cswave32/rtwave32.
In the future it will also allow more optimizations that use the subgroup size.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
The only somewhat complex case here is GFX10 geometry shaders, if gewave32 is
used. We then only know the subgroup size when is_ngg is decided, as legacy
GS doesn't support wave32.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37294>
In case the source or destination were previously written
through L2, we need to writeback L2 to avoid the CP DMA accessing
stale data.
However, as the CP DMA doesn't write L2 either, an invalidation
is also needed to make sure other clients don't access stale data
when they read it through L2 after the CP DMA is complete.
Doing an invalidation before the CP DMA operation should take
care of both.
Additionally, radv_src_access_flush also invalidates L2 before
the copied data can be read.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36820>
VirtIO devices can be configured as platform devices instead of PCI,
which is especially common in microVM projects like libkrun.
Let's allow RADV to probe MMIO virtgpu devices.
Signed-off-by: Val Packett <val@invisiblethingslab.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37281>
The border color index must be captured and returned to the app,
otherwise it's broken if samplers aren't replayed in-order.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37291>
The full HiZ workaround is the only one that fixes the issue reliably.
Sadly, the performance results are mixed and sometimes it hurts. To
maintain performance, RADV will opt-in by selecting which workarounds
to apply from drirc.
As we can't know the full list of games that will be affected by a
potential performance regression, users are encouraged to try with
RADV_GFX12_HIZ_WA (see Mesa documentation for more explanations).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36174>
This is actually wrong because if radv_handle_fbfetch_output() triggers
a decompression pass and graphics shaders (ESO) are saved/restored
they won't be updated because radv_bind_graphics_shaders() was called
before.
This fixes a very recent regression that I noticed while implementing
a new extension.
This reverts commit 9b912f00c7.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37194>
RADV_DYNAMIC_VERTEX_INPUT_BINDING_STRIDE doesn't emit any context
registers, so it can be excluded for the late scissor workaround to
avoid re-emitting scissors all the time it's dirty.
This fixes a performance regression noticed with Cyberpunk on Vega10,
but other games are likely affected too. The late scissor workaround is
only applied on Raven/Vega10.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13828
Fixes: d7f401c2bb ("radv: bind the vertex binding strides like a normal dynamic state")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37252>
If we never get rate control info, we would treat it as rate control
disabled and use QP value from slice info which is always 0 for default
rate control.
Cc: mesa-stable
Reviewed-by: Autumn Ashton <misyl@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37156>
It alwways comes in through the create flags now.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36957>
MEC probably doesn't support EVENT_WRITE_EOP.
Both PAL and RadeonSI use RELEASE_MEM.
RADV used RELEASE_MEM too but "is_gfx8_mec" was very misleading.
This commit just cleans that up. No functional changes.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37121>
EOS events are buggy on GFX7 and can cause hangs when used
together in the same IB with CP DMA packets that use L2.
While we don't use the L2 for CP DMA copies, we still use it
with CP DMA prefetches, so the issue needs to be mitigated.
As a mitigation, avoid using EVENT_WRITE_EOS and prefer to use
the BOTTOM_OF_PIPE event instead of PS_DONE/CS_DONE, which should
be close enough.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37121>
GFX6 actually supports IB2, but doesn't support chaining between
chunks inside the IB2. See WaCpIb2ChainingUnsupported in PAL.
Disable IB2 on GFX6 for now.
The proper fix will be to disable use_ib in just secondary
command buffers on GFX6 and emit multiple IB2 packets in the
main command buffer. This will be implemented later.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37121>
After a refactor last year, the noibs option stopped working
because it hits an assertion when empty IBs are submitted.
Emit a single large NOP packet to avoid submitting empty IBs.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37121>