Only GFX6-8 support for now because merged shaders make it harder to
implement but I have a bunch of local work for that.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26930>
Similarly to VS prologs and PS epilogs, this needs to be re-emitted
otherwise the config shader state can be overwritten.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26930>
GFX10 has a hw bug and it can't handle 0-sized index buffer. The
non-indirect draw path was fine but not the indirect path where RADV
emits the index buffer.
This fixes flakes with dEQP-VK.*maintenance6* on NAVI14, and possibly
GPU hangs if there is an indirect draw with a valid index buffer right
before because it would re-use the same index buffer.
Fixes: db9816fd66 ("radv: add support for NULL index buffer")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27142>
GFX7 CP requires the indirect dispatch VA to be aligned to 32-bytes.
This fixes dEQP-VK.api.command_buffers.many_indirect_disps_on_secondary,
but it's unexpected that it uncovered this bug.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27148>
Since Vulkan 1.3.271, the spec allowed vkCmdBeginTransformFeedbackEXT
to be called without an active graphics pipeline bound when using
shader objects.
That means that the last VGT shader would be NULL once VKCTS is
updated accordingly. This change delays emitting streamout enable at
draw time to make sure the last VGT shader is present, regarldess if
ESO is enabled or not.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27075>
Because we use unaligned dispatches, 1D launches only use 8 threads per
wave. Converting to 2D and fixing up launch IDs in the prolog
significantly increases occupancy.
Gives ~30% uplift in Ghostwire Tokyo.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26105>
Implementations are supposed to do that per the Vulkan spec.
This fixes the following new VKCTS tests
dEQP-VK.pipeline.*.stencil.no_stencil_att.*
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26847>
With dynamic rendering, it's allowed to begin rendering with depth or
stencil only but still with a depth/stencil format. The test below
checks that unbound part of ds isn't modified, if depth is bound and
stencil not and vice versa.
This fixes a recent CTS
dEQP-VK.dynamic_rendering.primary_cmd_buff.basic.partial_binding_depth_stencil.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25350>
Now that PS epilogs are always compiled during cmdbuf recording, we
have all information to enable MRT compaction, for optimal performance.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26667>
TC-compat CMASK means Fmask decompression isn't needed because the hw
can read it directly from shaders, so this shouldn't have any effects.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26575>
The current flush flags in RADV don't really match the SDMA HW,
so just always emit a NOP packet, for now.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26580>
Instead, the retile will be executed on another queue type
when the image is transitioned to another queue.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25834>
This isn't supposed to work nor does it match radeonsi but setting
AUTO_RESET_CNTL=0 by default for GFX9 chips is what gets it passing
linestrip CTS tests:
dEQP-VK.rasterization.primitives.dynamic_stipple.bresenham_line_strip
dEQP-VK.rasterization.primitives.dynamic_stipple_and_topology.bresenham_line_strip
dEQP-VK.rasterization.primitives.dynamic_stipple_and_topology.bresenham_line_strip_wide
dEQP-VK.rasterization.primitives.static_stipple.bresenham_line_strip
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24623>