GFX7 CP requires the indirect dispatch VA to be aligned to 32-bytes.
This fixes dEQP-VK.api.command_buffers.many_indirect_disps_on_secondary,
but it's unexpected that it uncovered this bug.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27148>
Since Vulkan 1.3.271, the spec allowed vkCmdBeginTransformFeedbackEXT
to be called without an active graphics pipeline bound when using
shader objects.
That means that the last VGT shader would be NULL once VKCTS is
updated accordingly. This change delays emitting streamout enable at
draw time to make sure the last VGT shader is present, regarldess if
ESO is enabled or not.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27075>
Because we use unaligned dispatches, 1D launches only use 8 threads per
wave. Converting to 2D and fixing up launch IDs in the prolog
significantly increases occupancy.
Gives ~30% uplift in Ghostwire Tokyo.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26105>
Implementations are supposed to do that per the Vulkan spec.
This fixes the following new VKCTS tests
dEQP-VK.pipeline.*.stencil.no_stencil_att.*
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26847>
With dynamic rendering, it's allowed to begin rendering with depth or
stencil only but still with a depth/stencil format. The test below
checks that unbound part of ds isn't modified, if depth is bound and
stencil not and vice versa.
This fixes a recent CTS
dEQP-VK.dynamic_rendering.primary_cmd_buff.basic.partial_binding_depth_stencil.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25350>
Now that PS epilogs are always compiled during cmdbuf recording, we
have all information to enable MRT compaction, for optimal performance.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26667>
TC-compat CMASK means Fmask decompression isn't needed because the hw
can read it directly from shaders, so this shouldn't have any effects.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26575>
The current flush flags in RADV don't really match the SDMA HW,
so just always emit a NOP packet, for now.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26580>
Instead, the retile will be executed on another queue type
when the image is transitioned to another queue.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25834>
This isn't supposed to work nor does it match radeonsi but setting
AUTO_RESET_CNTL=0 by default for GFX9 chips is what gets it passing
linestrip CTS tests:
dEQP-VK.rasterization.primitives.dynamic_stipple.bresenham_line_strip
dEQP-VK.rasterization.primitives.dynamic_stipple_and_topology.bresenham_line_strip
dEQP-VK.rasterization.primitives.dynamic_stipple_and_topology.bresenham_line_strip_wide
dEQP-VK.rasterization.primitives.static_stipple.bresenham_line_strip
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24623>
pipeline_is_dirty was never TRUE because it's emitted in the before
helper. This might fix bad interactions between DGC and RT because
they both use compute shaders and descriptor bindings need to be
re-emitted.
Found by inspection.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26417>
On GFX10.3, VRS rates need to be copied to the HTILE buffer but in some
situations, like for mips, it's not always possible to enable HTILE.
In this case, we can fallback to our internal HTILE buffer and tweak
the depth/stencil registers to use this HTILE buffer.
This fixes a bunch of VRS crashes on GFX10.3.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26025>