We can only use the entire HTILE buffer if TILE_STENCIL_DISABLE is
TRUE. On GFX8+, this is only true if the depth image has no stencil
and if it's not TC-compatible because of the ZRANGE_PRECISION issue.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8039>
To make sure the stencil compare state is properly initialized and
cleared when the driver performs a fast depth clear.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8039>
Not doing this for APUs because spilling is quite likely, due to
overall VRAM pressure.
Also adding a flag to disable for performance debugging.
Finally adds some memset for places where we depended on the memory
being initialized to zero, which we won't get with VRAM anymore.
(I think these places should stop depending on it since it hides
issues with executing the cmdbuffer multiple times, but this
preserves behavior)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7979>
Fixes dEQP-VK.api.copy_and_blit.*.4_bit. I think the MSAA2x and
MSAA8x just passed by luck.
Fixes: 7b21ce401f ("radv: disable FMASK compression when drawing with GENERAL layout")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7915>
This can result in meaningful compression changes so we shouldn't skip.
Fixes: 66131ceb8b "radv: Pass through render loop detection to internal layout decisions."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7004>
Only supported on GFX10.3+. Attachment Fragment Shading Rate is
for later.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7837>
It might be useful to know if the VA is valid and if other info
like the stride seems correct.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7753>
This workaround fixes a hang while loading a renderdoc trace for me.
Since the workload does 1 mip per cmdbuffer it is quite hard to confirm
what exactly the conditions for the hang are but this is the most
restrictive set I found and it corresponds to a workaround in AMDVLK as
well.
CC: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7210>
It's expected to be 0.
Fixes: 62d9ca696e ("radv: use 32-bit predication for conditional rendering on GFX10.3+")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7789>
It seems that only gfx queue doesn't support it, except on GFX10.3
which supports all queues.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7732>
The fce metadata can always be set to false as we don't care about
the compressed clear color.
Avoiding useless fast clear eliminates improves basemark performance by
1%-1.5%.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7005>
Not doing the EOP TS cacheflush event because that break wave counting
in RGP for some reason. But the rest looks to be all there.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6550>
Since the flushes really happen on the next draw delay the barrier
end to include the flushes.
This fixes the barrier duration in RGP.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6550>
From the Vulkan 1.2.154 spec:
"If pCounterBufferOffsets is NULL, then it is assumed the
offsets are zero."
Fix new CTS
dEQP-VK.transform_feedback.simple.backward_dependency_no_offset_array.
CC: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6798>
If radv_layout_can_fast_clear() is false, 028C70_COMPRESSION is unset when
the image is rendered to and CMASK isn't updated. This appears to cause
FMASK to be ignored and the 0th sample to always be used.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3449
Fixes: 7b21ce401f
('radv: disable FMASK compression when drawing with GENERAL layout')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6745>
Because the driver now waits for idle after every draw/dispatch
calls, we shouldn't report gfx pipelines when the GPU hang happens
after a dispatch (or the opposite).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6471>
It reduces traffic between CB, DB and TCP blocks if buffers
respect a certain alignment.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6482>
Fixes some dEQP-VK.renderpass2.* flakes. Valgrind:
Test case 'dEQP-VK.renderpass2.dedicated_allocation.attachment.8.724'..
==754520== Conditional jump or move depends on uninitialised value(s)
==754520== at 0x575B21C: radv_layout_is_htile_compressed (radv_image.c:1690)
==754520== by 0x572F470: radv_handle_depth_image_transition (radv_cmd_buffer.c:5855)
==754520== by 0x572F2F2: radv_handle_image_transition (radv_cmd_buffer.c:6123)
==754520== by 0x572EEC6: radv_handle_subpass_image_transition (radv_cmd_buffer.c:3385)
==754520== by 0x572A104: radv_cmd_buffer_begin_subpass (radv_cmd_buffer.c:4843)
==754520== by 0x572A007: radv_CmdBeginRenderPass (radv_cmd_buffer.c:4913)
==754520== by 0x572A197: radv_CmdBeginRenderPass2 (radv_cmd_buffer.c:4921)
Why false?
A renderloop happens when the same attachment is both used as input
attachment and output (color, ds) attachment in a subpass. Of course
this doesn't happen outside of a renderpass and hence we can initialize
it to false at the start of the renderpass.
Fixes: 66131ceb8b "radv: Pass through render loop detection to internal layout decisions."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3074
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6068>
AMDGPU_GEM_CREATE_CPU_GTT_USWC should be faster when CPU reads
are unexpected (because they aren't cached).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5978>
Out-of-order rasterization is disabled if a pipeline uses an
extended dynamic depth/stencil state because the driver doesn't
support enabling/disabling out-of-order dynamically.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5718>