Because there is no way to know where the address has been allocated
(GTT or VRAM), the existing entrypoints aren't dropped and the sparse
bit is derived from VK_ADDRESS_COMMAND_FULLY_BOUND_BIT_KHR.
It would be nice to figure out if the CP DMA vs compute heuristic for
GTT BOs on dGPUs could be removed to simplify this implementation.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40386>
Clearing on graphics updates HiZ correctly and expanding it always
after the clear might hurt because it means HiZ will be disabled.
This probably helps performance with the full GFX12 HiZ WA.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40176>
All BOs allocated from vkAllocateMemory are either local BOs or added
to the global BO list. Only BOs allocated internally should be added
to the per-cmdbuf list.
Verified this by doing a full CTS run with amdgpu.debug=0x1.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40466>
Removed this by mistake during a rebase presumably.
This fixes a regression with
dEQP-VK.pipeline.monolithic.multisample.m10_resolve.* on <= GFX8.
Fixes: 1746837a71 ("radv/meta: remove CB_RESOLVE")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40449>
CB_RESOLVE isn't very fast and we already have two different paths,
it's been removed in hw since GFX11. PAL and RadeonSI removed support
for it too.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39957>
No need to invalidate the VCACHE again (applications are supposed to
emit a barrier) and INV_SCACHE/INV_L2 are not necessary either.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40342>
No reason to require a barrier either because there is already one
before doing resolves and decompressions should already be correctly
synchronized.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40135>
The source image layout must be either TRANSFER_SRC or GENERAL and the
application must emit the image layout transition. There is no reason
the source image wouldn't be readable by shaders.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40135>
This barrier is only needed for rendering resolves (ie. not for
vkCmdResolveImage()). Though, these barriers are likely unnecessary
but let's keep them for now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40064>
There's no reason to have these checks be smeared between
radv_image_need_retile and radv_retile_transition.
Make radv_image_need_retile verify that the image might ever
need to have its displayable DCC updated.
Also, radv_image_need_retile should not care about the command
buffer. We should never try to do retile transition on a
command buffer that can't do compute to begin with.
Make radv_retile_transition only check whether the layout
we're transitioning to might involve reading the displayable
DCC, and perform retiling if so.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39990>
This is possible since VK_KHR_maintenance10.
This fixes new VKCTS coverage in
dEQP-VK.pipeline.*.multisample.m10_resolve.*.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39956>
While reworking image resolves completely in RADV, I found a very weird
bug where the only fix was to emit caches immediately after
decompressing the source resolve image (after FMASK_DECOMPRESS).
I have been struggling this for few hours and figured that it was
something related to context rolls (ie. as long the context was rolled
out, emitting the flushes immediately was required).
It turns out this was a known hardware bug on GFX6 that was implemented
in PAL. Though PAL only applies on GFX6 but GFX7-8 are also affected
based on my testing. Note that RadeonSI flushes CB_META too.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39959>
The current approach of explicitly saving/restoring some states is
unnecessarily complicated and inefficient. For example, some meta OPs
that use memory fills/copies will have nested save/restores. This patch
is the first step towards avoiding unnecessary state re-emits around
meta OPs.
The changes are:
- Move radv_meta_saved_state to radv_cmd_buffer::state
- Add radv_meta_begin/end helpers that initialize radv_meta_saved_state
and restore states used by the meta OP
- Remove all explicit saves/restores, use the new helpers
radv_meta_begin/end is called inside the entrypoint and not some nested
helper function which means that state is only restored once per meta
OP.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39774>