Cache flushes should be skipped on SDMA. In practice,
radv_emit_cache_flush() should only be called on GFX/ACE.
SDMA NOP packets are emitted in barriers directly.
This fixes recent VKCTS coverage
dEQP-VK.api.command_buffers.secondary_on_transfer_queue.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39964>
This should definitely be an OR operation if MRT0 and MRT1 don't write
the same channels. This also requires to set the writemask manually
because when it's 0 (in case a dual-source output is missing), the
intrinsic computes the mask itself with the number of components.
No fossils-db changes on NAVI33.
Fixes: 45d8cd037a ("ac/nir: rewrite ac_nir_lower_ps epilog to fix dual src blending with mono PS")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14878
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39996>
There's no reason to have these checks be smeared between
radv_image_need_retile and radv_retile_transition.
Make radv_image_need_retile verify that the image might ever
need to have its displayable DCC updated.
Also, radv_image_need_retile should not care about the command
buffer. We should never try to do retile transition on a
command buffer that can't do compute to begin with.
Make radv_retile_transition only check whether the layout
we're transitioning to might involve reading the displayable
DCC, and perform retiling if so.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39990>
This probably doesn't do anything because sgpr_read_by_valu are all set
already for raytracing shaders.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39825>
If a set layout is missing the driver can't compute the dynamic buffer
start offsets correctly. The only solution is to load these offsets from
an user SGPR.
To avoid adding more complexity, these offsets are re-emitted every
time dynamic buffers are dirty. That shouldn't matter because the
combination of dynamic buffers and independent sets is just super rare.
This fixes new VKCTS coverage
dEQP-VK.pipeline.pipeline_library.graphics_library.independent_sets_random.*.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39988>
The Vulkan spec says:
"The contents of pRenderingInfo must match between suspended
render pass instances and the render pass instances that resume
them, other than the presence or absence of the0
VK_RENDERING_RESUMING_BIT, VK_RENDERING_SUSPENDING_BIT, and
VK_RENDERING_CONTENTS_SECONDARY_COMMAND_BUFFERS_BIT flags. No
action or synchronization commands, or other render pass
instances, are allowed between suspending and resuming render
pass instances. All pairs of resuming and suspending render passes
must be submitted in the same batch. "
So it should be safe to avoid re-emitting the rendering state because
nothing can blow it up.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40004>
This is possible since VK_KHR_maintenance10.
This fixes new VKCTS coverage in
dEQP-VK.pipeline.*.multisample.m10_resolve.*.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39956>
Since the stride is always 32 dwords, we need to treat the workgroup
size as multiples of that value. Using MAX2() only works for cases where
the workgroup size is less than 32, which was hit by some CTS with 1x1
workgroups.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39981>
While reworking image resolves completely in RADV, I found a very weird
bug where the only fix was to emit caches immediately after
decompressing the source resolve image (after FMASK_DECOMPRESS).
I have been struggling this for few hours and figured that it was
something related to context rolls (ie. as long the context was rolled
out, emitting the flushes immediately was required).
It turns out this was a known hardware bug on GFX6 that was implemented
in PAL. Though PAL only applies on GFX6 but GFX7-8 are also affected
based on my testing. Note that RadeonSI flushes CB_META too.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39959>
Fixes baldurs_gate_3/60c8b7ff623fbb18 with vega10.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 310f588f92 ("aco/ra: move variables from affinity register to avoid waitcnt")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39986>
Might happen with radv_emulate_rt=true.
Fixes the_great_circle/a6079328b8df7712 with polaris10.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: e006f68b11 ("aco/isel: Don't add scratch offset as gfx8- soffset if no offsets exist")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39986>
In some cases, this would incorrectly set higher dpbArraySize
when overwriting already existing dpb slot.
This didn't seem to cause any issues, but the extra slot would
have zero va which was wrong.
Get the actual ref count from codec param, instead of using
cmd->num_refs which always includes current slot. Also add sanity
check that the ref surface was found.
Fixes: 79af03556c ("ac: Add VCN ac_video_dec implementation")
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39877>
The NIR_PASS macro only overwrites this when the pass actually makes
progress. If the pass doesn't make progress, the variable stays
uninitialized.
Clang correctly spots this and warns about it.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39968>
The hardware only provides 13 bits for encoding the stack base (in
dwords). That translates to the stack base being required to be below
8192 dwords, or 32kB. It's possible to exceed this - LDS is 64kB after
all. Add an explicit check to make sure we don't end up with offsets
that overflow the hw's address fields. This fixes Metro Exodus Enhanced
Edition, which was using ray queries in a 1024-thread sized workgroup,
resulting in exactly 64kB of LDS being required for the stack.
This check isn't required for RT pipelines as we always use 32 or 64
wide workgroups with no other LDS used, so it's impossible to reach this
stack base limit.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39691>
The current approach of explicitly saving/restoring some states is
unnecessarily complicated and inefficient. For example, some meta OPs
that use memory fills/copies will have nested save/restores. This patch
is the first step towards avoiding unnecessary state re-emits around
meta OPs.
The changes are:
- Move radv_meta_saved_state to radv_cmd_buffer::state
- Add radv_meta_begin/end helpers that initialize radv_meta_saved_state
and restore states used by the meta OP
- Remove all explicit saves/restores, use the new helpers
radv_meta_begin/end is called inside the entrypoint and not some nested
helper function which means that state is only restored once per meta
OP.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39774>
ac_prepare_cs_clear_copy_buffer determines whether to use CP DMA, and
the driver obeys that.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39841>
This should make copying sparse faster if we get aligned buffer bounds.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39841>