Some copy operations are poorly supported by the SDMA hardware,
meaning that the built-in packets don't support them, so we will
need to work around that by copying to and from a temporary BO.
The size of the temporary buffer was chosen so that it can fit
at least one full pixel row of the largest possible image.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25831>
Previously, RADV only had a simple implementation of
image to buffer copies using the SDMA for the PRIME copy.
This commit replaces that with a full-featured implementation
that includes buffer to image and image to buffer copies and
removes the assumptions that the PRIME copy had, as well as
adds new helper functions which will be shared with other copy
functions in upcoming commits.
Unaligned buffer/image copies require a workaround, which
will be implemented by a future commit.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25831>
This register seems needed to enable compute shader shader invocations
on GFX7. On GFX8+ it's working fine without emitting this register but
I think it doesn't hurt.
This fixes dEQP-VK.query_pool.statistics_query.*_cq on GFX7.
Fixes: a9945216ba ("radv: fix COMPUTE_SHADER_INVOCATIONS query on compute queue")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25957>
Looks like GFX6 always writes the number of compute shader invocations
at offset 0 when used on compute queue.
This fixes dEQP-VK.query_pool.statistics_query.*_cq on GFX6.
Fixes: a9945216ba ("radv: fix COMPUTE_SHADER_INVOCATIONS query on compute queue")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25957>
This partially reverts 74ab940156 which
was a fix for random GPU hangs with binning on GFX10+. Though,
according to RadeonSI, only GFX10+ is affected and this reduced perf
on GFX9 chips.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25951>
The following sequence is valid (although weird) but many other drivers
(including RADV) were broken:
- bind pipeline with some static state
- set state command for that static state (to a bad value)
- bind the same pipeline again
- draw
Fixes new dEQP-VK.dynamic_state.*.double_static_bind.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25954>
This used to be a PS_PARTIAL_FLUSH to fix synchronization issues with
NGG streamout on RDNA2 but it's no longer needed for RDNA3. It's
already synchronized in CmdEndTransformFeedbackEXT().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25903>
This was useful for experimenting it on RDNA2 and during RNDA3 bringup,
but now the support is rock solid on RDNA3 and it's useless to keep the
RADV_PERFTEST=ngg_streamout option.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25903>
Only RDNA1-2 are affected because RADV needs to handle the legacy vs
NGG path for this query, and the NGG results are stored with 2 extra
64-bit values.
Fixes flakes with
dEQP-VK.transform_feedback.primitives_generated_query.* since VKCTS
1.3.7.0.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25862>
The range alignment didn't happen through GetDescriptorEXT as it called
write_buffer_descriptor directly. So simply move the align
from write_buffer_descriptor_impl into write_buffer_descriptor.
Fixes: 46e0c77582 ("radv: implement VK_EXT_descriptor_buffer")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25837>
On NVIDIA, we can do a vote_ieq on bool in one hardware op so we don't
want that lowered. We do want to lower vote_feq and other vote_ieq,
though.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25894>
Tested with a small use-after-free Vulkan test. It's pretty basic
for now but I think I will add status decoding support to report
more useful information.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238>
It's been a very long time since this was useless because dmesg
required sudo access for a while.
This will be replaced by the new GPUVM fault interface which allows
to query from the kernel directly instead of trying to parse dmesg.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238>
Unlike AMDVLK, this has separate loops for attribute stores and exports,
so that the stores from different rows can overlap.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25040>