Looks like GFX6 always writes the number of compute shader invocations
at offset 0 when used on compute queue.
This fixes dEQP-VK.query_pool.statistics_query.*_cq on GFX6.
Fixes: a9945216ba ("radv: fix COMPUTE_SHADER_INVOCATIONS query on compute queue")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25957>
Only RDNA1-2 are affected because RADV needs to handle the legacy vs
NGG path for this query, and the NGG results are stored with 2 extra
64-bit values.
Fixes flakes with
dEQP-VK.transform_feedback.primitives_generated_query.* since VKCTS
1.3.7.0.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25862>
Create a version of this function that takes a CS and queue family.
move it to radv_cs.h so it can be called from multiple other files.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25770>
This doesn't fix anything because gds_needed should already be TRUE
because it's initialized at pipeline bind time, but this will be needed
for skipping GDS allocation on GFX11.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25284>
Move emitting the EOP even which writes the availability bit after the
GDS copy to ensure it's available.
This should fix all GS primitives/invocations flakes in CI.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25457>
The VA needs to be adjusted, otherwise the hw always writes at offset 0.
This fixes dEQP-VK.query_pool.statistics_query.*_cq.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25406>
SAMPLE_STREAMOUTSTATS requires PIPELINESTAT_START to be enabled,
otherwise the hw doesn't count anything.
This fixes
dEQP-VK.transform_feedback.primitives_generated_query.concurrent.pipeline_statistics_2.*
on GFX8. GFX6-9 are probably also affected by this bug, but with NGG
these queries are slightly different and don't use legacy streamout.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25049>
The number of geometry shader invocations is correctly counted by the
hardware for both NGG and the legacy GS path but it increments for
NGG VS/TES because they are merged with GS, but it shouldn't. Fix this
by emulating the number of geometry shader invocations.
This fixes piglit/bin/arb_query_buffer_object-qbo and recent
dEQP-VK.query_pool.statistics_query.gs_invocations_no_gs.* failures
with NGG.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24231>
NGG is enabled by default on RDNA1-2 but the driver might fallback to
legacy GS for some reasons, like XFB. On these generations, the number
of generated primitives by GS needs to be emulated from the NGG shader
because the hw doesn't increment the related pipelinestat counter.
In order to support NGG and legacy GS with that query (remember that
we can't know pipelines when starting/ending queries), we used to
reserve 2x 64-bit counters to store the GDS results, and the results
were accumulated.
Now that legacy GS also uses GDS counters, we can simplify this path
and overwrite the pipelinestat counter directly instead of having two
separate counters.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24231>
Primitives generated queries write 1 integer, the primitives-generated
count that is incremented every time a primitive emitted to that stream
reaches the transform feedback stage.
Fixes: 1ebf463a5a ("radv: implement VK_EXT_primitives_generated_query")
Signed-off-by: Juston Li <justonli@google.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23915>
Equivalent of 284e604872 but for acceleration structure queries.
If an app inserts a barrier between AS builds and writing AS properties,
we must respect it or things will blow up.
Cc: mesa-stable
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23568>
On RDNA1&2, the driver needs to support both NGG and legacy for
primitives generated query because we can't know that before starting
queries.
To get the query pool results, we check the availability bit wrote by
the SAMPLE_STREAMOUTSTATS packet but the GDS copy was emitted after,
which means the availability bit might be TRUE before the GDS copy is
actually done.
Fix this by emitting the GDS copy before to ensure the availability is
TRUE for both results.
This fixes recent updates in
dEQP-VK.transform_feedback.primitives_generated_query.* because the
tests no longer wait for the fence.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23080>
UE4's Vulkan backend uses vkCmdWriteTimestamp with TOP_OF_PIPE
to measure how long a workload took in the GPU Benchmark. This is wrong
and writes the timestamp before the workload is actually finished,
making it seem like the GPU is much faster than it actually is.
This caused subsequent benchmark passes to contain way too big workloads,
which caused soft hangs on slower GPUs.
Fixes GPU hangs with Splitgate during automatic settings configuration.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22823>
The flag was ignored for VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT and
VK_QUERY_TYPE_PRIMITIVES_GENERATED_EXT.
Cc: mesa-stable
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22781>
Most applications have a sequence like BeginQuery/Draw/EndQuery which
can be optimized by delaying DB_COUNT_CONTROL at draw time instead of
enabling/disabling for every draw.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22556>
This is really noticeable for games that resolve a bunch of occlusion
queries (in this case 4096) because it seems that emitting 4096
WAIT_REG_MEM packets can stall more than expected. Fixes this by
waiting for queries in the resolve query shader.
This improves performance of an unreleased game by +~10% (71->78 FPS).
RADV should now be really close to Windows performance for that title.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22579>
Instead, check if the cache is the meta shader cache. This catches the
shaders created by radv_create_radix_sort_u64().
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21606>
If the buffer hasn't been bound to memory yet, we will dereference a
NULL pointer in radv_CreateAccelerationStructureKHR.
cc: mesa-stable
Closes: #8199
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21019>
Don't emit the NGG query user SGPR if its state doesn't change.
Based on original work by Mike Blumenkrantz.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18819>
According to RadeonSI, only GFX10 and GFX10.3 need to emulate.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19319>
According to the AMD registers database, SAMPLE_STREAMOUTSTATS no
longer exists on GFX11. This fixes primitives generated query if only
the NGG path is used. Tested on GFX10.3 by forcing NGG everywhere.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19410>