According to RadeonSI, only GFX10 and GFX10.3 need to emulate.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19319>
XFB queries need to be enabled with NGG streamout and VS/TES.
Previously, the NGG lowering code relied on has_prim_query for XFB.
This fixes failures with RADV_PERFTEST=ngg_streamout on GFX10.3 with
the vkd3d-proton testsuite. Vulkan CTS is missing TES tests with XFB
queries apparently.
Cc: 22.3 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19493>
With dynamic color write mask and rasterization samples, the binning
state will have to be re-computed dynamically. This shouldn't hurt
anything right now because it's only done at pipeline bind time.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19164>
The hardware can detect binning transitions apparently, so it can be
hardcoded. This matches RadeonSI and PAL.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19164>
Meta operations change dynamic states like viewports and previously,
the guardband state was also always re-emitted because it relied on
dynamic viewport/scissor changes.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7577
Fixes: 40d8df7280 ("radv: emit the guardband state separately from the scissor state")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19521>
The Vulkan spec says:
"When a command buffer is allocated, it is in the initial state.
Some commands are able to reset a command buffer (or a set of
command buffers) back to this state from any of the executable,
recording or invalid state. Command buffers in the initial state
can only be moved to the recording state, or freed."
Because the status wasn't initialized, it was implicitly set to
RADV_CMD_BUFFER_STATUS_INVALID and that triggered a reset for newly
allocated command buffers.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19506>
GDS is used for NGG queries/streamout (GFX10+ only) and the BOs were
only added to the graphics queue because compute doesn't need them.
Though, the kernel emits a GDS switch when a queue submission doesn't
use GDS. That means that submitting jobs on the compute queue without
GDS can reset the state of the graphics queue and lead to GPU hangs.
The only viable solution for now is to make the GDS BOs resident to
avoid resetting the state between queues. This shouldn't introduce
more syncs between queues because GDS BOs are similar for both.
This fixes a GPU hang with Warhammer Chaosbane during loading time and
possibly some spurious random GPU hangs. Note that this GPU hang was
workarounded on the Steam side with RADV_DEBUG=nongg.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19466>
Found by inspection because the MIN_LOD bits were moved.
Cc: 22.3 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19496>
We need to pass the family to register parsing functions.
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19419>
It would assert anyways. Found by inspection.
Cc: 22.3 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19495>
Copied wrong from radeonsi. The registers following the scratch
buffer address are the shader rsrc1/rsrc2. Not the user SGPR0
containing the ring resource word 1.
Fixes: 278e533ec9 ("radv: update scratch buffer registers on GFX11")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19488>
6698753cdb switched our GS output stores to use MUBUF.
The stride doesn't matter for the ESGS descriptor (because idxen=false and
the index stride is 64), but this fixes it anyway.
This also changes ACO to use MUBUF store too, since MTBUF doesn't seem to
work correctly with an invalid data format in the descriptor.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 6698753cdb ("ac/llvm: don't use tbuffer_store as a fallback for swizzled stores")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18885>
Otherwise, queries might return invalid data because they used
the same offsets as NGG streamout.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19442>
The kernel keeps returning -ENOMEM if multiple processes allocate GDS,
this always happen while running VKCTS. This solution is loosely based
on RadeonSI, except that it includes a timeout of 1s to exit the loop.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19440>
NGG streamout lowering requires nir_shader::num_outputs to be set to
the total number of outputs in order to compute the pervertex LDS size
correctly. This is wasting LDS memory but it's currently the only viable
solution.
This fixes a bunch of dEQP-VK.transform_feedback.* failures.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19436>
If the size is passed through CmdBindTransformFeedback() uses that.
This partially fixes dEQP-VK.transform_feedback.simple.multiquery_1
by reporting the correct number of primitives written (the computation
is based on the buffer size). There is still a bug around GDS offsets
that will be fixed later.
Tested on GFX10.3 by forcing NGG streamout.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19433>
According to the AMD registers database, SAMPLE_STREAMOUTSTATS no
longer exists on GFX11. This fixes primitives generated query if only
the NGG path is used. Tested on GFX10.3 by forcing NGG everywhere.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19410>
With NGG only, only the GDS query counter will be incremented.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19410>