This is really noticeable for games that resolve a bunch of occlusion
queries (in this case 4096) because it seems that emitting 4096
WAIT_REG_MEM packets can stall more than expected. Fixes this by
waiting for queries in the resolve query shader.
This improves performance of an unreleased game by +~10% (71->78 FPS).
RADV should now be really close to Windows performance for that title.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22579>
VK_FORMAT_FEATURE_2_COLOR_ATTACHMENT_BIT is equal to
VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT but we want COLOR_ATTACHMENT_BIT.
Found by inspection.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22540>
For some reasons this seems broken only on GFX6. Note that PAL doesn't
allowed block-compressed with 1D on all GPUs.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22540>
mantissa needs to be at the lower part for shift left.
This fixes large integer value conversion.
Cc: mesa-stable
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22570>
Also add a comment that explains the dword aligned mode.
Note that the SDMA shader uploads are always dword aligned
so this commit doesn't fix any issues but just prepares this
function for more general use.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22551>
IB2 packets hang GFX6 when they contain any indirect draws,
not just the MULTI versions.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22533>
The GPUs which need the workaround do not support mesh shaders.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22533>
Move compute IB2 check to the winsys, because IB2 only works on
GFX queues and not any other queue types.
Then, simplify the workaround condition in the cmd buffer.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22533>
radeonsi does not pass scratch buffer address by arg,
but dynamical relocation symbol when upload. Just skip
this part to enable radeonsi use aco, but it will fail
when spill.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22525>
They use same instruction. Just because when the time
nir_load_smem_buffer_amd was introduced, radeonsi didn't support
pass buffer descriptor to nir_load_ubo directly.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22523>
Partial revert of 590959057c ("ci/amd: raven is currently downgraded
to 2 machines only, adapt")
Test which remains disabled: radeonsi-raven-va:amd64 (VAAPI testing).
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22447>
It should be handled like DEPTH_STENCIL_READ_ONLY_OPTIMAL.
This fixes an issue with VRS attachment because HTILE was considered
disabled for READ_ONLY_OPTIMAL but there is no reasons to disable it
as long as the image is only used as a depth/stencil attachment.
Otherwise, when HTILE is disabled, VRS rates are ignored.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8675
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22468>
This packet is supported on GFX6 too, its name should relect that.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22406>
Also don't check whether chaining is enabled in radv_queue, the
winsys will take care of that anyway.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22406>
GFX6 supports IB chaining since the PFP firmware version 20.
Note that the very first amdgpu firmware for GFX6 already had
version 29, so we can assume that all GPUs supported by RADV
have this feature.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22406>
These used to guard against chaining on GFX6 and on HW IP types
that don't support chaining, but these things are now guarded
elsewhere and these assertions are no longer necessary.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22406>
With the on_demand shaders feature, meta pipelines are only created
when they are used, otherwise they are NULL. Though, inside secondary
cmdbuffers, the graphics pipeline might be also NULL. In this specific
case, radv_is_{dcc,fmask}_decompress_pipeline() would return
TRUE because these pipelines are NULL too...
This fixes flakes with tests that use secondary cmdbuffers with
TC-compat images.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22440>
To simplify both llvm and aco backend and remove unnecessary
workaround code where prim count is known to be not zero.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22381>
I moved to many things to radv_pipeline_graphics.c without checking.
Fixes: 7783b7f697 ("radv: split radv_pipeline.c into radv_pipeline_{compute,graphics}.c")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22441>
Primitive export used the gs_accepted variable after culling,
so we overwrote this variable after vertex compaction to make
sure not to hang the GPU.
This had an unintended side effect when storing the primitive ID
to LDS on GS threads: the LDS store was done even on threads whose
triangle was culled; potentially causing issues.
As a fix, create a separate boolean variable that remembers
which invocations need to export a primitive; and don't store
the primitive ID to LDS when gs_accepted is false.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8805
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22424>