When there are lots of inlined shaders, going over each one and checking
if the call index matches becomes expensive. Instead, use a
binary-search-like selection to skip most of the checks.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26380>
This is needed for DXVK and VKD3D-Proton in order to implement
DXGI frame latency and frame latency waitable objects.
Gamescope also requires this to keep in-step with the host.
Using the same enablement system as RADV, etc to be safe.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26752>
We will use this in the future to enable present_id + present_wait like
in RADV.
This also enables the common WSI driconf entries for image count, etc
overrides to work by default, fixing some games.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26752>
We can't query the actually required alignment from llvmpipe, but 16
bytes is insufficient. In particular, llvmpipe's pixel backend code for
depth/stencil will load/store 4 values at a time, which crashes
with d32s8x24 format if alignment is only 16 bytes (the color backend
does similar things but will use unaligned loads if alignment exceeds
16 bytes at least for now).
32 bytes would be enough, however the "ordinary" llvmpipe resource
layout code currently always aligns to at least 64 bytes, so use
this value as well.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26724>
This lets us support multi-planar images properly on transfer queues.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26353>
This removes a bunch of redundant calculations, thus making the code
less error-prone.
Also move the row pitch calculation to a separate function.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26353>
This makes it possible to gather the surface information
only once and pass it to the various SDMA functions, therefore
the code can be less error-prone.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26353>
This makes it possible to call the function just once instead
of calling different functions repeatedly.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26353>
Using just one struct for both types of images (and buffers)
will simplify a lot of code.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26353>
Very few parts of RADV actually need the SDMA functions, so
moving them to a separate header makes the driver cleaner and
also improves compilation time when SDMA functions change.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26353>
These dynamic rendering failures/flakes are a known issue that will
be resolved soon as part of VKCTS. Until that, make sure CI is green.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26734>
The initial logic was to remember the place were SPI_SHADER_PGM_LO_*
are written, then assume that we can get the register offset because
the sequence would always be:
PKT3_SET_SH_REG
SPI_SHADER_PGM_LO_* register offset
VA low 32 bits value <- reg_va_low_idx
The problem is that this sequence isn't guaranteed, for instance we
can get this instead:
0 c0067600 |
1 00000046 |
2 003ffffd | SPI_SHADER_PGM_RSRC3_VS
3 00000020 | SPI_SHADER_LATE_ALLOC_VS
4 * 00002080 | SPI_SHADER_PGM_LO_VS
5 00000080 | SPI_SHADER_PGM_HI_VS
So the assert in si_state_draw.cpp would fail as well as the VA
update logic.
So instead remember which the SPI_SHADER_PGM_LO_* offset, and the low
32 bits of the VA in si_update_shaders.
Fixes: 8034a71430 ("radeonsi/sqtt: re-export shaders in a single bo")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26774>
Not expecting this to actually fix anything externally visible,
but reduces some invalid usage when the resulting vector is
not 16 elements long (e.g. the C/result matrix).
Fixes: 9df4703fbb ("radv: Add cooperative matrix lowering.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26768>
No hash updates as I didn't find a facility to do it in radeonsi
(even though there are flags like forcing fma32).
Note that we do this very late to avoid any optimizations that
might remove the dead stores. (Checked that LLVM doesn't remove
them, but it is admittedly potentially brittle)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26679>
Only shaders which explicitly allow shared memory are included for
now. The pass is very late to avoid optimizations removing the stores
and to ensure the clear gets added after MS outputs get loaded from LDS.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26679>
RAW shader did not have dma shader upload, this commit share
the pre/post upload code with ELF, so RAW and ELF can have same
upload mechanism.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26750>
There is nothing to be done, as we lower this op unconditionally.
We need to add an arm in the match to not panic, though.
Fixes a panic in
dEQP-VK.binding_model.shader_access.primary_cmd_buf.sampler_immutable.vertex_fragment.multiple_contiguous_descriptors.2d
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26587>
Supporting these means we don't have to depend on calling the GLSL
IR optimisation loop for shaders that contain these parameter types.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26755>
The main motivation for doing this is that some tests and even the
st tracker linking code dump out the GLSL IR for debugging before
glsl_to_nir() is called expecting it to already be in its final
form. Moving these to the linker makes those assumptions true.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26755>