More common and this implicitly enables this for Path Of Exile and X4
Foundations. Though, zero VRAM allocs is already the default in AMDGPU,
so that doesn't change anything in practice (except for very old
kernels).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41735>
libdrm dups the fd internally, so local_fd and get_fd() are different
fd number but they point to the same file descriptor. Close it right
after the amdgpu device is initialized to avoid keeping two fds open
for the same thing.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41753>
The register values will depend on new fields in PS_STATE and it doesn't
seem like dynamic state belongs in radv_emit_fragment_shader_state.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41689>
to reduce the number of initialized PS VGPRs, increasing the PS wave launch
rate.
The pass will have more RADV-specific stuff.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41689>
buffer_size is uint32_t so we must be careful to not overflow it.
radeonsi had code for this but radv doesn't, which means it will
hang if RADV_THREAD_TRACE_BUFFER_SIZE is too big or if buffer_size
is being doubled up to the point it overflows.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41383>
We'll get three new opcodes to properly model float multiply-add.
ffma_old is temporary and will be deleted at the end of this series.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>
Otherwise, if a cmdbuf is recycled it would assume that a gang CS is
always is present even if it's not used. That means, it would emit
useless synchronization and use gang submit with a mostly empty gang
CS for nothing.
It seems better to create the gang CS on-demand only when it's strictly
required (for compute fallback with SDMA and task shaders). Even for
heavy uses of task shaders, that shouldn't hurt.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41543>
This splits the nir_move_to_top_input_loads option into 2 options. The latter
option is mainly for at_offset/at_sample loads. Then it updates most places to
use only the first option.
The rationale is that moving at_sample loads makes Control (game) shaders
worse, as per the code comment.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41167>
Because SDMA doesn't support MSAA, it's possible to get there because
RADV fallback to compute queue in this case.
Some tests only pass because RDNA2 and older don't support image
stores with depth/stencil and MSAA.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41492>
When a VRS view is used with a depth/stencil view, the driver is
expected to copy the VRS rates to the HTILE buffer of the depth/stencil
view. Though if the image uses mipmaps and the base level can't support
HTILE there is no way to copy the rates. The workaround is to force VRS
to be 1x1 which is valid in Vulkan.
This fixes old VKCTS failures on RAPHAEL just because it supports
fragmentShadingRateWithShaderDepthStencilWrites compared to other GPUs
in CI (NAVI21/VANGOGH).
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41427>
It's required with VK_KHR_maintenance11. This allows way more transfer
queue related CTS tests to run and all issues I found should already
be fixed.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41316>
The Vulkan spec says:
If a fragment shader entry point statically uses an input variable
decorated with a BuiltIn of SampleId or SamplePosition,
sample shading is enabled and a value of 1.0 is used instead of minSampleShading.
If a fragment shader entry point statically uses an input variable decorated
with Sample, sample shading may be enabled and a value of 1.0 will be
used instead of minSampleShading if it is.
This means we have to overwrite the command buffer state entirely.
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41482>
This shouldn't be necessary because SDMA can detile the image just fine,
only buffer->image and image->image need to fallback.
It just works on GFX10+ because RADV is using NBC views, and I think
it works on eg. VEGA10 just by luck due to different
swizzles/alignments.
Fixes: 3d803d7a2e ("radv: Use compute copy for emulated formats")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41384>
This can only happen with RADV_DEBUG=fullsync which literally flushes
all caches, but INV_ICACHE is invalid with RELEASE_MEM apparently.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41396>