Only GFX10+ can support 1x user sample locations, but MSAA_ENABLE
needs to be enabled.
Fixes new VKCTS coverage dEQP-VK.pipeline.*samples_1*.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35492>
(cherry picked from commit 061bc6151a)
DCC tilings info needs to be set for all surfaces, including
depth/stencil. But because this is a C union, settings those fields
for depth/stencil surfaces might accidentally overwrite HiZ info.
This fixes rendering issues with RADV_DEBUG=nohiz.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35515>
(cherry picked from commit 251b23f6c2)
This prevents a regression from the next commit which would write
garbage for combined image+sampler descriptors and that might break
capture&replay.
It seems also more robust to write zeroes than garbage overall.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35455>
(cherry picked from commit 22e06d65d7)
This mirrors AMDVLK. 128-byte alignment is possible, but DOOM: The Dark
Ages screws up scratch allocation with alignments <256 bytes.
Fixes hangs in DOOM: The Dark Ages.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35152>
(cherry picked from commit dac6f09451)
This try to mitigate the HiZ GPU hang by increasing a timeout. Loosely
based on PAL but I can confirm it delays the hang when
BOTTOM_OF_PIPE_TS is used as a workaround.
This must be emitted when the GFX queue is idle.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35212>
(cherry picked from commit 47f5d25f93)
The sparse image VA needs to be returned to the application for replay.
Reported by Baldur.
VKCTS has coverage but it doesn't verify this yet.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35162>
(cherry picked from commit 63758bc093)
On GFX11+, DISABLE_FOR_AUTO_INDEX=1 automatically disables primitive
restart enable for non-indexed draws.
On GFX10-GFX10.3 the hw considers primitive restart enable for
non-indexed draws and the driver must disable it explicitly.
GFX9 and older gens aren't affected but applying the change for them
simplifies the implementation.
To fix that, move emitting primitive restart enable at draw time
because it needs to know if the draw is indexed or not.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13037
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34996>
(cherry picked from commit 4d1fcd75f9)
Similar to indirect draw barrier, need similar fixups for conditional
rendering access.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34956>
(cherry picked from commit e674823d55)
When the hardware doesn't natively support 32-bit predication, the
driver has a fallback which allocates a 64-bit predicate to the upload
BO in order to copy the original value.
But when conditional rendering is enabled in the stateCommandBuffer
which is used by preprocess() and the execute() is recorded also in the
stateCommandBuffer. If the preprocess() is recorded in a different
cmdbuf which is submitted before the cmdbuf that contains execute(),
the fallback (ie. alloc + COPY_DATA) will be performed after. This would
cause the predicate value to be always 0.
To fix that, keep track of the user predication VA which is the only
VA that needs to be used by DGC because it reads 32-bit from the shader.
This fixes a very weird corner case with vkd3d-proton.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13143
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34953>
(cherry picked from commit 3ca2f71f3d)
This optimization used to optimize the allocated space for descriptors
when immutable samplers are equal. Though, this was basically broken :
- descriptor copies were broken for combiner image sampler (or sampler)
with equal immutable samplers because 96 bytes were copied instead of
64 bytes (cf. the linked ticket). This could be fixed but it's not
worth it.
- the value returned by vkGetDescriptorLayoutSupport() was broken, it
should have been 96 with no immutable samplers (or when they aren't
equal)
This optimization was also not applied for descriptor buffers which is
the default for vkd3d-proton and Zink. DXVK doesn't use db but it
doesn't use immutable samplers, so basically only native vulkan games
would be concerned.
Note that immutable samplers would still be inlined in shaders if no
indirect access which should be 99.9% of the usecase.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11165
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34928>
(cherry picked from commit 69ff204422)
In a scenario where the viewports/scissors are a dynamic state but the
count is static (ie. updated when a graphics pipeline is bound), the
driver wasn't considering that and it was re-emitting the previous
number of viewports/scissors.
This fixes rendering issue with Blender.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13127
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34921>
(cherry picked from commit 9a07ccbc89)
The hardware requires a power of two bpe. To do that, the driver
needs to adjust the pitch/offset/extent based on a texel scale factor
which only applies to 96-bits formats.
This fixes new VKCTS coverage.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34927>
(cherry picked from commit 4b73d7e817)
CmdTraceRays is neither a dispatch or a draw command which means it
shouldn't be affected by conditional rendering.
Fixes recent VKCTS coverage.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34868>
(cherry picked from commit 4b76d04f7f)
VK_ERROR_INITIALIZATION_FAILED will fail physical device enumeration.
Returning VK_ERROR_INCOMPATIBLE_DRIVER means that the driver can still
be used on supported GPUs when multiple GPUs are installed.
cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34783>
(cherry picked from commit 84b9c281fe)
Emitting compute dispatches on SDMA just hangs. It might be needed
to switch to gang submit for these to work but fixing the GPU hang is
more important for now.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34805>
(cherry picked from commit 0684dc5fa8)
Similar to previous generations that support compression, except that
the driver don't need to configure a meta VA because DCC is completely
transparent to the userspace.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34517>
On GFX12, DCC is transparent to the driver and there is no meta VA.
Adding a new flag to know if the SDMA surface is compressed is needed.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34517>
This wasn't technically incorrect because V_028C70_BU_NUM_xxx values
are similar to V_028C70_NUMBER_xxx but it's better to use the corect
helper.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34517>
CP is very slow on GFX12 and parsing the packet header is the main
bottleneck. Using paired context regs reduce the number of packet
headers and it should be more optimal.
It doesn't seem worth when only one context reg is emitted (one packet
header and same number of DWORDS) or when consecutive context regs are
emitted (would increase the number of DWORDS).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34421>
When nir_opt_varyings doesn't make progress the first time,
it should not be necessary to call it a second time.
No Fossil DB changes.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33880>
This is to stop calling nir_shader_gather_info repeatedly for
some stages, and also as a pre-requisite to the work in the next commits.
No Fossil DB changes.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33880>