The previous code not only left them undefined, but also
didn't increment the array index, so subsequent PS inputs
would be broken after the undefined one.
Note that this doesn't affect any valid Vulkan apps, but it makes
the code a bit simpler and it makes undefined inputs a little more
forgiving, at no expense for valid PS.
This code actually uncovers a bug in Zink, so I'm also documenting
the failing Zink test case.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32220>
GFX10.3 keeps track of per-vertex and per-primitive PS inputs
separately in NUM_INTERP / NUM_PRIM_INTERP,
which we only really know when emitting the inputs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32220>
Add export_prim_id_per_primitive for mesh shaders.
This prepares to also configure some of these to be per-primitive
in the future, even in the traditional pipeline.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32220>
There are some FS built-ins that can be per-vertex or
per-primitive depending on whether a mesh shader is used:
primitive ID (implicit in VS), layer and viewport.
However, the HW requires per-primitive FS inputs to be ordered last.
This causes bugs when the same unlinked FS is used together
with VS/TES/GS and MS (with unlinked ESO or fast-linked GPL).
To solve this problem, we reorder the FS inputs so that these
potentially per-primitive inputs go after per-vertex inputs but
before per-primitive inputs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32220>
It's also possible to use ALL_GRAPHICS and PRE_RASTERIZATION as
alternatives.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32323>
It should be enough to do this at the end of each submit instead.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31695>
This will allow us to optimize the pipe misaligned special case for
GFX11 because only the first mip in the mip-tail needs the L2 cache
invalidation.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31921>
This fixes a CTS hang on Hawaii.
We previously only did a CB/DB flush,
but that doesn't include a L2 cache flush.
Also fix the comment that said this is for GFX9+.
Fixes: 7c62f6fa01
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31906>
The major differences compared to the NV extensions are:
- support for the sequence index as push constants
- support for draw with count tokens (note that DrawID is zero for
normal draws)
- support for raytracing
- support for IES (only compute is supported for now)
- improved preprocessing support with the state command buffer param
The NV DGC extensions were only enabled for vkd3d-proton and it will
maintain both paths for a while, so they can be replaced by the EXT.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31383>
If blending is disabled or the color write mask is 0, dual-source
blending would be ignored, and this can be simplified a bit.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31681>
We cannot set the {X,Y}MAX_RIGHT_EXCLUSION bits
if we have a sample location at a pixel boundary.
CTS does not seem to be catching this.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Co-authored-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31839>
It's possible to enable NGG culling with ESO if shaders are linked, or
if the VS doesn't need a prolog or if TES is used. This wasn't
supposed to be enabled but I think it worked just by luck because the
user SGPR value was probably zero and NGGC was disabled at draw time.
Found by inspection.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31829>
Looks like this was disabled by mistake. RadeonSI relies on the default
value which is "no force" and PAL only sets this to FORCE_DISABLE when
HTILE is completely disabled using settings.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31690>
The IB2 packet is only supported on the graphics queue. To execute DGC
IB on compute, the previous solution was to submit it separately
without any chaining. Though this solution was incomplete because it's
easy to reach the maximum number of IBs per submit when there is a lot
of ExecuteIndirect() calls.
To fix that, the proposed solution is to implement DGC IB chaining when
it's executed on the compute only. The idea is to add a trailer that is
added at the beginning of the DGC IB (to know the offset). This trailer
is used to chain back back the DGC IB to a normal CS, it's patched at
execution time. Patching is fine because it's not allowed to execute
the same DGC IB concurrently and the entire solution relies on that.
When the DGC IB is executed on graphics, the trailer isn't patched and
it only contains NOPs padding. Performance should be mostly similar.
This fixes
dEQP-VK.dgc.nv.compute.misc.execute_many_*_primary_cmd_compute_queue.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30809>
Preprocess now must use the same conditional rendering state as the
execute, so the DGC prepare shader must reset the number of sequences
to generate an empty cmdbuf for compute.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31563>
Instead of re-emitting some dynamic states when a new shader is bound,
only re-emit the user SGPR states. This is slightly more optimal.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31160>
This doesn't change anything in practice because if we have a task
shader, we also have a mesh shader and the state was already re-emitted.
Though, this will prevent a regression from the upcoming patches because
the user SGPR layout will change.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31160>