Vulkan 1.4 raises the minimum for maxPushConstantSize to 256, and given
that we intend on supporting 1.4 eventually and the change is very simple
might as well do it now.
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35191>
All the Malis in existence out there support at most 64 user-supplied FAUs.
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35191>
We create hierarchy masks based on the number of levels available,
creating a bitmask with `max_levels` bits set. Originally these bits
all came together. Modify this to spread the bits out, which improves
performance on chips like the G31 with only 2 levels of hierarchy.
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34744>
PRE_POST_FRAME_SHADER_MODE_EARLY_ZS_ALWAYS was introduced in
architecture version 7.2, not 7.0 as we assumed. Using it on
G31 (a 7.0 device) caused some CTS failures.
Cc: mesa-stable
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34744>
We have way more registers to work with and we are going to need an
additional register for indirect scoreboard handling so let's increase
our scratch limits.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35089>
This was forgotten when introducing v12+ support.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35089>
We never use this kind of form in that header and that was bumping on
clang-format a lot.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35089>
v11 and later allows to indirectly wait on a scoreboard mask and signal
a scoreboard (as set via SET_STATE)
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35089>
This adds all bit operations and an helper to indirectly wait on
scorebards.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35089>
Instead of hardcoding scoreboard count and mask, we now derive those
informations from Panthor CSIF properties.
We still limit iters to 5 as we currently don't support more.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35089>
These formats don't depend on the ASTC HDR texfeat, they depend on the
ASTC HDR texfeat. The ASTC HDR texfeat simply adds support for more
endpoint encodings to these formats.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35132>
We only have two of these boards, and can't get more as they're EOL.
Demote them to nightly until we can source different boards with this
SoC, and more of them.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35200>
This should have been a multiply not an add.
Fix an assertion when running in tracing mode on panvk.
Fixes: 79a1d98e1e ("pan/csf: make cs_builder.h usable from c++")
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35176>
Turns out, some Midgard GPUs don't support more than 4x MSAA. Add a
quirk for those GPUs, so don't expose it when it doesn't work.
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35184>
We can't easily pass multiple initializers here, because a comma in the
initializer list would be treated as a preprocessor argument separator
and not a separator in the initializer lst.
We could also have fixed this with some nested macro ugliness, but let's
instead do what nir_builder does for intrinsic indices and use __VA_ARGS__
to keep this neat.
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35184>
When using 16x MSAA, we have two sample-positions on the negative
boundary of the unit-square covering the pixel. This causes problems
when using the default tie-breaking rule, where we miss some
sample-positions when rasterizing primitives covering the entire
viewport.
This works fine on Bifrost and later, but this setting is ignored on
those GPUs, and they assume the default (e.g MINUS_180_OUT_0_IN).
Because we'd prefer for rasterization to match between Midgard and
Bifrost when we can, we only apply this when we have 16x MSAA.
As an added bonus, this behavior matches what the DDK does.
Fixes these tests when 16x MSAA is enabled:
- dEQP-GLES31.functional.texture.multisample.samples_16.use_texture_*
- dEQP-GLES3.functional.multisample.fbo_max_samples.proportionality_alpha_to_coverage
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35184>
The V4 GPUs doesn't have the dynamic allocation logic that V5 and later
has. There's nothing to calculate here; the GPU either supports 8x MSAA,
or 4x MSAA.
Since 8x MSAA is the architectural max, let's have this function report
that. We deal with the 4x limit separately as a quirk, because this
applies to some V5 GPUs as well.
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35184>
This value isn't valid on V4, so let's make sure we don't try to use it.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35184>
for pipelines, we know enabled features. for classic shader objects, we do not.
therefore, we want to plumb this through explicitly for drivers using common
pipelines, rather than making drivers guess whether they can use the device
features.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35169>
This was mostly wired up, but we currently require an alignment of 64
for uniform texel buffers, because we're currently using
plane-descriptors for this.
We could lift that limitation by switching to buffer descriptors and use
LD_CVT for the format-conversion, but that's a bigger change.
Let's just fix up the aligntment and enable the extension for now.
Reviewed-by: Olivia Lee <benjamin.lee@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34826>
For partial secondary cmdbufs, we emit FBDs/TDs in the primary cmdbuf
before calling the secondary. In order to set the provoking vertex mode
correctly here, we need to look at the mode set by pipelines bound in
the secondary cmdbuf.
This leaves one edge case: reemitting FBDs/TDs in a secondary cmdbuf
after a flush. If the secondary cmdbuf only contains vk_meta draws,
without ever binding a pipeline, we won't know which provoking vertex
mode to use here. This is actually okay, because in that case the
provoking vertex mode doesn't matter for any of the draws in the
secondary, and the FBDs/TDs will be reemitted on the primary with the
correct mode.
Fixes: 7a9f14d3c2 ("panvk: advertise VK_EXT_provoking_vertex")
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Tested-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34974>
In this case, we need to emit the FBDs and TDs for the meta command
before we know what provoking vertex mode the application is going to
use. To handle this, we make a guess for which provoking vertex mode we
need. Then we use cs_maybe to leave space to flip the provoking vertex
bit if the guess was wrong.
This case is still unhandled on JM.
Fixes: 7a9f14d3c2 ("panvk: advertise VK_EXT_provoking_vertex")
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Tested-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34974>
Because we advertise provokingVertexModePerPipeline=false, the provoking
vertex mode must be set the same for all pipelines used in a renderpass.
vk_meta doesn't care about the provoking vertex mode, but the vulkan api
doesn't provide a way to express this, so it always sets
PROVOKING_VERTEX_MODE_FIRST (the vulkan default). This causes an
assertion failure when vk_meta is used in a renderpass where the
application sets PROVOKING_VERTEX_MODE_LAST.
There are a few different cases here, that need different handling. The
simplest is when vk_meta is used after the first application draw, in
which case we can just ignore the state passed by vk_meta and use the
existing state.
Fixes: 7a9f14d3c2 ("panvk: advertise VK_EXT_provoking_vertex")
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Tested-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34974>
This is needed to handle the provoking vertex mode correctly. vk_meta
doesn't care which provoking vertex mode is used, but there is no way to
express this directly in the vulkan api.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Tested-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34974>
The tiler OOM exception handler allocated a region of memory to dump
save/restored registers. For defining more functions in the future, we
allocate a register dump region for each subqueue, that can hold the
largest number of registers needed by any functions executed on that
subqueue.
This does mean that we cannot have function calls more than one deep. If
we ever need nested function calls, we will have to consider a real
stack.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Tested-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34974>
The register save/restore machinery is useful for more general callable
functions, not just exception handlers.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Tested-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34974>
We have an edge case with VK_EXT_provoking_vertex where we may need to
emit FBDs and TDs before we know what provoking vertex mode the
application is using for the renderpass. To handle this, we want to
retroactively patch the provoking vertex bit. This commit introduces an
abstraction to do that.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Tested-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34974>
For now, just comparing the raw contents of the output buffer. Possibly
in the future we could hook this up to the disassembly from decode_csf.c
to make it a easier to edit.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Tested-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34974>
We need to do this in order to test it with gtest. Most of the changes
are just fixing integer truncation warnings.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Tested-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34974>
this allows eliminating surface refcounting and objects
which, relatively speaking, don't serve much purpose
see MR for details
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34054>
Apply the direct dispatch WLS instance limit to PanVK/JM as well to keep
compute jobs with large workgroup counts from hitting
VK_ERROR_OUT_OF_DEVICE_MEMORY.
Fixes: 005703e5b5 ("panvk: Move TLS preparation logic to cmd_dispatch_prepare_tls"
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: John Anthony <john.anthony@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34979>
During direct dispatch, we calculate the size of the WLS allocation
based on the number of WLS instances which is an unbounded calculation
on number of workgroups.
This leads to extreme allocation sizes and potentially
VK_ERROR_OUT_OF_DEVICE_MEMORY for direct dispatches with a high amount
of workgroups.
This change adds an upper bound to the number of WLS instances, using
the same value we assume for indirect dispatches.
Additionally, this commit fixes the WLS max instance calculation (which
should be per core).
Fixes: 5544d39f44 ("panvk: Add a CSF backend for panvk_queue/cmd_buffer")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: John Anthony <john.anthony@arm.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34979>
The CSF version of dispatch_precomp allocates TLS/WLS prior to calling
cmd_dispatch_prepare_tls, which will do the same.
This commit removes this unnecessary allocation.
Fixes: cc02c5deb4 ("panvk: Implement precomp dispatch")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: John Anthony <john.anthony@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34979>
Generate IDPADD instruction to support integer dot product
Support is added for both signed/unsigned dot product as well as
saturated dot product.
Support is only for v9+.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34812>
No idea why this was failing on CI in the first place, as I can't
reproduce it failing locally at any point. But at some point recently,
this also stopped failing on CI as well. It's not clear what caused
this, but I can't find anything in the history around the time it
changed that seems particularly suspicious either.
Let's just accept the new behavior, and investegate further if it
reappears.
Acked-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35085>
Specialize the texture emission logic for buffer views, which are much
simpler to deal with than image views.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34767>