There's no point in looping over all the descriptors.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
There are formats that require a 128B alignment but they're compressed
and not allowed for texel buffers. The biggest texel size we can have
for a texel buffer is RGBA32, which is 16B. The only reason why we
needed the large alignment was to work around a bug in the way we were
turning texel buffers into attribute descriptors on Bifrost. That bug
is now fixed so we can reduce to a reasonable alignment requiremdnt.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
We can handle that inside pan_buffer.c and make the interface simpler.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
In the attribute model, the size is for the attribute binding and the
offset is an offset into that range. If we're going to use that to
offset the buffer itself, we need to increase the size accordingly.
Fixes: a21ee564e2 ("pan/bi: Make texel buffers use Attribute Buffers")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
They swizzle just like anything else. Technically, we could maybe do a
little better than the generic case for these since they only read 8
bits per 16 bits in the destination but the generic case is correct,
even if it isn't optimal.
Fixes: f7d44a46cd ("pan/bi: Optimize replication")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
We used to lower multisampled arrays to 3D images by adjusting the
height and the Y coordinate so that addressing samples became
addressing into the new base image. This worked for gallium, but
was never implemented for vulkan, and also had the disadvantages
that (a) we handled arrays and non-arrays differently, and
(b) the image height was restricted to 4096.
Change this so that we lower samples into the Z coordinate instead,
adding new layers for each sample. This requires that we know the
number of samples (so we have to save a sysval for this in gallium)
but means that we handle arrays and non-arrays the same. More
importantly, we can fit 3 bits to indicate the number of samples
into the attribute descriptor in Vulkan, so this scheme works
there as well as in OpenGL.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40460>
We reduce the number of bits used for pixel stride from 10 to 7. This
gives us space to store the log2 of the number of samples, which
we will need later.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40460>
pan_nir_lower_fs_outputs already handles channel masks and vec3 and
smaller outputs. We don't need the extra precursor pass.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40544>
pan_nir_lower_fs_outputs() already does this so there's no need to have
it in a separate pass.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40544>
These are already implemented by common code, so there's nothing to be
done here, really.
A few tests fail due to timeouts. But this seems no different than on
other drivers, we just skip less WSI tests than most drivers does. Skip
those for now.
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40502>
_mesa_sha1_format has a few remaining uses, so it's moved to build_id.c,
which is its last user.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
Enable the extension and feature bits on CSF (v10+). Map the conditional
rendering pipeline stage to all three subqueues for barrier handling.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40452>
Meta operations (blit, copy, clear images, fill/update buffer) internally
emit draws or dispatches that must not be affected by conditional
rendering. Save and disable the state in meta_gfx_start/meta_compute_start,
then restore it in the corresponding end functions.
CmdClearAttachments is the exception: it IS affected by conditional
rendering per the Vulkan spec, so its state is restored right after
meta_gfx_start.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40452>
When a secondary command buffer is recorded with
conditionalRenderingEnable, its draws load the conditional rendering
flag from panvk_cs_subqueue_context::cond_render_flag via the
cs_subqueue_ctx_reg register.
In CmdExecuteCommands, evaluate the primary's predicate and write the
normalized result (0=skip, non-zero=render) to the subqueue context.
Since the context is per-queue and cs_subqueue_ctx_reg is never
clobbered across cs_call, this works correctly with
SIMULTANEOUS_USE_BIT and across multiple queues. For nested
inheritance, propagate the caller's own flag value.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40452>
Use panvk_cond_render() to wrap RUN_IDVS and RUN_COMPUTE GPU commands.
When active, the predicate is loaded from the buffer and a cs_if
branches over the GPU commands when the condition says to skip.
This per-command approach correctly leaves render pass operations
(BeginRendering, EndRendering, load ops) unaffected as required by the
spec. Only draws, dispatches, and CmdClearAttachments are conditional.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40452>
Add panvk_cond_render_state, CmdBeginConditionalRendering2EXT,
CmdEndConditionalRenderingEXT, and the panvk_cond_render() macro
that wraps GPU commands in a cs_if block.
For direct conditional rendering, the macro loads the predicate from
the buffer. For inherited secondaries, it loads a pre-evaluated flag
from panvk_cs_subqueue_context::cond_render_flag. When conditional
rendering is inactive, no GPU instructions are emitted.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40452>
The code was built with the assumption that VS shaders could magically
read FS varying descriptors because we never emitted any VS varying
descriptor. The truth is that VS shaders never read any varying
descriptor at all, lea_buf is never used for varying stores on v9.
This simplifies varying handling by a lot since we can assume different
descriptor types for VS and FS without hacks.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40515>
Previously the driver decided when the backend should use
LD_VAR_BUF[_IMM] instructions based on the total number of varyings
read, falling back to LD_VAR[_IMM] + descriptors when the varying index
could overflow the immediate index in the instructions. That means that
even adding a single varying read could overflow the index and make
everything fall back to LD_VAR.
With this patch the backend decides when to use LD_VAR_BUF for each
varying load, reporting that decision to the driver. This helps with
index overflows because only the instruction that actually overflow the
immediate use the LD_VAR fallback, leaving all other instructions on the
fast path.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40515>
The function was a leftover from the varying ABI rework.
Fixes: 7fc6af99ea ("pan: Remove dead code for sso_abi builder and fixed_varyings")
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40515>
Purely a visual change, but aligns with DDK disassembly.
For example:
- FMA.f32 r1, ^r1, u1, ^r4
+ FMA.f32 r1, r1^, u1, r4^
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40199>
A lot of masks and shifts were hard-coded in the disassembler. This
commit tries to move them to shared logic.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40199>
Rather than opcode/opcode2 hardcoded, treat the opcode as a list of
one or more subcodes.
This implies modifying the disassembler to hold an arbitrary depth dict
of dicts and recursively build the switch statements used to look up
each level.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40199>
This splits the printing logic from the iteration logic, making it
easier to reason about either.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40199>
Opcode2 was a bit all over the place, so utilize the new opcode modifier
to gather opcode2 information in a single place.
This cleans up the implicit va_mods "left", "descriptor_type" and
"memory_width".
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40199>
Rather than having the opcode as an attribute and the offset/mask being
implicit, make all of this information explicit in the xml.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40199>
These instructions were not generated as they do not exist.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40199>
This is already the default value, so there's no point in overriding it
to itself.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40489>
Dividing this by itself is nonsensical, and just always gives us one.
That's obviously not what we want here.
But in this case we also know that the extent is divisible by the tile
extent, so there's no need for DIV_ROUND_UP, we can just divide.
Fixes: e6f8cab698 ("pan/layout: Split the logic per modifier")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40489>
We can't use the stencil-aspect of a color-attachment. That's going to
fail, so let's use the color-aspect instead. We already have it around
anyway.
Fixes: 7a763bb0a3 ("pan/genxml: Rework the RT/ZS emission logic")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40489>
This is controlled by the writeback-mode when using AFRC, not by an YUV
Enable field. This Filed doesn't exist in these, and should according to
the spec be zero.
Fixes: 7a763bb0a3 ("pan/genxml: Rework the RT/ZS emission logic")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40489>
In preparation for IO lowering in NIR. The varying size does not change
between variants and we'll need the real store width in NIR if we want
to lower it correctly.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>