The index buffer unrolling logic was based on asahi's implementation in
libagx/geometry.cl.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38547>
Mali hardware handles a bunch of OOB conditions by using addresses with
the top bit set. When the top bit is set, any load/store from a shader
will treat the address as OOB and read zero and discard writes. We can
use this to implement ro_sink_address_poly.
Signed-off-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38547>
To count primitives generated when primitive restart is enabled, we need
to dispatch a precomp shader with dimensions that depend on the indirect
draw count. For this, we need some kind of precomp dispatch mechanism
where the dimensions are determined at execution time. The standard
shape for this would be an "indirect precomp" dispatch which reads
dimensions from memory, but in this case that would be a little awkward
because the draw count is min(*draw_count_buffer, max_draw_count). So to
implement that with a typical indirect dispatch mechanism, we would need
to read the buffer, compare it with the max draw count, and then write
it back to scratch memory, just to read it again. To simplify this, I
went for a precomp dispatch mechanism where it just reading the
dimensions from the JOB_SIZE registers set by the caller, which can use
whatever CS code it wants to calculate them.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38547>
Primitive restart requires scanning the index buffer to determine how
many primitives are present, and will be handled in a later commit.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38547>
These are based on digging through the git history on each file.
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39397>
This replaces all full lisence headers with SPDX identifiers and
generally makes things more consistent. I've also dropped the few
remaining author tags. If someone wants to know who wrote a bit of
code, `git blame` is going to be way more accurate than author tags
anyway.
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39397>
Vulkan spec requires binding flags to be matched with the binding with
the same index, however currently bindings are sorted with flags not
properly sorted, which leads to bindings and flags mismatch.
Resolve this by adding optional flags info to the parameters of
vk_create_sorted_bindings(), and refactoring panvk/pvr (which really
pair bindings and flags instead of only iterating flags) to use sorted
flags.
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38967>
Instead of making it explicitly about outputs, this switchies it to
being a NIR version of LD_TILE. It means we have to do a bit of work in
NIR and add a builder helper but the end result is something much more
versatile.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39367>
We already mark all fragment shaders as always using blend constants so
this is handled by the existing code and our setting of push.count.
Also, this hack is wrong as it pushes way too much and might overflow
memory. Also, it will overflow the HW limit of 64 as soon as we
increase the size of panvk_graphics_sysvals.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39274>
Outputs may be re-mapped by VK_KHR_dynamic_rendering_local_read, in
which case outputs_written won't actually correspond to render targets.
Fortunately, we already have rt_written, which is properly remapped.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39274>
This is what the JM code is doing and it lets us assume we always have a
fragment shader. Right now, we're already making that assumption, we
just haven't been bitten by it yet because no one has tried to use a
shader-requiring blend configuration with no FS yet.
Cc: mesa-stable
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39274>
This uses the vk-meta framework, so it feels more like it belongs here.
While we're at it, rename the function as well.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39304>
The official name for the architecture after Valhall is 'Arm 5th
Gen'. In code we can use 'FIFTHGEN' or 'fifthgen', while in
documentation and printed output we should use 'Arm 5th Gen' or '5th
Gen'.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39267>
Report the -rXpY info in the deviceID.
Reviewed-by: John Anthony <john.anthony@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39267>
Texel buffers are currently described by a TextureDescriptor, which
leads to restrictive limits on size and alignment. These limits can be
avoided by using AttributeDescriptors + AttributeBufferDescriptors
instead.
This requires us to access texel buffers using attributes rather than
textures, which involves setting up AttributeDescriptors and
AttributeBufferDescriptors in their respective allocations, rather than
the previous TextureDescriptors in the texture allocation.
This is already done for images, so we simply place the texel buffer
attributes after the images and ensure the indexing if offset correctly.
Accessing a texel buffer thus becomes:
1. Get the buffer address and ConversionDescriptor with LEA_ATTR[_IMM]
2. Use LD_CVT to get the value
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38490>
Once we move texel buffers to use Attribute Buffers, we'll encode them
as MALI_ATTRIBUTE_TYPE_1D. This means we can store the actual Attribute
Descriptor in what is usually the space for an
ATTRIBUTE_BUFFER_CONTINUATION_3D.
This is good, as we'll need to use the offset functionality of the
Attribute Descriptor to allow lowering
uniformTexelBufferOffsetAlignmentBytes.
This commit therefore creates a path for setting up the Attribute
Descriptor and Attribute Buffer Descriptor for texel buffers in
meta_desc_copy.
Note that this path will not be taken yet, as no images use
MALI_ATTRIBUTE_TYPE_1D.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38490>
nir_u2u32 is presumably a leftover from when the texture size was loaded
from a 16 bit field, but should no longer be necessary now that we're
loading 32 bits
Also renames the "bytes" variable to "size" to align with the descriptor
we're reading.
Fixes: 4573110e4e ("pan/v9+: Make texel buffers use BufferDescriptor")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38490>
should_split_render_pass()? No! You should not split the render pass.
You should never split the render pass! There's a better way and it's
the DCD0.primitive_barrier bit. This bit tells the fragment unit to
treat the current primitive as a barrier which forces it to wait until
all previous primitives for the covered fragments have executed before
executing this one. This give us a nice, pipelined, way to do fragment
barriers that doesn't involve splitting the render pass.
This new approach also has the advantage that it works properly in
secondary command buffers as it needs to know nothing about the actual
rendering state since it's just a draw call. Even the dimensions of the
primitive itself are determined by the hardware.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14328
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39051>
The one non-trivial change here is that we're now using BLEND with a
constant descriptor instead of ST_TILE for MSAA blend shaders. However,
this shouldn't make any practical difference.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39244>
ANGLE is a massive pain to debug because it threads like mad. This at
least ensures the shaders aren't weirdly interleaved from multiple
threads.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39244>
Instead of initializing the disk cache earlier, the vk_properties
filling can be deferred after disk cache init is done. Otherwise, the
created disk cache will be zero'ed out in vk_physical_device_init,
ending up with leaked alloc and disabled disk cache (though advertised).
Fixes: acd00c07f6 ("panvk: Initialize the disk cache earlier")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39248>
We already optimize the case where the destination format does not
contain alpha. However, there are a few more cases around formats and
blend constants which we can optimize. In particular, float blending
doesn't support constants so we really want to check if the client hands
us a 0/1 constant.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39171>
This is duplicated between the two drivers and about to get more
complicated.
Reviewed-by: Aksel Hjerpbakk <aksel.hjerpbakk@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39171>
This is not actually necessary and moreover was corrupting
mipmapped arrayed 2D images in cases when the transition barrier
wasn't transitioning all mips, but more than one layer.
Keep the layout transition infrastructure in place as we'll need
it for transaction elimination CRC zeroing on v10-.
Fixes: c95f8993 ("panvk: add a meta command for transitioning image layout")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38972>
Vulkan allows destroying an image without destroying the views of
this image first. These views can not be used in any way and the
only thing that the user can do with such a view is destroy it.
This also means that the driver can not refer to the image inside
the image view's destructor.
Fixes cb3f6481 ("panvk: Create MS shadow images and views")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38972>
Mix of Coccinelle patch, manual fix ups, sed, etc. Probably best to review the diff
as-if hand written:
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38955>
While the MAX2 thing here is correct for some formats, it's not correct
for all; for instance R8_SNORM doesn't need 32-bits here.
This should enable some higersample-counts on some 8 and 16-bit formats
on some Mali GPUs.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38968>
When only the tile buffers are touched, it's okay to take care of the
dependency at the draw level, with DCD_FLAGS_2, but as soon as one side
of the dep has side effects that could impact the other side, we need to
split the render pass and insert a real barrier, with a proper flush on
read-only L1 caches.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Aksel Hjerpbakk <aksel.hjerpbakk@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38950>
If we don't do that and something fails in the middle, we leak
the decode context.
Fixes: d155d6b7a3 ("panvk: Add a decode context at the panvk_device level")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38923>
cs_finish() is doing two things:
1. wrapping up the CS to prepare for its execution
2. freeing the temporary instrs array and maybe_ctx allocations
Mixing those two things lead to confusion and leaks, so let's split
those into cs_end() and cs_builder_fini(), and make sure panvk/panfrost
call both when appropriate.
Fixes: 50d2396b7e ("pan/cs: add helpers to emit contiguous csf code blocks")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38923>
own_bin needs to be set to true if we want the bin_ptr to be freed.
Fixes: 3d2cc01f8a ("panvk: Add create_shader_from_binary")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38923>
The desc_heap field is unconditionally initialized, so we need to
call util_vma_heap_finish() on it.
Fixes: ec02137c86 ("panvk: Support DESCRIPTOR_POOL_CREATE_HOST_ONLY_BIT")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38923>
VK_EXT_multisampled_render_to_single_sampled needs those to be able
to render to the MS attachment when the app only provides a single-
sampled one.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38825>