The index buffer unrolling logic was based on asahi's implementation in
libagx/geometry.cl.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38547>
Mali hardware handles a bunch of OOB conditions by using addresses with
the top bit set. When the top bit is set, any load/store from a shader
will treat the address as OOB and read zero and discard writes. We can
use this to implement ro_sink_address_poly.
Signed-off-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38547>
To count primitives generated when primitive restart is enabled, we need
to dispatch a precomp shader with dimensions that depend on the indirect
draw count. For this, we need some kind of precomp dispatch mechanism
where the dimensions are determined at execution time. The standard
shape for this would be an "indirect precomp" dispatch which reads
dimensions from memory, but in this case that would be a little awkward
because the draw count is min(*draw_count_buffer, max_draw_count). So to
implement that with a typical indirect dispatch mechanism, we would need
to read the buffer, compare it with the max draw count, and then write
it back to scratch memory, just to read it again. To simplify this, I
went for a precomp dispatch mechanism where it just reading the
dimensions from the JOB_SIZE registers set by the caller, which can use
whatever CS code it wants to calculate them.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38547>
Primitive restart requires scanning the index buffer to determine how
many primitives are present, and will be handled in a later commit.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38547>
These are based on digging through the git history on each file.
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39397>
This replaces all full lisence headers with SPDX identifiers and
generally makes things more consistent. I've also dropped the few
remaining author tags. If someone wants to know who wrote a bit of
code, `git blame` is going to be way more accurate than author tags
anyway.
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39397>
This reverts a single hung from 669ddc5241 ("pan/blend: Use the blend
builder helpers instead of nir_lower_blend()"). That commit did fix the
test on a bunch of platforms, just not G52.
Fixes: 669ddc5241 ("pan/blend: Use the blend builder helpers instead of nir_lower_blend()")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39397>
Vulkan spec requires binding flags to be matched with the binding with
the same index, however currently bindings are sorted with flags not
properly sorted, which leads to bindings and flags mismatch.
Resolve this by adding optional flags info to the parameters of
vk_create_sorted_bindings(), and refactoring panvk/pvr (which really
pair bindings and flags instead of only iterating flags) to use sorted
flags.
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38967>
Now that we have intrinsics which map directly to the hardware opcodes,
we can lower PLS inside the gallium driver instead of the back-end
compiler having to know anything about it. This simplifies the back-end
and is less code, if you ignore the new copyright header.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39367>
This is a little more manual (though it's actually less code) but it
gives us a lot more control and makes the whole flow nicer.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39367>
Like we just did with load_tile_pan, this maps directly to ST_TILE in
the hardware. This is more versatile and lets us do more of our
lowering in NIR.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39367>
Instead of making it explicitly about outputs, this switchies it to
being a NIR version of LD_TILE. It means we have to do a bit of work in
NIR and add a builder helper but the end result is something much more
versatile.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39367>
We already mark all fragment shaders as always using blend constants so
this is handled by the existing code and our setting of push.count.
Also, this hack is wrong as it pushes way too much and might overflow
memory. Also, it will overflow the HW limit of 64 as soon as we
increase the size of panvk_graphics_sysvals.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39274>
Otherwise, we'll only mark the first location for matrix and array
variables. Ideally, someone would split these before we get here but we
should at least be correct.
Cc: mesa-stable
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39274>
Outputs may be re-mapped by VK_KHR_dynamic_rendering_local_read, in
which case outputs_written won't actually correspond to render targets.
Fortunately, we already have rt_written, which is properly remapped.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39274>
I'm not sure if this is the best way to enforce this as it's really a
restriction on the blend array pointer. Individual blend descriptors
only have to be 16B-aligned. But this seems to work and doesn't affect
the array stride, which is still 16B.
Also, I couldn't actually find the restriction in the v10 docs but it's
there on v11 and this is enough of a pain to debug that we're probably
better off playing it safe.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39274>
This is what the JM code is doing and it lets us assume we always have a
fragment shader. Right now, we're already making that assumption, we
just haven't been bitten by it yet because no one has tried to use a
shader-requiring blend configuration with no FS yet.
Cc: mesa-stable
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39274>
On Bifrost, we have to return the blend return offset in the compiled
shader info and that means we need to be able to index into an array by
blend target deep inside the compiler. Instead of assuming bound blend
targets and subtracting BIR_FAU_BLEND_0 from fau_idx, add a separte
blend_target to bi_instr and use that. This way what we return will be
based on the nir_io_semantics::location, regardless of where the actual
blend descriptor comes from.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39274>
This uses the vk-meta framework, so it feels more like it belongs here.
While we're at it, rename the function as well.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39304>
G720 isn't wired up in CI, and nobody is running these regularly. So we
have missed that these CTS issues were fixed upstream.
I haven't actually verified that this works, because I don't have a G720
around at the moment. But the fixes fixed both v10 and v13 GPUs, so it
would be really, really strange if this wasn't effective on v11 as well.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39034>
The official name for the architecture after Valhall is 'Arm 5th
Gen'. In code we can use 'FIFTHGEN' or 'fifthgen', while in
documentation and printed output we should use 'Arm 5th Gen' or '5th
Gen'.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39267>
Report the -rXpY info in the deviceID.
Reviewed-by: John Anthony <john.anthony@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39267>
Texel buffers are currently described by a TextureDescriptor, which
leads to restrictive limits on size and alignment. These limits can be
avoided by using AttributeDescriptors + AttributeBufferDescriptors
instead.
This requires us to access texel buffers using attributes rather than
textures, which involves setting up AttributeDescriptors and
AttributeBufferDescriptors in their respective allocations, rather than
the previous TextureDescriptors in the texture allocation.
This is already done for images, so we simply place the texel buffer
attributes after the images and ensure the indexing if offset correctly.
Accessing a texel buffer thus becomes:
1. Get the buffer address and ConversionDescriptor with LEA_ATTR[_IMM]
2. Use LD_CVT to get the value
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38490>
Once we move texel buffers to use Attribute Buffers, we'll encode them
as MALI_ATTRIBUTE_TYPE_1D. This means we can store the actual Attribute
Descriptor in what is usually the space for an
ATTRIBUTE_BUFFER_CONTINUATION_3D.
This is good, as we'll need to use the offset functionality of the
Attribute Descriptor to allow lowering
uniformTexelBufferOffsetAlignmentBytes.
This commit therefore creates a path for setting up the Attribute
Descriptor and Attribute Buffer Descriptor for texel buffers in
meta_desc_copy.
Note that this path will not be taken yet, as no images use
MALI_ATTRIBUTE_TYPE_1D.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38490>
As we'll need to pack texel buffer attributes after image attributes in
panfrost and after vertex attributes in panvk, this adds a lowering pass
to achieve that.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38490>
Rather than loading a single 64bit channel with
load_texel_buf_index_address_pan, load three channels of 32bit each. The
last channel is required by the next commit.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38490>
nir_u2u32 is presumably a leftover from when the texture size was loaded
from a 16 bit field, but should no longer be necessary now that we're
loading 32 bits
Also renames the "bytes" variable to "size" to align with the descriptor
we're reading.
Fixes: 4573110e4e ("pan/v9+: Make texel buffers use BufferDescriptor")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38490>
should_split_render_pass()? No! You should not split the render pass.
You should never split the render pass! There's a better way and it's
the DCD0.primitive_barrier bit. This bit tells the fragment unit to
treat the current primitive as a barrier which forces it to wait until
all previous primitives for the covered fragments have executed before
executing this one. This give us a nice, pipelined, way to do fragment
barriers that doesn't involve splitting the render pass.
This new approach also has the advantage that it works properly in
secondary command buffers as it needs to know nothing about the actual
rendering state since it's just a draw call. Even the dimensions of the
primitive itself are determined by the hardware.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14328
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39051>
Blend shaders are alreaday lowered. However, since the lowering pass is
idempotent, it's harmless to lower again.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39244>
It's a lot more explicit to just have an intrinsic for this than to
treat blend shaders as their own weird stage. Also, the new intrinsic
uses the same io_semantics as a fragment store so the back-end code is a
little easier to read because it now checks sem.dual_source_blend_index
instead of the generic load_input offset.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39244>