I'm not convinced these really should be separate opcodes at all in NIR, but
that's not what this patch is about. Here we just infer the opcodes in the
texture builder to allow simplified usage.
This lets us drop nir_txl() & nir_txb() helpers in favour of nir_tex(.lod/bias)
which is more normalized. We could also drop nir_txf_ms in favour of nir_txf but
that affects more callsites and is not obviously a win (unlike nir_txl which is
used once and nir_txb which is unused).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39271>
It is not allowed in SDS. This matters on gen8.
Possibly it wasn't enforced, or worked by luck, on older gens. But at
least for gen7 and gen8 it is documented as not allowed in SDS (so
working or not may depend on fw version, phase of the moon, etc).
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39254>
The format of the packet changed with gen8. This went unnoticed because
in this particular case the contents of the packet work out the same.
But lets be correct about this.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39254>
At least on gen8 (unsure if hw or fw change) this makes things pissy and
results in writes to undefined offsets.
Non-zero chance that this was even UB on older gens, but ended up
writing somewhere harmless(ish).
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39254>
vi_bindings_valid doesn't necessarily match the # of VBOs emitted,
resulting an invalid size in the CP_SET_DRAW_STATE packet. Somehow
this didn't seem to cause problems prior to Dxx (although may
potentially have been a source of flakes, depending on what random
cmds followed in memory). But caused hangs on Dxx.
See, for example, dEQP-VK.pipeline.fast_linked_library.vertex_input.misc.unused_binding
Fixes: 97da0a7734 ("tu: Rewrite to use common Vulkan dynamic state")
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39254>
The Android 16 Cuttlefish build comes with a new Xe minigbm backend that
we can use on the host in the android-angle-venus-anv-adl nightly job.
Also update two outdated comments.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39297>
As part of coopmat, I want to make reductions faster as I need
them to implement coopmat.
The intrinsics can't be used directly as we have to take into
account the exec_mask, but it can be done by picking the
a value to insert into the disabled lanes, then calling
the LLVM intrinsic.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39225>
This fixes a llvm validation error seen in dEQP-VK.robustness.robustness2.push.notemplate.rgba32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.img.samples_1.1d_array.rgen
Cc: mesa-stable
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39225>
As mentioned in earlier commit, for sem fd export on async present
thread, it occurs after internal queue submissions for all swapchains.
So we can allow queue commands while sync wait w/o violating ordering.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39283>
For sem fd export on async present thread, it occurs after internal
queue submissions for all swapchains and before presenting to WSI
backend. We can safely unlock the chain acquire lock though we still
have to lock against chain destroy as well as chain async present on
other queues.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39283>
Brief:
1. present info is deep-copied and passed to async present thread
2. normal queue present always waits for async present to take over
3. queue access is protected by async present queue lock
4. chain access is protected by chain locks
5. no perf gain in practice since we haven't allowed parallel queue
submit or acquire image yet
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39283>
During async presentation, swapchain is accessed by the async present
thread. With proper queue and chain access locks within venus driver,
we still have to rely on mesa common wsi implementation specifics for
optimal async present performance:
Below need to be protected:
- vkAcquireNextImage2KHR
- vkDestroySwapchainKHR
Below are safe:
- vkGetSwapchainImagesKHR
- vkWaitForPresentKHR
- vkWaitForPresent2KHR
- vkReleaseSwapchainImagesKHR
- VkSwapchainPresentFenceInfoKHR::pFences
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39283>
When suboptimal is returned, the fence payload is missed to be installed
unexpectedly. Instead, we can directly return errors from sync import.
With this change, dEQP-VK.wsi.xcb.maintenance1.release_images.* can pass
robustly now.
Fixes: a312bb4285 ("venus: refactor wsi acquire to use semaphore and fence SYNC_FD import")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39283>
The official name for the architecture after Valhall is 'Arm 5th
Gen'. In code we can use 'FIFTHGEN' or 'fifthgen', while in
documentation and printed output we should use 'Arm 5th Gen' or '5th
Gen'.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39267>
Report the -rXpY info in the deviceID.
Reviewed-by: John Anthony <john.anthony@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39267>
Texel buffers are currently described by a TextureDescriptor, which
leads to restrictive limits on size and alignment. These limits can be
avoided by using AttributeDescriptors + AttributeBufferDescriptors
instead.
This requires us to access texel buffers using attributes rather than
textures, which involves setting up AttributeDescriptors and
AttributeBufferDescriptors in their respective allocations, rather than
the previous TextureDescriptors in the texture allocation.
This is already done for images, so we simply place the texel buffer
attributes after the images and ensure the indexing if offset correctly.
Accessing a texel buffer thus becomes:
1. Get the buffer address and ConversionDescriptor with LEA_ATTR[_IMM]
2. Use LD_CVT to get the value
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38490>
Once we move texel buffers to use Attribute Buffers, we'll encode them
as MALI_ATTRIBUTE_TYPE_1D. This means we can store the actual Attribute
Descriptor in what is usually the space for an
ATTRIBUTE_BUFFER_CONTINUATION_3D.
This is good, as we'll need to use the offset functionality of the
Attribute Descriptor to allow lowering
uniformTexelBufferOffsetAlignmentBytes.
This commit therefore creates a path for setting up the Attribute
Descriptor and Attribute Buffer Descriptor for texel buffers in
meta_desc_copy.
Note that this path will not be taken yet, as no images use
MALI_ATTRIBUTE_TYPE_1D.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38490>
As we'll need to pack texel buffer attributes after image attributes in
panfrost and after vertex attributes in panvk, this adds a lowering pass
to achieve that.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38490>
Rather than loading a single 64bit channel with
load_texel_buf_index_address_pan, load three channels of 32bit each. The
last channel is required by the next commit.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38490>
panfrost_emit_image_attribs is only called when the context's image
state is dirtied and uses the context's image_mask to write attributes
and attribute buffers.
However, it uses the shader's attribute_count (which in this context is
the last bit of the shader's images_used) to calculate the size of the
buffers.
In case more images are bound than the currently bound shader uses, this
would lead to out-of-bounds writes.
This change updates the allocation to use the last bit in the context's
image_mask for size calculations.
It also removes an unused parameter from emit_image_bufs and updates a
parameter name in emit_image_attribs to be more descriptive and match
the documentation.
Fixes: dc85f65e05 ("panfrost: emit shader image attribute descriptors")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38490>
nir_u2u32 is presumably a leftover from when the texture size was loaded
from a 16 bit field, but should no longer be necessary now that we're
loading 32 bits
Also renames the "bytes" variable to "size" to align with the descriptor
we're reading.
Fixes: 4573110e4e ("pan/v9+: Make texel buffers use BufferDescriptor")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38490>