As part of coopmat, I want to make reductions faster as I need
them to implement coopmat.
The intrinsics can't be used directly as we have to take into
account the exec_mask, but it can be done by picking the
a value to insert into the disabled lanes, then calling
the LLVM intrinsic.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39225>
This fixes a llvm validation error seen in dEQP-VK.robustness.robustness2.push.notemplate.rgba32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.img.samples_1.1d_array.rgen
Cc: mesa-stable
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39225>
The official name for the architecture after Valhall is 'Arm 5th
Gen'. In code we can use 'FIFTHGEN' or 'fifthgen', while in
documentation and printed output we should use 'Arm 5th Gen' or '5th
Gen'.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39267>
Texel buffers are currently described by a TextureDescriptor, which
leads to restrictive limits on size and alignment. These limits can be
avoided by using AttributeDescriptors + AttributeBufferDescriptors
instead.
This requires us to access texel buffers using attributes rather than
textures, which involves setting up AttributeDescriptors and
AttributeBufferDescriptors in their respective allocations, rather than
the previous TextureDescriptors in the texture allocation.
This is already done for images, so we simply place the texel buffer
attributes after the images and ensure the indexing if offset correctly.
Accessing a texel buffer thus becomes:
1. Get the buffer address and ConversionDescriptor with LEA_ATTR[_IMM]
2. Use LD_CVT to get the value
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38490>
panfrost_emit_image_attribs is only called when the context's image
state is dirtied and uses the context's image_mask to write attributes
and attribute buffers.
However, it uses the shader's attribute_count (which in this context is
the last bit of the shader's images_used) to calculate the size of the
buffers.
In case more images are bound than the currently bound shader uses, this
would lead to out-of-bounds writes.
This change updates the allocation to use the last bit in the context's
image_mask for size calculations.
It also removes an unused parameter from emit_image_bufs and updates a
parameter name in emit_image_attribs to be more descriptive and match
the documentation.
Fixes: dc85f65e05 ("panfrost: emit shader image attribute descriptors")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38490>
The one non-trivial change here is that we're now using BLEND with a
constant descriptor instead of ST_TILE for MSAA blend shaders. However,
this shouldn't make any practical difference.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39244>
PIPE_VIDEO_VPP_PRI_RESERVED0 and PIPE_VIDEO_VPP_TRC_RESERVED0 have value 0,
and this is what we will get from apps that doesn't set primaries and transfer
characteristics at all.
Fixes: a284bff8ad ("frontends/va: Set color properties when not using explicit color standard")
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38892>
This has always been disabled by default, because VAAPI doesn't provide
all the parameters we need, which makes it impossible to correctly decode
most streams.
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38780>
Frontend was using PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE for
VAProfileH264ConstrainedBaseline, so this hasn't caused any issues.
Change it to correct enum value to make it less confusing.
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38748>
When running ``Models.Op/yolox_114`` on Etnaviv with TEFLON_DEBUG=verbose,
the delegate currently reports supported operations as unknown, for example:
Teflon delegate: loaded etnaviv driver
idx type ver support inputs
=================================================================
0 unknown v2 supported in: 0(i8) 1(i32) out: 2(i8)
This happens because not all operations supported by Teflon are mapped
in ``tflite_builtin_op_name()``. Therefore, extend ``tflite_builtin_op_name()``
to include all operations supported, ensuring that the operation type
is reported correctly during debug.
Signed-off-by: Maíra Canal <mairacanal@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38942>
Works on CL and also in use by lavapipe. However it needs the
shader_subgroup_size cap to be set to be actually advertized in OpenGL.
Also support subgroup rotate while at it.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38015>
The bit values are taken from Vulkan to make it easy for Zink. Those new
subgroup features will be used by rusticl for cl_khr_subgroup_rotate.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38015>
The Android CTS job now takes about 25 minutes with Android 16, which is
too long for a pre-merge job.
The deqp-runner-powered `android-angle-lavapipe` job remains in the Marge
pipeline.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39197>
Update the Cuttlefish image to Android 16, move to the r29 NDK, and build
Mesa with SDK version 35, the latest version currently supported.
The new Cuttlefish build switches the 'venus_guest_angle' mode to use the
`venus:cross-domain` context type instead of `virgl:virgl2:venus`, which
now works on Android 16. This mode also moves to the `skiavk` Vulkan
backend for HWUI and SurfaceFlinger.
The Cuttlefish repositories have also been moved to the new
https://gitlab.freedesktop.org/gfx-ci/android namespace.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39197>
nr_params & params array are gone.
brw_ubo_range is not stored on the prog_data structure anymore (Anv
already stored a copy of that with its own additional information)
The backend now only deals with load_push_data_intel. load_uniform &
load_push_constant have to be lowered by the driver.
Pre Gfx12.5 platforms have to provide a subgroup_id_param to specify
where the subgroup_id value is located in the push constants.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38975>
Anv already manages this itself. This allows removing the logic from
the compiler.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38975>
Drivers can do all the lowering to push constants to find the only
value useful in that array (subgroup_id). Then drivers call into
brw_cs_fill_push_const_info() to get the cross/per thread constant
layout computed in the prog_data.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38975>
The format parameters should come from the buffer itself,
not be taken from the process_properties,
because the buffer used for geometric scaling does not
originate from an externally provided buffer.
Signed-off-by: Peyton Lee <peytolee@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38948>
The users of exportable might have different expectations for what can
be exported, and some are more tight. So we need a new exportable_dmabuf
flag to track where dmabuf is actually needed.
If the underlying driver does not advertise dmabuf extension, requesting
dmabuf export violates the spec VU:
> VUID-VkMemoryGetFdInfoKHR-handleType-00671
>
> handleType must have been included in
> VkExportMemoryAllocateInfo::handleTypes when memory was created
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38439>
Drop the assignment entirely (and fallback to the default of 1024).
Fixes GL_OUT_OF_MEMORY errors when calling e.g., glTexStorage2D.
Fixes: 24ba57259f ("mesa: remove MaxTextureMbytes, use the cap instead")
Signed-off-by: Alyssa Milburn <amilburn@zall.org>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39143>
The Vulkan feature fillModeNonSolid is used to implement OpenGL API
glPolygonMode(), which does not exist in OpenGL ES and the hardware
support is missing in many mobile GPUs.
The use of this Vulkan feature is only triggered when glPolygonMode() is
really called, and among current gallium drivers at least lima and
panfrost do not properly handle polygon modes either.
Only warn about this feature being missing when it's really needed,
instead of warning at screen initialization time. This will prevent the
warning from being raised when running OpenGL ES on Zink.
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38897>