128-bit formats (RGBA32) are emulated as two stacked G32R32 planes. The
bound sampler reads the RG plane and a companion sampler reads the BA plane,
which etna_nir_lower_128bit(..) reassembles in the shader. Only the
descriptor path set up the companion, so the state path could not sample
these formats. Set up the companion on the state path too and share
companion_slot(..) between both paths.
The real requirement is the plane format, not the descriptors. The float
plane G32R32F samples through the half-float pipe, so gate it on HALF_FLOAT
and advertise GL_OES_texture_float, also on halti2 GPUs like GC3000. The
integer plane G32R32I needs halti5, so keep the integer formats there.
The KHR-GLES2 internalformat tests for sized RGB32F/RGBA32F need an ES3
context, so list them as expected fails on GC3000 too.
Verified on GC7000 with and without ETNA_MESA_DEBUG=no_texdesc.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42201>
The 128-bit emulation now covers the clear, blit, copy and sample paths,
so stop rejecting the three emulated RGBA32 formats. The format table is
the remaining filter. Sampling still relies on the halti5 texture
descriptors, so halti5 is the gate.
Sampling RGBA32F enables GL_OES_texture_float, and with the existing
half-float support also GL_ARB_texture_float, so advertise both.
The KHR-GLES2 internalformat tests for sized RGB32F/RGBA32F need an ES3
context, so they fail on the ES2 driver. List them as expected fails, as
other ES2 drivers do.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42201>
Enable VK_EXT_rasterization_order_attachment_access and
VK_ARM_rasterization_order_attachment_access for PAN_ARCH >= 10.
All three feature flags are enabled: color, depth, and stencil
rasterization order attachment access.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40675>
Add configurable SE masks for instruction timing capture and export the selected mask in RGP metadata so hit counts match the traced shader engine coverage.
An environment variable RADV_THREAD_TRACE_INSTRUCTION_TIMING_SE_MASK is used to config SE mask. If it's not specified, all SE data are captured.
Signed-off-by: Gu, Wangfeng <Wangfeng.Gu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42264>
This improves RADV_DEBUG=hang's pipeline.log when shader caching is not
disabled.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42175>
Also adds dEQP-VK.reconvergence.subgroup_uniform_control_flow_ballot.compute.nesting4.7.10
to CI skips due to it having a runtime of > 5m with the following:
Test case 'dEQP-VK.reconvergence.subgroup_uniform_control_flow_ballot.compute.nesting4.7.10'..
NotSupported (No compatible memory type found at vkMemUtil.cpp:652)
which hits the timeout.
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41833>
Adds force spilling control and optimal allocation disabling debug variables.
Signed-off-by: Radu Costas <radu.costas@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42078>
Compute derivatives can use the same lane based path as fragment shaders
because a workgroup's invocations map to subgroup lanes in order. This
gives correct derivative quads on Valhall.
Advertise the extension for PAN_ARCH >= 9 with both derivative groups.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Jakob Sinclair <jakob.sinclair@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42142>
vtn lowers OpFmaKHR to nir_op_ffma and every Mali has a native fused
multiply-add, so there is nothing to do in the backend.
fp16 is gated on shaderFloat16. A 16-bit OpFmaKHR also needs the Float16
capability and only shaderFloat16 turns that on, so without it the bit
would not be usable. Mali has no fp64, so that one stays off.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42075>
Using multiple `MTL4Compiler` instances concurrently may result in
the process crashing from within the Metal driver. Work around this
by maintaining one `MTL4Compiler` per `MTLDevice`.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41842>
Based on current v3dv support for clip control
v1: original version (Andrew Copland)
v2: update docs, really enable extension (Alejandro Piñeiro)
v3: adjusted viewport and its dirty flag (Chema Casanova)
v4: avoid dirty flags when no rasterization is enabled (Iago Toral)
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42082>
Sample locations are render pass state in Metal. In the best case
(same sample positions for all of subpass), we simply configure it
at the start and proceed as normal. For sub-optimal case (sample
positions change during subpass), we can support it by restarting
the Metal render pass with the new values.
This also interacts with the existing logic for centering sample
positions for bresenham lines. The user's custom sample positions
are prioritized, and centering applies in the default case. Some bug
fixes have also been made to prevent losing attachment contents from
render pass restarts and ensure the render pass restart happens before
other draw state is flushed.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42036>
This covers some drivers which expose KHR_display and EXT_present_timing.
Based on Emma Anholt's work from 2025, rebased on current Mesa 26.2-devel,
tiny compile fixes and docs/features updates by Mario Kleiner.
See MR 38472 for reference of Emma's work, based on Keith's work.
Tested locally on AMD Polaris for radv, Intel Kabylake for anv, and on
Mesa CI's VK-CTS VK_GOOGLE_display_timing test case for AMD radv,
Intel anv, Qualcomm Adreno tu.
Original code of Emma is
Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Update of docs/features.txt + new_features.txt updates is
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41168>
Metal does not support importing host memory pointers into MTLHeap,
only MTLBuffer. Buffers can import without issue, and images are
restricted to linear images without flags requiring aliasing.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41894>
Similar to RADV, restarts render pass with resolve attachments. Not
the most ideal for tiling, but we don't even use native resolve for
built-in modes due to Metal format limitations.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41888>
This is implemented in common code in d8ef386f98 ("vulkan: add support
for VK_KHR_internally_synchronized_queues").
Passes dEQP-VK.synchronization2.internally_synchronized_queues.*
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41926>
This implements the extension on the Graphics and Compute queues using
Blorp OpenCL compute shaders. Support for the Transfer queue will come
in a later patch. We also don't support 24/48/96 bpp formats yet.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39338>
Directories are named using the process name and PID to avoid overwriting dumps from
subsequent runs of the same application.
v2 (Caio): Use util_get_process_name(). Change to be default behavior.
Old behavior still accessible via MDA_OUTPUT_DIR="." env var.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39125>
This was originally disabled by a22ad99bdd ("pvr: set device
features/props/extensions to Vulkan 1.0 minimums (unless implemented)") in order
to concentrate efforts on passing "base" Vulkan conformance before layering on
additional functionality. The driver is now Vulkan 1.2 conformant.
As the functionality is already implemented, simply enable the extension.
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Ella Stanforth <ella@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41859>