Commit graph

220425 commits

Author SHA1 Message Date
Lorenzo Rossi
245d54397d pan/compiler: Make lower_vs_outputs write needs_extended_fifo
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40537>
2026-03-26 20:53:21 +00:00
Lorenzo Rossi
445a22acbd pan/compiler: Group outputs in lower_vs_outputs
Previously bifrost_nir_lower_shader_output grouped outputs in separate
if blocks and made a best-effort attempt to group them together.  This
also assumed that pan_nir_lower_store_component wrote each output only
once and that nir_lower_io_vars_to_temporaries pulled them out of any
control flow.

Now all of these are handled by the new pan_nir_lower_vs_outputs pass
that handles write masks, control flow, per_view and grouping for IDVS.
This makes the overall dependencies much simpler, ensures that the
stores are grouped in the same ifs and should be more robust.

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40537>
2026-03-26 20:53:21 +00:00
Lorenzo Rossi
66bee415ad pan/compiler: Split lower_varyings_io into fs_inputs and vs_outputs
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40537>
2026-03-26 20:53:21 +00:00
Lorenzo Rossi
b1acc1aa89 pan/compiler: Refactor va_shader_output_from_ in common code
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40537>
2026-03-26 20:53:20 +00:00
Lorenzo Rossi
024c66ec0f pan: Add PAN_MAX_MULTIVIEW_VIEW_COUNT
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40537>
2026-03-26 20:53:20 +00:00
Lorenzo Rossi
922405ab71 panfrost/bi: Separate va_shader_output from bitmasks
The new pass will need to iterate on the enum varyings

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40537>
2026-03-26 20:53:20 +00:00
Zan Dobersek
d5b9411331 fd: support a8xx in rddecompiler
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Propery initialize rddecompiler's RNN instance for a8xx dumps.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40661>
2026-03-26 18:46:07 +00:00
emre
fe558d8328 nvk: fix barrier cache invalidation
Fixes: e1c1cdbd5f ("nvk: Implement vkCmdPipelineBarrier2 for real")
Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40637>
2026-03-26 18:29:16 +00:00
Icenowy Zheng
252904f3d1 pvr: consider the size of DMA request when setting msize of DDMADT
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The DDMADT instruction of PDS has out-of-bound test capability, which is
used for implementation of robust vertex input fetch.

According to the pseudocode in the comment block before the "LAST DDMAD"
mark in pvr_pipeline_pds.c, the check is between
`calculated_source_address + (burst_size << 2)` and `base_address +
buffer_size`, in which the `burst_size` seems to correspond to the BSIZE
field set in the low 32-bit of DDMAD(T) src3 and the `buffer_size`
corresponds to the MSIZE field set in the DDMADT-specific high 32-bit of
src3. As the calculated source address is just the base address adds the
multiplication result (the offset), the base address could be eliminated
from the check, results in the check between `offset + (BSIZE * 4)` and
`MSIZE` .

Naturally it's expected to just set the MSIZE field to the buffer size.
In addition, as the Vulkan spec says "Reads from a vertex input MAY
instead be bounds checked against a range rounded down to the nearest
multiple of the stride of its binding", the driver rounds down the
accessible buffer size before setting MSIZE to it.

However when running OpenGL ES 2.0 CTS, two problems are exhibited about
the setting of the size to check:

- dEQP-GLES2.functional.buffer.write.basic.array_stream_draw sets up a
  VBO with 3 bytes per vertex (RGB colors and 1B per color) and 340
  vertices (results in a buffer size of 1020 = 0x3fc). However as the
  DMA request size, which is specified by BSIZE, is counted by dwords,
  3 bytes are rounded up to 1 dword (which is 4 bytes). When the bound
  check of the last vertex happens, the vertex's DMA start offset is
  0x3f9, so the DDMADT check happens between 0x3fd (0x3f9 + 1 * 4) and
  0x3fc, and indicates a check failure. This prevents the last vertex,
  which is perfectly in-bound, from being properly fetched; this is
  against the Vulkan specification, and needs to be fixed.
- dEQP-GLES2.functional.vertex_arrays.single_attribute.strides.
  buffer_0_32_float2_vec4_dynamic_draw_quads_1 sets up a VBO with a size
  of 168 bytes, and tries to draw 6 vertices (each vertex consumes 2
  floats (thus 8 bytes) of attribute) with a stride of 32 bytes using
  this VBO. Zink then translates the VBO to a Vulkan vertex buffer bound
  with size = 168B, stride = 32B. Here the optional rule about rounding
  down buffer size happens in the current PowerVR driver, and the
  checked bound is rounded down to 160B, which prevented the last
  vertex's 8B attributes to be fetched. It looks like this kind of
  situation is considered in the codepath without DDMADT, but omitted
  for the codepath utilizing DDMADT for bound check.

So this patch tries to mimic the behavior of DDMADT when setting the
MSIZE field of it to prevent false out-of-bounds. It first calculates
the offset of the last valid vertex DMA, then adds the DMA request size
to it to form the final MSIZE value. With the code calculating the last
valid DMA offset considering the situation of fetching the attribute
from the space after the last whole multiple of stride, both problems
mentioned above are solved by this rework.

There're 99 GLES CTS testcases fixed by this change, and Vulkan CTS
shows no regression on `dEQP-VK.robustness.robustness1_vertex_access.*`
tests.

Fixes: 4873903b56 ("pvr: Enable PDS_DDMADT")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Ella Stanforth <ella@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40528>
2026-03-26 18:12:06 +00:00
Icenowy Zheng
d992474be9 pvr: move PVR_BUFFER_MEMORY_PADDING_SIZE definition to pvr_buffer.h
This memory padding is enforced by GetBufferMemoryRequirements2 and
might be then checked against to decide whether it's enough.

Move it to pvr_buffer.h for further assertions.

Backport-to: 25.3
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Ella Stanforth <ella@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40528>
2026-03-26 18:12:06 +00:00
Icenowy Zheng
aa8dad141c pvr: save vertex attribute size for DMA checking
Currently the size of single components inside one attribute is saved
and checked against when checking DMA capability. However, the vertex
attribute DMA happens for a whole attribute instead of individually for
its components, so checking against the component size is useless -- the
size of the whole attribute is what needs to be saved and checked.

Rename all component_size_in_bytes fields to attrib_size_in_bytes, and
save the size of the whole attribute inside them.

Fixes: 8991e64641 ("pvr: Add a Vulkan driver for Imagination Technologies PowerVR Rogue GPUs")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Ella Stanforth <ella@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40528>
2026-03-26 18:12:06 +00:00
Icenowy Zheng
caea72cffc pvr: fix "obb" typo in oob_buffer_size when building vertex pds data
The ddmadt_oob_buffer_size structure to be filled is named
`obb_buffer_size`, which is obviously a typo.

Change to `oob_buffer_size` to fix the typo.

Fixes: 8991e64641 ("pvr: Add a Vulkan driver for Imagination Technologies PowerVR Rogue GPUs")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Ella Stanforth <ella@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40528>
2026-03-26 18:12:06 +00:00
Faith Ekstrand
c2fc7d49e8 pan/bi: Rework mem_vectorize_cb
Intstead of focusing on numbers of components and bit sizes, focus on
the total number of bytes read.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:40 +00:00
Faith Ekstrand
a5801b1a23 pan/bi: Simplify unpack_64_2x32_split_*
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:40 +00:00
Faith Ekstrand
79f8c1ca9a pan/bi: Unify handling of unpack_*
These are just a fancy mov on Mali.  We need to use bi_make_vec_to()
because it handles 64-bit movs as well.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:40 +00:00
Faith Ekstrand
6cc10835b6 pan/bi: Unify handling of pack_*
These are the same as the split versions and vecN except they have one
source and they use src[0].swizzle[i] instead of src[i].swizzle[0].
While we're here, it's trivial to implement pack_64_4x16 as well.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:39 +00:00
Faith Ekstrand
56f5899786 pan/bi: Handle pack_*_split with vecN
They're literally the same thing since vectors are packed.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:39 +00:00
Faith Ekstrand
682ab923e6 pan/bi: Move nir_op_mov handling to the top
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:39 +00:00
Faith Ekstrand
0b029f319f pan/bi: Properly handle large 8-bit vectors in bi_alu_src_index()
Previously, we used bi_src_index() directly and ignored the offset we
took all that care to calculate at the top of the function.  For most
cases, this is fine since the offset is 0.  But if we ever have an i8v8,
or larger, this doesn't work.  It's not really more work to handle this
case.  All we have to do is use the offset and &3 the swizzle.  It just
means we can't have false code sharing with the bi_make_vec_to() case.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:38 +00:00
Faith Ekstrand
9950a98d5e pan/bi: Handle 64-bit sources in bi_alu_src_index()
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:38 +00:00
Faith Ekstrand
4be0e46e61 pan/bi: Allow 64-bit vectors in bi_make_vec_to()
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:38 +00:00
Faith Ekstrand
f09f080835 pan/bi: Vectorize SSBOs when not robust
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:37 +00:00
Faith Ekstrand
3c1a1d2006 panvk: Replace robust2_modes with robust_modes
There's no real difference for us between robustness and robustness2.
The only thing robust_modes does in nir_opt_load_store_vectorize() is to
tell it to be a bit more careful about integer overflow in address
calculations so you don't end up wrapping something around and getting a
non-zero load when you should have gotten an zero from OOB.  There's no
good reason why we should only set it for robustness2.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:37 +00:00
Faith Ekstrand
fb7e1fe81c pan/bi: Always vectorize UBO access
Now that we claim 16B robustness alignments, we can vectorize UBO
access, even when robustness2 is enabled.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:36 +00:00
Faith Ekstrand
3bbacfe8d7 panvk: Set min_ubo/ssbo_alignment in spirv_options
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:36 +00:00
Faith Ekstrand
e52e7019b9 panvk: Increase robust buffer access alignments
We can't go any higher than 4B for SSBOs but we can go up to 16B for
UBOs.  This will let us start vectorizing UBO access, even when robust
because max-size loads (LD_PKA.i128) will never overrun a binding unless
they're entirely outside the binding.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:36 +00:00
Faith Ekstrand
f350a69759 panvk: Track which dynamic buffers are SSBOs
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:36 +00:00
Faith Ekstrand
12e1f5d0ea panvk: Rework setting dyn_buf_offsets
There's no point in looping over all the descriptors.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:35 +00:00
Faith Ekstrand
813e399803 panvk: Reduce minTexelBufferOffsetAlignment
There are formats that require a 128B alignment but they're compressed
and not allowed for texel buffers.  The biggest texel size we can have
for a texel buffer is RGBA32, which is 16B.  The only reason why we
needed the large alignment was to work around a bug in the way we were
turning texel buffers into attribute descriptors on Bifrost.  That bug
is now fixed so we can reduce to a reasonable alignment requiremdnt.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:35 +00:00
Faith Ekstrand
8471a90eb4 pan/buffer: Drop pan_buffer_view::offset
We can handle that inside pan_buffer.c and make the interface simpler.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:35 +00:00
Faith Ekstrand
ce56f49561 pan/buffer: Add the offset to the size for buffer textures
In the attribute model, the size is for the attribute binding and the
offset is an offset into that range.  If we're going to use that to
offset the buffer itself, we need to increase the size accordingly.

Fixes: a21ee564e2 ("pan/bi: Make texel buffers use Attribute Buffers")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:34 +00:00
Faith Ekstrand
8dc458225b pan/bi: v2x16 conversions don't replicate
They swizzle just like anything else.  Technically, we could maybe do a
little better than the generic case for these since they only read 8
bits per 16 bits in the destination but the generic case is correct,
even if it isn't optimal.

Fixes: f7d44a46cd ("pan/bi: Optimize replication")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:34 +00:00
Faith Ekstrand
dbefdb2376 pan/bi/ra: Dump verbose debug logging to stderr
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:34 +00:00
Tomeu Vizoso
fc0770d5e3 ethosu: parse optional SRAM size from device spec string
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The spec format is now GEN-MACS[-SRAM], e.g. "65-256-4096" or
"85-256". When the SRAM parameter is omitted it defaults to 0.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40647>
2026-03-26 16:13:23 +00:00
Tomeu Vizoso
abd681c169 ethosu: add U85-256 support to ethosu_ml_device_create()
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40647>
2026-03-26 16:13:23 +00:00
Tomeu Vizoso
3b68c5b4bc ethosu: move hardware description from ethosu_screen to ethosu_ml_device
Move target-specific fields (is_u65, ifm_ublock, ofm_ublock,
max_concurrent_blocks, sram_size) from ethosu_screen into
ethosu_ml_device. This decouples the compilation phase from the DRM
file descriptor and pipe_screen, allowing ahead-of-time compilation
where the target NPU is not present on the compilation host.

The ethosu_device_screen() helper is retained only for runtime paths
that need the DRM fd (buffer allocation, job submission, destroy).

Compilation code now accesses hardware parameters through
ethosu_ml_device() cast of pipe_ml_device, which can be created
either from a DRM-backed screen or standalone via
ethosu_ml_device_create() with a target string like "65-256".

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40647>
2026-03-26 16:13:23 +00:00
Qiang Yu
06e5026e28 docs: add GL_NV_timeline_semaphore support for radeonsi
Author: Claude Opus 4.6 <noreply@anthropic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40526>
2026-03-26 14:26:56 +00:00
Qiang Yu
00b1d77176 radeonsi: advertise GL_NV_timeline_semaphore
Set max_timeline_semaphore_difference = UINT64_MAX when timeline syncobj
is supported and GFX uses the kernel queue path (not userq). The GL
state tracker auto-enables GL_NV_timeline_semaphore when this cap is
non-zero.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15113
Author: Claude Opus 4.6 <noreply@anthropic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40526>
2026-03-26 14:26:56 +00:00
Qiang Yu
26418f0f58 radeonsi: add timeline semaphore support to fence operations
Thread timeline_point through si_add_fence_dependency and
si_add_syncobj_signal to the winsys. Remove the assert(!value)
guards in si_fence_server_sync and si_fence_server_signal so that
non-zero timeline point values are passed through to the winsys
fence dependency and signal lists.

Add PIPE_FD_TYPE_TIMELINE_SEMAPHORE_VK handling in si_create_fence_fd,
importing the fd as a syncobj (the timeline point is applied at
wait/signal time, not at import time).

Author: Claude Opus 4.6 <noreply@anthropic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40526>
2026-03-26 14:26:56 +00:00
Qiang Yu
379bf43084 winsys/amdgpu: use timeline syncobj chunks in kernelq submission
When has_timeline_syncobj is available, use AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_WAIT
with drm_amdgpu_cs_chunk_syncobj for dependencies and
AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_SIGNAL for signals in kernelq submission.
This passes timeline point values from the fence lists through to the kernel.

Keep the existing binary SYNCOBJ_IN/SYNCOBJ_OUT path as fallback when
timeline syncobj is not available.

Author: Claude Opus 4.6 <noreply@anthropic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40526>
2026-03-26 14:26:56 +00:00
Qiang Yu
c4edd58a74 winsys/amdgpu: add timeline point support to fence lists
Add a parallel uint64_t *points array to amdgpu_fence_list to store
timeline semaphore point values alongside each fence. Point=0 means
binary semaphore (preserving existing behavior).

Update cs_add_fence_dependency and cs_add_syncobj_signal winsys
interfaces to accept a timeline_point parameter, and thread it
through to the fence lists. All existing callers pass 0.

Author: Claude Opus 4.6 <noreply@anthropic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40526>
2026-03-26 14:26:56 +00:00
Icenowy Zheng
765a9f4fd9 pvr: Align width for PBE write when creating linear image
Even if a linear image isn't created with usages declaring PBE writes,
the image might be exported and then re-imported with a usage that
allows rendering to.

Always align linear images' width for being written by PBE.

This fixes WSI creating surfaces with odd width, exporting them and
re-importing for rendering.

Backport-to: 26.0
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40250>
2026-03-26 14:08:10 +00:00
Georg Lehmann
0d8e2354ed nir: add fp_math_ctrl to convert_alu_types
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:50 +00:00
Georg Lehmann
8470bb59f6 lavapipe: preserve fp_math_ctrl when lowering cmat alu
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:50 +00:00
Georg Lehmann
eef0fa22e0 brw: preserve fp_math_ctrl when lowering cmat alu
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:50 +00:00
Georg Lehmann
123d8c230e nak: preserve fp_math_ctrl when lowering cmat
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:50 +00:00
Georg Lehmann
bcdef7c79b radv: preserve fp_math_ctrl when lowering cmat alu ops
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:50 +00:00
Georg Lehmann
b8b1ce9667 spirv: set fp_math_ctrl for cmat alu
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:50 +00:00
Georg Lehmann
35ca85176c nir: add fp_math_ctrl to cmat alu ops
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:50 +00:00
Georg Lehmann
9cba104e11 nir/opt_fp_math_ctrl: use ddx/ddy fp_math_ctrl
No Foz-DB changes.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:50 +00:00