Commit graph

52276 commits

Author SHA1 Message Date
Tomeu Vizoso
f0e4ccf664 ethosu: handle NULL bias tensor in convolution
PyTorch Conv2d without explicit bias produces a NULL bias_tensor
in the Gallium pipe_ml_operation. Guard against NULL dereferences
in two places:

- ethosu_lower.c: pass NULL to fill_coefs when bias_tensor is NULL
- ethosu_coefs.c: treat missing biases as zero

Fixes crashes when running Conv2d models without bias through the
Ethos-U NPU backend.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40578>
2026-03-27 09:33:52 +01:00
Tomeu Vizoso
e0b401aa87 ethosu: implement ml_subgraph_deserialize()
Add ethosu_ml_subgraph_deserialize() which reconstructs a subgraph
from a serialized byte buffer. Parses the header (cmdstream size,
coefs size, io size, tensors size), restores the tensor array,
cmdstream, and coefficient buffers.

DRM buffer object creation is deferred to prepare_for_submission()
which is called lazily on first invoke.

Wire pctx->ml_subgraph_deserialize in ethosu_create_context().

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40578>
2026-03-27 09:33:52 +01:00
Tomeu Vizoso
aff92add98 ethosu: Specifying SRAM size in pipe_ml_device ID
The spec format is now GEN-MACS-SRAM, e.g. "65-256-4096".

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40578>
2026-03-27 09:07:12 +01:00
Tomeu Vizoso
fc0770d5e3 ethosu: parse optional SRAM size from device spec string
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The spec format is now GEN-MACS[-SRAM], e.g. "65-256-4096" or
"85-256". When the SRAM parameter is omitted it defaults to 0.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40647>
2026-03-26 16:13:23 +00:00
Tomeu Vizoso
abd681c169 ethosu: add U85-256 support to ethosu_ml_device_create()
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40647>
2026-03-26 16:13:23 +00:00
Tomeu Vizoso
3b68c5b4bc ethosu: move hardware description from ethosu_screen to ethosu_ml_device
Move target-specific fields (is_u65, ifm_ublock, ofm_ublock,
max_concurrent_blocks, sram_size) from ethosu_screen into
ethosu_ml_device. This decouples the compilation phase from the DRM
file descriptor and pipe_screen, allowing ahead-of-time compilation
where the target NPU is not present on the compilation host.

The ethosu_device_screen() helper is retained only for runtime paths
that need the DRM fd (buffer allocation, job submission, destroy).

Compilation code now accesses hardware parameters through
ethosu_ml_device() cast of pipe_ml_device, which can be created
either from a DRM-backed screen or standalone via
ethosu_ml_device_create() with a target string like "65-256".

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40647>
2026-03-26 16:13:23 +00:00
Qiang Yu
00b1d77176 radeonsi: advertise GL_NV_timeline_semaphore
Set max_timeline_semaphore_difference = UINT64_MAX when timeline syncobj
is supported and GFX uses the kernel queue path (not userq). The GL
state tracker auto-enables GL_NV_timeline_semaphore when this cap is
non-zero.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15113
Author: Claude Opus 4.6 <noreply@anthropic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40526>
2026-03-26 14:26:56 +00:00
Qiang Yu
26418f0f58 radeonsi: add timeline semaphore support to fence operations
Thread timeline_point through si_add_fence_dependency and
si_add_syncobj_signal to the winsys. Remove the assert(!value)
guards in si_fence_server_sync and si_fence_server_signal so that
non-zero timeline point values are passed through to the winsys
fence dependency and signal lists.

Add PIPE_FD_TYPE_TIMELINE_SEMAPHORE_VK handling in si_create_fence_fd,
importing the fd as a syncobj (the timeline point is applied at
wait/signal time, not at import time).

Author: Claude Opus 4.6 <noreply@anthropic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40526>
2026-03-26 14:26:56 +00:00
Qiang Yu
c4edd58a74 winsys/amdgpu: add timeline point support to fence lists
Add a parallel uint64_t *points array to amdgpu_fence_list to store
timeline semaphore point values alongside each fence. Point=0 means
binary semaphore (preserving existing behavior).

Update cs_add_fence_dependency and cs_add_syncobj_signal winsys
interfaces to accept a timeline_point parameter, and thread it
through to the fence lists. All existing callers pass 0.

Author: Claude Opus 4.6 <noreply@anthropic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40526>
2026-03-26 14:26:56 +00:00
Alyssa Milburn
a6992c7bbe nv50,nvc0: Avoid uninitialized cbuf reads in blits
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Overwrite the whole framebuffer cbuf rather than copying it from the
stack; fixes util_framebuffer_get_num_samples getting uninitialized
stack contents during validation.

Suggested-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Alyssa Milburn <amilburn@zall.org>
Fixes: 2eb45daa9c ("gallium: de-pointerize pipe_surface")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14082
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39138>
2026-03-25 17:48:43 +00:00
Tomeu Vizoso
16e15ee205 gallium: add pipe_ml_device, pipe_screen::get_ml_device()
For compiling models, we don't really need a context for a real device.

To support ML frameworks models in which compilation happens
ahead-of-time (AoT), add API for compilation that doesn't require a
pipe_context.

Add struct pipe_ml_device with function pointers for:
- ml_operation_supported: query operation support
- ml_subgraph_create: compile a subgraph
- ml_subgraph_serialize: serialize a compiled subgraph
- ml_subgraph_destroy: free subgraph resources

Move ml_operation_supported, ml_subgraph_create, and
ml_subgraph_destroy from pipe_context to pipe_ml_device.

Add pipe_screen::get_ml_device() to obtain the device.

Change pipe_ml_subgraph.context (pipe_context*) to
pipe_ml_subgraph.device (pipe_ml_device*).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40167>
2026-03-25 16:58:05 +00:00
Tomeu Vizoso
1d4d1fc61d gallium: replace padding_same with per-side padding
Replace the boolean padding_same field in pipe_ml_operation.conv
and .pooling with explicit per-side padding fields: padding_top,
padding_bottom, padding_left, padding_right.

Frontends always compute these from their own padding representation
(e.g. TFLite same/valid, PyTorch (pad_h, pad_w)). Drivers use
them directly, removing the need for drivers to derive padding.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40167>
2026-03-25 16:58:05 +00:00
Tomeu Vizoso
db866eca28 gallium: pipe_tensor.resource → pipe_tensor.data
Change the tensor backing storage from pipe_resource* to uint8_t*.

This simplifies tensor data management by using raw memory pointers
instead of pipe_resource objects. Frontends allocate tensor data with
malloc() and drivers access it directly, removing the need for
pipe_buffer_map/unmap for tensor data access.

We initially used resources thinking that the NPU would want to directly
access the data in those tensors. It is clear now that all NPUs will
need the data to be compressed and reformatted in some way, so let's
drop the incovenient resources and just use allocated memory.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40167>
2026-03-25 16:58:04 +00:00
Eric R. Smith
a2e61ee1b9 pan: change image2DMSArray lowering to use Z instead of Y
We used to lower multisampled arrays to 3D images by adjusting the
height and the Y coordinate so that addressing samples became
addressing into the new base image. This worked for gallium, but
was never implemented for vulkan, and also had the disadvantages
that (a) we handled arrays and non-arrays differently, and
(b) the image height was restricted to 4096.

Change this so that we lower samples into the Z coordinate instead,
adding new layers for each sample. This requires that we know the
number of samples (so we have to save a sysval for this in gallium)
but means that we handle arrays and non-arrays the same. More
importantly, we can fit 3 bits to indicate the number of samples
into the attribute descriptor in Vulkan, so this scheme works
there as well as in OpenGL.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40460>
2026-03-25 15:05:53 +00:00
Eric R. Smith
89288722e7 panfrost: add sysval for number of samples
Not really used yet, but we will need it later when we change how we
lower multisampled image arrays.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40460>
2026-03-25 15:05:53 +00:00
Valentine Burley
17d38c9668 zink/ci: Move zink-tu-a618 to sc7180-trogdor-kingoftown
The sc7180-trogdor-lazor-limozeen devices are having issues, so move the
job to a different device with available capacity.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40566>
2026-03-24 15:22:12 +00:00
Marek Olšák
dee99b38c5 radeonsi: fix an assertion failure for sampler descriptor loads with LLVM
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Pierre-Eric
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40589>
2026-03-24 01:05:29 +00:00
Marek Olšák
e1a845c042 radeonsi: fix compiler selection for fixed-func TCS
Reviewed-by: Pierre-Eric
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40589>
2026-03-24 01:05:29 +00:00
Marek Olšák
55f5253976 radeonsi: remove unnecessary ac_to_integer in si_llvm_ps_build_end
Reviewed-by: Pierre-Eric
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40589>
2026-03-24 01:05:29 +00:00
Marek Olšák
235e32d560 ac/llvm: remove almost duplicated ac_build_varying_gather_values
Reviewed-by: Pierre-Eric
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40589>
2026-03-24 01:05:29 +00:00
Marek Olšák
d692ce4b34 radeonsi/meson: don't use llvm variables when LLVM is disabled
also winsys doesn't use LLVM

Reviewed-by: Pierre-Eric
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40589>
2026-03-24 01:05:29 +00:00
Marek Olšák
8ea3d794fb radeonsi: recompute IO bases after optimizations
to fix an assertion added by the commit, reproduced by viewperf13/catia

Fixes: d06616063c - radeonsi: assert that IO bases don't have holes & the same base isn't used twice

Reviewed-by: Pierre-Eric
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40589>
2026-03-24 01:05:29 +00:00
Eric Engestrom
731e5e466a zink+lvp/ci: document recent flakes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40583>
2026-03-23 23:38:32 +00:00
Eric Engestrom
bb71c2dc34 zink+radv/ci: document recent flakes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40583>
2026-03-23 23:38:32 +00:00
Eric Engestrom
b729dfcc9e llvmpipe/ci: document regressions
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40583>
2026-03-23 23:38:32 +00:00
Pavel Ondračka
52d90752c2 r300/ci: update expectations
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40570>
2026-03-23 21:06:32 +00:00
Mike Blumenkrantz
d6958a5e43 gallium: kill off pipe_surface::context
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40462>
2026-03-23 16:58:15 +00:00
Mike Blumenkrantz
9ffc4f43f9 svga: move surface context member onto internal surface type
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40462>
2026-03-23 16:58:15 +00:00
Mike Blumenkrantz
e8ced90aab gallium: add a pipe_context param to pipe_surface_reference()
this shouldn't be used anymore, but for anyone still using it there
needs to be a context passed

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40462>
2026-03-23 16:58:15 +00:00
Mike Blumenkrantz
0615a276ca gallium: add a destructor param to surface refcounting functions
these functions should no longer be used by serious drivers. for those that
do use them, they now need to pass their own destructor function

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40462>
2026-03-23 16:58:15 +00:00
Mike Blumenkrantz
639c356894 r300: delete pipe_context surface hooks
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40462>
2026-03-23 16:58:14 +00:00
Mike Blumenkrantz
8c37145e61 r300: clean up some surface management
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40462>
2026-03-23 16:58:14 +00:00
Mike Blumenkrantz
0cafd100fa freedreno: delete pipe_context surface hooks
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40462>
2026-03-23 16:58:13 +00:00
Mike Blumenkrantz
1af551ed9f tegra: delete pipe_context surface hooks
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40462>
2026-03-23 16:58:13 +00:00
Mike Blumenkrantz
199eff7538 nouveau: delete unused surface hook
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40462>
2026-03-23 16:58:13 +00:00
Mike Blumenkrantz
643d7b4b70 freedreno: clean up some surface management
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40462>
2026-03-23 16:58:13 +00:00
Mike Blumenkrantz
0115fc92c6 crocus: clean up surface management
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40462>
2026-03-23 16:58:13 +00:00
Mike Blumenkrantz
a4c0f5ba6f svga: simplify some surface management
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40462>
2026-03-23 16:58:13 +00:00
Mike Blumenkrantz
17d9f1dc64 llvmpipe: delete pipe_context surface hooks
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40462>
2026-03-23 16:58:12 +00:00
Mike Blumenkrantz
fa350781ed svga: delete pipe_context surface hooks
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40462>
2026-03-23 16:58:12 +00:00
Mike Blumenkrantz
5e2ecd64b0 softpipe: delete pipe_context::create_surface
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40462>
2026-03-23 16:58:11 +00:00
Pierre-Eric Pelloux-Prayer
98cdcf9467 radeonsi/test: update failures
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40372>
2026-03-23 14:53:01 +00:00
Pierre-Eric Pelloux-Prayer
88986dcc9c radeonsi: account for outputs_written when updating spi_shader_col_format
Variants can modify which outputs get written so we must update
these fields otherwise spi_shader_col_format will be incorrect.

This can happen for instance with uniforms inlining:

   uniform bool depth_only;
   void main() {
      if (depth_only) return;
      ...
   }

When depth_only is true, this shader becomes empty after uniforms
inlining but spi_shader_col_format wasn't updated properly,
causing a hang.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14737
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40372>
2026-03-23 14:53:01 +00:00
Pierre-Eric Pelloux-Prayer
da7c515783 radeonsi: move spi_shader_*_format to si_shader_variant_info
Variants can affect theses value so it's best to store them
in this struct.

No functional changes.

Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40372>
2026-03-23 14:53:01 +00:00
Tomeu Vizoso
db5a1ed2fa rocket: Skip all synthetic tests as we now have several real models
And sync the baseline with the new models that were recently added.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40166>
2026-03-23 12:57:09 +00:00
Tomeu Vizoso
15f0c245c8 ethosu: Set test baseline for the Corstone 1000 (U85)
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:59 +00:00
Tomeu Vizoso
ac0d6e7b7c ethosu: Properly emit IFM_BROADCAST and IFM2_BROADCAST on U85
On U85, both NPU_SET_IFM_BROADCAST and NPU_SET_IFM2_BROADCAST must be
emitted for elementwise operations, matching Vela's GenerateInputBroadcast.

Add calc_broadcast_mode() matching Vela's CalculateBroadcast(): broadcasts
a dimension of shape1 when it is 1 and shape2 is larger, producing a
broadcast_mode bitmask (H=1, W=2, C=4, SCALAR=8).

Split emit_ifm2_broadcast into U65 (legacy bitfields) and U85 paths.
The U85 path emits both IFM_BROADCAST and IFM2_BROADCAST using
calc_broadcast_mode in each direction.

Also fix emit_eltwise to call emit_ifm2_precision instead of
emit_ifm_broadcast for U85, which was emitting 0 instead of the
required IFM2_PRECISION register.

Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:59 +00:00
Tomeu Vizoso
2a6d181bc6 ethosu: Fix scalar ADD on U85
They added new registers tot he command stream, with a new bitfield
layout.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:59 +00:00
Tomeu Vizoso
818e1835d7 ethosu: map BOs at creation time and unmap at destruction
Map DRM buffer objects once at resource_create and unmap at
resource_destroy, instead of mapping them in buffer_map where they
were never unmapped. This fixes a virtual memory leak that caused
SIGBUS under heavy workloads by exhausting CMA.

Also remove unused phys_addr and obj_addr fields from ethosu_resource,
and add asserts on pipe_buffer_create return values.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:58 +00:00
Tomeu Vizoso
f9cd399eb0 ethosu: Fix ublock selection for 8-bit depthwise/pooling on U85-256
For U85-256 with 8-bit IFM, Vela's _uBlockToOpTable restricts which
microblocks are valid per operation type:

  {2,2,8}  and {4,1,8}:  conv, matmul, vectorprod, reducesum, eltwise, resize
  {2,1,16}:              depthwise, pool, eltwise, reduceminmax, argmax, resize

Mesa's find_ublock() was not enforcing these constraints, allowing
{4,1,8} or {2,2,8} to be selected for depthwise/pooling based on
minimum waste. For depthwise ops with OFM shapes that aligned better
to {4,1,8}, the wrong ublock was chosen, causing incorrect weight
encoding and NPU hangs.

Fix by skipping {4,1,8} and {2,2,8} for depthwise/pooling operations,
matching Vela's operation-validity table.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:58 +00:00