Commit graph

52221 commits

Author SHA1 Message Date
Tomeu Vizoso
2cf3d0b273 ethosu: Add a separate scheduler for the U85
As the performance details have changed quite a bit.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:57 +00:00
Tomeu Vizoso
82d4f21106 ethosu: Don't emit redundant state changes
Keep track of the state and only emit meaningful changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:56 +00:00
Tomeu Vizoso
8872f5eea4 ethosu: Add debug option for forcing U85 generation
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:56 +00:00
Tomeu Vizoso
45fb8b99df ethosu: Invert lowering order of concatenation suboperations
Just so we match the order in which Vela assigns offsets to the FMs so
it's easier to diff cmdstream dumps.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:56 +00:00
Tomeu Vizoso
d66d2c05d3 ethosu: Switch to the weight encoder from Regor
We vendor the encoder used in the Regor compiler in Vela, and replace
the previous one that was used by the Python compiler and doesn't
support U85.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:55 +00:00
Tomeu Vizoso
410d74e078 ethosu: Compute is_partkernel during scheduling
As we need it for encoding the weights.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:55 +00:00
Tomeu Vizoso
3ade0a4dd6 ethosu: Make the UBlock sizes arch-specific
As U85 has a different configuration.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:55 +00:00
Tomeu Vizoso
91137a9327 ethosu: Let maxblockdeps be arch-specific
As U85 can have up to 7.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:54 +00:00
Tomeu Vizoso
0af37552a7 ethosu: Add U85 fields, these are compatible with the U65
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:54 +00:00
Tomeu Vizoso
47aa30276e ethosu: Update test expectations
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:53 +00:00
Marek Olšák
ae9ea27e0d Rename *_sha1 names to *_blake3
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:28 +00:00
Marek Olšák
353fe94c0e Rename SHA1 words to BLAKE3
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:28 +00:00
Marek Olšák
102d41799b Rename more sha and sha1 names to blake3
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:28 +00:00
Marek Olšák
282bd2e6db Rename sha words to blake3
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:28 +00:00
Marek Olšák
d4831aaf5f Rename sha1_* and sha_* names to blake3_*
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:28 +00:00
Marek Olšák
c0ac992a2a Remove mesa-sha1.h
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:27 +00:00
Marek Olšák
53c64973e8 Inline _mesa_sha1_compute/format, remove the other unused ones
_mesa_sha1_format has a few remaining uses, so it's moved to build_id.c,
which is its last user.

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:27 +00:00
Marek Olšák
699f9d7066 Inline _mesa_sha1_init/update/final functions
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:27 +00:00
Marek Olšák
3ae8f910ad Inline SHA1* functions, remove sha1.h
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:27 +00:00
Marek Olšák
a965ada6ee Inline mesa_sha1, SHA1_CTX
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:27 +00:00
Marek Olšák
0da88d237a Inline SHA1_DIGEST_STRING_LENGTH
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:27 +00:00
Marek Olšák
110632f702 Inline SHA1_DIGEST_LENGTH
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:27 +00:00
Marek Olšák
2283244975 nir: change export_amd intrinsics to use target instead of base
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40415>
2026-03-23 06:10:49 +00:00
Rob Herring (Arm)
6b26cc2df3 ethosu: Fix buffer overrun in stridedslice
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The slice.begin array length matches the tensor depth which may be less
than 4.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40525>
2026-03-21 08:32:20 +00:00
Rob Herring (Arm)
5e93ab5477 ethosu: Support ReLU activation for ADD ops
ReLU activations require the minimum to be set to the zero point.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40525>
2026-03-21 08:32:20 +00:00
Rob Herring (Arm)
3780fb8494 ethosu: Handle IFM2 H/W/D broadcast
If the IFM and IFM2 dimensions are not the same, then the H/W/D broadcast
needs to be enabled.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40525>
2026-03-21 08:32:20 +00:00
Rob Herring (Arm)
1cb46e9304 ethosu: Handle reversing IFM and IFM2 operands
IFM2 must be scalar or smaller than IFM. If not, then the operands need
to be swapped.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40525>
2026-03-21 08:32:19 +00:00
Rob Herring (Arm)
d962160e95 ethosu: Add scalar ADD support
An input tensor can contain a single scalar value to add to the IFM.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40525>
2026-03-21 08:32:19 +00:00
Rob Herring (Arm)
5606fd1ea6 ethosu: Add support for 16-bit tensors
Ethos-U can support 16-bit tensors. So far the driver just assumed 8-bit
tensors.

There's a few cases where 32-bit tensors are supported, but exactly what
those are hasn't been determined, so just reject them for now.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40525>
2026-03-21 08:32:19 +00:00
Rob Herring (Arm)
c29860e9e9 test_teflon: Add 32-bit integer output comparison
Add support for 32-bit integer output comparison. This fixes several
test failures for movenetlightning and movenetthunder.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40525>
2026-03-21 08:32:18 +00:00
Mike Blumenkrantz
4b2022a8f5 llvmpipe: fix color fbfetch
with the unlowering pass, there is no longer a separate gl_LastFragData variable,
so this workaround just breaks color outputs

fixes dEQP-GLES31.functional.shaders.framebuffer_fetch.basic.last_frag_data

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40437>
2026-03-20 20:02:35 +00:00
Lorenzo Rossi
caf5a2640b panvk,panfrost: Always emit ld_var_buf when possible
Previously the driver decided when the backend should use
LD_VAR_BUF[_IMM] instructions based on the total number of varyings
read, falling back to LD_VAR[_IMM] + descriptors when the varying index
could overflow the immediate index in the instructions.  That means that
even adding a single varying read could overflow the index and make
everything fall back to LD_VAR.

With this patch the backend decides when to use LD_VAR_BUF for each
varying load, reporting that decision to the driver.  This helps with
index overflows because only the instruction that actually overflow the
immediate use the LD_VAR fallback, leaving all other instructions on the
fast path.

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40515>
2026-03-20 18:47:11 +00:00
Yiwei Zhang
0aa6d727c9 llvmpipe: follow winsys handle attributes when imported with explicit layout
TILE_SIZE round up can conflict with explicit plane size, which assumes
a smaller alignment e.g. sw YV12

Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40426>
2026-03-20 17:53:30 +00:00
Yiwei Zhang
ec06d0b634 llvmpipe: drop unused dt_format
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40426>
2026-03-20 17:53:27 +00:00
Mike Blumenkrantz
929eb9a021 mesa/renderbuffer: always add PIPE_BIND_SAMPLER_VIEW to rendering textures
this fixes expectations around e.g., using u_blitter to copy textures

cc: mesa-stable

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40444>
2026-03-20 15:15:32 +00:00
Georg Lehmann
ec331cc48a nir: replace lower_ldexp with has_ldexp
I can be bothered to fix all the backends that don't set lower_ldexp,
and only two backends have ldexp anyway.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33900>
2026-03-20 08:15:08 +00:00
Rob Clark
55ee6aa57c freedreno/a6xx: Move A2D reg write to ncrb
It is not a 3d context reg.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40458>
2026-03-19 19:58:20 +00:00
Silvio Vilerino
6e39982b32 d3d12: d3d12_video_encode_support_caps was assigning a stack variable address to capEncoderSupportData in/out arg
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Pohsiang (John) Hsu <pohhsu@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40511>
2026-03-19 19:06:51 +00:00
Silvio Vilerino
0e37a80aca d3d12: Truncate move_rects_support.bits.max_motion_hints 16 bit var to 65535, not 65536
Reviewed-by: Pohsiang (John) Hsu <pohhsu@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40511>
2026-03-19 19:06:51 +00:00
Thong Thai
525fcdfcee radeonsi: remove radeonsi prefix from si_pipe.h includes
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40518>
2026-03-19 18:47:29 +00:00
Anders Roxell
ea731cda12 ethosu: fix blockdep to check for data dependencies
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
calc_blockdep always returned MAX_BLOCKDEP without checking if the
previous op writes to a buffer the current op reads from. This let
the NPU start reading before the previous write was done.

Add overlap check between previous OFM and current IFM so we set
blockdep to 0 when they share the same buffer.

Update ethos-imx93-fails.txt to remove the tests that now pass.

Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39594>
2026-03-19 16:43:13 +00:00
Anders Roxell
17435b6a58 teflon/tests: add micronet_large anomaly detection model
Downloaded from the Arm ML Zoo [1]. Per-channel quantized INT8 model
with 14 operators: CONV_2D (7x), DEPTHWISE_CONV_2D (5x),
AVERAGE_POOL_2D, RESHAPE. All per-op tests pass but the full model
fails due to a bug in synchronization of operations.

[1] https://github.com/Arm-Examples/ML-zoo/tree/master/models/anomaly_detection/micronet_large/tflite_int8

Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39594>
2026-03-19 16:43:13 +00:00
Anders Roxell
7c1ec56427 ethosu: clean up ADD elementwise scaling
Replace the two functions simplified_elementwise_add_sub_scale and
eltwise_emit_ofm_scaling with a single advanced_elementwise_add_sub_scale
that follows the ethos-u-vela naming. Remove the large block of
commented out Vela Python code.

No functional change.

Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39594>
2026-03-19 16:43:13 +00:00
Anders Roxell
69d3f080be ethosu: fix RESIZE upscale mode
The upscale field was a bool which happened to work since true maps
to 1 which is NEAREST in the hardware. Change from bool to an enum
ethosu_upscale_mode so the intent is clear and we dont rely on the
bool-to-int mapping.

Also add a check in operation_supported so RESIZE only accepts 2x
upscaling since thats what the NPU can do with IFM_UPSCALE. Other
sizes fall back to CPU.

Keep the original zero_points from tensors in RESIZE and STRIDED_SLICE
instead of forcing them to 0 since the requantization needs them.

Fixes the RESIZE_NEAREST_NEIGHBOR operations in EfficientDet-Lite
models that use BiFPN with 2x nearest neighbor upsampling.

Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39594>
2026-03-19 16:43:12 +00:00
Anders Roxell
e27ba5b437 ethosu: Handle per-channel zero_points
fill_weights subtracted a single zero_point from all weights which
did not handle models with per-channel zero_points. Use the
per-channel zero_point for each output channel when available.

Also decouple the zero_points copy from the scales copy in the lower
pass so they are handled independently.

Suggested-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39594>
2026-03-19 16:43:12 +00:00
Anders Roxell
63c028b5e0 ethosu: Add support for per-channel quantization
For those models with coefficients that have different quantization
parameters for each channel.

The NPU can handle per-channel scales as can be seen in
fill_scale_and_biases(), which already iterates per output channel.

Activation tensors (input/output) don't have per-channel quantization.

- Add scales/zero_points arrays to ethosu_kernel struct
- Copy per-channel scales from weight tensor in lower pass
- Use per-channel scale when computing conv_scale in coefs
- Allow per-channel quantization in operation_supported check

Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39594>
2026-03-19 16:43:12 +00:00
Lorenzo Rossi
636aba5811 panfrost: Lower indirect derefs before lower_io
This will surely lose performance in some cases, this is a temporary fix
to align ourselves with how the Vulkan compiler works.  We might be able
to us indirect varyings directly in the future depending on how we
handle their memory layout.

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>
2026-03-19 11:25:30 +00:00
Icenowy Zheng
af8923bb01 zink: skip all post-process when importing and resource_create fails
When the pipe_resource pointer returned by resource_create is NULL, the
process importing the handle into the underlying Vulkan driver is known
to have failed, and the handle importing process shouldn't continue.

Just return NULL in this case to prevent further check of pres being
non-NULL.

This also fixes the issue that renderonly code lacks check for non-NULL
pres, and the conversion of pipe_resource to zink_resource in renderonly
codepath is now gone because of a converted zink_resource is available
above.

Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40490>
2026-03-18 16:34:10 +00:00
Robert Mader
0bbc26d2c4 llvmpipe: Stop aligning height to raster block size for unbacked handles
This code path is usually used by lavapipe when importing dmabufs, not
for output.
The resulting size_required is then used to calculate the size
requirements for VkMemoryRequirements2 etc. Requiring a multiple of
LP_RASTER_BLOCK_SIZE - 4 - can eventually result in lavapipe rejecting
dmabuf imports.

An example is YUV420 at a resolution of 1680x1050 produced by Gstreamer
1.28 - e.g. from a screencasts. In this case we currently compute a size
of 3235840, while other drivers like radv compute 3225600. The actual
size is 3227648, fitting into the later but not the former.

Removing the alignment brings lavapipe in line with other drivers.

Cc: mesa-stable
Signed-off-by: Robert Mader <robert.mader@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40424>
2026-03-18 16:20:16 +01:00
Eric R. Smith
3945421c17 panfrost: fix typos in architecture detection
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The preprocessor symbol we want is `PAN_ARCH`, not `MALI_ARCH`.

Fixes: a21ee564e2 ("pan/bi: Make texel buffers use Attribute Buffers")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40459>
2026-03-18 12:53:37 +00:00