When initializing a BO using a lazy VMA, the iova is provided by
the sparse VMA and was not allocated from the device's VMA heap.
Avoid calling util_vma_heap_free in the error path for such BOs
to prevent heap corruption and potential double-frees.
Fixes: 88d001383a ("tu: Add support for a "lazy" sparse VMA")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40507>
set_iova() was called unconditionally after tu_bo_init(), even on the
failure path where the BO has been zeroed. This would call set_iova()
with res_id 0 and a stale iova, corrupting the iova mapping.
Move set_iova() into the success branch so it is only called when
tu_bo_init() succeeds.
Fixes: db88a490b8 ("tu: Avoid extraneous set_iova")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40507>
All BOs allocated from vkAllocateMemory are either local BOs or added
to the global BO list. Only BOs allocated internally should be added
to the per-cmdbuf list.
Verified this by doing a full CTS run with amdgpu.debug=0x1.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40466>
The global BO list for app allocations has been enabled by default
since Mesa 25.3 and we didn't find any blockers, so let's make it the
default for real. Note that vkd3d-proton and Zink always used that
path and DXVK started to use it in August 2025 after requiring BDA.
This removes RADV_DEBUG=nobolist which was added only for debugging
purposes since the global BO list was enabled by default for app
allocations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40466>
It's supported on GFX9+ and on GFX8+ with a specific fw version. It's
more correct with preemption.
Also rewrite the comment now that we got more information from Marek.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40341>
These are already implemented by common code, so there's nothing to be
done here, really.
A few tests fail due to timeouts. But this seems no different than on
other drivers, we just skip less WSI tests than most drivers does. Skip
those for now.
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40502>
On U85, both NPU_SET_IFM_BROADCAST and NPU_SET_IFM2_BROADCAST must be
emitted for elementwise operations, matching Vela's GenerateInputBroadcast.
Add calc_broadcast_mode() matching Vela's CalculateBroadcast(): broadcasts
a dimension of shape1 when it is 1 and shape2 is larger, producing a
broadcast_mode bitmask (H=1, W=2, C=4, SCALAR=8).
Split emit_ifm2_broadcast into U65 (legacy bitfields) and U85 paths.
The U85 path emits both IFM_BROADCAST and IFM2_BROADCAST using
calc_broadcast_mode in each direction.
Also fix emit_eltwise to call emit_ifm2_precision instead of
emit_ifm_broadcast for U85, which was emitting 0 instead of the
required IFM2_PRECISION register.
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
Map DRM buffer objects once at resource_create and unmap at
resource_destroy, instead of mapping them in buffer_map where they
were never unmapped. This fixes a virtual memory leak that caused
SIGBUS under heavy workloads by exhausting CMA.
Also remove unused phys_addr and obj_addr fields from ethosu_resource,
and add asserts on pipe_buffer_create return values.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
For U85-256 with 8-bit IFM, Vela's _uBlockToOpTable restricts which
microblocks are valid per operation type:
{2,2,8} and {4,1,8}: conv, matmul, vectorprod, reducesum, eltwise, resize
{2,1,16}: depthwise, pool, eltwise, reduceminmax, argmax, resize
Mesa's find_ublock() was not enforcing these constraints, allowing
{4,1,8} or {2,2,8} to be selected for depthwise/pooling based on
minimum waste. For depthwise ops with OFM shapes that aligned better
to {4,1,8}, the wrong ublock was chosen, causing incorrect weight
encoding and NPU hangs.
Fix by skipping {4,1,8} and {2,2,8} for depthwise/pooling operations,
matching Vela's operation-validity table.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>