Simulator is crashing when receiving GPGPU + Pixel as resource barrier signal
stage, what according to spec is invalid.
So here replacing the pixel stage by color, over synchronizing it a bit but
keeping it functional.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14641
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40516>
Simulator hangs if a resource barrier has wait stage = None, HW seens
to don't care but something bad could be happning internaly.
So here making sure Wait stage is set to TOP when it is None.
Simulator hangs if a resource barrier has wait stage = None.
The HW seems to ignore it, but something bad could be happening internally.
So here I'm making sure the wait stage is set to TOP when it is None.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40516>
pan_nir_lower_fs_outputs already handles channel masks and vec3 and
smaller outputs. We don't need the extra precursor pass.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40544>
pan_nir_lower_fs_outputs() already does this so there's no need to have
it in a separate pass.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40544>
Clearing on graphics updates HiZ correctly and expanding it always
after the clear might hurt because it means HiZ will be disabled.
This probably helps performance with the full GFX12 HiZ WA.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40176>
To update HiZ properly during depth/stencil clears. There is a risk
but it's very minimal and it's also much better for performance.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40176>
Variants can modify which outputs get written so we must update
these fields otherwise spi_shader_col_format will be incorrect.
This can happen for instance with uniforms inlining:
uniform bool depth_only;
void main() {
if (depth_only) return;
...
}
When depth_only is true, this shader becomes empty after uniforms
inlining but spi_shader_col_format wasn't updated properly,
causing a hang.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14737
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40372>
Mesa CI currently builds some libwayland components as part of the
container images while installing others from the Debian repositories.
This results in a version mismatch: we build 1.24.0 locally, while
Debian Trixie provides 1.23.1.
Mixing these versions is unsupported and breaks the arm32 jobs.
Instead of building libwayland in CI, switch back to using the Debian
packages (now updated to 1.24.0 via ci-deb-repo).
This ensures a consistent version across build and test containers and
fixes the arm32 failures.
Note that Cuttlefish ships its own older libwayland_client.so without
wl_fixes. As a result, we still need to keep the
-D legacy-wayland=bind-wayland-display workaround for both x86_64 and
arm64, which effectively builds Mesa against an older Wayland interface.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15072
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40487>
move_rt_instructions() only makes sense for CPS recursive shaders, where
later rt_trace_ray calls can overwrite the current shader's RT system
values.
Running it on the function-call path can hoist load_hit_attrib_amd
above merged intersection writes, which corrupts any-hit
hitAttributeEXT. Move the pass into the existing CPS-only
non-intersection branch before nir_lower_shader_calls().
Fixes: c5d796c902 ("radv/rt: Use function call structure in NIR lowering")
Closes: #15074
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40531>
When vdrm_handle_to_res_id fails in virtio_bo_init_dmabuf, the handle
obtained from vdrm_dmabuf_to_handle was leaked.
Closing the handle is safe despite the lack of vdrm refcounting
because dma_bo_lock is held and already-imported BOs return early.
At this point, we are the sole holder of the handle.
While here, use the local vdrm variable consistently.
Fixes: 6ca192f586 ("turnip: virtio: fix iova leak upon found already imported dmabuf")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40507>
In tu_bo_init, if growing the submit BO list fails, the GEM handle
must be closed. However, bo->gem_handle is only populated later
via compound assignment. Use the gem_handle parameter directly
to ensure the correct handle is closed and not leaked.
Fixes: d67d501af4 ("tu/drm/virtio: Switch to vdrm helper")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40507>
When initializing a BO using a lazy VMA, the iova is provided by
the sparse VMA and was not allocated from the device's VMA heap.
Avoid calling util_vma_heap_free in the error path for such BOs
to prevent heap corruption and potential double-frees.
Fixes: 88d001383a ("tu: Add support for a "lazy" sparse VMA")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40507>
set_iova() was called unconditionally after tu_bo_init(), even on the
failure path where the BO has been zeroed. This would call set_iova()
with res_id 0 and a stale iova, corrupting the iova mapping.
Move set_iova() into the success branch so it is only called when
tu_bo_init() succeeds.
Fixes: db88a490b8 ("tu: Avoid extraneous set_iova")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40507>
All BOs allocated from vkAllocateMemory are either local BOs or added
to the global BO list. Only BOs allocated internally should be added
to the per-cmdbuf list.
Verified this by doing a full CTS run with amdgpu.debug=0x1.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40466>
The global BO list for app allocations has been enabled by default
since Mesa 25.3 and we didn't find any blockers, so let's make it the
default for real. Note that vkd3d-proton and Zink always used that
path and DXVK started to use it in August 2025 after requiring BDA.
This removes RADV_DEBUG=nobolist which was added only for debugging
purposes since the global BO list was enabled by default for app
allocations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40466>
It's supported on GFX9+ and on GFX8+ with a specific fw version. It's
more correct with preemption.
Also rewrite the comment now that we got more information from Marek.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40341>
These are already implemented by common code, so there's nothing to be
done here, really.
A few tests fail due to timeouts. But this seems no different than on
other drivers, we just skip less WSI tests than most drivers does. Skip
those for now.
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40502>
On U85, both NPU_SET_IFM_BROADCAST and NPU_SET_IFM2_BROADCAST must be
emitted for elementwise operations, matching Vela's GenerateInputBroadcast.
Add calc_broadcast_mode() matching Vela's CalculateBroadcast(): broadcasts
a dimension of shape1 when it is 1 and shape2 is larger, producing a
broadcast_mode bitmask (H=1, W=2, C=4, SCALAR=8).
Split emit_ifm2_broadcast into U65 (legacy bitfields) and U85 paths.
The U85 path emits both IFM_BROADCAST and IFM2_BROADCAST using
calc_broadcast_mode in each direction.
Also fix emit_eltwise to call emit_ifm2_precision instead of
emit_ifm_broadcast for U85, which was emitting 0 instead of the
required IFM2_PRECISION register.
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
Map DRM buffer objects once at resource_create and unmap at
resource_destroy, instead of mapping them in buffer_map where they
were never unmapped. This fixes a virtual memory leak that caused
SIGBUS under heavy workloads by exhausting CMA.
Also remove unused phys_addr and obj_addr fields from ethosu_resource,
and add asserts on pipe_buffer_create return values.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>