Commit graph

220402 commits

Author SHA1 Message Date
Faith Ekstrand
fb7e1fe81c pan/bi: Always vectorize UBO access
Now that we claim 16B robustness alignments, we can vectorize UBO
access, even when robustness2 is enabled.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:36 +00:00
Faith Ekstrand
3bbacfe8d7 panvk: Set min_ubo/ssbo_alignment in spirv_options
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:36 +00:00
Faith Ekstrand
e52e7019b9 panvk: Increase robust buffer access alignments
We can't go any higher than 4B for SSBOs but we can go up to 16B for
UBOs.  This will let us start vectorizing UBO access, even when robust
because max-size loads (LD_PKA.i128) will never overrun a binding unless
they're entirely outside the binding.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:36 +00:00
Faith Ekstrand
f350a69759 panvk: Track which dynamic buffers are SSBOs
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:36 +00:00
Faith Ekstrand
12e1f5d0ea panvk: Rework setting dyn_buf_offsets
There's no point in looping over all the descriptors.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:35 +00:00
Faith Ekstrand
813e399803 panvk: Reduce minTexelBufferOffsetAlignment
There are formats that require a 128B alignment but they're compressed
and not allowed for texel buffers.  The biggest texel size we can have
for a texel buffer is RGBA32, which is 16B.  The only reason why we
needed the large alignment was to work around a bug in the way we were
turning texel buffers into attribute descriptors on Bifrost.  That bug
is now fixed so we can reduce to a reasonable alignment requiremdnt.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:35 +00:00
Faith Ekstrand
8471a90eb4 pan/buffer: Drop pan_buffer_view::offset
We can handle that inside pan_buffer.c and make the interface simpler.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:35 +00:00
Faith Ekstrand
ce56f49561 pan/buffer: Add the offset to the size for buffer textures
In the attribute model, the size is for the attribute binding and the
offset is an offset into that range.  If we're going to use that to
offset the buffer itself, we need to increase the size accordingly.

Fixes: a21ee564e2 ("pan/bi: Make texel buffers use Attribute Buffers")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:34 +00:00
Faith Ekstrand
8dc458225b pan/bi: v2x16 conversions don't replicate
They swizzle just like anything else.  Technically, we could maybe do a
little better than the generic case for these since they only read 8
bits per 16 bits in the destination but the generic case is correct,
even if it isn't optimal.

Fixes: f7d44a46cd ("pan/bi: Optimize replication")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:34 +00:00
Faith Ekstrand
dbefdb2376 pan/bi/ra: Dump verbose debug logging to stderr
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40576>
2026-03-26 16:28:34 +00:00
Tomeu Vizoso
fc0770d5e3 ethosu: parse optional SRAM size from device spec string
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The spec format is now GEN-MACS[-SRAM], e.g. "65-256-4096" or
"85-256". When the SRAM parameter is omitted it defaults to 0.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40647>
2026-03-26 16:13:23 +00:00
Tomeu Vizoso
abd681c169 ethosu: add U85-256 support to ethosu_ml_device_create()
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40647>
2026-03-26 16:13:23 +00:00
Tomeu Vizoso
3b68c5b4bc ethosu: move hardware description from ethosu_screen to ethosu_ml_device
Move target-specific fields (is_u65, ifm_ublock, ofm_ublock,
max_concurrent_blocks, sram_size) from ethosu_screen into
ethosu_ml_device. This decouples the compilation phase from the DRM
file descriptor and pipe_screen, allowing ahead-of-time compilation
where the target NPU is not present on the compilation host.

The ethosu_device_screen() helper is retained only for runtime paths
that need the DRM fd (buffer allocation, job submission, destroy).

Compilation code now accesses hardware parameters through
ethosu_ml_device() cast of pipe_ml_device, which can be created
either from a DRM-backed screen or standalone via
ethosu_ml_device_create() with a target string like "65-256".

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40647>
2026-03-26 16:13:23 +00:00
Qiang Yu
06e5026e28 docs: add GL_NV_timeline_semaphore support for radeonsi
Author: Claude Opus 4.6 <noreply@anthropic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40526>
2026-03-26 14:26:56 +00:00
Qiang Yu
00b1d77176 radeonsi: advertise GL_NV_timeline_semaphore
Set max_timeline_semaphore_difference = UINT64_MAX when timeline syncobj
is supported and GFX uses the kernel queue path (not userq). The GL
state tracker auto-enables GL_NV_timeline_semaphore when this cap is
non-zero.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15113
Author: Claude Opus 4.6 <noreply@anthropic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40526>
2026-03-26 14:26:56 +00:00
Qiang Yu
26418f0f58 radeonsi: add timeline semaphore support to fence operations
Thread timeline_point through si_add_fence_dependency and
si_add_syncobj_signal to the winsys. Remove the assert(!value)
guards in si_fence_server_sync and si_fence_server_signal so that
non-zero timeline point values are passed through to the winsys
fence dependency and signal lists.

Add PIPE_FD_TYPE_TIMELINE_SEMAPHORE_VK handling in si_create_fence_fd,
importing the fd as a syncobj (the timeline point is applied at
wait/signal time, not at import time).

Author: Claude Opus 4.6 <noreply@anthropic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40526>
2026-03-26 14:26:56 +00:00
Qiang Yu
379bf43084 winsys/amdgpu: use timeline syncobj chunks in kernelq submission
When has_timeline_syncobj is available, use AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_WAIT
with drm_amdgpu_cs_chunk_syncobj for dependencies and
AMDGPU_CHUNK_ID_SYNCOBJ_TIMELINE_SIGNAL for signals in kernelq submission.
This passes timeline point values from the fence lists through to the kernel.

Keep the existing binary SYNCOBJ_IN/SYNCOBJ_OUT path as fallback when
timeline syncobj is not available.

Author: Claude Opus 4.6 <noreply@anthropic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40526>
2026-03-26 14:26:56 +00:00
Qiang Yu
c4edd58a74 winsys/amdgpu: add timeline point support to fence lists
Add a parallel uint64_t *points array to amdgpu_fence_list to store
timeline semaphore point values alongside each fence. Point=0 means
binary semaphore (preserving existing behavior).

Update cs_add_fence_dependency and cs_add_syncobj_signal winsys
interfaces to accept a timeline_point parameter, and thread it
through to the fence lists. All existing callers pass 0.

Author: Claude Opus 4.6 <noreply@anthropic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40526>
2026-03-26 14:26:56 +00:00
Icenowy Zheng
765a9f4fd9 pvr: Align width for PBE write when creating linear image
Even if a linear image isn't created with usages declaring PBE writes,
the image might be exported and then re-imported with a usage that
allows rendering to.

Always align linear images' width for being written by PBE.

This fixes WSI creating surfaces with odd width, exporting them and
re-importing for rendering.

Backport-to: 26.0
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40250>
2026-03-26 14:08:10 +00:00
Georg Lehmann
0d8e2354ed nir: add fp_math_ctrl to convert_alu_types
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:50 +00:00
Georg Lehmann
8470bb59f6 lavapipe: preserve fp_math_ctrl when lowering cmat alu
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:50 +00:00
Georg Lehmann
eef0fa22e0 brw: preserve fp_math_ctrl when lowering cmat alu
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:50 +00:00
Georg Lehmann
123d8c230e nak: preserve fp_math_ctrl when lowering cmat
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:50 +00:00
Georg Lehmann
bcdef7c79b radv: preserve fp_math_ctrl when lowering cmat alu ops
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:50 +00:00
Georg Lehmann
b8b1ce9667 spirv: set fp_math_ctrl for cmat alu
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:50 +00:00
Georg Lehmann
35ca85176c nir: add fp_math_ctrl to cmat alu ops
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:50 +00:00
Georg Lehmann
9cba104e11 nir/opt_fp_math_ctrl: use ddx/ddy fp_math_ctrl
No Foz-DB changes.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:50 +00:00
Georg Lehmann
85ff60e68a nir/opt_uniform_subgroup: use ddx/ddy fp_math_ctrl
Foz-DB Navi48:
Totals from 16 (0.01% of 139781) affected shaders:
Instrs: 12432 -> 11597 (-6.72%)
CodeSize: 66204 -> 62440 (-5.69%)
Latency: 77168 -> 76132 (-1.34%)
InvThroughput: 8942 -> 8332 (-6.82%)
VClause: 302 -> 290 (-3.97%)
SClause: 207 -> 201 (-2.90%)
Copies: 553 -> 517 (-6.51%)
PreVGPRs: 589 -> 577 (-2.04%)
VALU: 8007 -> 7473 (-6.67%)
SALU: 1057 -> 900 (-14.85%)
VMEM: 407 -> 395 (-2.95%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:50 +00:00
Georg Lehmann
5d2be211ea nir: add fp_math_ctrl to ddx/ddy
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:49 +00:00
Georg Lehmann
854911aeab nir: add fp_math_ctrl as intrinsic index
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:49 +00:00
Georg Lehmann
d2be2fd4c1 nir/opt_fp_math_ctrl: ignore ffract input sign of zero
ffract(-0.0) = fract(+0.0) = +0.0

Foz-DB Navi48:
Totals from 23 (0.01% of 205040) affected shaders:
Instrs: 12036 -> 11836 (-1.66%)
CodeSize: 58392 -> 57716 (-1.16%); split: -1.19%, +0.03%
Latency: 57532 -> 57204 (-0.57%); split: -0.61%, +0.04%
InvThroughput: 10399 -> 10217 (-1.75%)
VClause: 72 -> 70 (-2.78%)
Copies: 324 -> 335 (+3.40%)
PreVGPRs: 640 -> 646 (+0.94%)
VALU: 8561 -> 8364 (-2.30%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:49 +00:00
Juan A. Suarez Romero
18a63522d6 v3dv: fix mutable resolve attachment format mismatch
When a resolve attachment is created with VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT,
the render pass may use a view format that differs from the image creation
format (e.g. view=R16G16_SINT on an image created as B8G8R8A8_SRGB).

cmd_buffer_emit_resolve() was calling v3dv_CmdResolveImage2() which only
receives images but not the view format. This means that blit_shader()
will use the wrong format, causing miss-renderings.

So instead of using directly v3dv_CmdResolveImage2(), let's have an
intermediate function that receives both images and view formats to do
the resolve.

This fixes dEQP-VK.image.mutable.* failures.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40234>
2026-03-26 13:25:16 +01:00
Alejandro Piñeiro
473b99b1d1 broadcom/vulkan: remove v3dv_private.h
We recently splitted it in smaller sub-headers, but forgot to also
remove the header itself.

Fixes: 70728fce57 ("v3dv: split v3dv_private.h into smaller headers")

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40650>
2026-03-26 12:56:54 +01:00
Icenowy Zheng
441bb8b947 pvr: drop master for the display FD if it's not needed
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Currently the display FD is opened twice because of pvr_winsys_create()
being called twice, however the WSI (which will do modeset on the
display FD in case of VK_KHR_display) is registered against the winsys
created at PhysicalDevice enumeration time, and the display FD opened at
Device creation time will only be used for allocating dumb buffer (which
does not require master privilege).

Add a parameter to pvr_winsys_create() to indicate whether the master
privilege is desired on the display FD, and pass true only when creating
the winsys for PhysicalDevice initialization.

Fixes VK_KHR_display operation on PowerVR driver, which is broken after
the WSI code starts to drop master in commit 870e233ca5
("vulkan/wsi/display: Avoid holding drm master for the device's fd.").

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15161
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40640>
2026-03-26 17:39:01 +08:00
Robert Mader
44fa9c8326 nir/lower_tex: Reinstate LSB to MSB shift
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
lower_sx10_external and lower_sx12_external are used for
LSB aligned formats such as DRM_FORMAT_S010, which are typically
used by software decoders. Unlike MSB aligned 10/12 bit formats
used by hardware decoders such as P010 they need to manually
get "shifted" in order to correctly map to the 0-1 range.

In the commit mentioned below the corresponding code got removed,
probably because it got confused with similar sounding code in
the common path - and because we don't have tests on the CI for the
affected formats yet.

Note: the formats in question are not yet supported in Vulkan.

Fixes: 5127568b98 ("compiler/nir: use common ycbcr math")
Signed-off-by: Robert Mader <robert.mader@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40561>
2026-03-26 09:05:40 +00:00
Samuel Pitoiset
8483acdc29 radv/ci: add new jobs that run full VKCTS on NAVI21/NAVI31/GFX1201
They are only nightly jobs that run full VKCTS. The main advantage is
that we have mesh shaders coverage on NAVI31/GFX1201. It's still not
possible to enable that on pre-merge because of random GPU hangs.

Expect random GPU hangs on NAVI31/GFX1201 nightly jobs but I think
it's better than no coverage at all.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40626>
2026-03-26 08:39:33 +00:00
Samuel Pitoiset
749eb41b59 radv/ci: add a new dEQP test suite for nightly jobs
These jobs only skip the tests that are known to hang. The timeout is
also increased to 120s.

Also rename them to -full for less confusion.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40626>
2026-03-26 08:39:33 +00:00
Samuel Pitoiset
1f8ff31b81 radv/ci: move slow tests to radv-slow-skips.txt
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40626>
2026-03-26 08:39:33 +00:00
Casey Bowman
db34a92c48 intel/tools: Add xe3p format for intel_monitor
The kernel uses an updated buffer format for xe3p gpus when EU stall
sampling, so this updates intel_monitor to use the correct formatting,
leaving room for any future formatting updates.

This also addresses an issue with not packing the formatted structure
with the correct macro, which lead to incorrect offsets being used for
parsing the buffer.

BSpec: 79847

v2: Add BSpec reference number, suggested by Lionel

Signed-off-by: Casey Bowman <casey.g.bowman@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40622>
2026-03-26 07:31:09 +00:00
Zan Dobersek
42ea820b26 tu/a8xx: add missing register state in tu_clear_sysmem_attachments()
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
For depth clears of sysmem attachments to work properly, additional
register state is required in tu_clear_sysmem_attachments().

Fixes various CTS tests on a8xx:
  - dEQP-VK.conditional_rendering.draw_clear.clear.depth.*
  - dEQP-VK.api.image_clearing.core.clear_depth_stencil_attachment.*
    with FD_DEV_FEATURES=has_generic_clear=0, which will result in
    tu_clear_sysmem_attachments() fallback being used

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40542>
2026-03-26 06:59:03 +00:00
Alyssa Milburn
a6992c7bbe nv50,nvc0: Avoid uninitialized cbuf reads in blits
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Overwrite the whole framebuffer cbuf rather than copying it from the
stack; fixes util_framebuffer_get_num_samples getting uninitialized
stack contents during validation.

Suggested-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Alyssa Milburn <amilburn@zall.org>
Fixes: 2eb45daa9c ("gallium: de-pointerize pipe_surface")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14082
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39138>
2026-03-25 17:48:43 +00:00
Gurchetan Singh
32bd3a6e4e gfxstream: simple compile fix
Fixes:

hardware/google/gfxstream/guest/OpenglSystemCommon/HostConnection.h:32:
external/mesa3d/src/gfxstream/guest/platform/include/VirtGpu.h:132:40:
    error: implicit conversion changes signedness: 'unsigned int' to
           'const int32_t' (aka 'const int') [-Werror,-Wsign-conversion]

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40492>
2026-03-25 17:33:56 +00:00
Tomeu Vizoso
16e15ee205 gallium: add pipe_ml_device, pipe_screen::get_ml_device()
For compiling models, we don't really need a context for a real device.

To support ML frameworks models in which compilation happens
ahead-of-time (AoT), add API for compilation that doesn't require a
pipe_context.

Add struct pipe_ml_device with function pointers for:
- ml_operation_supported: query operation support
- ml_subgraph_create: compile a subgraph
- ml_subgraph_serialize: serialize a compiled subgraph
- ml_subgraph_destroy: free subgraph resources

Move ml_operation_supported, ml_subgraph_create, and
ml_subgraph_destroy from pipe_context to pipe_ml_device.

Add pipe_screen::get_ml_device() to obtain the device.

Change pipe_ml_subgraph.context (pipe_context*) to
pipe_ml_subgraph.device (pipe_ml_device*).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40167>
2026-03-25 16:58:05 +00:00
Tomeu Vizoso
1d4d1fc61d gallium: replace padding_same with per-side padding
Replace the boolean padding_same field in pipe_ml_operation.conv
and .pooling with explicit per-side padding fields: padding_top,
padding_bottom, padding_left, padding_right.

Frontends always compute these from their own padding representation
(e.g. TFLite same/valid, PyTorch (pad_h, pad_w)). Drivers use
them directly, removing the need for drivers to derive padding.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40167>
2026-03-25 16:58:05 +00:00
Tomeu Vizoso
db866eca28 gallium: pipe_tensor.resource → pipe_tensor.data
Change the tensor backing storage from pipe_resource* to uint8_t*.

This simplifies tensor data management by using raw memory pointers
instead of pipe_resource objects. Frontends allocate tensor data with
malloc() and drivers access it directly, removing the need for
pipe_buffer_map/unmap for tensor data access.

We initially used resources thinking that the NPU would want to directly
access the data in those tensors. It is clear now that all NPUs will
need the data to be compressed and reformatted in some way, so let's
drop the incovenient resources and just use allocated memory.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40167>
2026-03-25 16:58:04 +00:00
Eric R. Smith
1a6809936f panvk: remove a redundant conditional
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
We've already checked PAN_ARCH just above here.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40460>
2026-03-25 15:05:53 +00:00
Eric R. Smith
a2e61ee1b9 pan: change image2DMSArray lowering to use Z instead of Y
We used to lower multisampled arrays to 3D images by adjusting the
height and the Y coordinate so that addressing samples became
addressing into the new base image. This worked for gallium, but
was never implemented for vulkan, and also had the disadvantages
that (a) we handled arrays and non-arrays differently, and
(b) the image height was restricted to 4096.

Change this so that we lower samples into the Z coordinate instead,
adding new layers for each sample. This requires that we know the
number of samples (so we have to save a sysval for this in gallium)
but means that we handle arrays and non-arrays the same. More
importantly, we can fit 3 bits to indicate the number of samples
into the attribute descriptor in Vulkan, so this scheme works
there as well as in OpenGL.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40460>
2026-03-25 15:05:53 +00:00
Eric R. Smith
968b6896d5 panvk: store number of samples in unused bits in the attribute descriptor
We reduce the number of bits used for pixel stride from 10 to 7. This
gives us space to store the log2 of the number of samples, which
we will need later.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40460>
2026-03-25 15:05:53 +00:00
Eric R. Smith
89288722e7 panfrost: add sysval for number of samples
Not really used yet, but we will need it later when we change how we
lower multisampled image arrays.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40460>
2026-03-25 15:05:53 +00:00
Icenowy Zheng
54860bb4c7 pco: fix encoding of fred's s0abs bit
The s0abs bit in the encoing of fred instruction is wrongly set to the
status of .neg modifier instead of .abs modifier.

Fix this copy-n-paste error.

Fixes GLCTS tests when running on top of Zink:
dEQP-GLES2.functional.shaders.random.trigonometric.vertex.4
dEQP-GLES2.functional.shaders.random.trigonometric.vertex.45
dEQP-GLES2.functional.shaders.random.trigonometric.fragment.4
dEQP-GLES2.functional.shaders.random.trigonometric.fragment.45

Fixes: 8ec174b3f9 ("pco: add support for various selection, complex, trig ops")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40611>
2026-03-25 14:37:19 +00:00