fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 22:08:10 +02:00

Author	SHA1	Message	Date
Benjamin Cheng	34e090ae11	radv/video: Add low-latency flags Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details radv equivalent of `62f07b8c`. Reviewed-by: David Rosca <david.rosca@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40524>	2026-03-29 15:56:50 +00:00
Benjamin Cheng	917dff0b22	ac: Update FW required for variable slice mode Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details There are some compatiblity issues with variable slice mode and preencode that are fixed with newer FW. Fixes: `d9ba641e28` ("ac: Add variable slice mode interface") Reviewed-by: David Rosca <david.rosca@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40604>	2026-03-29 15:30:55 +00:00
Benjamin Cheng	bb6d57c90d	radeonsi/vcn: Reorder get_slice_ctrl_param This will need to depend on quality_modes.pre_encode_mode, so reorder the calls to make it possible. Fixes: `d9ba641e28` ("ac: Add variable slice mode interface") Reviewed-by: David Rosca <david.rosca@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40604>	2026-03-29 15:30:55 +00:00
Alyssa Rosenzweig	aebd76415b	agx: drop NIR continue handling Since `31af989270` ("nir/lower_continue_constructs: Simplify loops before lowering continue constructs"), we never ingest loops with continues. That lets us delete a bunch of now dead code (and outdated comments) around control flow. This patch is part of the treewide effort to improve loops in NIR. I already sent the Intel patch earlier this week and this weekend hit delete here too. https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40609 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenz.ca> Reviewed-by: Mary Guillemard <mary@mary.zone> Tested-by: Mary Guillemard <mary@mary.zone> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40690>	2026-03-29 14:06:14 +00:00
Kenneth Graunke	ca3cabd2f8	brw: Use nir_texop_resinfo_intel for query_levels and txs Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This eliminates the need to special case query_levels. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40451>	2026-03-29 12:53:10 +00:00
Kenneth Graunke	0e143ae663	nir: Add nir_texop_resinfo_intel This is a combination of txs and query_levels in a single vec4 result. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40451>	2026-03-29 12:53:09 +00:00
Georg Lehmann	e7077e8f5c	nir/lower_non_uniform_access: fix fusing loops for same index but different array variable Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details struct nu_handle is hashed and deduplicated using struct nu_handle_key, which ignored parent_deref. That means all instructions will use the first parent_deref when rewriting the sources. Avoid this by not including the parent deref in the struct, and instead querying it when needed. Fixes: `4d09cd7fa5` ("nir/lower_non_uniform_access: Group accesses using the same resource") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15173 Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40654>	2026-03-29 08:31:51 +00:00
Rob Clark	6fb261147b	freedreno: Add a829 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15124 Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40655>	2026-03-28 21:19:58 +00:00
Rob Clark	04f9a82705	freedreno/common: Drop gen8 0x78000 offset Initially I'd added the offset to make things match up to blob driver on x2-85/a840. But this gets in the way on parts with smaller GMEM. Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40655>	2026-03-28 21:19:58 +00:00
Yiwei Zhang	a2e42eff52	ci/panvk: update expectations with new flakes Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40682>	2026-03-28 20:16:09 +00:00
Yiwei Zhang	73c9d35644	panvk: hide swapchainMaintenance1 behind WSI guard Fixes: `9ec387efb1` ("panvk: advertise wsi maintenance extensions") Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40682>	2026-03-28 20:16:09 +00:00
Marek Olšák	c361c82a5a	radeonsi: draw using a single triangle in u_blitter Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This fixes dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency* when not using the rectangle path. Reviewed-by: Pierre-Eric Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40634>	2026-03-28 18:47:55 +00:00
Marek Olšák	6ce1b12a76	radeonsi: sink si_get_pipe_constant_buffer in si_blitter_begin Reviewed-by: Pierre-Eric Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40634>	2026-03-28 18:47:55 +00:00
Marek Olšák	7f846bc50a	radeonsi: remove always-set SI_SAVE_FRAGMENT_STATE Reviewed-by: Pierre-Eric Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40634>	2026-03-28 18:47:55 +00:00
Marek Olšák	2dc65308f8	radeonsi: add 64K texture support to gfx blits Reviewed-by: Pierre-Eric Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40634>	2026-03-28 18:47:55 +00:00
Marek Olšák	918e5764f4	radeonsi: disable streamout queries for u_blitter Cc: mesa-stable Reviewed-by: Pierre-Eric Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40634>	2026-03-28 18:47:55 +00:00
Marek Olšák	556ceb1b75	radeonsi: fix blits via util_blitter_draw_rectangle It didn't save states properly. The only correct place to save them is si_blitter_begin. Unfortunately, we can't skip saving and restoring those states because we don't know in advance whether the rectangle path will be used. Cc: mesa-stable Reviewed-by: Pierre-Eric Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40634>	2026-03-28 18:47:54 +00:00
Marek Olšák	ea9a31cc8c	gallium/u_blitter: allow using the single triangle for scaled blits too This should be faster because 2 triangles are inefficient on the diagonal, generating helper invocations and potentially extra memory loads from dst because tiles aren't fully covered. Reviewed-by: Pierre-Eric Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40633>	2026-03-28 18:01:40 +00:00
Natalie Vock	1f9bc71051	radv/rt: Remove RADV_OFFSET_UNUSED Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details RADV_OFFSET_UNUSED became unused, itself. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39985>	2026-03-28 16:48:46 +01:00
Natalie Vock	579feda38b	radv/rt: Fix cases in which the bound BVH build pipeline gets clobbered The most egregious case was AS updates, in which case radv_copy_memory would decide to use compute, which overwrites the bound pipeline with a copy shader. Subsequent dispatches assumed the update pipeline to be bound, but dispatched another copy shader instead. There is also a chance of this happening for geometry info copying for RRA, so add another pass for that. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39985>	2026-03-28 16:48:46 +01:00
Natalie Vock	e713527aa9	vulkan: Bump MAX_ENCODE_PASSES RADV needs one more encode pass for a bugfix in the next commit. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39985>	2026-03-28 16:12:09 +01:00
Natalie Vock	6f80027447	vulkan: Rename {encode,update}_bind_pipeline to {encode,update}_prepare Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39985>	2026-03-28 16:12:09 +01:00
Icenowy Zheng	ee031d67b4	pvr: fix dirty tracking for stencil ops Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The dirty state of stencil ops is not checked when deciding whether to rebuild the ISP state, although the values are part of the ISP state (the 27:16 bits of ISPB word). Add MESA_VK_DYNAMIC_DS_STENCIL_OP to the condition for rebuilding ISP control registers. Fixes GLCTS tests when running on top of Zink: dEQP-GLES2.functional.fragment_ops.stencil.zero_stencil_fail Fixes: `88f1fad3f7` ("pvr: Use common pipeline & dynamic state frameworks") Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn> Reviewed-by: Simon Perretta <simon.perretta@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40623>	2026-03-28 19:39:01 +08:00
Icenowy Zheng	71880a2911	pvr: support VK_EXT_non_seamless_cube_map Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details When running GLES2 conformance tests with Zink on the PowerVR driver, I found that the PowerVR driver has the same kind of weird behavior of not ignoreing wrap mode for seamless cubes with Apple AGX (See !21978 for the description of the quirk on AGX). As GLES2 exposes non-seamless cubes, exposing non-seamless cube support at PowerVR help seems to help lot about these GLES2 tests. Implementing full GLES 3 and relying on the workaround for AGX is another choice, but it's still too far. Implementing non-seamless cube seems to be as easy as setting a bit in the sampler control word, so do it. Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn> Reviewed-by: Simon Perretta <simon.perretta@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40660>	2026-03-28 11:17:12 +00:00
Zan Dobersek	468113efd4	fd/replay: kgsl context should use no-fault tolerance, report reset state Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Use KGSL_CONTEXT_NO_FAULT_TOLERANCE to push context into an error state when a GPU fault is detected. This is useful when dealing with replays of captures that are producing a GPU fault but might seem to replay just fine because of the KGSL kernel fault tolerance. Signed-off-by: Zan Dobersek <zdobersek@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40667>	2026-03-28 07:58:05 +00:00
Olivia Lee	8d5ba04e65	panvk/csf: use different resource registers for precomp vs user dispatch Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This allows us to avoid dirtying all of the state for user compute dispatches when we run a precomp shader. Signed-off-by: Olivia Lee <olivia.lee@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37970>	2026-03-28 03:53:41 +00:00
Icenowy Zheng	ea783b4691	vulkan/wsi/headless: implement wait_for_present for swapchain Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The VK_KHR_present_wait extension contains no functionality to announce (the lack of) support for vkWaitForPresentKHR() on a WSI (or WSI-bound object) granularity. On any driver advertising that extension and the headless WSI, the application will expect vkWaitForPresentKHR() to be usable with the headless WSI, which leads to assertion failure in debug Mesa builds or crash in release builds. Create a trivial wait_for_present implementation for the headless WSI, which just assumes the image is immediately presented at the time of queue_present is called, so it only checks the WSI present semaphore. Tested with `dEQP-VK.wsi.headless.present_id_wait.wait.*` on RADV without any failures. Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn> Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40347>	2026-03-27 19:55:11 +00:00
Icenowy Zheng	2f540283b3	vulkan/wsi/headless: properly cleanup swapchain init failure Currently the wsi_headless_surface_create_swapchain() function abuses the corresponding destroy function to perform cleanup operations when any failure happens during images creation. This practice sounds fragile and prevents further changes to the swapchain creation procedure. Implement a proper cleanup sequence to reverse all operations. As another cleanup codepath above already contains call of vk_free(), the call is changed to a goto targetting the corresponding label. Regression tested with `dEQP-VK.wsi.headless.swapchain.simulate_oom.*` on RADV. Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn> Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40347>	2026-03-27 19:55:11 +00:00
Lorenzo Rossi	c0e0591999	pan/compiler: Replace frag_coord_zw_pan with var_special_pan Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Just a bit cleaner, and we can unify point size too. Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40677>	2026-03-27 19:23:02 +00:00
Lorenzo Rossi	5be2b03b88	pan/compiler: Add bound assert on emit_split_i32 This could've saved me a lot of time debugging stack corruption. Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40677>	2026-03-27 19:23:02 +00:00
Maíra Canal	691cfe40fa	v3d: use devinfo->page_size for state uploader default size The state uploader was hardcoded to 4096 bytes, which doesn't fill the full page on systems with 16KB pages. Use devinfo->page_size instead so the uploader default matches the actual allocation granularity. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Maíra Canal <mcanal@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40496>	2026-03-27 18:54:29 +00:00
Maíra Canal	4db32305ec	v3d: Rename cle_buffer_min_size to page_size The variable doesn't store a granularity specific to CLE buffers. It stores the granularity that the OS imposes on buffer allocations (that is, the OS page size). Therefore, rename the variable to best reflect its meaning. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Maíra Canal <mcanal@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40496>	2026-03-27 18:54:29 +00:00
Maíra Canal	bfe92d50ce	v3d: sub-allocate sampler view texture state from state uploader Previously, each sampler view allocated a dedicated BO for its, TEXTURE_SHADER_STATE packet (~24 bytes), which got rounded up to a full 4KB page. This wastes memory and inflates the per-job BO handle count. Use u_upload_alloc_ref() to sub-allocate texture shader state from the shared state_uploader, matching the pattern already used by image views. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Maíra Canal <mcanal@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40496>	2026-03-27 18:54:29 +00:00
Maíra Canal	751e0d26ec	v3d: use the state uploader for the image view texture shader state From the documentation, the state uploader should be used inside the driver for long-term state inside buffers, while the stream uploader should be used by Gallium's internals. Considering that the image view texture shader state can be considered long-lived state data, use `state_uploader` instead of `uploader` for consistency. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Maíra Canal <mcanal@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40496>	2026-03-27 18:54:29 +00:00
Rob Clark	b76678cddd	freedreno/a6xx: Fix supported-blit fmt check Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Fixes some KHR-GLES.core.internalformat.texture2d. failures. Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40665>	2026-03-27 17:48:21 +00:00
Julia Zhang	32d04bcdcd	vulkan: return pQueue with matching flags Searching device->queues only according to queueIndex and queueFamilyIndex could cause this issue: if there are two queues A and B created with same queueIndex and queueFamilyIndex but different flags. When user try to get B but vk_foreach_queue loop return A when it get A and find it have the request queueIndex and queueFamilyIndex. So this add a check of queue flags and return the queue with matching flags, queueIndex and queueFamilyIndex. Signed-off-by: Julia Zhang <Julia.Zhang@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40669>	2026-03-27 17:08:01 +00:00
Trigger Huang	007cfd138d	vulkan/queue: pass protected submit info to driver Pass application's protected submission info to driver Signed-off-by: Trigger Huang <Trigger.Huang@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40669>	2026-03-27 17:08:01 +00:00
Samuel Pitoiset	dede14cce3	radv: advertise VK_KHR_device_address_commands Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40386>	2026-03-27 16:17:02 +00:00
Samuel Pitoiset	a97c889a7b	radv: implement VK_KHR_device_address_commands Because there is no way to know where the address has been allocated (GTT or VRAM), the existing entrypoints aren't dropped and the sparse bit is derived from VK_ADDRESS_COMMAND_FULLY_BOUND_BIT_KHR. It would be nice to figure out if the CP DMA vs compute heuristic for GTT BOs on dGPUs could be removed to simplify this implementation. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40386>	2026-03-27 16:17:02 +00:00
Samuel Pitoiset	479a992b02	radv: replace radv_copy_flags by VkAddressCopyFlagsKHR Same meaning. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40386>	2026-03-27 16:17:02 +00:00
Samuel Pitoiset	72ac5e6d29	radv/ci: fix a typo in radv-navi10-vkcts-full Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Oops. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40679>	2026-03-27 15:53:39 +00:00
Samuel Pitoiset	566e4c25d9	radv/ci: fix radv-slow-skips.txt path This was causing issues with personal branches. Suggested-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40676>	2026-03-27 14:53:37 +00:00
Rhys Perry	3b52d61bb0	radv: don't copy radv_vertex_input_state in CmdSetVertexInputEXT This doubles vkoverhead's draw_16vattrib_change_dynamic performance. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40603>	2026-03-27 13:38:29 +00:00
Georg Lehmann	ae2968c4ec	aco: allow spilling to LDS in RT shaders without stack pointer Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details No Foz-DB changes because most RT shaders use function calls now. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36367>	2026-03-27 13:08:44 +00:00
Georg Lehmann	133ef9f94b	aco: spill VGPRs to LDS if it doesn't further limit occupancy Only use LDS for VGPR spilling if we can use addtid access, to avoid having a VGPR addr. Limit to single wave workgroups, to avoid needing the wave_id for the offset. If we have a scratch stack pointer, don't use LDS at all. Limit LDS spilling to not reduce occupancy further. Note that in theory, this can still limit occupancy of other shaders running on the CU at the same time, but that's unlikely and impossible to know at this point. Removes all scratch usage in emulated FSR4 and parallel_rdp. Besides that, only a single GoW shader is affected. Foz-DB Navi31: Totals from 9 (0.01% of 114641) affected shaders: Instrs: 68863 -> 68830 (-0.05%); split: -0.07%, +0.02% CodeSize: 416108 -> 416000 (-0.03%); split: -0.05%, +0.02% LDS: 2048 -> 45056 (+2100.00%) Scratch: 261888 -> 220672 (-15.74%) Latency: 727951 -> 657155 (-9.73%); split: -9.73%, +0.00% InvThroughput: 418644 -> 383269 (-8.45%) VClause: 1506 -> 1200 (-20.32%) Copies: 10651 -> 10624 (-0.25%) VALU: 48700 -> 48684 (-0.03%) SALU: 6200 -> 6199 (-0.02%); split: -0.05%, +0.03% VMEM: 4139 -> 3589 (-13.29%) VOPD: 580 -> 574 (-1.03%) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36367>	2026-03-27 13:08:44 +00:00
Pavel Ondračka	56a6528744	r300/ci: expectation update Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40671>	2026-03-27 10:48:55 +01:00
Tomeu Vizoso	e23fcc1464	ethosu: implement ml_device_destroy for standalone ML device Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Use ralloc_free to release the device allocated by ethosu_ml_device_create(). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40578>	2026-03-27 09:35:40 +01:00
Tomeu Vizoso	f06b4dbe33	gallium: add ml_device_destroy callback to pipe_ml_device Add a destroy callback so that standalone ML devices created via *_ml_device_create() can properly free their resources. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40578>	2026-03-27 09:35:40 +01:00
Tomeu Vizoso	f0e4ccf664	ethosu: handle NULL bias tensor in convolution PyTorch Conv2d without explicit bias produces a NULL bias_tensor in the Gallium pipe_ml_operation. Guard against NULL dereferences in two places: - ethosu_lower.c: pass NULL to fill_coefs when bias_tensor is NULL - ethosu_coefs.c: treat missing biases as zero Fixes crashes when running Conv2d models without bias through the Ethos-U NPU backend. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40578>	2026-03-27 09:33:52 +01:00
Tomeu Vizoso	e0b401aa87	ethosu: implement ml_subgraph_deserialize() Add ethosu_ml_subgraph_deserialize() which reconstructs a subgraph from a serialized byte buffer. Parses the header (cmdstream size, coefs size, io size, tensors size), restores the tensor array, cmdstream, and coefficient buffers. DRM buffer object creation is deferred to prepare_for_submission() which is called lazily on first invoke. Wire pctx->ml_subgraph_deserialize in ethosu_create_context(). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40578>	2026-03-27 09:33:52 +01:00

1 2 3 4 5 ...

220480 commits