fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-06-21 03:48:22 +02:00

Author	SHA1	Message	Date
Autumn Ashton	0dfe4aaa24	nvk: Allow nvk_cmd_upload_qmd to take a custom root descriptor The cubin kernel launches need to use a root descriptor that's compatible with the bytecode that nvcc generates which contains block dim, grid dim and the kernel params at specific layouts which can be influenced by ELF .nv.info attributes. Thus, expose the ability to input custom root descriptors in nvk_cmd_upload_qmd. Signed-off-by: Autumn Ashton <misyl@froggi.es> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com> Reviewed-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Karol Herbst <kherbst@redhat.com> Tested-by: Mary Guillemard <mary@mary.zone> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40686>	2026-06-19 19:01:38 +00:00
Autumn Ashton	acfea9e03f	nak: Expose max_warps_per_sm Previously this was only accessible from Rust, but VK_NVX_binary_import needs to calculate this for imported cubin kernels from EIATTR_REGCOUNT. Signed-off-by: Autumn Ashton <misyl@froggi.es> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com> Reviewed-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Karol Herbst <kherbst@redhat.com> Tested-by: Mary Guillemard <mary@mary.zone> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40686>	2026-06-19 19:01:38 +00:00
Adrián Larumbe	c95edade04	panvk: Talk directly to pankmod when binding sparse resources Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details There's no longer need for the panvk_sparse library, or for panvk to care about whether the KMD can do native sparse mapping. Submit sparse VM bindings as a single operation and let pankmod handle the gory details. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40400>	2026-06-19 18:20:31 +00:00
Adrián Larumbe	19cf49f02f	panvk: Use pankmod instead of panthor drm interfaces in bind queues On top of that, leverage the new push/flush interface so that management of the black hole in older KMD versions can be handled by the pankmod layer. Merging of operations is now done in conjunction with buffering the latest submission, so that the very last operation can have its signal syncs assigned before being delivered to the pankmod layer. Co-developed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40400>	2026-06-19 18:20:31 +00:00
Adrián Larumbe	306539a3c7	pan/kmod: Introduce vm_op buffering and sparse mapping emulation The goal is moving the need for prebuffering when the total number of vm_bind operations isn't known in advance away from panvk, and into the pankmod layer, and also to consolidate that treatment in a single place. At the moment, both panvk_vX_bind_queue.c and panvk_sparse.c roll their own workarounds for the blackhole-mapping sparse bind mechanism. For older KMD versions with no sparse mapping support, emulate it by cyclically mapping over a dummy BO, which is allocated on demand and per VM. This behaviour is similar to that of the Panthor KMD. This moves responsibility over whether to use native KMD sparse mapping or the blackhole method into the pankmod layer, so that the sparse mapping mechanism is transparent to the Vulkan driver. Also disallow automatic VA assignment when sparse emulation is required, because relaying auto va's back to the caller is both cumbersome and unsafe, and also not a practical use case. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40400>	2026-06-19 18:20:31 +00:00
Adrián Larumbe	831bc9bb65	pan/kmod: Introduce sparse binding Register whether the underlying KMD supports sparse mappings in a device property. Add a new VM operation field that holds flags, for the time being only sparse is a valid operation modifier. Disallow sparse operations when an automatic VA is requested or when a BO is provided accidentally. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40400>	2026-06-19 18:20:31 +00:00
Adrián Larumbe	f20b7adbca	pan/kmod: Handle sync object signals in Panthor's vm_bind A future commit will want to have a binary sync object attached to a vm_bind operation or a sync operation only, so rather than creating a separate pankmod flag for it, we simply check the point (always 0). Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40400>	2026-06-19 18:20:31 +00:00
Adrián Larumbe	1dd5ea7feb	pan/kmod: Pass signal and wait syncs separately This is done in preapration of kmod support for blackhole sparse mappings. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40400>	2026-06-19 18:20:31 +00:00
Adrián Larumbe	9ded6f3d38	pan/kmod: Use kernel-reported page sizes for new VM when available Instead of hard-coding available page sizes in UM, have pan_kmod backends query the KMD when these are exposed by the kernel. This is not yet done for Panfrost, but it might be added soon. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40400>	2026-06-19 18:20:30 +00:00
Adrián Larumbe	1ca84d67ac	drm-uapi: Sync the panthor header Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40400>	2026-06-19 18:20:30 +00:00
Juan A. Suarez Romero	df96a100ae	v3dv: fix assertion on push constants Fixes a compiler warning regarding the assertion. Fixes: `6d6a3ab679` ("v3dv: asserts push constants data is valid") Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42269>	2026-06-19 18:01:40 +00:00
David Rosca	08c2bb3b31	radeonsi/mm: Set correct usage in si_dec_fill_surface Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Fixes: `26979becec` ("radeonsi/video: Add video decoder using ac_video_dec") Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42149>	2026-06-19 15:18:03 +00:00
David Rosca	6cd7dd852a	radeonsi/mm: Only setup ref surfaces with tier3 For lower tiers this adds unnecessary dependency on the ref surface. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15630 Fixes: `26979becec` ("radeonsi/video: Add video decoder using ac_video_dec") Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42149>	2026-06-19 15:18:03 +00:00
David Rosca	61d2f8d0f1	radeonsi/mm: Return error when decoding H264 P/B frame with no refs The firmware expects at least one valid reference when decoding P and B frames, otherwise it may pagefault. If the app doesn't handle missing references by using dummy surfaces, error out when trying to decode such frame. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15659 Fixes: `26979becec` ("radeonsi/video: Add video decoder using ac_video_dec") Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42256>	2026-06-19 14:57:39 +00:00
Matt Turner	5bb025f953	gallivm: fix small_unorm -> unorm8 fetch path on big-endian Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Two bugs in lp_build_fetch_rgba_aos's small_unorm fast path: - vector_justify=true shifted the loaded value into the MSB of the wider type on big-endian. The format_desc already carries big-endian-corrected channel shifts, so the extra shift broke channel extraction for sub-32-bit formats (e.g. R8G8B8, B5G5R5). - The output OR-loop packed channels assuming little-endian byte order (shift = j * width), so after bitcast to vec4-u8 on big-endian the alpha channel landed at byte[0] instead of byte[3]. The fix is simple: gather with vector_justify=false so format_desc shifts apply directly; use (3-j)*width on UTIL_ARCH_BIG_ENDIAN to match the memory layout that big-endian bitcast produces. This fixes the lp_test_format test on big-endian platforms. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42228>	2026-06-19 14:17:27 +00:00
Matt Turner	a9d70225ec	gallivm: fix lp_build_round on altivec/VSX The software fallback in lp_build_round (used when arch_rounding_available returns false, e.g. altivec with length < 4) used lp_build_iround's bias-and-truncate path, which rounds half-away-from-zero due to float32 rounding of the (a + nextafterf(0.5)) sum. This caused lp_test_arit failures for v1 and v2 vector widths on ppc64. For altivec/VSX, llvm.nearbyint lowers to vrfin (AltiVec) or xvrspic (VSX) — both single instructions that round to nearest-even — for any vector width. Use it in the else branch when has_altivec is set, preserving the lp_build_iround path for x86 pre-SSE4.1 where llvm.nearbyint would expand to scalar nearbyintf calls. Update the length==2 expected-failure condition in lp_test_arit to exclude altivec (now fixed), keeping it for other platforms that still use the software fallback. This fixes the lp_test_arit test on ppc64. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42227>	2026-06-19 13:46:11 +00:00
Juan A. Suarez Romero	07b53cd328	v3d/ci: update expected results and document failures Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42292>	2026-06-19 12:31:56 +00:00
Job Noorman	99a268c889	ir3/lower_vars_to_scratch_global: use stable sort for variables Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details To ensure we pick variables to spill deterministically. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42168>	2026-06-19 12:00:49 +00:00
Job Noorman	3596d63338	nir/lower_vars_to_scratch_global: make callback deterministic We pass the found variables as a pointer set to the driver. Since the callback is supposed to be used for global decisions, the driver might end up picking different variables based on the (non-deterministic) iteration order of the set. Fix this by passing the variables as a util_dynarray instead. To make sure the contents of the util_dynarray don't have to be shuffled around every time the drivers wants to remove a variable from it, introduce nir_variable::pass_flags that we use to create an intrusive ordered set using a util_dynarray. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42168>	2026-06-19 12:00:49 +00:00
Eric Engestrom	eec4f5712d	ci: fix the fix for perfetto download in `make-git-archive` nightly job The previous fix used `grep -P` which is not supported by the grep implementation used in this job, so replace it with `grep -E` + `cut` which is supported by that implementation. Fixes: `df3756e6dc` ("ci: fix perfetto download in `make-git-archive` nightly job") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42331>	2026-06-19 11:22:49 +00:00
Christian Gmeiner	ad4e3cad54	etnaviv: Gate 128-bit render targets on HALF_FLOAT 128-bit render targets are emulated as paired G32R32F targets. There is no integer 64-bit PE format, so the integer formats also render through G32R32F, as the blob does. The real hardware requirement is the half-float pipe that provides G32R32F, so gate on HALF_FLOAT instead of the conservative halti5 level. This enables the formats on older GPUs that have the pipe. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42201>	2026-06-19 10:49:24 +00:00
Christian Gmeiner	57f5acf849	etnaviv: Support split sampler for 128-bit formats on the state path 128-bit formats (RGBA32) are emulated as two stacked G32R32 planes. The bound sampler reads the RG plane and a companion sampler reads the BA plane, which etna_nir_lower_128bit(..) reassembles in the shader. Only the descriptor path set up the companion, so the state path could not sample these formats. Set up the companion on the state path too and share companion_slot(..) between both paths. The real requirement is the plane format, not the descriptors. The float plane G32R32F samples through the half-float pipe, so gate it on HALF_FLOAT and advertise GL_OES_texture_float, also on halti2 GPUs like GC3000. The integer plane G32R32I needs halti5, so keep the integer formats there. The KHR-GLES2 internalformat tests for sized RGB32F/RGBA32F need an ES3 context, so list them as expected fails on GC3000 too. Verified on GC7000 with and without ETNA_MESA_DEBUG=no_texdesc. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42201>	2026-06-19 10:49:24 +00:00
Christian Gmeiner	0783eaf6d6	etnaviv: Set per-RT sRGB bit on non-zero render target slots sRGB encoding was only handled through the global PE.LOGIC_OP SRGB bit, which the hardware applies to the primary render target alone. An sRGB surface bound to any other MRT slot was written as linear. Fixes dEQP-GLES3.functional.fragment_out.random.{1,17,39,64,86,93,96}. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Daniel Lang <dalang@gmx.at> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42201>	2026-06-19 10:49:23 +00:00
Christian Gmeiner	2e0b5f4b96	etnaviv: Update headers from rnndb Update to rnndb commit 0fd26f92cfd7 Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Daniel Lang <dalang@gmx.at> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42201>	2026-06-19 10:49:23 +00:00
Christian Gmeiner	37eb09ed06	etnaviv: Disable TS per render target on mixed TS modes PE.MEM_CONFIG.COLOR_TS_MODE is a single global field, so every TS-enabled color render target in a framebuffer has to share one TS mode. With CACHE128B256BPERLINE the mode is picked per resource (256B for compressible formats, 128B otherwise), so a compressible format bound next to an integer format disagrees and the odd target gets decoded in the wrong mode, reading back as the clear color. The blob keeps TS on the targets that match the global mode and disables it only on the odd one, instead of giving up TS for the whole framebuffer. Compute a per-RT TS mask once in etna_set_framebuffer_state(..), store it in etna_framebuffer_state and reuse it when arming the BLT fast clear, so the two consumers stay consistent by construction. A disabled target keeps its tile status allocated, so it recovers once a later framebuffer is compatible again. Fixes 23 dEQP-GLES3.functional.draw_buffers_indexed.random.* cases that mix integer and unorm render targets, with no regression in fbo.color or fbo.blit. Fixes: `d70531ca93` ("etnaviv: Extend etna_update_ts_config(..) for MRTs") Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Daniel Lang <dalang@gmx.at> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42201>	2026-06-19 10:49:23 +00:00
Christian Gmeiner	47a2f9e420	etnaviv: Advertise 128-bit color formats as renderable and samplable The 128-bit emulation now covers the clear, blit, copy and sample paths, so stop rejecting the three emulated RGBA32 formats. The format table is the remaining filter. Sampling still relies on the halti5 texture descriptors, so halti5 is the gate. Sampling RGBA32F enables GL_OES_texture_float, and with the existing half-float support also GL_ARB_texture_float, so advertise both. The KHR-GLES2 internalformat tests for sized RGB32F/RGBA32F need an ES3 context, so they fail on the ES2 driver. List them as expected fails, as other ES2 drivers do. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42201>	2026-06-19 10:49:23 +00:00
Christian Gmeiner	9013b56a22	etnaviv: Limit nir_lower_fragcolor(..) to advertised render targets nir_lower_fragcolor(..) expands a broadcast gl_FragColor into one store per render target. It was passed specs->num_rts, the physical HW count, but on HALTI2 only half of them are advertised (caps.max_render_targets) since the upper half is reserved for float and 128-bit format emulation companions. A broadcast shader thus wrote into the reserved slots. For a 128-bit target the clear meta shader stores to every gl_FragData and overwrote the BA companion plane filled by etna_nir_lower_128bit(..), so the clear came back with the RG half replicated into BA. Pass the advertised count instead to keep the broadcast inside the user visible range. Fixes: `928a276b78` ("etnaviv: Limit max supported render targets") Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Daniel Lang <dalang@gmx.at> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42201>	2026-06-19 10:49:23 +00:00
Christian Gmeiner	d75d08437b	etnaviv: rs: Support 128-bit color clears A 128-bit color level is laid out as two stacked G32R32F planes, so clear it with two 64bpp RS fills, the RG half at the level offset and the BA half at the second-plane offset. A cache flush and stall separate the two fills. etna_clear_rs(..) needs the same flush between its color and depth clears to avoid a GC600 hang, and the blob brackets every RS operation this way. The blob clears RGBA32F render targets through RS with the same plane split, verified with a cmdstream capture on a faked GC7000 rev 6204 identity. Fixes dEQP-GLES3.functional.fbo.color.repeated_clear.* for 128-bit formats on RS-only halti5 hardware. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Daniel Lang <dalang@gmx.at> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42201>	2026-06-19 10:49:22 +00:00
Christian Gmeiner	3d8a718181	etnaviv: Save the framebuffer without 128-bit companion slots etna_blit_save_state(..) saved the expanded framebuffer including the appended companion slots. The util_blitter restore goes through etna_set_framebuffer_state(..), which appends companions again, so every blitter round trip with a 128-bit color buffer bound grew nr_cbufs until the expansion assert fired. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Daniel Lang <dalang@gmx.at> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42201>	2026-06-19 10:49:22 +00:00
Christian Gmeiner	38eb315407	etnaviv: blt: Use block-layout offset for 128-bit second-plane blit The 128-bit emulation stores all RG halves in the first half of the BO and all BA halves in the second half. The sampler descriptors, the CPU upload and the BLT clear all compute the second plane as (size * depth) / 2. etna_try_blt_blit(..) advanced source and destination by layer_stride instead, an interleaved layout nothing else uses. For single-layer 2D targets both formulas coincide, so plain blits worked, but per-layer blits of a multi-layer 128-bit array texture corrupted the BA half of every layer. Use the same (size * depth) / 2 offset as the rest of the emulation. Fixes: `1f60a0397b` ("etnaviv: blt: Support 128 bit blit operations") Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Daniel Lang <dalang@gmx.at> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42201>	2026-06-19 10:49:22 +00:00
Christian Gmeiner	7436c8ad0b	etnaviv: NIR pass to lower 128-bit color RT and texture access The hardware reads and writes the emulated 128-bit formats as two G32R32F planes, so one store or sample in the shader has to become two. Add etna_nir_lower_128bit to do that split at the NIR level, driven by per-slot masks and companion tables in the shader key. A store is split into an RG store to the user output and a BA store to a companion output above the application visible range. A sample is cloned to the companion sampler slot and the two results are reassembled into the full vec4, with textureGather matching the blob's 16-bit halves. The write mask is split alongside the data so partial writes keep their meaning, and missing channels are padded with a typed zero so the integer formats work. set_sampler_views(..) computes the per-stage 128-bit mask once and the per-draw shader key setup reads it back, so draws without 128-bit textures only pay for a key compare. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42201>	2026-06-19 10:49:22 +00:00
Christian Gmeiner	a499067877	etnaviv: Emit paired 128-bit sampler descriptors Sampling a 128-bit color texture needs two G32R32F reads because the TE knows nothing about the two-plane emulation. Emit a second sampler descriptor at a companion slot that points at the BA plane. The shader lowering later samples both slots and reassembles the vec4. The BA descriptors reuse the descriptor slots of the RB_SWAP native-order descriptors. The two cases cannot collide because no 128-bit format is RB_SWAP. The companion of user slot i is the slot right after the bound views, nr + i. A 128-bit view is only usable when nr + i fits the sampler pool, asserted in debug builds. An out-of-pool companion stays at ~0U so the emit path skips it and the shader lowering leaves the sample untouched, the only fallout being wrong .ba data instead of a wild state write. Dummy descriptors now cover every inactive dirty slot, not only the previously-active ones, because a companion slot from an earlier bind may never have been active. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Daniel Lang <dalang@gmx.at> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42201>	2026-06-19 10:49:22 +00:00
Christian Gmeiner	7467461277	etnaviv: Lay out 128-bit color FBO as paired G32R32F render targets The PE cannot write 128-bit color formats (RGBA32F/UI/SI) natively. A pair of 64-bit G32R32F render targets covers the same memory: the user's RG channels go to RT[i] and the BA channels go to a companion RT that points at the second plane of the same BO. etna_set_framebuffer_state(..) appends one companion RT per 128-bit user RT and records the mapping for the shader lowering and the sampler side. The companion takes an extra resource reference so it is released cleanly on unbind. The companion colormask is derived from the user's blend state in etna_update_blend(..), the upper two bits (B,A) move down to (R,G), so glColorMask works on both halves. The per-RT restructuring drops the RT_CONFIG_UNK27 programming for non-TS secondary RTs on CACHE128B256BPERLINE hardware. The bit must not be set on the 128-bit companion RTs and its exact meaning needs more reverse engineering. It can come back once understood. Drawing still needs the NIR lowering from a later commit to split the shader output into the RG and BA halves. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Daniel Lang <dalang@gmx.at> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42201>	2026-06-19 10:49:22 +00:00
Christian Gmeiner	7d853e01b4	etnaviv: Wrap pipe_framebuffer_state in etna_framebuffer_state The 128-bit color emulation needs driver-private tracking next to the framebuffer state. Introduce an etnaviv-private wrapper struct with pipe_framebuffer_state as its base member. The tracking fields come with the next commit. All ctx->framebuffer_s.X accesses become ctx->framebuffer_s.base.X. No behavior change. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Daniel Lang <dalang@gmx.at> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42201>	2026-06-19 10:49:21 +00:00
Christian Gmeiner	d5c507c6db	etnaviv: Add a helper for the 128-bit second-plane offset The BA plane of an emulated 128-bit color level starts at (size * depth) / 2. That offset is open-coded in the blt clear and the transfer map/unmap paths, and the upcoming 128-bit render target and sampler support adds more sites. Add etna_resource_level_second_plane_offset(..) and use it, so the invariant lives in one place. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Daniel Lang <dalang@gmx.at> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42201>	2026-06-19 10:49:21 +00:00
Christian Gmeiner	695fec92f7	etnaviv: Flush texture caches after clears A clear writes the surface through the BLT or RS engine while the texture caches can still hold texels from an earlier draw that sampled the cleared resource. Nothing invalidates them when the sampler view stays bound, so the next draw samples stale data. Clears of TS'd levels are not affected as sampling them goes through the sampler TS or a resolve in etna_update_sampler_source(..), which flushes the texture caches as a side effect. Mark the texture caches dirty after clearing a samplable surface, like etna_blit(..) and etna_transfer_unmap(..) already do. Fixes dEQP-GLES3.functional.fbo.color.repeated_clear.sample.* with ETNA_MESA_DEBUG=no_ts. Cc: mesa-stable Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Daniel Lang <dalang@gmx.at> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42201>	2026-06-19 10:49:21 +00:00
David Rosca	88baf64496	va: Use RGB format with matching bit depth for YUV->YUV matrices The bit depth used for the matrix calculation is derived from input format. Fixes: `5bc0df5aad` ("vl,frontends/va: Implement YUV->YUV matrix coeff conversion") Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42181>	2026-06-19 10:26:15 +00:00
David Rosca	5d6e5a895f	va: Always reset compositor chroma location Otherwise it would incorrectly use it for RGB formats if there was a previous conversion with YUV formats. Fixes: `d0eec62831` ("frontends/va: Change vlVaPostProcCompositor to take pipe_vpp_desc arg") Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42181>	2026-06-19 10:26:15 +00:00
David Rosca	f13f439049	vl: Skip transfer function and primaries conversion when not needed cs_trc_apply clamps the value to 0 which causes issues in YUV->YUV conversions that can have negative intermediate values after conversion to RGB. If no transfer function and primaries conversion is needed, this step can be skipped. Fixes: `69717c257f` ("vl,frontends/va: Implement gamma and primaries conversion") Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42181>	2026-06-19 10:26:15 +00:00
Collabora's Gfx CI Team	46bf8c2568	Uprev VVL to e17d63f8fcd967b2ff91efcb8607d2c9ab962e23 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details `2ab77a0165...e17d63f8fc` Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42140>	2026-06-19 09:23:36 +00:00
Job Noorman	44b82c7576	ir3/opt_prefetch_descriptors: rematerialize defs at preamble start When rematerializing defs in the preamble, we make sure to insert them in a block dominated by all their sources. When inserting a def that doesn't have any sources we have to make sure to insert them as early as possible. This important for sequences like this: 32 %34 = load_const (0x00000007 = 0.000000) ... if ... { 32 %184 = @load_preamble (base=8) 32 %185 = @bindless_resource_ir3 (%184) (desc_set=0) 32 %186 = @bindless_resource_ir3 (%34 (0x7)) (desc_set=1) 32x4 %187 = (float32)tex %186 (texture_handle), %185 (sampler_handle), ... ... } %185 has to be rematerialized in control flow since its source is defined there. %186 does not as its source is defined outside control flow. We used to insert %186 as late as possible (nir_after_impl(preamble)) but this causes issues as we cannot find a valid block (i.e., a block that is dominated by both) to insert the descriptor prefetch for (%185, %186). Fix this by setting the default block to insert rematerialized defs as the preamble's start block. Totals from 520 (0.30% of 176258) affected shaders: MaxWaves: 5796 -> 5794 (-0.03%) Instrs: 715314 -> 715248 (-0.01%); split: -0.02%, +0.01% CodeSize: 1547680 -> 1547182 (-0.03%); split: -0.17%, +0.14% NOPs: 157057 -> 157005 (-0.03%); split: -0.07%, +0.04% Full: 9399 -> 9415 (+0.17%) (ss): 18718 -> 18715 (-0.02%); split: -0.03%, +0.01% (sy): 8183 -> 8178 (-0.06%) (ss)-stall: 79780 -> 79818 (+0.05%); split: -0.02%, +0.06% (sy)-stall: 221660 -> 221591 (-0.03%); split: -0.05%, +0.02% STPs: 232 -> 234 (+0.86%) LDPs: 232 -> 234 (+0.86%) Preamble Instrs: 242817 -> 242695 (-0.05%); split: -0.79%, +0.74% Cat0: 176573 -> 176523 (-0.03%); split: -0.06%, +0.03% Cat7: 18945 -> 18929 (-0.08%); split: -0.09%, +0.01% Signed-off-by: Job Noorman <job@noorman.info> Fixes: `4e2a0a5ad0` ("ir3: Add descriptor prefetching optimization on a7xx") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42220>	2026-06-19 08:47:11 +00:00
Toshinari Morikawa	0b1479eb6a	virgl: fix memory leak on shader translation virgl_create_compute_state, virgl_shader_encoder was unnecessarily cloning NIR. Since NIR is passed from st/mesa and the driver is responsible for freeing it, virgl driver don't have to preserve passed NIR and should free it through nir_to_tgsi_options. Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13822 Reviewed-by: Marek Olšák <maraeo@gmail.com> Signed-off-by: Toshinari Morikawa <morikawa.toshinari@jp.panasonic.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40236>	2026-06-19 08:31:39 +00:00
squidbus	1b48450128	kk: Implement draw-related commands using device addresses Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Replaces several draw commands with their `VK_KHR_device_address_commands` equivalent, using device addresses directly with Metal 4. The Vulkan runtime handles lifting the older commands up for us. We can also remove most index robustness handling, as Metal 4 provides the necessary guarantees. This does not fully implement `VK_KHR_device_address_commands` yet as the copy/fill/update memory commands do not have as straight-forward Metal 4 or meta equivalents. Reviewed-by: Aitor Camacho <aitor@lunarg.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42316>	2026-06-19 06:38:20 +00:00
Casey Bowman	5fe14149a9	anv: Set anv_disable_hiz for Sons of the Forest Until we make greater use of the COMMON_SLICE_CHICKEN1 register for Xe1-Xe3 platforms, we'll disable HiZ for SOTF to avoid redundant plane expansions. In turn, this will avoid major FPS drop when encountering scenes where this condition takes place. A test scene with the player's camera near an environmental rock showed FPS gains greater than 183%. This should keep the FPS stable during gameplay. TODO: Revert once proper solution is in place. See https://gitlab.freedesktop.org/mesa/mesa/-/work_items/11782 Signed-off-by: Casey Bowman <casey.g.bowman@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42206>	2026-06-19 05:58:12 +00:00
Casey Bowman	8c689e9bcb	anv: Add option to disable HiZ via drirc Allows per-app disabling of HiZ if a case arises where HiZ is significantly slowing down a workload. This option should not be the resulting fix for when such cases arise, but can serve as a temporary band-aid while a proper solution is fleshed out. Signed-off-by: Casey Bowman <casey.g.bowman@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42206>	2026-06-19 05:58:12 +00:00
Aitor Camacho	c08dba8302	kk: Move to Metal4 command encoding Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Tested-by: squidbus <squidbus@proton.me> Signed-off-by: Aitor Camacho <aitor@lunarg.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42268>	2026-06-19 05:18:55 +00:00
Gu, Wangfeng	56588ef066	radv/sqtt: emit pending barrier end before API markers Flush any delayed RGP barrier-end marker before writing the next general API marker so no-op barriers cannot incorrectly cover subsequent draw or dispatch events. Signed-off-by: Gu, Wangfeng <Wangfeng.Gu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42289>	2026-06-19 04:57:12 +00:00
Valentine Burley	d2ac94d4d3	perfetto: Centralize perfetto header include in u_perfetto.h Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Move the ANDROID_LIBPERFETTO conditional include (<perfetto.h> vs <perfetto/tracing.h>) into util/perf/u_perfetto.h to eliminate duplicated #ifdef blocks scattered across every driver. Files that need additional pbzero proto headers for Android's modular perfetto still include those individually under #ifdef ANDROID_LIBPERFETTO. This enables ninja-to-soong to generate an Android.bp that builds Mesa against Android's libperfetto_client_experimental library. Following: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36561 Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Acked-by: Rob Clark <rob.clark@oss.qualcomm.com> Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42271>	2026-06-19 02:58:16 +00:00
Karol Herbst	d470487004	nak/instr_sched_prepass: Take predicate spilling into account when scheduling instrucitons As long as the GPR usage never exceeds the threshold, the instruction scheduling doesn't take spilling costs into account of other files. Given we only have 7 availalbe predicates, the overhead of spilling those can be significant in int64 math heavy shaders. I also refactored the code a bit to make it easier to do the same for other register files, however it seems to hurt occupancy when doing so. Fixes a performance regression with `vkpeak int64-scalar`. Totals: CodeSize: 8373514512 -> 8372304992 (-0.01%); split: -0.02%, +0.01% Number of GPRs: 48332322 -> 48328427 (-0.01%); split: -0.02%, +0.01% Static cycle count: 4781363047 -> 4779946979 (-0.03%); split: -0.11%, +0.08% Spills to reg: 197662 -> 159098 (-19.51%); split: -21.19%, +1.68% Fills from reg: 195767 -> 161305 (-17.60%); split: -18.62%, +1.01% Max warps/SM: 53043448 -> 53044352 (+0.00%); split: +0.00%, -0.00% Totals from 15734 (1.30% of 1213129) affected shaders: CodeSize: 465771008 -> 464561488 (-0.26%); split: -0.36%, +0.10% Number of GPRs: 1277105 -> 1273210 (-0.30%); split: -0.75%, +0.44% Static cycle count: 570906432 -> 569490364 (-0.25%); split: -0.90%, +0.65% Spills to reg: 121361 -> 82797 (-31.78%); split: -34.52%, +2.74% Fills from reg: 107186 -> 72724 (-32.15%); split: -34.00%, +1.85% Max warps/SM: 448816 -> 449720 (+0.20%); split: +0.38%, -0.18% Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42323>	2026-06-19 02:01:18 +00:00
Ahmed Hesham	13d2b058cd	clc: fix fp16 fallback mask for remquo Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The fp16 libclc fallback indexes NIR function arguments including the hidden return pointer. For remquo, this makes the parameters: 0: return pointer 1: x 2: y 3: quotient pointer The quotient pointer (3) is an integer pointer and should not be converted when building the wrapper. Update the mask for OpenCLstd_Remquo to exclude the third argument. Signed-off-by: Ahmed Hesham <ahmed.hesham@arm.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42309>	2026-06-19 01:24:45 +00:00

1 2 3 4 5 ...

224626 commits