fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-21 15:28:18 +02:00

Author	SHA1	Message	Date
Lorenzo Rossi	d2f7b8db9d	pan/compiler: Collect nopersp varyings in lower_noperspective_fs Now that lower_noperspective_fs and varying collection are closer together we can merge nopersp collection in lower_noperspective_fs without fear of desyncrhonization, making everything also a bit cleaner. Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>	2026-04-30 18:26:13 +00:00
Lorenzo Rossi	dfdb9f1d41	pan/compiler: Sort postprocess Now that we removed a lot of upcoming bugs using time-travel, we can reorders the passes in postprocess to be more in-line with modern compilers. We also lift a lot of passes from compile_shader_nir into postprocess. Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Co-authored-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>	2026-04-30 18:26:13 +00:00
Lorenzo Rossi	312603b2fa	pan/compiler: Rename bifrost_optimize_nir Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>	2026-04-30 18:26:12 +00:00
Lorenzo Rossi	6f05b27b9a	panvk: Remove pan_optimize_nir call The shader will be optimized a few passes later in preprocess, this way we can have the same pipeline as in Gallium Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>	2026-04-30 18:26:12 +00:00
Lorenzo Rossi	39f54ddea2	panvk,panfrost: Pass inputs and info to postprocess This is needed if we want postprocess to decide IDVS and layout later in the series Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>	2026-04-30 18:26:12 +00:00
Lorenzo Rossi	01e6a0555c	pan/compiler: Rework scratch memory strategy Before this commit, all scartch memory was allocated in 16-byte chunks and indirect references where always lowered into if-else trees. This patch tries to clean this up a little bit, by using a more compact layout that is still TLS friendly, allowing indirect accesses and only lowering them for optimizations and using the newer nir_lower_explicit_io. The patches should improve performance on some shaders, but lifts a lot of dust off the compiler uncovering some new bugs. They have been kept at bay by disabling local memory vectorization. Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>	2026-04-30 18:26:11 +00:00
Lorenzo Rossi	f0d2ad9840	panvk/jm: Fix tls_size overwrite in indirect draws Only caused problems when the VS/FS has more TLS than our internal shaders that doesn't usually happen but will cause bugs when we start to compress local memory. Fixes: `005703e5b5` ("panvk: Move TLS preparation logic to cmd_dispatch_prepare_tls") Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>	2026-04-30 18:26:11 +00:00
Lorenzo Rossi	768d7cb149	pan/compiler: Sort preprocess Reorders the preprocess passes to be more in-line with modern compilers. Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Co-authored-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>	2026-04-30 18:26:11 +00:00
Lorenzo Rossi	dd96a1514b	pan/compiler: Handle ssbo_atomics in lower_vs_atomics This way the pass does not depend on lower_ssbo anymore Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>	2026-04-30 18:26:10 +00:00
Lorenzo Rossi	408d03291d	pan/compiler: Lower unaligned scratch memory accesses Using OpenCL size/alignment requirements we might get some types with a size bigger than their alignment. This breaks the current TLS load/stores that expect 16-byte alignment for 16-byte load/stores. This problem probably hasn't surfaced yet because we reassigned OpenCL scratch in 16-byte slots, but will break if we compact the layout. Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>	2026-04-30 18:26:10 +00:00
Lorenzo Rossi	ac23e3c6c5	pan/compiler: Fix WaRaR hazard in pressure scheduler A common memory swap operation might be compiled as: %v1 = LOAD %a1 # L1 %v2 = LOAD %a2 # L2 STORE %v2, %a1 # S1 STORE %v1, %a2 # S2 The current pressure scheduler just records the last load/store operation for dependencies, thus the dependency chain becomes L2 -> S1 -> S2. The compiler might thus reorder them as L2, S1, L1, S2, i.e # L1: %v2 = LOAD %a2 # L2 \| STORE %v2, %a1 # S1 \| %v1 = LOAD %a1 # L1<- STORE %v1, %a2 # S2 This is incorrect as S1 depends on L1 too. The fix makes all loads also depend on each other, restricting load reordering. The proper fix that NAK has is to track all loads and make each store depend on every load, building a more correct DAG. This doesn't matter as much in panfrost since all loads are serialized by the scoreboard. We might still want to implement it for register pressure in the future. Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>	2026-04-30 18:26:09 +00:00
Lorenzo Rossi	abde403a7c	pan/compiler: Allow 16-bit alpha for atest_pan We just need to handle it while translating NIR to BIR, the hardware can do automatic widening to 32-bits. Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41096>	2026-04-30 17:33:09 +00:00
Valentine Burley	ca92f8697e	panfrost/ci: Update kernel to pick up ZSTD support for ZRAM No other changes. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15342 Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Acked-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41217>	2026-04-29 07:24:18 +00:00
Olivia Lee	72e0eda260	pan/bi: fix memory access alignment Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Memory accesses need to be aligned up to the next power of two of the full access size. Component count and bit-size don't matter to the hardware, only the total size. shader-db results are pretty much what you would expect, there are a few shaders that have increased LS instructions as a result of splitting accesses to satisfy alignment requirements that were previously ignored. The one surprising thing is that there are several shaders that have reduced uniform usage. Looking at some of these individually, what happened is that splitting UBO loads early allowed the compiler to eliminate loads from unused ranges of the access. total instrs in shared programs: 719166 -> 719174 (<.01%) instrs in affected programs: 2355 -> 2363 (0.34%) helped: 4 HURT: 6 helped stats (abs) min: 1.0 max: 9.0 x̄: 3.00 x̃: 1 helped stats (rel) min: 0.36% max: 6.52% x̄: 1.99% x̃: 0.54% HURT stats (abs) min: 1.0 max: 4.0 x̄: 3.33 x̃: 4 HURT stats (rel) min: 0.65% max: 2.13% x̄: 1.38% x̃: 1.48% 95% mean confidence interval for instrs value: -2.14 3.74 95% mean confidence interval for instrs %-change: -1.76% 1.82% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 30210.83 -> 30218.81 (0.03%) cycles in affected programs: 50 -> 57.99 (15.97%) helped: 2 HURT: 6 helped stats (abs) min: 0.0078129999999999589 max: 0.070312000000000041 x̄: 0.04 x̃: 0 helped stats (rel) min: 1.10% max: 10.23% x̄: 5.66% x̃: 5.66% HURT stats (abs) min: 0.03125 max: 5.0 x̄: 1.34 x̃: 1 HURT stats (rel) min: 2.38% max: 25.00% x̄: 13.05% x̃: 14.26% 95% mean confidence interval for cycles value: -0.42 2.41 95% mean confidence interval for cycles %-change: -1.74% 18.49% Inconclusive result (value mean confidence interval includes 0). total cvt in shared programs: 2385.91 -> 2385.91 (<.01%) cvt in affected programs: 11.14 -> 11.14 (<.01%) helped: 5 HURT: 4 helped stats (abs) min: 0.0078119999999999301 max: 0.070312000000000041 x̄: 0.02 x̃: 0 helped stats (rel) min: 0.27% max: 10.23% x̄: 2.61% x̃: 0.82% HURT stats (abs) min: 0.01562600000000014 max: 0.03125 x̄: 0.03 x̃: 0 HURT stats (rel) min: 1.31% max: 2.75% x̄: 2.21% x̃: 2.40% 95% mean confidence interval for cvt value: -0.02 0.02 95% mean confidence interval for cvt %-change: -3.51% 2.58% Inconclusive result (value mean confidence interval includes 0). total ls in shared programs: 25871 -> 25879 (0.03%) ls in affected programs: 46 -> 54 (17.39%) helped: 0 HURT: 4 HURT stats (abs) min: 1.0 max: 5.0 x̄: 2.00 x̃: 1 HURT stats (rel) min: 10.00% max: 25.00% x̄: 18.38% x̃: 19.26% 95% mean confidence interval for ls value: -1.18 5.18 95% mean confidence interval for ls %-change: 8.46% 28.30% Inconclusive result (value mean confidence interval includes 0). total code size in shared programs: 6302848 -> 6302976 (<.01%) code size in affected programs: 1536 -> 1664 (8.33%) helped: 0 HURT: 1 total registers used in shared programs: 117324 -> 117329 (<.01%) registers used in affected programs: 45 -> 50 (11.11%) helped: 1 HURT: 2 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 6.25% max: 6.25% x̄: 6.25% x̃: 6.25% HURT stats (abs) min: 2.0 max: 4.0 x̄: 3.00 x̃: 3 HURT stats (rel) min: 12.50% max: 30.77% x̄: 21.63% x̃: 21.63% total uniforms used in shared programs: 78538 -> 78274 (-0.34%) uniforms used in affected programs: 2688 -> 2424 (-9.82%) helped: 104 HURT: 4 helped stats (abs) min: 1.0 max: 18.0 x̄: 2.65 x̃: 2 helped stats (rel) min: 1.96% max: 54.55% x̄: 12.78% x̃: 11.11% HURT stats (abs) min: 1.0 max: 5.0 x̄: 3.00 x̃: 3 HURT stats (rel) min: 3.70% max: 16.13% x̄: 9.92% x̃: 9.92% 95% mean confidence interval for uniforms used value: -3.01 -1.88 95% mean confidence interval for uniforms used %-change: -14.15% -9.74% Uniforms used are helped. Total CPU time (seconds): 73.26 -> 74.48 (1.67%) Signed-off-by: Olivia Lee <olivia.lee@collabora.com> Fixes: `2f2738dc90` (pan/bi: Use nir_lower_mem_access_bit_sizes) Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41033>	2026-04-29 01:26:47 +00:00
Faith Ekstrand	11399b15e0	pan/bi: Improve swizzle propagation Instead of only propagating when we have a full word, always attempt to find a propagation, only considering the bytes actually consumed by the instruction. This is especially important for v2i8 sources. Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41247>	2026-04-28 20:27:16 +00:00
Yiwei Zhang	ad857ba7cd	panvk: drop panvk_android_create_deferred_image No need a separate helper anymore since https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41145. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41216>	2026-04-28 17:46:48 +00:00
Christian Gmeiner	7d59c62fde	panvk: Wire up VK_EXT_conservative_rasterization on v11+ Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Mali >= v11 has a Conservative Rast Mode field in DCD Flags 0 with values Disabled and Over Estimate. Wire it to vk_runtime's rasterization state and expose the extension on PAN_ARCH >= 11, with caps restricted to overestimate only — HW has no underestimate value and no overestimation-size granularity. On v11-v13, degenerate triangles produce a wrong fragment w when overestimate is enabled, so cull_zero_area is forced on alongside the mode bit and degenerateTrianglesRasterized is reported as false. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41189>	2026-04-28 09:34:28 +02:00
Erik Faye-Lund	4f2de63a27	pan/ci: add a flake from nightly This failed here: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/98061409 Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41121>	2026-04-27 09:27:02 +00:00
Faith Ekstrand	9c8e8ed655	panvk/csf: Emit INDEX_BUFFER[_SIZE] even for non-indexed draws The index buffer and index buffer size don't affect whether or not we're actually doing indexed rendering so we should just emit them whenever they change. Otherwise, if someone sets an index buffer and then does a non-indexed draw and then an indexed draw, the first draw will clear the dirty bits without setting the index buffer registers and the second draw won't know to re-emit them. Fixes: `5544d39f44` ("panvk: Add a CSF backend for panvk_queue/cmd_buffer") Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Reviewed-by: Marc Alcala Prieto <marc.alcalaprieto@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40997>	2026-04-27 08:40:43 +00:00
Christian Gmeiner	3d7d2115f8	panvk: Implement vkCmdFillBuffer with panlib kernels Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Replace the vk_meta_fill_buffer call with direct panlib precomp dispatches: a KERNEL(32) uint4 bulk path for 16-byte-aligned fills and a KERNEL(32) uint32 path otherwise, each with a KERNEL(1) scalar tail for sub-workgroup remainders. gpu-ratemeter vk.bufbw on Mali-G610 MC4 shows a 1.15-1.18x median speedup across alignment classes and roughly 5x on fills <= 512 B, thanks to the removed pipeline bind / descriptor-set setup that vk_meta_fill_buffer pays per call. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41079>	2026-04-27 08:19:20 +00:00
Yiwei Zhang	0b99d1db0b	panvk: adopt common ANB helpers Below are adopted: - vk_android_import_anb - vk_android_import_anb_memory Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41145>	2026-04-24 16:25:36 +00:00
Christian Gmeiner	aed60946a1	panvk: Advertise VK_EXT_dynamic_rendering_unused_attachments The Vulkan runtime and panvk already handle unused attachments correctly. Enable the extension and feature flags. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40920>	2026-04-24 07:09:33 +00:00
Valentine Burley	8d4fb52919	panvk: Use vk_android deferred image helper Switch to using the new helper. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40635>	2026-04-23 21:21:31 +00:00
Christoph Pillmayer	b7f9974f3e	pan/bi: Fix format in bi_repair_ssa Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40969>	2026-04-23 15:49:48 +00:00
Christoph Pillmayer	fcfc580f67	pan/bi: Fix source swizzle in bi_repair_ssa Repairing SSA was creating invalid PHI nodes with source swizzles != BI_SWIZZLE_H01. PHI sources can't have non-identity swizzles. In most cases the repair logic only replaces sources, in which case the swizzle is taken from the old source that is getting replaced. However, in add_phi_operands there is no old source because the phi is new, and so the result from resolve_read is assigned directly. This falsely carries over the destination swizzle to the source. Since it never makes sense for resolve_read to carry over the swizzle from the instruction writing the value, we can make it so that resolve_read always returns the identity swizzle on indices. resolve_read returns one of: - An index stored by record_write - An index created by bi_temp_like - The result of a recursive resolve_read call bi_temp_like already correctly sets the swizzle to H01. Setting it in record_write leads to both base cases returning the desired swizzle. Fixes: `dd94d183` ("pan/bi: Fixup bi_repair_ssa.c for bi") Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40969>	2026-04-23 15:49:47 +00:00
Lars-Ivar Hesselberg Simonsen	82592433e6	panvk: Fix debug flag overlap Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details PANVK_DEBUG_HSR_PREPASS and PANVK_DEBUG_NO_EXTENDED_VA_RANGE have the same value, meaning they both get toggled when one is. This commit moves PANVK_DEBUG_HSR_PREPASS to the following value. Fixes: `2d9be41706` ("panvk/v13: Support HSR Prepass") Reviewed-by: John Anthony <john.anthony@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41106>	2026-04-22 15:14:23 +00:00
Lars-Ivar Hesselberg Simonsen	98c298cf4d	pan/va/disasm: Align indentation The disassembly file had a lot of inconsitencies in indentation, so align on the standard IndentWidth: 3 Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41062>	2026-04-22 08:31:01 +00:00
Lars-Ivar Hesselberg Simonsen	17f1a2c184	pan/va/disasm: Align FAU printing The current implementation prints FAU entries as 32-bit entries. While this works, it does not align with the DDK. Rather than treating FAU as a set of 32-bit entries, treat is as 64-bit entries that can be split in two words. This aligns with the DDK and has allows for differentiating 32-bit and 64-bit reads based on whether a word modifier is used. Finally, add entry values to FAU printing to easily look up specific reads. For example: Vertex FAU @ffd93950: 43000000 43000000 3F800000 43000000 43000000 00000000 C7000000 47000000 00000001 00000000 FMAX.f32 r3, r3^, u6 FMIN.f32 r3, r3^, u7 vs Vertex FAU @ffd93950: u0 43000000 43000000 u1 3F800000 43000000 u2 43000000 00000000 u3 C7000000 47000000 u4 00000001 00000000 FMAX.f32 r3, r3^, u3.w0 FMIN.f32 r3, r3^, u3.w1 Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41062>	2026-04-22 08:31:01 +00:00
Lars-Ivar Hesselberg Simonsen	829eafa076	pan/va/disasm: Print 64 bit src/dest regs as reg pairs This makes it clear that both registers are read/written, and aligns with DDK disassembly. For example: STORE.i128.istream.slot2.reconverge @r0:r1:r2:r3, r4^, offset:0 vs STORE.i128.istream.slot2.reconverge @r0:r1:r2:r3, [r4^:r5^], offset:0 Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41062>	2026-04-22 08:31:01 +00:00
Lars-Ivar Hesselberg Simonsen	9f049032be	pan/genxml: Print shader hex in trace for Valhall Enable verbose disassembly for Valhall in traces, which adds hex values to shader printing. Useful for debugging. For example: Shader 0xffffbe3ec000 (GPU VA ffdd3000) sz 16384 LD_ATTR_IMM.v4.f32.slot0.wait0 @r0:r1:r2:r3, r60^, r61^, index:0x0, table:0x0 FRCP.f32 r3, r3^ FMAX.f32 r3, r3^, u6 vs Shader 0xffffa8bf7000 (GPU VA ffdd3000) sz 16384 7c 7d 00 32 08 80 66 08 LD_ATTR_IMM.v4.f32.slot0.wait0 @r0:r1:r2:r3, r60^, r61^, index:0x0, table:0x0 43 00 00 00 00 c3 9c 00 FRCP.f32 r3, r3^ 43 86 03 00 00 c3 a4 00 FMAX.f32 r3, r3^, u6 Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41062>	2026-04-22 08:31:00 +00:00
Samuel Pitoiset	9d17a7bdb4	spirv,treewide: rework specialization constant Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details With SPV_KHR_constant_data, it's allowed to specialize array of constants. RustiCL changes are from Karol Herbst <kherbst@redhat.com>. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41046>	2026-04-22 06:57:55 +00:00
Erik Faye-Lund	c8ae72f51d	panvk: do not enable extension without required feature The Vulkan spec states that if VK_KHR_shader_clock is supported, shaderSubgroupClock is a required feature. So let's not enable that extension unless we can... Fixes: `e9c2c32409` ("panvk: enable VK_KHR_shader_clock") Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Ashley Smith <ashley.smith@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40944>	2026-04-20 17:36:20 +00:00
Erik Faye-Lund	8cb89853b8	panvk: do not enable extension without required feature The Vulkan spec states that if VK_ARM_shader_core_builtins is supported, shaderCoreBuiltins is a required feature. So let's not enable that extension unless we can... Fixes: `dff1d91c64` ("panvk: Enable VK_ARM_shader_core_builtins") Reviewed-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40944>	2026-04-20 17:36:20 +00:00
Erik Faye-Lund	36983b50fe	panvk: increase maxBufferSize on v11 and later Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The HW supports larger buffer-sizes on v11 and later, so let's bump this up. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40999>	2026-04-20 16:00:56 +00:00
Erik Faye-Lund	bd2646482b	panvk: increase maxResourceSize on v11 and later The HW supports larger resource-sizes on v11 and later, so let's bump this up. But since we have a knob to limit the usable VA space, we need to take that into account here as well. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40999>	2026-04-20 16:00:56 +00:00
Erik Faye-Lund	f3d3102143	panvk: do not artificially limit image dimensions While we used to need this, we no longer do, thanks to handling maxResourceSize. From the Vulkan spec: > If the size of the resultant image would exceed maxResourceSize, > then vkCreateImage must fail and return VK_ERROR_OUT_OF_DEVICE_MEMORY. > This failure may occur even when all image creation parameters satisfy > their valid usage requirements. Handling of this was added in `86068ad1ee` ("panvk: implement sparse resources"), so we no longer need to make sure of this when reporting the limits. The hardware-fields for these are 16 bits, so let's allow the full range for all of these. This is effectively a revert of `e25a91d919` ("panvk: Lower maxImageDimension{2D,3D,Cube} to match the HW caps"). Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40999>	2026-04-20 16:00:56 +00:00
Erik Faye-Lund	d76e4f6054	pan/lib: validate data_size_B in drivers In order to be able to properly check for maxResourceSize on Vulkan, we need to be able to report the size even for resources that overflow that limit. Otherwise we end up failing to find a usable modifier rather than properly report the problem to the application. This means we need to move the check out of the mod-handler. There's no need to validate the slice-stride. The reason is a little bit complicated, but we have two possible cases: 1. V10 and before: the image-size and the slice-stride are both limited to UINT32_MAX. Since the image-size is always at least as large as the slice-stride, it's enough to check the image-stride. 2. V11 and later: 37 bits is large enough to store any valid slice-stride. The only way we could blow this one up, would be to pass out-of-range width or height, which is already either validated by higher-level logic (gallium) or UB (vulkan). This is important, because we don't have another mandate to reject large resources on Vulkan; we can only reject due to maxResourceSize, not an individual plane. So let's move this out to the call-site. We don't need to do anything for PanVK, becuase it already checks for maxResourceSize. To keep the Gallium and Vulkan driver as similar as reasonably possible, check against the whole resource even in Gallium, where we could have gotten away with checking a plane at the time instead. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40999>	2026-04-20 16:00:56 +00:00
Erik Faye-Lund	57a80ff78c	pan/lib: emit high bits of buffer-size We can't expose large texel-buffers if we don't emit the high bits. Whoopsie! Fixes: `4db7958edc` ("pan/bi: Change texel buffer limits") Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40999>	2026-04-20 16:00:56 +00:00
Erik Faye-Lund	69b8372fbf	pan/lib: fix up afbc and linear layout A few cases of UINT32_MAX were missed, whoops. Fixes: `c2c91e78fd` ("pan/layout: Allow bigger size/surface stride on v12+") Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40999>	2026-04-20 16:00:55 +00:00
Christian Gmeiner	917c3dc77a	panvk: Advertise VK_EXT_shader_uniform_buffer_unsized_array Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The extension permits a SPIR-V OpTypeRuntimeArray as the trailing member of a UBO block. panvk's compiler path handles this correctly without changes: UBO access goes through nir_lower_explicit_io with address formats that carry no compile-time size (bounds are enforced by the hardware UBO descriptor at runtime), so a runtime array inside a UBO is indistinguishable from any other dynamically-indexed UBO access. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40963>	2026-04-20 12:10:19 +02:00
Erik Faye-Lund	ebe4a56650	panvk: use perf-trilinear when doing anisotropic sampling This should be faster, and matches what the DDK does. Reviewed-by: Marc Alcala Prieto <marc.alcalaprieto@arm.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40869>	2026-04-17 12:52:17 +00:00
Marc Alcala Prieto	a073fb193e	pan/genxml: Add performance-trilinear enum values The HW supports these, so let's define the enum values. Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40869>	2026-04-17 12:52:17 +00:00
Ryan Zhang	62e7120384	panvk: add VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL to host copy layouts Add the missing layout which do not need implemented anything in mali gpu. Fixed: dEQP-VK.image.host_image_copy.properties.properties unifiedImageLayouts feature is supported, but layout VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL was not included in VkPhysicalDeviceHostImageCopyProperties::pCopySrcLayouts. Fixes: `1cd61ee` ("panvk: implement VK_EXT_host_image_copy for linear color images") Signed-off-by: Ryan Zhang <ryan.zhang@nxp.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40899>	2026-04-17 10:19:44 +00:00
Erik Faye-Lund	f137207108	panvk: drop out-of-date TODO Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details We already did this, so let's drop this TODO. Fixes: `d36e6af329` ("panvk: Bump the max image size on v11+") Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40990>	2026-04-16 11:21:48 +00:00
Christian Gmeiner	713cecb1df	panvk: Advertise VK_EXT_rgba10x6_formats Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Map X6R10X6G10X6B10X6A10_UNORM to the native R10X6G10X6B10X6A10X6_UNORM HW format on PAN_ARCH >= 11 where it is supported. Enable the extension with formatRgba10x6WithoutYCbCrSampler in the physical device, allowing VK_FORMAT_R10X6G10X6B10X6A10X6_UNORM_4PACK16 to be used as a regular color format without YCbCr sampler conversion. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40653>	2026-04-15 12:16:53 +00:00
Lorenzo Rossi	7ccca9f972	pan/compiler: Document compilation pipeline expectations Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40844>	2026-04-15 10:32:19 +00:00
Lorenzo Rossi	43ba475d4c	panfrost,panvk: Move lower_texture_early inside preproc Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40844>	2026-04-15 10:32:19 +00:00
Lorenzo Rossi	e24228e327	panfrost,panvk: Move lower_texture_late inside postproc Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40844>	2026-04-15 10:32:19 +00:00
Lorenzo Rossi	eafc822dbd	panfrost,panvk: Move postprocess near shader_compile Ideally there should be only sysval lowering in the middle. Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40844>	2026-04-15 10:32:18 +00:00
Lorenzo Rossi	83fd45aa5a	pan/compiler: Fix noperspective int varyings Ints and floats do not need to match between VS and FS, some crazy shaders might write an uint from the VS and read a noperspective float from the FS. There will be new tests in the conformance tests that check that too shortly. Is this a performance regression? yes. Can we fix this easily? No, we'll need dynamic prolog/epilog linking. Since maybe_noperspective is almost useless after this fix, the whole logic has been removed Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40844>	2026-04-15 10:32:18 +00:00

1 2 3 4 5 ...

7723 commits