Commit graph

222075 commits

Author SHA1 Message Date
Lars-Ivar Hesselberg Simonsen
82697cc245 panfrost: Add v15 support to the Gallium driver 2026-05-05 12:11:00 +02:00
Lars-Ivar Hesselberg Simonsen
4789fa6b70 pan: Add v15 support 2026-05-05 12:11:00 +02:00
Lars-Ivar Hesselberg Simonsen
bd52eb4a3a panvk: Add v15 support 2026-05-05 12:11:00 +02:00
Lars-Ivar Hesselberg Simonsen
a986de1f53 pan/clc: Build for v15 2026-05-05 12:11:00 +02:00
Lars-Ivar Hesselberg Simonsen
fb50cac9c6 pan/lib: Build for v15 2026-05-05 12:11:00 +02:00
Lars-Ivar Hesselberg Simonsen
fb3dd5f938 pan/genxml: Build libpanfrost_decode for v15 2026-05-05 12:11:00 +02:00
Lars-Ivar Hesselberg Simonsen
b2671ddcee pan/genxml: Add base v15 definition
This is currently just a copy of v14 except for "arch" being changed to
"15".
2026-05-05 12:10:44 +02:00
Lars-Ivar Hesselberg Simonsen
1f12828011 pan: Add handling for v15+ uapi thread_max_wg_size
thread_max_workgroup_size has been replaced with
thread_num_active_granularity in v15, which requires updated handling
for calculating the max number of threads in a workgroup
2026-05-05 11:53:40 +02:00
Lars-Ivar Hesselberg Simonsen
003becf081 pan: Add handling for v15+ uapi gpu_id
Since v15, gpu_ids are 64 bit, so they need to be handled differently.

To ease this, a compat value of 0xF is found in what previously used to
be ARCH_MAJOR, which we can use to decide whether to read information
from the full 64 bits.

Since we now cannot pass gpu_id directly as deviceID, align with the DDK
on what fields to expose.
2026-05-05 11:53:40 +02:00
Lars-Ivar Hesselberg Simonsen
df8f2d8896 drm-uapi: Add panthor v15 uapi changes
This is currently based on the uapi in the following MR:
https://gitlab.freedesktop.org/panfrost/linux/-/merge_requests/65
2026-05-05 11:53:40 +02:00
Lars-Ivar Hesselberg Simonsen
6e8b73ca76 pan/va/compiler: Fix broken ATOM1_RETURN asm/disasm test 2026-05-05 11:53:40 +02:00
Lars-Ivar Hesselberg Simonsen
1a374e1f04 pan/va/ISA: Remove non-existent register_type
Register_type does not exist in Valhall and was currently not actually
packed.
2026-05-05 11:53:40 +02:00
Lars-Ivar Hesselberg Simonsen
769eddfeca pan/va: Use preload abstraction for blend shader regs
A couple of preloads were missed when implementing the preload register
abstraction.

This fix is not required prior to v15, but marking it as a bug fix for
consistency.

Fixes: 1f0370616a ("pan: Centralize preload registers")
2026-05-05 11:53:40 +02:00
Marc Alcala Prieto
2f6a4e7692 docs/panfrost: Advertize Mali-G1-Pro support
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:46:23 +02:00
Marc Alcala Prieto
c9e740a80e panfrost: Advertize Mali-G1-Pro support
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:46:23 +02:00
Marc Alcala Prieto
4abd3ce744 panfrost: Build the Gallium driver for v14
Enable building panfrost for v14.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:46:23 +02:00
Marc Alcala Prieto
94ec179b55 panfrost: Hook up RUN_FRAGMENT2 on the Gallium driver
Set the FBD size/alignment correctly and emit the fragment staging
registers before issuing fragment commands.

Also, move some temporary registers to non-conflicting ones.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:46:23 +02:00
Marc Alcala Prieto
95596dbc0c pan/bi,va: Use dedicated LD_VAR_BUF_FLAT* opcodes on v14+
On v14+, flat source formats are no longer supported by LD_VAR_BUF and
LD_VAR_BUF_IMM opcodes. This patch makes the compiler emit the
dedicated LD_VAR_BUF_FLAT* opcodes instead.

Add the ISA definitions, handle the new opcodes, and add packing tests
for both immediate and indirect forms.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:46:23 +02:00
Marc Alcala Prieto
6dedfd66a4 pan/va: Fix packing test for LdVarBufImmF16 on v11
Encoding for LdVarBufImmF16 on v11 changed compared to v10. Updated the
test to check for the right encoding.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:46:23 +02:00
Marc Alcala Prieto
f11725a219 pan: Add v14 support
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:46:23 +02:00
Marc Alcala Prieto
74c0426ae7 panvk: Build for v14
Enable building panvk for v14.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:46:22 +02:00
Marc Alcala Prieto
fab9558ab8 panvk: Handle provoking vertex and simultaneous reuse on v14
The provoking vertex bit in RUN_FRAGMENT2 is located in a register
instead of a descriptor stored in memory. That means we don't need to
patch memory, resulting in a much leaner implementation compared to
RUN_FRAGMENT.

Also, implement the simultaneous reuse copy path with the corresponding
tiler pointer patching.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:46:22 +02:00
Marc Alcala Prieto
d425c52a8a panvk: Hook up RUN_FRAGMENT2
Set the FBD size/alignment correctly and emit the fragment staging
registers before issuing fragment commands.

Also, move some temporary registers to non-conflicting ones.

Incremental rendering is left as TODO for later.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:46:22 +02:00
Marc Alcala Prieto
52d6c19293 pan/lib: Build for v14
Enable building libpanfrost for v14. Also, modify format mappings to
account for the new architecture specification.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:46:22 +02:00
Marc Alcala Prieto
1e350ef79c pan/afrc: Add v14+ AFRC YUV compression mappings
v14+ no longer uses specific AFRC compression formats for YUV. Instead,
generic R8/R8G8 and R10/R10G10 formats are used.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:46:22 +02:00
Marc Alcala Prieto
0c162269c3 pan/afbc: Add v14+ AFBC YUV compression mappings
On v14+, many AFBC YUV modes map to generic RGB compression modes.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:46:22 +02:00
Marc Alcala Prieto
198d385535 pan/format: Add v14+ YUV pipe format mappings
Map the multiplane and special internal formats to the new v14+ YUV
formats. Note v14+ has a much simplified list of formats.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:46:22 +02:00
Marc Alcala Prieto
59c6549fc4 pan/texture: Add v14+ YUV pipe format mappings
v14+ no longer uses specific clump formats for YUV.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:46:22 +02:00
Marc Alcala Prieto
589dedf2f2 pan/desc: Implement pan_emit_fbd for RUN_FRAGMENT2
Reuses the same structure that is used by pan_emit_fb_desc.

Also, modify pan_emit_fbd's signature to take a pan_ptr to the
framebuffer memory instead of the CPU-mapped pointer.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:46:22 +02:00
Marc Alcala Prieto
1527d88bc1 pan/fb: Implement pan_emit_fb_desc for RUN_FRAGMENT2
Add a new structure that is used to store per-layer RUN_FRAGMENT2 state.
Any other state will be emitted directly to registers.

Also, modify pan_emit_fb_desc's signature to take a pan_ptr to the
framebuffer memory instead of the CPU-mapped pointer.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:46:22 +02:00
Marc Alcala Prieto
6c89a14e1b pan/clc: Build for v14
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:46:22 +02:00
Marc Alcala Prieto
3687fc515b pan/genxml: Build libpanfrost_decode for v14
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:46:22 +02:00
Marc Alcala Prieto
cb6e788548 pan/decode: Remove progress-related decoding logic
Progress is no longer encoded by the CS builder.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:46:22 +02:00
Marc Alcala Prieto
8c744c5dc0 pan/genxml: Implement RUN_FRAGMENT2
Add support for emitting and decoding RUN_FRAGMENT2 instructions.

Some existing decoding logic from decode.c is modified to be reusable
by the new RUN_FRAGMENT2 decoding logic.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:46:22 +02:00
Marc Alcala Prieto
96cec69ce8 pan/genxml: Add v14 definition
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:46:22 +02:00
Marc Alcala Prieto
661ef96526 pan/genxml: Add base v14 definition
This is just a copy of v13.xml to help spot any missing changes while
working on v14.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:45:39 +02:00
Marc Alcala Prieto
85e211efd8 pan/genxml: Add missing enum values on v9-v13
Note block-linear interleaved clump orderings are not supported on all
v10 architectures.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
2026-05-05 11:45:38 +02:00
Liu, Mengyang
956f4c96e1 amd: disable reset_filter_cam for mec
reset_filter_cam is not supported on mec.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41232>
2026-05-05 08:28:00 +00:00
Roman Stratiienko
60fdab22a5 v3dv: Emulate multi-queue support via vk_queue for Android
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Android14+ relies on at least 2 queues for vulkan skia/UI rendering.
More explained [here][1]

[1]: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/11326

Signed-off-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41213>
2026-05-05 07:03:08 +00:00
Roman Stratiienko
16526e451e v3dv: move noop_job creation to device scope
Preparation step for multiple queue emulation support

Signed-off-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41213>
2026-05-05 07:03:07 +00:00
Samuel Pitoiset
87be392251 radv: fix determining needed dynamic states when rasterization is disabled
The vertex input state can be NULL if rasterization is disabled with
dynamic vertex inputs.

The input assembly state can be NULL if rasterization is disabled
and both states are dynamic (primive topology and primitive restart
enable).

This fixes a segfault with gpu-ratemeter vk_dyn.prim

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41335>
2026-05-05 06:37:57 +00:00
Valentine Burley
39406b8e83 tu: Add shared image support on Android
ANB shared image is required for KHR_shared_presentable_image support.

https://android.googlesource.com/platform/frameworks/native/+/refs/heads/android16-qpr2-release/vulkan/include/vulkan/vk_android_native_buffer.h#154

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41195>
2026-05-05 06:09:21 +00:00
Valentine Burley
924e86b957 tu: Move Android extensions into main list
No reason for these to be separated or be guarded by DETECT_OS_ANDROID.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41195>
2026-05-05 06:09:21 +00:00
Job Noorman
6d6efc332a ir3: enable opt_offsets for load/store_global_offset
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41342>
2026-05-05 06:25:49 +02:00
Job Noorman
97edf88d5f ir3: move feature check down in ir3_nir_max_imm_offset
We want to start using this function for non-SSBO intrinsics, so don't
bail out early.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41342>
2026-05-05 06:25:49 +02:00
Job Noorman
0703f27d6a nir/opt_offsets: add support for @load/store_global_ir3
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41342>
2026-05-05 06:25:49 +02:00
Job Noorman
c784af5ca0 ir3: always use byte offset for @load/store_global_ir3
Before a7xx, ldg/stg.a use an offset in units of their type size while
on a7xx and later, the offset is always in bytes. Currently,
@load/store_global_ir3 take their offset in dwords (32-bits). This has a
few downsides: offsets need an extra shl during codegen on a7xx and
addressing sub-dword-aligned addresses is only possible by doing 64-bit
math on the base address.

Improve the situation by always using a byte offset for
@load/store_global_ir3 and adding the offset_shift index to support type
units pre-a7xx. While we're at it, add the base index as well to support
all ldg/stg.g features in @load/store_global_ir3.

Supporting these renewed intrinsics consists of two parts:
- ir3_nir_lower_io_offsets legalizes the offset_shift on a6xx: for
  ldg.a/stg.a, the offset has to be in units of the type size so extra
  shifts are inserted to accomplish this if necessary. On a7xx, offsets
  are always in bytes so nothing needs to be done.
- The intrinsics are emitted as ldg/stg if the offset is a small enough
  constant and as ldg.a/stg.a otherwise. a6xx supports an extra shift
  for ldg.a/stg.a that only applies to the GPR offset (not the immediate
  base); NIR is pattern matched at this point to extract this if
  possible.

All users of @load/store_global_ir3 are updated to generate the offset
in units of bytes. ir3_nir_analyze_ubo_ranges is updated to take the new
offset_shift into account.

Totals from 2029 (1.15% of 176266) affected shaders:
MaxWaves: 26728 -> 26660 (-0.25%); split: +0.01%, -0.26%
Instrs: 1314089 -> 1278603 (-2.70%); split: -2.72%, +0.02%
CodeSize: 2739108 -> 2633236 (-3.87%); split: -3.87%, +0.01%
NOPs: 197537 -> 200843 (+1.67%); split: -1.62%, +3.30%
MOVs: 43771 -> 44025 (+0.58%); split: -1.11%, +1.69%
Full: 31849 -> 31948 (+0.31%); split: -0.03%, +0.34%
(ss): 37965 -> 42027 (+10.70%); split: -3.47%, +14.17%
(sy): 13752 -> 13566 (-1.35%); split: -4.04%, +2.68%
(ss)-stall: 154238 -> 170353 (+10.45%); split: -1.72%, +12.16%
(sy)-stall: 804442 -> 806518 (+0.26%); split: -4.65%, +4.91%
Preamble Instrs: 326728 -> 293488 (-10.17%)
Cat0: 217926 -> 220947 (+1.39%); split: -1.58%, +2.96%
Cat1: 50182 -> 50446 (+0.53%); split: -0.97%, +1.49%
Cat2: 460987 -> 452101 (-1.93%); split: -2.26%, +0.33%
Cat3: 390696 -> 361271 (-7.53%)
Cat7: 39148 -> 38688 (-1.18%); split: -1.24%, +0.06%

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41342>
2026-05-05 06:25:49 +02:00
Job Noorman
6158072e6f ir3/isa: use same src for ldg.a OFF field on a6xx/a7xx
This makes it slightly easier to generate ldg.a for the different
generations in the same code path.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41342>
2026-05-05 06:25:49 +02:00
Job Noorman
53d96aed05 nir/get_io_offset_src_number: support @load/store_global_ir3
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41342>
2026-05-05 06:25:49 +02:00
Faith Ekstrand
a9b28b9838 pan/nir: Lower texel buffers in nir_lower_tex()
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>
2026-05-05 01:27:16 +00:00