Commit graph

223760 commits

Author SHA1 Message Date
Benjamin Cheng
a989ca8c8f mesa/st: run the lower_opcodes pass for draw shaders
Fixes: 5eb0136a3c ("mesa/st: when creating draw shader variants, use the base nir and skip driver opts")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15304
Signed-off-by: Benjamin Cheng <benjamin.cheng@amd.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41302>
2026-06-04 19:29:57 +00:00
Benjamin Cheng
a4a862a605 draw: Add lower_opcodes NIR pass
Gallivm runs shaders that are originally compiled with another backend's
compiler options, which may have optimizations that introduce opcodes
that gallivm does not support. Add a pass to lower these.

Assisted-by: Claude Opus 4.6
Signed-off-by: Benjamin Cheng <benjamin.cheng@amd.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41302>
2026-06-04 19:29:57 +00:00
Faith Ekstrand
364b5f806c compiler/rust/smallvec: Optimize extend()
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42005>
2026-06-04 18:09:19 +00:00
Yiwei Zhang
4e8595da21 venus: let resource_create_blob wait for mem alloc
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Previously, the mem alloc wait barrier is via a separate renderer
submission (e.g. execbuf for virtgpu backend). In fact, we can leverage
the cmd payload in resource_create_blob to avoid the extra submission.
This would help downstream win32 backend as well.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42003>
2026-06-04 16:33:02 +00:00
Yiwei Zhang
77b73d8595 venus: update create_from_device_memory to take a cmd payload
This is to leverage drm_virtgpu_resource_create_blob::cmd for expressing
the blob mem host resource dependency in the virtgpu backend, which can
avoid the execbuf. Similar for vtest backend.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42003>
2026-06-04 16:33:02 +00:00
Job Noorman
2b37a0b410 vulkan: use consistent module hashing for pipeline stages
Currently, when hashing a pipeline stage, the final hash is different
when the module is passed as VkPipelineShaderStageCreateInfo::module
(the module's hash is hashed) or as a VkShaderModuleCreateInfo in its
pNext chain (the module's code is hashed). This causes unnecessary cache
misses. To prevent this, hash the code first in the latter case and add
that hash to the stage's hash.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42014>
2026-06-04 16:01:55 +00:00
Job Noorman
0a60a53c81 vulkan: add vk_shader_module_hash helper
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42014>
2026-06-04 16:01:55 +00:00
Hyunjun Ko
bea1212ee7 anv/video: Change size of the cached array of recently decoded AV1 frames.
Current size of prev_refs is 8, which just means the size of ref-frames
but needs to be aligned with full size of dpb, which is 9.
Also prev_refs is now indexed by dpb slot and holds the last intra frame
written to that slot.

This fixes visible artifacts on AV1 streams that mix super-res and
non-super-res frames in a hierarchical reference structure.

Closes: mesa/mesa#15503

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41846>
2026-06-04 15:43:54 +00:00
Hyunjun Ko
11c8930e2b anv/video: define ANV_VIDEO_AV1_MAX_DPB_SLOTS
this is a prep-work for the follwing fix.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41846>
2026-06-04 15:43:54 +00:00
Hyunjun Ko
6875286159 anv/video: Add to check size mismatch during motion field estimation.
Due to super resolution size can change so we need to keep coded size
and check whether the change happens during motion field estimation.

Closes: mesa/mesa#15503

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41846>
2026-06-04 15:43:54 +00:00
Natalie Vock
1a8953c956 radv: Dump printf buffer after detecting a GPU hang
This allows us to use printf debugging when the GPU hangs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41961>
2026-06-04 15:22:07 +00:00
Natalie Vock
c8518581bf radv/rt: Don't overwrite bvh_base at the start of the traversal loop
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This may delete existing pointer flags coming from the instance if the
traversal loop is exited and then restarted, as is done with ray
queries.

Fixes geometry being incorrectly culled due to FLIP_FACING flags going
missing.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41965>
2026-06-04 14:55:30 +00:00
Karmjit Mahil
10c914693d freedreno/computerator: Remove VLA giving a build warning
```
../src/freedreno/computerator/main.cc:327:24: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
  327 |       uint64_t results[num_perfcntrs];
      |                        ^~~~~~~~~~~~~
../src/freedreno/computerator/main.cc:327:24: note: read of non-const variable 'num_perfcntrs' is not allowed in a constant expression
../src/freedreno/computerator/main.cc:206:13: note: declared here
  206 |    unsigned num_perfcntrs = 0;
      |             ^
```

Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42017>
2026-06-04 14:38:43 +00:00
Jose Maria Casanova Crespo
28e584b687 v3dv: enable lowered shaderFloat16/Int16/Int8 + VK_KHR_shader_float16_int8
V3D 7.1 now exposes shaderFloat16, shaderInt8, shaderInt16 and
VK_KHR_shader_float16_int8.

Partial native Float16 support is already available. But the rest of
sub-32-bit ALU operations are widened to 32-bit by nir_lower_bit_size
in v3d_lower_nir(); conversion and pack operations are kept at their
native bit width so the QPU's 16-bit pack/unpack paths on mul/mov can
be used.

Assisted-by: Claude Opus 4.7
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41810>
2026-06-04 13:29:39 +00:00
Jose Maria Casanova Crespo
4c5b0fa7f4 v3d: emit packed-f16 ALU ops natively on V3D 7.1
Keep f16 fadd/fsub/fmul/fmin/fmax/fneg/fabs at 16-bit through
nir_lower_bit_size on V3D 7.1+ and emit the matching VF* op in
nir_to_vir, instead of widening to f32 with f16<->f32 round-trip
movs that pack-fold can absorb into hints. The native path saves
the absorption overhead in f16-heavy shaders.

Only the lower half of each VF* result is consumed; the upper half
is computed but unused.

New VIR helpers vir_VFADD, vir_VFSUB, vir_VFCMP, vir_VFMIN,
vir_VFMUL, vir_VFMOV, vir_VFABS, vir_VFNEG, vir_VFNAB were added.

Assisted-by: Claude Opus 4.7
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41810>
2026-06-04 13:29:39 +00:00
Jose Maria Casanova Crespo
16856adff5 broadcom/qpu: expose V3D 7.1 packed-f16 instructions
Add the V3D 7.1+ 2x16-bit f16 add-pipe ops (VFADD/VFSUB/VFCMP and
the sign-manipulation family VFMOV/VFABS/VFNEG/VFNAB), wire VFMAX
into v3d71_add_ops, and complete the V3D 7.1 decode/encode for
VFMIN/VFMAX/VFMUL.

Assisted-by: Claude Opus 4.7
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41810>
2026-06-04 13:29:39 +00:00
Jose Maria Casanova Crespo
5a575cca8e v3d: improve liveness analysis for packed partial writes
The liveness analysis treated any output-pack write (D.l /
D.h) as a partial definition, refusing to mark the variable as
defined in the block. That extended live ranges all the way to the
top of the program for every f16 temporary, artificially increasing
register pressure.

D.l/h only modifies the written bits, leaving the unwritten half bits
preserved. So a pack write is a full definition whenever no
consumer ever observes the unwritten half, or when both halves are
written before the variable is used.

This scans every instruction into a per-temp read-flag array
(TEMP_READ_LO / TEMP_READ_HI, with FULL = LO | HI) by inspecting
each source's input unpack. And recognizes two patterns as full
definitions:

 * Both PACK_L and PACK_H written unconditionally in the same block.
 * The instruction's pack writes the half that covers every observed
   read of the variable across the program (the unwritten half is never
   read).

Assisted-by: Claude Opus 4.7
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41810>
2026-06-04 13:29:39 +00:00
Jose Maria Casanova Crespo
66ac3b55af v3d: widen sub-32-bit subgroup arithmetic and vote ops
nir_lower_subgroups lowers reduce/scan to a tree of shuffle + ALU
chains over the source data type. When the source is sub-32-bit
(int8, int16, float16, or vector forms) those new ALU ops escape
the bit_size widening done earlier in v3d_lower_nir, leaving the
QPU codegen to emit raw min/max/etc. on 32-bit channel registers
whose upper bits are unspecified. The result is wrong reductions
for signed integer min/max (the upper bits make a signed int8 look
like a positive int32), wrong unsigned reductions (high-bit garbage
mixes into the result), and wrong f16 reductions.

Re-run nir_lower_bit_size after nir_lower_subgroups so the
generated sub-32-bit ALU ops are widened with the correct
sign/zero extension on inputs and the matching narrow on outputs.

Also widen vote_feq/vote_ieq when the source operand is sub-32-bit:
the V3D backend emits ALLFEQ/ALLEQ on full 32-bit channels (it does
not use yet the f16 vfcmp/vfmin/vfmax HW path), so the comparison input
must be 32-bit.

Assisted-by: Claude Opus 4.7
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41810>
2026-06-04 13:29:39 +00:00
Jose Maria Casanova Crespo
54de903ae4 v3dv: lower flrp16 for consistency with flrp32
flrp32 is already lowered; mirror it for flrp16 so V3D's f16 ALU
path doesn't see an unsupported flrp@16 leftover after bit_size
widening. No measurable test impact on the current f16 sweep,
but matches the f32 behaviour and keeps the lowering surface
consistent across bit sizes.

Assisted-by: Claude Opus 4.7
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41810>
2026-06-04 13:29:38 +00:00
Jose Maria Casanova Crespo
0a5200d051 v3d: move nir_lower_frexp after nir_lower_bit_size
The frexp lowering decomposes frexp into bit manipulation (fabs, ushr,
iand, ior) that relies on implicit float-to-int bit reinterpretation.
When lowered at 16-bit, the subsequent nir_lower_bit_size pass widens
float operations with f2f32 (changing the bit pattern to IEEE fp32)
and integer operations with u2u32 (zero-extending 16-bit bits). This
breaks the reinterpretation: ushr on the fabs result gets f2f32-widened
float bits instead of the original fp16 bit pattern, causing the sign
bit to leak into the exponent extraction for negative inputs.

Moving nir_lower_frexp into v3d_lower_nir after nir_lower_bit_size.
This way frexp decomposition operates at 32-bit where float and integer
operations share the same bit width, and the bit manipulation masks use
the correct IEEE fp32 constants.

Assisted-by: Claude Opus 4.7
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41810>
2026-06-04 13:29:38 +00:00
Jose Maria Casanova Crespo
cac92fecac broadcom/qpu: support output pack on itof/utof
itof and utof natively support packing the f32 result to f16
(.l/.h), but the encode/decode paths fell through to the default
case and rejected any non-NONE pack, breaking nir_op_i2f16 /
nir_op_u2f16 codegen with "Failed to pack instruction: itof rfN.l".

Assisted-by: Claude Opus 4.7
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41810>
2026-06-04 13:29:38 +00:00
Job Noorman
5943d01e86 tu: add option to override the build ID
Add the tu-build-id meson option to force the build ID to a particular
value. This allows us the share the shader cache between different
builds. This enables, for example, sharing the cache between x86
drm-shim and aarch64 native builds.

Also add tu_override_{graphics,compute}_shader_version driconf options
to force recompilation of shaders even when tu-build-id stays the same.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41954>
2026-06-04 12:44:13 +00:00
Job Noorman
59438fba2a tu: use chip_id instead of gpu_id for the cache UUID
gpu_id has been deprecated for a while. Moreover, drm-shim actually sets
a gpu_id for a7xx devices (while native builds do not) making the cache
UUID inconsistent.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41954>
2026-06-04 12:44:13 +00:00
Job Noorman
920a93170d freedreno/drm-shim: allow chip selection by chip_id
gpu_id has been deprecated for a while, add a new env var (FD_CHIP_ID)
to select a chip by chip_id.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41954>
2026-06-04 12:44:13 +00:00
squidbus
1e08ccf28d kk: Advertise additional tessellation dynamic state
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
These are already supported by the tessellation implementation.

Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42016>
2026-06-04 12:29:14 +00:00
squidbus
94295fda67 kk: Support VK_EXT_external_memory_host
Metal does not support importing host memory pointers into MTLHeap,
only MTLBuffer. Buffers can import without issue, and images are
restricted to linear images without flags requiring aliasing.

Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41894>
2026-06-04 12:12:09 +00:00
Aitor Camacho
68048759f0 kk: Implement tessellation
Same approach as HK for tessellation. It also handles instance_id lowering.
instance_id_includes_base_index is not taken into account in multiple
other passes that use instance id. These passes expect instance id to
actually be instance id. This change adds a pass to work around this.

Signed-off-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41038>
2026-06-04 11:11:08 +00:00
Aitor Camacho
84929be129 kk: Rework shader compilation to handle more than 2 stages
Integrate poly module and support tessellation stage compilation

Signed-off-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41038>
2026-06-04 11:11:08 +00:00
Aitor Camacho
7317282488 kk: Rework draw dispatch
Tessellation and geometry stages require emulation by launching
pre-graphics compute workloads, modifying the draw index and switching to
indirect. However, since these emulation steps can only take one draw at
a time (multi draw being the issue), we need to accommodate this limitation
by splitting kk_draw_data into 2. A constant structure that maintains the
initial values such as is restart enabled, index buffer, etc. and a second
structure containing the modified values used to dispatch the Metal draw
call.

This change also early returns if any of the emulation steps fail instead
of allowing the draw to continue to avoid potential issues.

Signed-off-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41038>
2026-06-04 11:11:07 +00:00
squidbus
9405760aad kk: Support VK_IMAGE_CREATE_BLOCK_TEXEL_VIEW_COMPATIBLE_BIT
Adds layer size and mip level offset information to image layouts.
With this information, we can calculate the subresource accessed for
block texel view and create an aliased texture in the intended format.

Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41900>
2026-06-04 03:29:13 -07:00
squidbus
b93602e16a kk: Fence read-write images after write
Metal does not guarantee that image reads after writes will be coherent,
requiring us to insert fences for read-write textures.

Reviewed-by: Arcady Goldmints-Orlov <arcady@lunarg.com>
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41900>
2026-06-04 03:24:23 -07:00
Jose Maria Casanova Crespo
03dee27f48 v3dv: expose the full simulator memory to applications
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
On real hardware compute_heap_size() reserves a fraction of total_ram for
the rest of the system and compute_memory_budget() reports at most 90% of
the available memory, both because that RAM is shared between the GPU and
the CPU. In simulator mode the memory is instead a dedicated GPU pool
allocated by the simulator, so these reservations just hid memory: although
we allocate 1 GiB for the simulator, only 512 MiB was exposed as the heap
and as the budget.

Expose the full simulator allocation as both the heap size and the budget.
The simulator never allocates more than the 4 GiB the GPU MMU can address,
which we assert.

Before:
  memoryHeaps[0]:
    size   = 536870912 (0x20000000) (512.00 MiB)
    budget = 536870912 (0x20000000) (512.00 MiB)

After:
  memoryHeaps[0]:
    size   = 1073741824 (0x40000000) (1024.00 MiB)
    budget = 1073725536 (0x3fffc060) (1023.98 MiB)

Assisted-by: Claude Opus 4.8
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41898>
2026-06-04 09:35:38 +00:00
Rob Clark
27a8ca79b1 freedreno/perfcntrs: Expose gen8 counters
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41158>
2026-06-04 08:57:56 +00:00
Rob Clark
f7a41c26ff freedreno/a6xx: Program gen8+ slice SEL regs
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41158>
2026-06-04 08:57:56 +00:00
Rob Clark
ca616c2b64 tu/gen8: Program slice selector regs
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41158>
2026-06-04 08:57:56 +00:00
Rob Clark
68163a732e tu: Disable preemption for counters on gen8
Extend the CP_SCOPE_CNTL to gen8 and newer.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41158>
2026-06-04 08:57:56 +00:00
Rob Clark
89b25531cb freedreno: Skip BV perfcntrs
Not useful unless we expose concurrent binning.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41158>
2026-06-04 08:57:56 +00:00
Rob Clark
cde274fc6c freedreno/perfcntrs: Use helper for derived counters
Use helper to assign/reserve counters for derived counters.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41158>
2026-06-04 08:57:56 +00:00
Rob Clark
5f4bf61653 freedreno/perfcntrs: Refactor derived counter setup
Most of what is done here doesn't need to be duplicated per hw gen.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41158>
2026-06-04 08:57:56 +00:00
Rob Clark
71ae168c7a freedreno/a6xx: Use counter allocation helper
If the kernel supports PERFCNTR_CONFIG for counter reservation, we can
expose perfcntrs by default.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41158>
2026-06-04 08:57:56 +00:00
Rob Clark
b6938c4c33 tu: Use counter allocation helper
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41158>
2026-06-04 08:57:56 +00:00
Rob Clark
57ced39635 freedreno/perfcntrs: Add helper to assign counters
Add a helper to allocate a counter for a requested countable, and (if
supported by KMD) do the PERFCNTR_CONFIG ioctl to reserve the counter
for UMD local (inline) usage.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41158>
2026-06-04 08:57:56 +00:00
Rob Clark
4282d1edc9 freedreno/perfcntrs: Add helpers to resolve group and countable
We were duplicating this in a few places.  Add helpers instead.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41158>
2026-06-04 08:57:56 +00:00
Rob Clark
7bbdf0ffb7 freedreno/ds: Add a8xx derived counters
Mostly just some counter renames (slice vs unslice, etc)

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41158>
2026-06-04 08:57:56 +00:00
Rob Clark
b39cb31e66 freedreno/ds: PERFCNTR_CONFIG support
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41158>
2026-06-04 08:57:56 +00:00
Rob Clark
8f7f9c0897 freedreno/fdperf: Add PERFCNTR_CONFIG support
Add support for the new ioctl for KMD global counter collection.  This
avoids needing hacks to parse dtb and mmap the GPU's i/o space.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41158>
2026-06-04 08:57:56 +00:00
Rob Clark
bd5e3a2675 freedreno/fdperf: Prepare for partial-counter usage
With PERFCNTR_CONFIG, some other process may have already reserved some
counters, so not all will be available to fdperf.  Prepare for this by
using num_counters in counter_group.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41158>
2026-06-04 08:57:56 +00:00
Rob Clark
e601474339 freedreno/fdperf: Move where we setup counter groups
Move this earlier so we have the counter config early enough to probe
kernel support for PERFCNTR_CONFIG with a valid config.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41158>
2026-06-04 08:57:56 +00:00
Rob Clark
15669a6981 freedreno/common: Add ioctl ptr helpers
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41158>
2026-06-04 08:57:56 +00:00
Rob Clark
ac76ceafb6 drm-uapi: Sync msm_drm.h
Pull in updated UABI header with PERFCNTR_CONFIG ioctl.  Sync with:

   commit 44c460d2cc8b87c08360fe60f861660c8045ef90
   Merge: 9bb8af2770b7 9a967125427e
   Author: Dave Airlie <airlied@redhat.com>

       Merge tag 'drm-msm-next-2026-05-30' of https://gitlab.freedesktop.org/drm/msm into drm-next

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41158>
2026-06-04 08:57:56 +00:00