Commit graph

223059 commits

Author SHA1 Message Date
Samuel Pitoiset
9ab3828d5e radv: close the local fd slightly later when enumerating physical devices
So that it can be used to print GPU info.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41802>
2026-05-27 08:12:37 +00:00
Samuel Pitoiset
8c9995e7fa nir: add nir_lower_abort
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41651>
2026-05-27 06:37:03 +00:00
Samuel Pitoiset
88fb73c883 spirv: implement SPV_KHR_abort
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41651>
2026-05-27 06:37:03 +00:00
Samuel Pitoiset
f431d6bc87 nir: add new intrinsics for SPV_KHR_abort
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41651>
2026-05-27 06:37:03 +00:00
Andrzej Datczuk
691371a176 radv/rra,rmv: fix device id written into trace files
Both RRA and RMV used the PCI bus slot index in the trace device_id
field. On a typical single-GPU system, this resulted in "Device ID =
0000" displayed in RRA and RMV when traces were opened.

Match RGP dump, which reports device ids correctly.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41788>
2026-05-27 06:10:48 +00:00
Calder Young
bec5d3fff5 anv: Add workaround for vertex explosions in Split Fiction
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The game tries to use anisotropic filtering deep in some control flow
while updating a procedural displacement map, our sampling hardware
does not check the channel enable mask before calculating the
derivatives for each subspan, which causes it to get garbage for any
subspans that have partially disabled lanes.

This workaround converts any sample messages in fragment shaders that
have divergent control flow into a sample_d message with the derivatives
zero'd by software if some of the lanes are disabled.

Closes: #12796
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41716>
2026-05-26 21:21:55 +00:00
Calder Young
abe41f3acf brw: Add workaround pass for shaders using derivatives in control flow
Using derivatives in control flow that is not uniform across a subspan
will produce "undefined behavior" in GLSL.

On Intel hardware, this means the sampler will just always compute the
derivatives from whatever values are in each lane of a subspan in the
raw payload, regardless if some have been disabled and contain garbage.

Unfortunately, some applications seem to expect the sampler to ignore
disabled lanes in these cases instead of computing their derivatives
anyway from garbage, so for those we need a pass that finds any sample
messages in divergent control flow and converts them to a sample_d with
the derivatives zero'd by software if one or more lanes required to
calculate them have been disabled.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41716>
2026-05-26 21:21:55 +00:00
Olivia Lee
a1d6a34154 panvk: fix executable properties handling for IDVS varying shaders
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The previous implementation assumed that VS would have exactly two
executables, the HW variant position shader and the HW variant varying
shader, and that non-VS shaders would only have one executable per
variant. These assumptions are violated by avalon (which only has one
IDVS executable), by SW VS (which is a second VS variant, for three
total variants) and GS rast variants (which execute as a VS on the
hardware and so have two executables pre-avalon).

The new logic allows VS-staged variants to occur as a variant in any API
shader stage, and gives them either one or two executable indices
depending on whether the secondary is used.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Fixes: ff9907927f (panvk: Add basic infrastructure for shader variants)
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41631>
2026-05-26 19:57:35 +00:00
Benoît du Garreau
7683d552be docs: Add many missing features
I have only looked for unconditionally enabled features, so some are
probably still missing.

Signed-off-by: Benoît du Garreau <benoit@dugarreau.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40680>
2026-05-26 20:54:24 +02:00
ZhengMing
f0a6360e05 vulkan/wsi/win32: Prefer the more popular surface format on Windows
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15148
Signed-off-by: ZhengMing <zhengming@sanway.tech>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41777>
2026-05-26 15:23:24 +00:00
Valentine Burley
14be25c5fa tu: Merge tu_image_init and tu_image_update_layout
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
No longer need to split these up after c22e4022a8.
This is essentially a revert of 4b024a15f2.
No functional change.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41674>
2026-05-26 13:51:07 +00:00
Wujian Sun
cf8a61a071 mesa: Fix clipping order in _mesa_clip_blit()
The source and destination clipping were performed in the wrong order.
We should first clip the source rectangle against the source buffer
bounds, then clip the destination rectangle against the destination
buffer bounds (including scissor).

Fixed the webgl 2.0.0 test case:
conformance2/rendering/blitframebuffer-filter-outofbounds.html

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Wujian Sun <wujian.sun_1@nxp.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41718>
2026-05-26 13:15:45 +00:00
Gert Wollny
b22315f9a9 r600/sfn: run nir_opt_idiv_const
Suggested by Emma.

This reduces the number of ALU groups in the query result shader by
more than 50%.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41328>
2026-05-26 13:00:45 +00:00
Gert Wollny
fc582adfcb r600: replace TGSI query shader with nir
v2: - remove a few useless helpers
    - rename some variables
    - use some more nir with immediate codes (Emma)

v3: - use 64 bit integer ops
    - optimize generated code

v4: - fix typo (Emma)
    - Use boolean for available (Emma)
    - simplify some calculations (Emma)
    - replace "if" in timestamp code and bool conversion
      with "bcsel" (Emma)
    - clean up some variable names

v5: - remove iadd3 (Konstantin)

Assisted-by: Copilot (Auto mode)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41328>
2026-05-26 13:00:45 +00:00
Gert Wollny
60daea17ca r600: replace TGSI TCS passthrough with NIR version
We don't actually need to copy the vertex attributes because if no
TCS shader was given by the user TES simply is pointed to the VS
output in LDS that has the same layout the TCS shader would provide.

v2: with the lowering of the relevant intrinsics in place
    use nir_create_passthrough_tcs_impl to create the passthrough
    shader (like suggested by Mareco and Emma)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41328>
2026-05-26 13:00:45 +00:00
Gert Wollny
9e2d961e56 r600/sfn: Add lowering of tess inner and outer default intrinsics
These are UBO loads and so we do the lowering in nir.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41328>
2026-05-26 13:00:45 +00:00
squidbus
2d6ad3cba1 kk: Support VK_KHR_shader_untyped_pointers
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
With memcpy lowering and fix for infinite optimize loop on 4x16 packs,
passes `dEQP-VK.spirv_assembly.instruction.compute.untyped_pointers.*`.

Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41695>
2026-05-26 11:09:12 +00:00
Arjob Mukherjee
8ee23ece2b pvr: increase maxPerStageDescriptorStorageBuffers to 16
Zink implementation splits `maxPerStageDescriptorStorageBuffers` between
atomic buffers and `MaxShaderStorageBlocks` causing CTS tests to fail
because there is not enough SSBO blocks.

Also updated 'maxPerStageResources' for the current limits.

Fixes the following tests:

* KHR-GLES31.core.program_interface_query.ssb-types
* KHR-GLES31.core.compute_shader.pipeline-compute-chain
* KHR-GLES31.core.shader_storage_buffer_object.advanced-indirectAddressing-case1-cs
* KHR-GLES31.core.shader_storage_buffer_object.advanced-usage-sync-cs
* KHR-GLES31.core.shader_storage_buffer_object.advanced-indirectAddressing-case2-cs

Signed-off-by: Arjob Mukherjee <arjob.mukherjee@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41708>
2026-05-26 10:54:12 +00:00
squidbus
35ac0f78b1 kk: De-duplicate geometry unroll logic
Original poly code supports what we need now, so remove the
duplicated code and switch to that.

Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41568>
2026-05-26 10:39:00 +00:00
squidbus
7b938e8fe3 kk: Fix compute system value and algebric lowering in pre-compiles
Changes are the result of two issues:

- In library form, workgroup size is not lowered. Only once the
  pre-compiles are distinct variants with entry-points can we
  lower uses of the workgroup size input.

- Some unimplemented instructions like `ufind_msb` would make their
  way through to the final shader, if they are generated by other
  algebraic optimizations. `nir_opt_algebraic` needs to be run in a
  loop to ensure they are eliminated.

Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41568>
2026-05-26 10:39:00 +00:00
squidbus
bed2ba22f2 poly: Fix range used for index unroll bounds checks
The index buffer pointer is offset by the draw first index, so the
index buffer range needs to be offset by the same.

Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41568>
2026-05-26 10:39:00 +00:00
squidbus
69a5105aad poly: Refactor poly_unroll_restart for general purpose unrolling
Defines a more general purpose version of `poly_unroll_restart`
named `poly_unroll_geometry`, which allows unrolling without an
input index buffer by separating the input and output index sizes.
This allows it to be used for additional use cases, such as
unrolling triangle fans or changing index types, where the draw
may not necessarily be indexed or the input and output index types
are not the same.

`poly_unroll_restart` remains as an alias with the same declaration
as before.

Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41568>
2026-05-26 10:39:00 +00:00
Samuel Pitoiset
72f02d6e89 radv/amdgpu: fix releasing the mutex for virtio and RADV_PERFTEST=localbos
Found by inspection.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41755>
2026-05-26 10:18:50 +00:00
Samuel Pitoiset
473551ecd0 radv: determine supported syncobj types directly in the physical device
To remove the dependency on the winsys.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41755>
2026-05-26 10:18:50 +00:00
Samuel Pitoiset
b5403ff331 radv/amdgpu: simplify syncobj verifications during submissions
These are equivalent but do not rely on syncobj_sync_type.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41755>
2026-05-26 10:18:50 +00:00
Samuel Pitoiset
f8ce76e996 radv: remove declared but unused create_null_physical_device()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41755>
2026-05-26 10:18:50 +00:00
Samuel Pitoiset
921eedee8b radv: add a separate function to query allocated/usage for each heap
This is just a cleanup that will be useful for upcoming changes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41755>
2026-05-26 10:18:50 +00:00
Samuel Pitoiset
25a53ab412 radv: pre-compute a mask of supported global queue priorities
That removes the winsys dependency when querying that.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41755>
2026-05-26 10:18:49 +00:00
Samuel Pitoiset
c1a3619d2c radv: use radv_device::ws directly for quering sync payloads
Easier to spot the remaining occurrences.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41755>
2026-05-26 10:18:49 +00:00
Christian Gmeiner
99400f272d etnaviv: blt: Add BLT format conversion support
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Until now the BLT engine only handled same-format blits. Teach it to
convert between the UNORM formats it can represent, so format-converting
blits no longer fall back to the 3D blitter.

The supported formats are A8R8G8B8, X8R8G8B8, A4R4G4B4, A1R5G5B5,
R5G6B5, R8G8, R8 and A2R10G10B10.

The BLT format names are BGRA-based, matching the PE-internal byte order,
so an identity swizzle is correct for all of these except A2R10G10B10.
That one the PE keeps in RGBA order, so pipe R lands in the BLT B position
and vice versa. blt_conversion_needs_channel_swap() captures this and
find_blt_conversion() derives the per-image swizzle.

SRGB variants share the BLT format of their UNORM sibling. For example
R8G8B8A8_UNORM and R8G8B8A8_SRGB both map to BLT_FORMAT_A8R8G8B8, and
the sRGB handling is carried separately via img->srgb.

No regressions in dEQP-GLES3.functional.fbo.blit.conversion.*.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39879>
2026-05-26 10:03:27 +00:00
Christian Gmeiner
4e5363a66c etnaviv: Map R8G8B8A8_SRGB to BLT_FORMAT_A8R8G8B8
Required for SRGB format conversion blits via the BLT engine.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39879>
2026-05-26 10:03:27 +00:00
Christian Gmeiner
4cb7c63f21 etnaviv: blt: Add sRGB support to blt_imginfo
Add sRGB field to blt_imginfo and use it to conditionally set the
BLT_SRC_IMAGE_CONFIG_SRGB and BLT_DEST_IMAGE_CONFIG_SRGB bits in
the BLT config register setup.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39879>
2026-05-26 10:03:27 +00:00
David Rosca
dfc608260a radv: Add support for timestamps on video queue
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41744>
2026-05-26 09:29:42 +00:00
David Rosca
581bf2e3b0 ac/cmdbuf: Add ac_emit_video_write_timestamp
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41744>
2026-05-26 09:29:42 +00:00
Christian Gmeiner
adad1a7318 st/mesa: Zero MaxTextureImageUnits for unsupported stages
Mesa core pre-seeds VS/TCS/TES/GS/FS in _mesa_init_constants(..) with
MAX_TEXTURE_IMAGE_UNITS. When a driver does not expose a stage, this
seed leaks into the GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS sum. Drivers
that only expose VS+FS (like etnaviv) overcounted by 96. Zero the
field so the sum reflects only the stages the driver advertises.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41746>
2026-05-26 08:52:48 +00:00
David Rosca
165a0105d3 ac/vcn_dec: Move register defines to ac_vcn_dec.c
Also remove unused struct jpeg_params.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41743>
2026-05-26 08:31:12 +00:00
David Rosca
fa96c3781c radv: Use ac_emit_video_write_memory
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41743>
2026-05-26 08:31:12 +00:00
David Rosca
8f2ee52c0e ac/cmdbuf: Add ac_emit_video_write_memory
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41743>
2026-05-26 08:31:10 +00:00
David Rosca
e1e65e47d4 ac/vcn: Add ac_vcn_sq_header/tail and use it for decode
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41743>
2026-05-26 08:31:10 +00:00
David Rosca
f775ecb143 ac/vcn_dec: Add ac_vcn_dec_init_regs to get register offsets
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41743>
2026-05-26 08:31:10 +00:00
Jose Maria Casanova Crespo
ae604b4bdd v3dv: share zero-fill TFU staging BO at device level
The TFU stride-0 fill path allocates a 64 KiB staging BO
(V3D_TFU_MAX_DIM * cpp = 16384 * 4), maps it, fills it with the
pattern, and caches it on the command buffer. For non-zero patterns
the per-cmd-buffer cache works well, but WebGPU/Dawn workloads
issue many zero-fills (lazy buffer init) across separate command
buffers, so the cache misses almost every time and each fill pays
for a fresh alloc + mmap + memcpy.

Add a device-wide staging BO held in v3dv_device::meta.tfu_fill_zero,
lazily allocated under meta.mtx and used whenever data == 0. The BO
is read-only after init so it can be shared across queues without
extra synchronization, and it is freed in destroy_device_meta.

Measured on a Dawn/WebGPU zero-fill-heavy workload (RPi5, ~60
meta_fill_buffer calls, ~218 MiB total, all zero-fills):

  before: TFU branch total 7.328 ms, avg 115.55 us/call
  after:  TFU branch total 0.296 ms, avg   4.78 us/call  (~24x)

Non-zero patterns continue to use the per-cmd-buffer cache.

Assisted-by: Claude Opus 4.7
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41725>
2026-05-26 07:50:45 +00:00
Jose Maria Casanova Crespo
2a62490fa7 v3dv: relax buffer padding in TFU buffer<->image copy
Adjust eligibility check on imageExtent vs slice dimensions
rather than on the buffer addressing dimensions. The TFU codepath
here always writes/reads the full slice from its origin, so the
required invariant is 'imageExtent == slice'; bufferRowLength and
bufferImageHeight may be larger than imageExtent (the spec permits
this for non-zero values), in which case the TFU reads/writes at the
buffer's row/layer stride but only touches slice->width pixels per
row and slice->height rows per layer, leaving the trailing padding
untouched.

The previous combined check (width == slice->width && height ==
slice->height applied to the buffer dimensions) would reject any
caller that set bufferRowLength or bufferImageHeight larger than the
image (this is common for buffers shared across mip levels or
for alignment requirements like Dawn aligning bufferRowLength to 2
for 1-pixel-wide textures), forcing those copies through the slower
TLB / blit / compute paths.

For compressed formats, keep the strict equality check since
block-level stride semantics are more complex.

Assisted-by: Claude Opus 4.7
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41725>
2026-05-26 07:50:44 +00:00
Jose Maria Casanova Crespo
99bce54daa v3dv: implement TFU image-to-buffer copy on V3D 7.1
Generalize copy_buffer_image_tfu with a to_buffer flag selecting which
side is the raster destination, and wire it into v3dv_CmdCopyImageToBuffer2
before the TLB path.

The to_buffer=true direction has the same eligibility constraints as
buffer-to-image, except that V3D 4.2 is unsupported as its TFU cannot
produce raster output, and for image-to-buffer the destination is
always a raster buffer.

Assisted-by: Claude Opus 4.7
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41725>
2026-05-26 07:50:44 +00:00
Jose Maria Casanova Crespo
0054ff2cb7 v3dv: rename copy_buffer_to_image_tfu to copy_buffer_image_tfu
Drop the direction from the function name in preparation for sharing
this implementation with image-to-buffer copies in the next commit.

Pure rename, no functional change.

Assisted-by: Claude Opus 4.7
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41725>
2026-05-26 07:50:43 +00:00
Jose Maria Casanova Crespo
43ddd0c96f v3dv: extract TFU helpers for format-plane and slice-stride args
Assisted-by: Claude Opus 4.7
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41725>
2026-05-26 07:50:43 +00:00
Jose Maria Casanova Crespo
8e294e6aee v3dv: use TFU copy with stride-0 for vkCmdFillBuffer
Replace the TLB-based meta_fill_buffer path on V3D 7.1+ with a TFU
raster-to-raster copy that broadcasts a single staging row across
the output via iis=0 (stride-0 input). This eliminates the per-fill
CL render job and its tile_alloc/TSDA BO overhead, which is
substantial on workloads that issue many small fills (e.g. WebGPU
lazy buffer initialization in Dawn).

The staging BO holding one row of the fill pattern is cached on the
command buffer and reused across fills with the same data value, so
sequences of identical-pattern fills share a single staging BO.

The existing TLB-based fill is kept as a fallback and is also used
when V3D_DEBUG=disable_tfu is set, or on V3D simulator builds where
the stride-0 TFU input mode is not supported and would assert.

Assisted-by: Claude Opus 4.7
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41725>
2026-05-26 07:50:43 +00:00
Jose Maria Casanova Crespo
ed9fea6045 v3dv: move destroy_update_buffer_cb to a generic helper
Move from v3dv_meta_copy.c to a generic v3dv_cmd_buffer_destroy_bo_cb
in the cmd buffer module. This makes it reusable for different callers
that want to attach a v3dv_bo to a command buffer's private_objs list.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41725>
2026-05-26 07:50:42 +00:00
Jose Maria Casanova Crespo
9b131eb86e v3dv: Enable meta_copy_buffer with TFU for V3D 7.1
Buffer-to-buffer copies on V3D 7.1+ can be served by the TFU as a
raster-to-raster copy, avoiding the per-copy CL render job and
tile_alloc/TSDA BO overhead of the TLB-based path.

Treat the buffer as a raster texture and chunk the copy into TFU
jobs of up to 16384x16384 pixels. Pick the largest pixel size
(cpp in {4,2,1}) such that src/dst offsets and size are all
cpp-aligned: cpp=4 (R8G8B8A8_UINT) is the expected common case;
cpp=2 (R8G8_UINT) and cpp=1 (R8_UINT) handle Vulkan-permitted
unaligned vkCmdCopyBuffer regions that would otherwise fall back
to the slow TLB path. Skipped when V3D_DEBUG=disable_tfu is set;
emits perf_debug when the cpp=1/2 fallback is taken.

Drop the `if (copy_job)` guard on src_bo cleanup registration in
v3dv_CmdUpdateBuffer: the TFU path queues jobs without returning a
v3dv_job*, so the staging BO must be tracked unconditionally to
avoid leaking once the cmd buffer is submitted.

Assisted-by: Claude Opus 4.7
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41725>
2026-05-26 07:50:42 +00:00
Collabora's Gfx CI Team
ff6f82c834 Uprev ANGLE to a793c75398c746f3f8a08fd2e74dfc4dff07a0c9
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
7772c5602d...a793c75398

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41612>
2026-05-26 06:58:28 +00:00
Lishin
c41f88fb35 v3d/v3dv: use common compute limits
Move the compute workgroup count and shared memory limits shared by
v3d and v3dv to v3d_limits.h.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41791>
2026-05-26 07:13:22 +01:00