Commit graph

208791 commits

Author SHA1 Message Date
Faith Ekstrand
829bdc2fdb nir/instr_set: Rework tex instr hash/compare
We were missing a couple bits from hash and a bunch of stuff from the
comparison.  This puts most of nir_tex_instr into a single pack_tex
helper that's used by both and grabs everything we were missing.

Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36234>
(cherry picked from commit 557ac588e4)
2025-07-23 14:34:42 +02:00
Vasily Khoruzhick
eec4d7bb69 lima: ppir: index SSA nodes the same way as we index registers
var_nodes size is x4 of nir defs count, since we need to track a node
for each individual channel of a register write. We don't need that for
SSA, but we used non-shifted indices for SSA, which made the compiler
reliant of reg nir def indeces to start after all the SSA indices.

That has changed with 7b70b419b528("nir: always index SSA defs before
printing").

Fix that by shifting SSA index as well, that would allow not to rely on
any assumptions on nir def indices.

Backport-to: 25.2
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36206>
(cherry picked from commit 2e38cbc40c)
2025-07-23 14:34:42 +02:00
Ruijing Dong
02a6e9137f radeonsi/vcn: vcn5 av1 decoding context buffer fix
In VCN5, the AV1 context buffer has changed to a bigger
one than VCN4. It fixed an AV1 decoding issue on VCN5.

Cc: mesa-stable

Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36208>
(cherry picked from commit 32a2012975)
2025-07-23 14:34:42 +02:00
Yiwei Zhang
132471c61c lavapipe: do not short-circuit AHB export alloc (non-import)
Per spec VUID-VkMemoryAllocateInfo-pNext-01874:

If the parameters do not define an import operation, and the pNext chain
includes a VkExportMemoryAllocateInfo structure with
VK_EXTERNAL_MEMORY_HANDLE_TYPE_ANDROID_HARDWARE_BUFFER_BIT_ANDROID
included in its handleTypes member, and the pNext chain includes a
VkMemoryDedicatedAllocateInfo structure with image not equal to
VK_NULL_HANDLE, then allocationSize must be 0

- before: total 116, skip 66, pass 36, fail 14
- after:  total 116, skip 66, pass 50, fail 0

Fixes: cebb2bf266 ("lavapipe: Add AHB extension")
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36204>
(cherry picked from commit 209e402720)
2025-07-23 14:34:42 +02:00
Yiwei Zhang
e88186b30f lavapipe: populate AHB memory mapping
- before: total 116, skip 66, pass 36, fail 14
- after:  total 116, skip 66, pass 38, fail 12

Fixes: cebb2bf266 ("lavapipe: Add AHB extension")
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36204>
(cherry picked from commit 91c8372c67)
2025-07-23 14:34:42 +02:00
Yiwei Zhang
66ec573903 lavapipe: properly handle AHB release
Need to release the AHB ref upon vkFreeMemory.

Fixes: cebb2bf266 ("lavapipe: Add AHB extension")
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36204>
(cherry picked from commit faa71af431)
2025-07-23 14:34:42 +02:00
Yiwei Zhang
f634471a7c lavapipe: do not close import fd on error and amend an error code
The implementation only takes the ownership after a successful import.
On import failure, the caller is going to handle the fd. Meanwhile,
amend a missing error code on an error path.

Fixes: 895d3399f7 ("lavapipe: add support for KHR_external_memory_fd")
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36204>
(cherry picked from commit 160cd3a317)
2025-07-23 14:34:42 +02:00
Yiwei Zhang
b3285b1212 lavapipe: implement GetMemoryAndroidHardwareBufferANDROID
lvp hasn't used common device memory obj, and it allocates and imports
ahb on its own. Thus it has to implement the AHB export api itself.

- before: total 116, skip 66, pass 24, fail 26
- after:  total 116, skip 66, pass 36, fail 14

Fixes: cebb2bf266 ("lavapipe: Add AHB extension")
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36204>
(cherry picked from commit f1af533b2c)
2025-07-23 14:34:42 +02:00
Yiwei Zhang
c5aaeb3c4d lavapipe: allow AHB export allocation
This fix came from below error log:

> E MESA    : lavapipe: unimplemented external memory type 1024

Fixes: cebb2bf266 ("lavapipe: Add AHB extension")
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36204>
(cherry picked from commit 3167e30ee2)
2025-07-23 14:34:42 +02:00
David Rosca
509db43546 radeonsi/vcn: Correctly handle tile swizzle
Currently tile swizzle can only be non zero for single plane
formats, for multi plane formats we always set PIPE_BIND_SHARED.

Luma only (Y400) JPG decode and encode with RGB input surface (EFC)
are the only two cases where we can get surface with tile swizzle
and ignoring it would result in corrupted output.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13346
Acked-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35647>
(cherry picked from commit b665bd21cb)
2025-07-23 14:34:42 +02:00
Karol Herbst
3ea04c268d rusticl/mem: relax flags validation for clGetSupportedImageFormats
While the API spec does describe which flags _may_ be passed in, the
overall CL working group agreement is, that implementations should expect
random flags to be passed in as other implementations _may_ use them to
further restrict or allow image formats.

Also fix validation for importing GL objects while at it.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36216>
(cherry picked from commit d8793e3874)
2025-07-23 14:34:42 +02:00
Mike Blumenkrantz
77db8c83a4 lavapipe: call nir_lower_int64
otherwise the 64bit ops unsupported by llvmpipe will not be lowered

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35743>
(cherry picked from commit 6d2d4e9bbf)
2025-07-23 14:34:42 +02:00
Mike Blumenkrantz
0b31752560 zink: fix valid contents check for adding new bind
the previous one didn't account for buffers

Fixes: b022cdc8a1 ("zink: only copy resource during add_bind if it is valid")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36201>
(cherry picked from commit c768699a73)
2025-07-23 14:34:42 +02:00
Danylo Piliaiev
f81588fb51 tu: Use safe-const binning VS when safe-const full VS is used
Otherwise we can have a case where binning VS uses more consts than
full VS (when safe variant is used for full VS), that will result in
a rendering issue because SP_VS_CONST_CONFIG.CONSTLEN is shared between
full and binning VS in PROGRAM_CONFIG state and gets the value from the
full VS.

There are two alternative solutions that can allow binning VS to always
use maximum constlen:
- Move constlen emission to per-XS config. This interferes
  PROGRAM_CONFIG state which uploads consts and does SP_UPDATE_CNTL.
  Consts would need to be uploaded after constlen is defined, while
  SP_UPDATE_CNTL must be done before per-XS state is emitted.
  Also having SP_UPDATE_CNTL in a draw state that is always DIRTY
  isn't great.
  Something didn't work out on A6XX, so this idea was dropped.
- Emit constlen again in VS_BINNING draw state. This seem to work
  but also likely an undefined behaviour since constlen is changed
  after some consts are uploaded.

Cc: mesa-stable

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36203>
(cherry picked from commit 6003a89b89)
2025-07-23 14:34:42 +02:00
Karol Herbst
66c92d0542 zink: disable shader images for intensity formats
Vulkan only allows identity remapping on storage images descriptors.

Fixes: 475c43cf8a ("zink: translate intensity formats")
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36199>
(cherry picked from commit e03d23ddc9)
2025-07-23 14:34:42 +02:00
Karol Herbst
d448930c16 zink: disallow intensity buffer images
Fixes: 475c43cf8a ("zink: translate intensity formats")
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36199>
(cherry picked from commit 146e843254)
2025-07-23 14:34:42 +02:00
Karol Herbst
3c48152cb1 vtn/opencl: set exact on all ffmas and mads
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36175>
(cherry picked from commit 3a2f38cd44)
2025-07-23 14:34:42 +02:00
Valentine Burley
4fd5967d9a ci: Always save the artifacts for performance traces
We need the job artifacts in all cases to gather performance data.
Since commit b723bc80d2, we were only saving them on failures.

Fixes: b723bc80d2 ("ci/lava: inherit .piglit-traces-test in .lava-piglit-traces and deduplicate configs")

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36195>
(cherry picked from commit b7f1f40bf4)
2025-07-23 14:34:42 +02:00
Antonio Ospite
4658dcae7f ci/android: update comment about ANDROID_CTS_MODULES
After commit 545727f97c (ci/android: Move ANDROID_CTS_MODULES to build
script, 2025-06-24) the comment about ANDROID_CTS_MODULES in
lvp-android-angle-android-cts-include.txt has become inaccurate.

Update the instructions to reflect the latest status.

Fixes: 545727f97c
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36187>
(cherry picked from commit 0f09c9436b)
2025-07-23 14:34:42 +02:00
Mike Blumenkrantz
0554de6016 gallium/hud: set the framebuffer texture when drawing
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13545

Fixes: 2eb45daa9c ("gallium: de-pointerize pipe_surface")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36167>
(cherry picked from commit ce09d80698)
2025-07-23 14:34:42 +02:00
Dave Airlie
7f32e1c4bd nak: disable imma 8x8x16 on Blackwell+
It's not supported anymore

Fixes: 669c8a5145 ("nvk: Advertise VK_KHR_cooperative_matrix")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36163>
(cherry picked from commit da61149b2a)
2025-07-23 14:34:42 +02:00
Faith Ekstrand
264ff0528c nak: Wire up the mma predicate on Hopper+
Fixes: 90438bae51 ("nir: Add NVIDIA-specific muladd intrinsics")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36163>
(cherry picked from commit 4bb67cacba)
2025-07-23 14:34:42 +02:00
Eric Engestrom
e0d3b2a086 .pick_status.json: Mark 80be9153f9 as denominated 2025-07-23 14:34:42 +02:00
Eric Engestrom
f88117ec3d .pick_status.json: Update to f4166ab1e1 2025-07-23 14:34:42 +02:00
Eric Engestrom
d277c284e4 VERSION: bump for 25.2.0-rc1 2025-07-16 16:42:48 +02:00
David Rosca
bc11dc72c1 Revert "radeonsi/vcn: Stop using stream handle for decode"
Caused issues on VCN5.

This reverts commit 46d5926d83.

Fixes: 46d5926d83 ("radeonsi/vcn: Stop using stream handle for decode")
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36154>
2025-07-16 13:25:31 +00:00
Eric Engestrom
8d61c2751f ci: move script: override from .piglit-traces-test to llvmpipe-traces
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This is already being set as needed everywhere else, and would cause
issues in future work.

Use the relative `install/` path for `HWCI_TEST_SCRIPT` as that's
supported by both HW runners and FDo runners.
A separate MR will fix the `/install/` vs `install/` mess.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36145>
2025-07-16 12:54:06 +00:00
Eric Engestrom
b723bc80d2 ci/lava: inherit .piglit-traces-test in .lava-piglit-traces and deduplicate configs
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36145>
2025-07-16 12:54:06 +00:00
Eric Engestrom
5ebe02db30 ci/piglit: provide default results file name
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36145>
2025-07-16 12:54:06 +00:00
Eric Engestrom
9c42e66de1 ci/piglit: provide default device name
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36145>
2025-07-16 12:54:06 +00:00
Eric Engestrom
8e568a1ed3 ci/piglit: drop LAVA variable from non-LAVA jobs
LAVA jobs use `.lava-piglit-traces` which also sets it (with the same
comment), so this doesn't affect lava jobs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36145>
2025-07-16 12:54:05 +00:00
Eric Engestrom
3dc28c9e55 zink+radv/ci: document recent flakes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36160>
2025-07-16 12:39:53 +00:00
Eric Engestrom
8808f039cc freedreno/ci: document recent flakes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36160>
2025-07-16 12:39:53 +00:00
Eric Engestrom
e1277100df zink+nvk/ci: fix mistake in yesterday's crash->fail improvement update
Fixes: e703847410 ("zink+nvk/ci: document crash->fail change from !36031")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36160>
2025-07-16 12:39:53 +00:00
Eric Engestrom
6f9fcfb0ad nvk/ci: document vkd3d regression
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36160>
2025-07-16 12:39:53 +00:00
Eric R. Smith
65bcae096a panfrost: fix SSA register allocation
We were allocating a fixed number of temporary registers; this isn't
always enough, and in fact we should have calculated the number of
temporaries required.

Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 6c64ad934f ("panfrost: spill registers in SSA form")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36135>
2025-07-16 12:16:50 +00:00
Georg Lehmann
497f607c8e radv/nir/lower_cmat: vectorize GFX11 B -> ACC conversion
Foz-DB Navi31:
Totals from 7 out of 14 FSR4 shaders:
MaxWaves: 50 -> 52 (+4.00%)
Instrs: 44951 -> 44516 (-0.97%); split: -1.00%, +0.03%
CodeSize: 309176 -> 305500 (-1.19%); split: -1.23%, +0.04%
VGPRs: 1464 -> 1416 (-3.28%)
SpillVGPRs: 188 -> 92 (-51.06%)
Scratch: 24064 -> 11776 (-51.06%)
Latency: 171318 -> 163663 (-4.47%); split: -4.51%, +0.04%
InvThroughput: 178796 -> 178956 (+0.09%); split: -0.04%, +0.13%
VClause: 769 -> 730 (-5.07%); split: -6.50%, +1.43%
Copies: 3149 -> 3261 (+3.56%); split: -1.21%, +4.76%
PreVGPRs: 1607 -> 1467 (-8.71%)
VALU: 37715 -> 37744 (+0.08%); split: -0.11%, +0.18%
SALU: 754 -> 753 (-0.13%)
VMEM: 2813 -> 2621 (-6.83%)
VOPD: 1674 -> 1685 (+0.66%); split: +1.55%, -0.90%

Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36115>
2025-07-16 11:46:52 +00:00
Georg Lehmann
7546169e1c radv/nir/lower_cmat: vectorize GFX11 ACC -> B conversion
Foz-DB Navi31:
Totals from 10 out of 14 FSR4 shaders:
Instrs: 64204 -> 60749 (-5.38%)
CodeSize: 439052 -> 417668 (-4.87%)
SpillVGPRs: 186 -> 188 (+1.08%)
Scratch: 23808 -> 24064 (+1.08%)
Latency: 208878 -> 202903 (-2.86%)
InvThroughput: 232898 -> 225688 (-3.10%)
VClause: 902 -> 907 (+0.55%); split: -1.55%, +2.11%
Copies: 6418 -> 3762 (-41.38%)
Branches: 55 -> 37 (-32.73%)
PreSGPRs: 297 -> 298 (+0.34%)
PreVGPRs: 2299 -> 2303 (+0.17%)
VALU: 54762 -> 51489 (-5.98%)
SALU: 956 -> 938 (-1.88%)
VMEM: 3469 -> 3473 (+0.12%)
VOPD: 3895 -> 2126 (-45.42%)

Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36115>
2025-07-16 11:46:52 +00:00
Georg Lehmann
d672737372 nir,aco: add byte_perm_amd
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36115>
2025-07-16 11:46:52 +00:00
Georg Lehmann
56d93c40ea radv/nir/lower_cmat: convert matrix use in smaller type
Less conversions, and less data to move around.

Foz-DB Navi31:
Totals from 10 out of 14 FSR4 shaders:
Instrs: 65443 -> 64204 (-1.89%); split: -1.93%, +0.04%
CodeSize: 441884 -> 439052 (-0.64%); split: -1.21%, +0.57%
Latency: 213374 -> 208878 (-2.11%); split: -2.17%, +0.07%
InvThroughput: 236922 -> 232898 (-1.70%); split: -1.77%, +0.08%
VClause: 935 -> 902 (-3.53%); split: -3.74%, +0.21%
Copies: 5064 -> 6418 (+26.74%); split: -13.35%, +40.09%
Branches: 54 -> 55 (+1.85%)
VALU: 55700 -> 54762 (-1.68%); split: -1.85%, +0.16%
VOPD: 3459 -> 3895 (+12.60%); split: +16.88%, -4.28%

Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36115>
2025-07-16 11:46:52 +00:00
Georg Lehmann
f2846b936a radv/nir/lower_cmat: use v_permlanex16_b32 instead of ds_swizzle_b32 for GFX11 ACC->B
ds_swizzle is slower than I expected.

Foz-DB Navi31:
Totals from 10 out of 14 FSR4 shaders:
Instrs: 68802 -> 65443 (-4.88%)
CodeSize: 458000 -> 441884 (-3.52%)
Latency: 218147 -> 213374 (-2.19%); split: -3.17%, +0.99%
InvThroughput: 230190 -> 236922 (+2.92%); split: -0.25%, +3.18%
VClause: 922 -> 935 (+1.41%); split: -0.98%, +2.39%
Copies: 5877 -> 5064 (-13.83%); split: -15.74%, +1.91%
Branches: 37 -> 54 (+45.95%)
VALU: 53441 -> 55700 (+4.23%); split: -0.55%, +4.77%
SALU: 872 -> 956 (+9.63%)
VOPD: 1767 -> 3459 (+95.76%)

Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36115>
2025-07-16 11:46:51 +00:00
Olivia Lee
5ee3c10d1e panvk: advertise vulkan 1.4 on v10+
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
VK_EXT_host_image_copy was the last extension needed.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
2025-07-16 10:56:03 +00:00
Olivia Lee
3894f58914 panvk: implement VK_EXT_host_image_copy for depth/stencil images
Copy between memory and a depth/stencil image requires copying the depth
and stencil aspects in separate calls. For D32S8, this needs to be
special cased in order to handle (de)interleaving.

For image->image copies, deinterleaving is not supported. Aspects must
match between src and dest for non-planar images.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
2025-07-16 10:56:03 +00:00
Olivia Lee
91c037f228 panfrost: add support for (de)interleaving Z24S8 in pan_tiling
This is needed for VK_EXT_host_image_copy which, like the buffer<->image
copy commands, treats depth/stencil like separate image planes and
requires copying each separately.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
2025-07-16 10:56:03 +00:00
Olivia Lee
93c5d1be94 pan/shared: refactor pan_tiling
We don't need to use fixed-size pixel_t types and put the tiling loop in
a macro in order to get good codegen for this. Replacing the fixed-size
types with memcpy/__builtin_assume_aligned, the compiler is still able
to generate multi-word load/store instructions. Without the fixed-size
types, the only advantage of putting this in a macro is to ensure the
code is specialized on size/is_store/shift, but we can get the same
specialization by making the functions ALWAYS_INLINE.

Measured performance in VK_EXT_host_image_copy benchmraks is unchanged,
and generated assembly looks effectively identical to the previous version.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
2025-07-16 10:56:02 +00:00
Olivia Lee
476fb5c5cf panvk: implement VK_EXT_host_image_copy for tiled images
Since we don't have a CPU implementation of AFBC compression, host copy
is only implemented for u-interleaved tiling.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
2025-07-16 10:56:02 +00:00
Olivia Lee
0f6a06bbba pan/shared: add function to copy between two tiled images
This is needed for VK_EXT_host_image_copy.

Most other mesa drivers use a similar approach to implement tiled->tiled
copy, with a few differences. They use a temp buffer sized for only one
tile, don't attempt to tile-align the copies in either the src or dest,
and they don't have the memcpy fast path. I measured performance of a
variety of implementations on a rock5b, and found:

 - The fast path for when the copy region is tile-aligned is a 167%
   improvement.
 - Aligning the temp buffer chunks to src tiles is a 20% improvement.
 - Using a 64k buffer instead of a tile-sized buffer is a 14%
   improvement. This buffer size appears optimal in my benchmark,
   smaller and larger buffers are both slower. Skipping the chunk
   approach and just (de)tiling to a temp buffer that fits the whole
   image (what NVK does) is also slower.
 - I had no luck with attempts at a direct tiled->tiled copy algorithm
   that didn't need a temp buffer. The fastest I got was ~1/4 the speed
   of the temp buffer implementation.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
2025-07-16 10:56:01 +00:00
Olivia Lee
d3150006be panvk: split out helper function for checking AFBC support
For supporting VK_EXT_host_image_copy for tiled images, we need to be
to determine whether AFBC may be supported in
vkGetPhysicalDeviceImageFormatProperties2.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
2025-07-16 10:56:01 +00:00
Olivia Lee
1cd61ee948 panvk: implement VK_EXT_host_image_copy for linear color images
Depth/stencil and tiled images require some additional complexity, so
will be implemented in later commits.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
2025-07-16 10:56:01 +00:00
Olivia Lee
adb85dc307 panvk: store BO offset in panvk_image_plane
For VK_EXT_host_image_copy, we need to access image memory from the CPU
after mapping the BO. The existing base field in pan_image_plane doesn't
work for this because it's a GPU address and we don't have a mechanism
to recover the GPU base address of an image's BO to calculate the offset.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
2025-07-16 10:56:00 +00:00