We were missing a couple bits from hash and a bunch of stuff from the
comparison. This puts most of nir_tex_instr into a single pack_tex
helper that's used by both and grabs everything we were missing.
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36234>
(cherry picked from commit 557ac588e4)
var_nodes size is x4 of nir defs count, since we need to track a node
for each individual channel of a register write. We don't need that for
SSA, but we used non-shifted indices for SSA, which made the compiler
reliant of reg nir def indeces to start after all the SSA indices.
That has changed with 7b70b419b528("nir: always index SSA defs before
printing").
Fix that by shifting SSA index as well, that would allow not to rely on
any assumptions on nir def indices.
Backport-to: 25.2
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36206>
(cherry picked from commit 2e38cbc40c)
In VCN5, the AV1 context buffer has changed to a bigger
one than VCN4. It fixed an AV1 decoding issue on VCN5.
Cc: mesa-stable
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36208>
(cherry picked from commit 32a2012975)
Per spec VUID-VkMemoryAllocateInfo-pNext-01874:
If the parameters do not define an import operation, and the pNext chain
includes a VkExportMemoryAllocateInfo structure with
VK_EXTERNAL_MEMORY_HANDLE_TYPE_ANDROID_HARDWARE_BUFFER_BIT_ANDROID
included in its handleTypes member, and the pNext chain includes a
VkMemoryDedicatedAllocateInfo structure with image not equal to
VK_NULL_HANDLE, then allocationSize must be 0
- before: total 116, skip 66, pass 36, fail 14
- after: total 116, skip 66, pass 50, fail 0
Fixes: cebb2bf266 ("lavapipe: Add AHB extension")
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36204>
(cherry picked from commit 209e402720)
The implementation only takes the ownership after a successful import.
On import failure, the caller is going to handle the fd. Meanwhile,
amend a missing error code on an error path.
Fixes: 895d3399f7 ("lavapipe: add support for KHR_external_memory_fd")
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36204>
(cherry picked from commit 160cd3a317)
lvp hasn't used common device memory obj, and it allocates and imports
ahb on its own. Thus it has to implement the AHB export api itself.
- before: total 116, skip 66, pass 24, fail 26
- after: total 116, skip 66, pass 36, fail 14
Fixes: cebb2bf266 ("lavapipe: Add AHB extension")
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36204>
(cherry picked from commit f1af533b2c)
Currently tile swizzle can only be non zero for single plane
formats, for multi plane formats we always set PIPE_BIND_SHARED.
Luma only (Y400) JPG decode and encode with RGB input surface (EFC)
are the only two cases where we can get surface with tile swizzle
and ignoring it would result in corrupted output.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13346
Acked-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35647>
(cherry picked from commit b665bd21cb)
While the API spec does describe which flags _may_ be passed in, the
overall CL working group agreement is, that implementations should expect
random flags to be passed in as other implementations _may_ use them to
further restrict or allow image formats.
Also fix validation for importing GL objects while at it.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36216>
(cherry picked from commit d8793e3874)
Otherwise we can have a case where binning VS uses more consts than
full VS (when safe variant is used for full VS), that will result in
a rendering issue because SP_VS_CONST_CONFIG.CONSTLEN is shared between
full and binning VS in PROGRAM_CONFIG state and gets the value from the
full VS.
There are two alternative solutions that can allow binning VS to always
use maximum constlen:
- Move constlen emission to per-XS config. This interferes
PROGRAM_CONFIG state which uploads consts and does SP_UPDATE_CNTL.
Consts would need to be uploaded after constlen is defined, while
SP_UPDATE_CNTL must be done before per-XS state is emitted.
Also having SP_UPDATE_CNTL in a draw state that is always DIRTY
isn't great.
Something didn't work out on A6XX, so this idea was dropped.
- Emit constlen again in VS_BINNING draw state. This seem to work
but also likely an undefined behaviour since constlen is changed
after some consts are uploaded.
Cc: mesa-stable
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36203>
(cherry picked from commit 6003a89b89)
We need the job artifacts in all cases to gather performance data.
Since commit b723bc80d2, we were only saving them on failures.
Fixes: b723bc80d2 ("ci/lava: inherit .piglit-traces-test in .lava-piglit-traces and deduplicate configs")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36195>
(cherry picked from commit b7f1f40bf4)
After commit 545727f97c (ci/android: Move ANDROID_CTS_MODULES to build
script, 2025-06-24) the comment about ANDROID_CTS_MODULES in
lvp-android-angle-android-cts-include.txt has become inaccurate.
Update the instructions to reflect the latest status.
Fixes: 545727f97c
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36187>
(cherry picked from commit 0f09c9436b)
This is already being set as needed everywhere else, and would cause
issues in future work.
Use the relative `install/` path for `HWCI_TEST_SCRIPT` as that's
supported by both HW runners and FDo runners.
A separate MR will fix the `/install/` vs `install/` mess.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36145>
We were allocating a fixed number of temporary registers; this isn't
always enough, and in fact we should have calculated the number of
temporaries required.
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 6c64ad934f ("panfrost: spill registers in SSA form")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36135>
Copy between memory and a depth/stencil image requires copying the depth
and stencil aspects in separate calls. For D32S8, this needs to be
special cased in order to handle (de)interleaving.
For image->image copies, deinterleaving is not supported. Aspects must
match between src and dest for non-planar images.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
This is needed for VK_EXT_host_image_copy which, like the buffer<->image
copy commands, treats depth/stencil like separate image planes and
requires copying each separately.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
We don't need to use fixed-size pixel_t types and put the tiling loop in
a macro in order to get good codegen for this. Replacing the fixed-size
types with memcpy/__builtin_assume_aligned, the compiler is still able
to generate multi-word load/store instructions. Without the fixed-size
types, the only advantage of putting this in a macro is to ensure the
code is specialized on size/is_store/shift, but we can get the same
specialization by making the functions ALWAYS_INLINE.
Measured performance in VK_EXT_host_image_copy benchmraks is unchanged,
and generated assembly looks effectively identical to the previous version.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
Since we don't have a CPU implementation of AFBC compression, host copy
is only implemented for u-interleaved tiling.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
This is needed for VK_EXT_host_image_copy.
Most other mesa drivers use a similar approach to implement tiled->tiled
copy, with a few differences. They use a temp buffer sized for only one
tile, don't attempt to tile-align the copies in either the src or dest,
and they don't have the memcpy fast path. I measured performance of a
variety of implementations on a rock5b, and found:
- The fast path for when the copy region is tile-aligned is a 167%
improvement.
- Aligning the temp buffer chunks to src tiles is a 20% improvement.
- Using a 64k buffer instead of a tile-sized buffer is a 14%
improvement. This buffer size appears optimal in my benchmark,
smaller and larger buffers are both slower. Skipping the chunk
approach and just (de)tiling to a temp buffer that fits the whole
image (what NVK does) is also slower.
- I had no luck with attempts at a direct tiled->tiled copy algorithm
that didn't need a temp buffer. The fastest I got was ~1/4 the speed
of the temp buffer implementation.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
For supporting VK_EXT_host_image_copy for tiled images, we need to be
to determine whether AFBC may be supported in
vkGetPhysicalDeviceImageFormatProperties2.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
Depth/stencil and tiled images require some additional complexity, so
will be implemented in later commits.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
For VK_EXT_host_image_copy, we need to access image memory from the CPU
after mapping the BO. The existing base field in pan_image_plane doesn't
work for this because it's a GPU address and we don't have a mechanism
to recover the GPU base address of an image's BO to calculate the offset.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>