lvp hasn't used common device memory obj, and it allocates and imports
ahb on its own. Thus it has to implement the AHB export api itself.
- before: total 116, skip 66, pass 24, fail 26
- after: total 116, skip 66, pass 36, fail 14
Fixes: cebb2bf266 ("lavapipe: Add AHB extension")
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36204>
(cherry picked from commit f1af533b2c)
Currently tile swizzle can only be non zero for single plane
formats, for multi plane formats we always set PIPE_BIND_SHARED.
Luma only (Y400) JPG decode and encode with RGB input surface (EFC)
are the only two cases where we can get surface with tile swizzle
and ignoring it would result in corrupted output.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13346
Acked-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35647>
(cherry picked from commit b665bd21cb)
While the API spec does describe which flags _may_ be passed in, the
overall CL working group agreement is, that implementations should expect
random flags to be passed in as other implementations _may_ use them to
further restrict or allow image formats.
Also fix validation for importing GL objects while at it.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36216>
(cherry picked from commit d8793e3874)
Otherwise we can have a case where binning VS uses more consts than
full VS (when safe variant is used for full VS), that will result in
a rendering issue because SP_VS_CONST_CONFIG.CONSTLEN is shared between
full and binning VS in PROGRAM_CONFIG state and gets the value from the
full VS.
There are two alternative solutions that can allow binning VS to always
use maximum constlen:
- Move constlen emission to per-XS config. This interferes
PROGRAM_CONFIG state which uploads consts and does SP_UPDATE_CNTL.
Consts would need to be uploaded after constlen is defined, while
SP_UPDATE_CNTL must be done before per-XS state is emitted.
Also having SP_UPDATE_CNTL in a draw state that is always DIRTY
isn't great.
Something didn't work out on A6XX, so this idea was dropped.
- Emit constlen again in VS_BINNING draw state. This seem to work
but also likely an undefined behaviour since constlen is changed
after some consts are uploaded.
Cc: mesa-stable
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36203>
(cherry picked from commit 6003a89b89)
We need the job artifacts in all cases to gather performance data.
Since commit b723bc80d2, we were only saving them on failures.
Fixes: b723bc80d2 ("ci/lava: inherit .piglit-traces-test in .lava-piglit-traces and deduplicate configs")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36195>
(cherry picked from commit b7f1f40bf4)
After commit 545727f97c (ci/android: Move ANDROID_CTS_MODULES to build
script, 2025-06-24) the comment about ANDROID_CTS_MODULES in
lvp-android-angle-android-cts-include.txt has become inaccurate.
Update the instructions to reflect the latest status.
Fixes: 545727f97c
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36187>
(cherry picked from commit 0f09c9436b)
This is already being set as needed everywhere else, and would cause
issues in future work.
Use the relative `install/` path for `HWCI_TEST_SCRIPT` as that's
supported by both HW runners and FDo runners.
A separate MR will fix the `/install/` vs `install/` mess.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36145>
We were allocating a fixed number of temporary registers; this isn't
always enough, and in fact we should have calculated the number of
temporaries required.
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 6c64ad934f ("panfrost: spill registers in SSA form")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36135>
Copy between memory and a depth/stencil image requires copying the depth
and stencil aspects in separate calls. For D32S8, this needs to be
special cased in order to handle (de)interleaving.
For image->image copies, deinterleaving is not supported. Aspects must
match between src and dest for non-planar images.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
This is needed for VK_EXT_host_image_copy which, like the buffer<->image
copy commands, treats depth/stencil like separate image planes and
requires copying each separately.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
We don't need to use fixed-size pixel_t types and put the tiling loop in
a macro in order to get good codegen for this. Replacing the fixed-size
types with memcpy/__builtin_assume_aligned, the compiler is still able
to generate multi-word load/store instructions. Without the fixed-size
types, the only advantage of putting this in a macro is to ensure the
code is specialized on size/is_store/shift, but we can get the same
specialization by making the functions ALWAYS_INLINE.
Measured performance in VK_EXT_host_image_copy benchmraks is unchanged,
and generated assembly looks effectively identical to the previous version.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
Since we don't have a CPU implementation of AFBC compression, host copy
is only implemented for u-interleaved tiling.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
This is needed for VK_EXT_host_image_copy.
Most other mesa drivers use a similar approach to implement tiled->tiled
copy, with a few differences. They use a temp buffer sized for only one
tile, don't attempt to tile-align the copies in either the src or dest,
and they don't have the memcpy fast path. I measured performance of a
variety of implementations on a rock5b, and found:
- The fast path for when the copy region is tile-aligned is a 167%
improvement.
- Aligning the temp buffer chunks to src tiles is a 20% improvement.
- Using a 64k buffer instead of a tile-sized buffer is a 14%
improvement. This buffer size appears optimal in my benchmark,
smaller and larger buffers are both slower. Skipping the chunk
approach and just (de)tiling to a temp buffer that fits the whole
image (what NVK does) is also slower.
- I had no luck with attempts at a direct tiled->tiled copy algorithm
that didn't need a temp buffer. The fastest I got was ~1/4 the speed
of the temp buffer implementation.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
For supporting VK_EXT_host_image_copy for tiled images, we need to be
to determine whether AFBC may be supported in
vkGetPhysicalDeviceImageFormatProperties2.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
Depth/stencil and tiled images require some additional complexity, so
will be implemented in later commits.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
For VK_EXT_host_image_copy, we need to access image memory from the CPU
after mapping the BO. The existing base field in pan_image_plane doesn't
work for this because it's a GPU address and we don't have a mechanism
to recover the GPU base address of an image's BO to calculate the offset.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
Advertising SAMPLED_IMAGE_DEPTH_COMPARISON is a no-op for images that
don't have SAMPLED_IMAGE_BIT, but it's confusing and results in us
advertising a lot of formats that with only the
SAMPLE_IMAGE_DEPTH_COMPARISON feature that aren't usable for anything.
For R32_UINT and R32_SINT, the change is just a cleanup, because we
always support these for storage images.
Whe we implement VK_EXT_host_image_copy, advertising unusable formats
triggers failures in dEQP-VK.api.image_clearing.*, so it's convenient to
have features==0 for all unusable formats.
Fixes: 70b8056df1 ("panvk: Enable KHR_format_feature_flags2 and use them")
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35910>
The maximum number of DWORDS per IB is limited by the hardware. So,
when the number of sequences is too high, it would just hang.
The solution here is to implement IB chaining inside the DGC cmdbuf
itself, so that a sequence chains the next one basically.
In practice, games only use up to 4K sequences and they aren't affected
by this change.
This fixes dEQP-VK.dgc.ext.misc.properties.maxIndirectSequenceCount.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13536
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36062>
The workflow rules added in 7bfb51a7e6 catch the `if` ladder too early and
bypass the `is-scheduled-pipeline`, `is-push-to-upstream-default-branch`,
and `is-push-to-upstream-staging-branch` rules setting the correct job
priority.
Fixes: 7bfb51a7e6 ("ci: Fix missing pipelines on user pipelines in MRs")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35501>