Refactor the intermediate buffer copy path to use a generic callback
approach, making the code more maintainable and easier to extend with
new format conversions.
The core copy_intermediate() function is now format-agnostic, accepting
a conversion callback that handles the actual data transformation. This
moves format-specific logic (RGB<->RGBA conversion and ASTC
decompression) into dedicated callback functions, making the conversion
path explicit at each call site rather than hidden inside the copy
function.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37691>
Instead of allocating a buffer for the entire RGB->RGBA conversion
process. Just allocate a smaller buffer that is the size of a tile and
do the conversion one tile at a time.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37691>
Move ASTC decompression code from mesa/main to src/util to make it
available for use by Vulkan drivers. This allows the Intel ANV driver
to use CPU-based ASTC decompression for host image copy operations
on hardware that doesn't natively support ASTC formats.
The _mesa_unpack_astc_2d_ldr() function signature is updated to use
pipe_format instead of mesa_format for better integration with the
util format system.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37691>
The needs_temp_copy() function was incorrectly identifying some
depth/stencil formats as needing RGB<->RGBA conversion.
VK_FORMAT_D32_SFLOAT_S8_UINT maps to PIPE_FORMAT_Z32_FLOAT_S8X24_UINT,
which has 3 channels (F32 depth, UP8 stencil, X24 padding). The
component count check (== 3) was matching this as an RGB color format,
causing depth/stencil images to incorrectly use the RGB conversion path.
Add an explicit vk_format_is_depth_or_stencil() check before the
component count test to ensure depth/stencil formats always use the
direct copy path.
Fixes: f97b51186f ("anv: intermediate RGB <-> RGBX copy for HIC")
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37691>
At least some drivers need a full modeset to change the Colorspace
property or to en-/disable HDR mode. E.g., at least amdgpu-kms as
tested under Linux 6.8 on Polaris needs it. Otherwise the atomic
commit for disabling HDR in _wsi_display_cleanup_state() will fail,
and the connector stays stuck in HDR mode after vkDestroySwapchainKHR().
Fixes: 1ed78dd7ec ("wsi/display: Clean up DRM hdr/color state on swapchain destruction")
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Autumn Ashton <misyl@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37880>
For a selected non-default imageColorSpace during swapchain creation,
make sure that proper HDR setup also works even if a client app does not
explicitly call vkSetHdrMetadataEXT() in time.
Assign the EDID provided metadata here, so the 1st atomic commit will
set Colorspace and HDR metadata properties on the connector, to make sure
HDR or other wide color gamut modes get enabled.
Without this, the chain->color_outcome_serial would stay at zero and
the properties would not ever get assigned during drm_atomic_commit(),
leaving HDR disabled on the display sink.
Fixes: 13137393f6 ("wsi/display: Expose HDR10 colorspace based on EDID")
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Autumn Ashton <misyl@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37880>
CTA-861-G section 6.9.1 Static Metadata Type 1 declares that zero values
for different groups of HDR Metadata properties are allowed, including
zero nits values for max display mastering luminance, max content light
level, max frame-average light level and min display mastering luminance.
A zero value is meant to be treated by the video sink as "undefined" /
"unknown", and handled accordingly. This is common for dynamically
generated visual content.
Therefore don't assert on some minimum nits level > 0, but only check for
a non-negative level.
Fixes: b4176393a0 ("wsi/display: Implement VK_EXT_hdr_metadata on KHR_display swapchain")
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Autumn Ashton <misyl@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37880>
The quality level may change every frame, but we can only enable
pre-encode on first frame because it changes context buffer layout
and currently we only allow the context buffer to grow (append recon
pics at the end).
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38257>
We've been building the Piglit replayer in the test-vk container/rootfs,
but all trace-replay testing in CI is actually done using the test-gl
rootfs.
Despite the naming, the "gl" and "vk" rootfs variants don't correspond to
the graphics API being tested - just the different sets of tools
bundled.
The required tools for trace replay are already included in the test-gl
rootfs, so there's no need to build or use the test-vk variant for this
purpose.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38282>
fixes O(N^2) memory usage and runtime around liveness/scheduling/spilling/RA,
and proves out the design for the common code sparse bitsets (I did need to make
an adjustment for this - worth the effort).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37908>
Some shaders, especially RTPSO shaders that have parts of the PSO
inlined, can become absolutely huge. Using a sparse bitset avoids
quadratic complexity in memory consumption for the liveness information.
This reduces peak memory usage in worst-case tests (hammering
compilation of many huge RTPSOs on 32 threads concurrently) by ~60%,
from 43GB to 18GB.
CPU time (seconds) differences for a workload with mostly small shaders:
Difference at 95.0% confidence
-5.27 +/- 1.08963
-0.88811% +/- 0.183626%
(Student's t, pooled s = 0.629735)
Peak resident set usage for the mostly-small workload:
Difference at 95.0% confidence
30809 +/- 13394.3
1.59276% +/- 0.69246%
(Student's t, pooled s = 7741.09)
CPU time for the heavy workload did not show any difference.
Co-authored-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37908>
Useful for potentially huge bitsets that are expected to be mostly
filled with zeroes, reducing memory consumption by assuming bits being
zero by default (without wasting memory to store zeroes).
Co-authored-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37908>
The code in question multiplies `uint32_t`s together and assigns them to
a `uint64_t`. It seems rather unlikely at there would be an overflow,
but we might as well do the cast.
CID: 1649587
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38289>
Our exposed limits say we shouldn't be able to, but let's add an assert
in case something changes, and to help Coverity out.
CID: 1662103
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37583>
For AHB VkBuffer import, the allocationSize comes from the raw external
AHB props query and it can be larger than the underlying buffer memory
requirement. So we must respect the allocationSize for the actual mem
import to support mapping the whole AHB size, and the dedicated buffer
info has to be stripped to obey the spec.
Test: CtsNativeHardwareTestCases no longer crashes on debug build panvk
Fixes: 66bbd9eec8 ("panvk: implement AHB image deferred init and memory alloc")
Tested-by: Valentine Burley <valentine.burley@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38274>
I routinely don't update the docs because the build (hawkmoth in
particular) is a pain. The prior instructions almost worked, except that
on Debian you need to use a venv (which is a good idea to do for py3-clang
as well for reproducibility, anyway), and the versions that were listed
didn't actually run any more due to a revoked dependency.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38266>
We want to know whether or not we successfully initialized before
filling out physical device properties.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38284>
CTS regressed due to c2a6fb6419 not considering cmd stages for
signaling and waiting. Before this patch panvk did consider cmd stages,
but not semaphore stage masks.
With this fix we consider both which is what the spec requires
Fixes: c2a6fb6419 ("panvk: cull semaphores in unrelated subqueues")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38138>
Setting MESA_VK_VALIDATE_SHADER_BINARIES will cause the shader code to
round-trip every shader through [de]serialize and only ever use the
deserialized version. This catches bugs where the driver may drop
things in the [de]serialization process. It also deserializes the new
shader again and compares it against the original to ensure that
deserialize -> serialize is idempotent.
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36647>
This is just a wrapper around ops->compile() for now but we'll extend it
in the next commit to add some validation.
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36647>
To match the VK_MAX_PIPELINE_BINARY_KEY_SIZE_KHR of only 32B.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36647>