From be75ece0952322444483e372b73c380cf7caff45 Mon Sep 17 00:00:00 2001 From: squidbus Date: Mon, 27 Apr 2026 02:22:00 -0700 Subject: [PATCH] kk: Workaround for GPU capture under Rosetta 2. GPU capture bugs if heap sizes are not aligned to at least 16K. Ensuring that they are is not expected to impact memory usage since it seems the actual internal memory allocation is already aligned to 16K, the issue is only with how the heap reports its size versus the allocation size that capture uses. Reviewed-by: Aitor Camacho Part-of: --- docs/drivers/kosmickrisp/workarounds.rst | 25 ++++++++++++++++++++++++ src/kosmickrisp/vulkan/kk_bo.c | 8 ++++++++ 2 files changed, 33 insertions(+) diff --git a/docs/drivers/kosmickrisp/workarounds.rst b/docs/drivers/kosmickrisp/workarounds.rst index 85ee8ce14a6..cface470f4f 100644 --- a/docs/drivers/kosmickrisp/workarounds.rst +++ b/docs/drivers/kosmickrisp/workarounds.rst @@ -79,6 +79,31 @@ presumably because it re-ordered the operations to after the loop. To work around this, we add a trivial, always-true runtime condition to the break to ensure that the prior logic is not re-ordered. +KK_WORKAROUND_8 +--------------- +| macOS version: 26.4.1 +| Metal ticket: FB22579201 (@squidbus) +| Metal ticket status: Waiting resolution +| CTS test failure: N/A +| Comments: + +Metal GPU capture uses ``currentAllocatedSize`` to create an internal buffer +over ``MTLHeap`` for the purpose of capturing its contents. + +Suppose we have a heap whose size is under the memory page size. Under native +ARM execution, both the heap ``size`` and ``currentAllocatedSize`` will be +aligned up to 16K. However, it has been observed that under Rosetta 2, ``size`` +will be aligned up to 4K but ``currentAllocatedSize`` will still be aligned up +to 16K. + +These two in combination mean that, when GPU capture attempts to create buffers +for these small heaps, it will fail, as ``currentAllocatedSize`` is larger than +the heap ``size``. This will cause Metal validation layer errors if they are +enabled, and attempting to take a GPU capture will crash the application. + +This workaround ensures that under Rosetta 2, heap sizes will be aligned to a +minimum of 16K, prevening this scenario from occurring. + | Log: | 2026-04-27: Workaround implemented diff --git a/src/kosmickrisp/vulkan/kk_bo.c b/src/kosmickrisp/vulkan/kk_bo.c index eb9b507b86f..60d451f060c 100644 --- a/src/kosmickrisp/vulkan/kk_bo.c +++ b/src/kosmickrisp/vulkan/kk_bo.c @@ -23,6 +23,14 @@ kk_alloc_bo(struct kk_device *dev, struct vk_object_base *log_obj, mtl_heap_buffer_size_and_align_with_length(dev->mtl_handle, &size_B, &minimum_alignment); minimum_alignment = MAX2(minimum_alignment, align_B); + +#if DETECT_ARCH_X86_64 + /* KK_WORKAROUND_8 */ + if (!(dev->disabled_workarounds & BITFIELD64_BIT(8))) { + minimum_alignment = MAX2(minimum_alignment, 16384); + } +#endif + size_B = align64(size_B, minimum_alignment); mtl_heap *handle = mtl_new_heap(dev->mtl_handle, size_B, KK_MTL_RESOURCE_OPTIONS);