Cache for vkGetPhysicalDeviceImageFormatProperties2 as it is observed
to be called repeatedly with zink/proton layers.
Cache design is the same as the image requirements cache, generating
a hash key from pImageFormatInfo and storing pImageFormatProperties
into a hash table.
There are a couple differences though:
- VkResult gets cached when the query returns NOT_SUPPORTED.
- Unlike pMemoryRequirements that returns VkMemoryRequirements2 and
possibly VkMemoryDedicatedRequirements, VkImageFormatProperties2
has various pNext chains that can be optionally passed in. Hash
the existence of these pNext so that they are considered different
queries and the underlying pNext struct can be optionally cached.
The alternative would be to modify the query to always chain these
pNext so all of them would be cached, but it is unlikely for queries
to only differ in pNext chains.
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27401>
lock the entire scope when storing image reqs cache entry to prevent
entry being added between the split locks.
Fixes: b51ff22fbe ("venus: support caching image memory requirements")
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27401>
This works around some Unity engine behaivor with ANGLE-on-Venus, when
cmd pools are created on main thread once while the render thread only
does descriptor pool creation for set allocations during recording time.
This change also explicitly forces async pipeline create for threads
creating the device instead of implicitly via feedback cmd pool create.
This ensures intended behavior when feedback is disabled.
Fixes: d17ddcc847 ("venus: dispatch background shader tasks to secondary ring")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27347>
On the off chance the combined list resolves to empty due to resets,
skip adding query feedback by not increasing the total cmd buffer
count for query feedback.
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27163>
Instead of just recyling 1 linked query feedback cmd for use and
defering the actualy recycle, recycle all linked cmds found when
setting up submission immediately.
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27163>
The list free_query_feedback_cmds for recycling query feedback cmds was
only used in vn_command_pool when it was a vn_feedback_cmd_pool.
For clarity, refactor and store this list in vn_feedback_cmd_pool
instead and introduce a new struct vn_query_feedback_cmd that references
the feedback cmd and the feedback cmd pool for tracking.
Refactor out the allocation portion of query feedback cmds into its own
function for allocating the new vn_query_feedback_cmd struct.
Fixes: 5b24ab91e4 ("venus: switch to unconditionally deferred query feedback")
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27163>
Ring submissions on tls ring are synchronous and single threaded, thus
without perf degradation, a single cmd can use the entire ring shmem.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27054>
This fixes VUID-vkCmdDraw-None-08600 violation when running gpl cts:
dEQP-VK...graphics_library.misc.bind_null_descriptor_set.*, where the
final pipeline layout is falsely dropped, leading to incompatible with
the pipeline layout of the bound descriptor set.
Fixes: a65ac274ac ("venus: Do pipeline fixes for VK_EXT_graphics_pipeline_library")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27054>
The render pass (legacy or dynamic) can be ignored only in a pipeline
library with just Vertex Input State. For other cases, even when raster
has been discarded, it is still needed at the api level to avoid
violating a bunch of VUs which validate against attachments. The legacy
pass byitself is also necessary to tell whether it's legacy or dynamic.
So venus implemented at the VK api level should not drop render pass in
those cases.
The layout to be ref'ed is the one to be used, so we don't care about
those being ignored, which has already been removed in the pipeline info
fix.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27054>
Currently driver side heap alloc obj ptr is used as object id, which is
used on the renderer side for actual vk obj mapping. However, this adds
an implicit dependency between any driver obj destroy/free and new obj
create/allocate because the heap obj freed up can be immediately
reallocated out.
With venus moving to multi-ring, the ordering between asynchronous obj
destroy/free and new obj create/allocate has to be guaranteed via driver
side non-primary ring submission always waiting for primary ring idle.
This can defeat the purpose of multi-ring in certain scenarios. So this
change adds a way to assign unique id to object.
Even before multi-ring, the unique object id can make device and queue
object alloc/free more robust without hidden ordering requirements. This
also fixes some oom cts which can intentionally fail the submission of
an object destroy (renderer side obj is still present) while the driver
side freed object ptr being reused for another object creating, causing
object id reuse at renderer side object table.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27083>
ring_seqno_valid indicates a successful ring cmd submission, and can be
used to avoid invalid reply decoding due to failed submit alloc.
Otherwise, the garbled VkResult will mislead into initialization failure
instead of oom.
Below cts failure is fixed:
dEQP-VK.api.device_init.create_instance_device_intentional_alloc_fail.basic
Fixes: ec131c6e55 ("venus: use instance allocator for ring allocs")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27026>
The vdrm_execbuf() missed to set the seqno field for requests sent to host.
This causes vdrm_host_sync() to lock up due to the unset seqno in a case
where two or more threads are using vdrm_execbuf() and vdrm_send_req()
concurrently, like in this scenario:
thread1: vdrm_send_req() shmem->seqno=1 req->seqno=2
thread2: vdrm_execbuf() shmem->seqno=1 req->seqno=0
thread1: vdrm_host_sync() shmem->seqno=0 req->seqno=2
Fix the lockup by setting the seqno in vdrm_execbuf().
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27021>
This change adds a new venus feature: TLS ring
- co-owned by TLS and VkInstance
- initialized in TLS upon requested
- teardown happens upon thread exit or instance destroy
- teardown is split into 2 stages:
1. one owner locks and destroys the ring and mark destroyed
2. the other owner locks and frees up the tls ring storage
TLS ring supercedes the prior secondary ring and enables multi-thread
shader compilation and reduces the loading time of ROTTR from ~110s to
~21s (native is ~19s).
TLS ring is in fact a synchronous ring by design, and can be used to
redirect all exisiting synchronous submissions trivially. e.g. upon any
vn_call_*, request a TLS ring, wait for deps and then submit.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26838>
This is to prepare for a new multi-ring design. A preview is as below:
- primary ring will migrate to be asynchronous only
- synchronous commands will be via thread local rings
- pipeline creations will be synchronous and dispatched to thread local
rings unless being forced to be async on primary ring
- perf option no_multi_ring is made generic to force a single ring
Pipeline cache retrieval is temporarily moved back to primary ring, but
will be moved to thread local later since it's a synchronous command.
The dependency resolving will follow the same with pipeline create with
detailed rationale later.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26838>
Similar idea to buffer memory requirements cache but CreateImage has
many more params that may affect the memory requirements.
Instead of a sparse array, generate a SHA1 hash of all the relevant
VkImageCreateInfo params including relevant pNext structures and use
part of the hash as a key to a hash table that stores the cache entries.
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26118>
That enables testing and development of the Venus-based Vulkan HAL
on a wider range of Android systems - flavors of "Cuttlefish" are
of particular practical interest. At this point, only two gralloc
variants are supported: CrOS and IMapper v4. The fallback gralloc
and any gralloc adapter modules relying on it (GBM, QCOM) are out
of scope for Android Vulkan HAL now.
Signed-off-by: VladimirTechMan <VladimirTechMan@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26858>
also remove a redundant trace point and adjust the position for another
to better tell whether fixes have been applied to the pipeline info
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26751>
Copy sanitization incorrectly included +1 range of the reset.
Eg Reset Query=0 QueryCount=5 is [0,5) exclusive, not [0,5] inclusive.
Fixes: 5b24ab91e4 ("venus: switch to unconditionally deferred query feedback")
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26603>
Summary:
- Add a perf option to force primary ring submission
- Let device own secondary ring(s) for ad-hoc spawn
- For threads where swapchain and command pool are created, track with
TLS to instruct ring dispatch.
- If the pipeline creation or cache retrieval happens on the background
threads not on the hot paths, force synchronous and dispatch to the
secondary ring after waiting for primary ring becoming current.
- If the pipeline creation or cache retrieval happens on the hot paths
threads, dispatch to the primary ring to avoid being blocked by those
tasks on the secondary ring.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
Sync protocol and fix all the interfaces, otherwise we have to generate
two sets of headers with both interfaces to separate protocol sync and
the driver side adaptation.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>