Venus implements guest WSI on host external memory and thus cannot
transition guest wsi images to/from VK_IMAGE_LAYOUT_PRESENT_SRC_KHR.
Thus, when a client would attempt to transition a Venus wsi image
to/from VK_IMAGE_LAYOUT_PRESENT_SRC_KHR, Venus instead transitions
to/from VK_IMAGE_LAYOUT_GENERAL and performs an explicit ownership
transfer to/from VK_QUEUE_FAMILY_FOREIGN_EXT. Unfortunately, the
read-only guarantee of VK_IMAGE_LAYOUT_PRESENT_SRC_KHR is lost.
Upon the "acquire from foreign queue" side of that symmetry, when a
client would attempt to retain the contents of the image (i.e.
transition from VK_IMAGE_LAYOUT_PRESENT_SRC_KHR instead of
VK_IMAGE_LAYOUT_UNDEFINED), Venus knows that the image's backing memory
has not been modified. Thus, when those "acquire from FOREIGN queue"
ownership transfers flow to the native driver, Venus can signal it to
skip any acquisition-time validation of an image's internal data,
obtaining the same optimization as native WSI.
This is useful for drivers such as ARM's Mali (with Transaction
Elimination) that would otherwise need to recompute costly per-tile
checksums (CRCs) to ensure that they haven't gone stale during FOREIGN
ownership of the image's memory.
Signed-off-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29777>
Image memory barriers don't need to be fixed when Venus' internal
"presentable" layout is PRESENT_SRC (generally only in specific types of
debugging). In that case, skip barrier fixes as early as possible and
remove early returns from procedures deeper in the call stack.
Signed-off-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29777>
Prepare to allocate VkExternalMemoryAcquireUnmodifiedEXT structs from
command pool cached storage with the same lifetime as
VkImageMemoryBarrier(2) structs.
Also use common parameter naming and function call signatures for the
both the barrier and barrier2 variants.
Signed-off-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29777>
No need to track is_push_descriptor in templ. No need to conditionally
decide to use set or NULL handle since we pass NULL handle from the cmd
side. Also fixed the arg type mismatch in the template helper.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28686>
The check won't reduce much of the overhead but also adds more when
something is to be fixed (mostly the case for push descriptor).
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28563>
For guest vram, there's already roundtrip to protect device memory alloc
ordering. This change adds the same protection for shmem used in below
scenarios and optimize to wait for new shmem only.
- reply shmem
- indirect upload shmem
- cmd stream shmem
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28147>
Add a new free helper while renaming the alloc one as well. During query
record resolving, use a dropped list to store those records being reset.
This is to prepare for later further query record resolving.
This change also simplifies a query pool compare.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28112>
No behavior change, and below is the summary:
1. simplify to drop _timeline_ from semaphore feedback naming
2. update feedback structs to use obj_handle naming
3. for vn_feedback_cmd_pool, use fb_cmd_pool variable naming
4. for vn_feedback_buffer, use fb_buf variable naming
5. for query_feedback_cmd, use qfb_cmd variable naming (already use ffb)
6. s/submit_batches2/submit2_batches/
7. s/cmd_buffer_count/cmd_count/
8. use total_cmd_size instead of cmd_buffer_size if applicable
9. update vn_queue_submission's feedback_cmd_count to cmd_count
10. update setup time local feedback_cmd_count to extra_cmd_count
11. update feedback_event_cmd to event_feedback_cmd
12. other trivial renames
Most semaphore and query feedback cmd renamings are deferred to later
commits.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27758>
Instead of just recyling 1 linked query feedback cmd for use and
defering the actualy recycle, recycle all linked cmds found when
setting up submission immediately.
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27163>
The list free_query_feedback_cmds for recycling query feedback cmds was
only used in vn_command_pool when it was a vn_feedback_cmd_pool.
For clarity, refactor and store this list in vn_feedback_cmd_pool
instead and introduce a new struct vn_query_feedback_cmd that references
the feedback cmd and the feedback cmd pool for tracking.
Refactor out the allocation portion of query feedback cmds into its own
function for allocating the new vn_query_feedback_cmd struct.
Fixes: 5b24ab91e4 ("venus: switch to unconditionally deferred query feedback")
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27163>
Currently driver side heap alloc obj ptr is used as object id, which is
used on the renderer side for actual vk obj mapping. However, this adds
an implicit dependency between any driver obj destroy/free and new obj
create/allocate because the heap obj freed up can be immediately
reallocated out.
With venus moving to multi-ring, the ordering between asynchronous obj
destroy/free and new obj create/allocate has to be guaranteed via driver
side non-primary ring submission always waiting for primary ring idle.
This can defeat the purpose of multi-ring in certain scenarios. So this
change adds a way to assign unique id to object.
Even before multi-ring, the unique object id can make device and queue
object alloc/free more robust without hidden ordering requirements. This
also fixes some oom cts which can intentionally fail the submission of
an object destroy (renderer side obj is still present) while the driver
side freed object ptr being reused for another object creating, causing
object id reuse at renderer side object table.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27083>
This is to prepare for a new multi-ring design. A preview is as below:
- primary ring will migrate to be asynchronous only
- synchronous commands will be via thread local rings
- pipeline creations will be synchronous and dispatched to thread local
rings unless being forced to be async on primary ring
- perf option no_multi_ring is made generic to force a single ring
Pipeline cache retrieval is temporarily moved back to primary ring, but
will be moved to thread local later since it's a synchronous command.
The dependency resolving will follow the same with pipeline create with
detailed rationale later.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26838>
Summary:
- Add a perf option to force primary ring submission
- Let device own secondary ring(s) for ad-hoc spawn
- For threads where swapchain and command pool are created, track with
TLS to instruct ring dispatch.
- If the pipeline creation or cache retrieval happens on the background
threads not on the hot paths, force synchronous and dispatch to the
secondary ring after waiting for primary ring becoming current.
- If the pipeline creation or cache retrieval happens on the hot paths
threads, dispatch to the primary ring to avoid being blocked by those
tasks on the secondary ring.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
Sync protocol and fix all the interfaces, otherwise we have to generate
two sets of headers with both interfaces to separate protocol sync and
the driver side adaptation.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
All commands that make queries available have feedback cmds batched
and stored during recording. At submission time, for each batch
(SubmitInfo) these feedback cmds are recorded in a cmd buffer that is
appended after the last original cmd buffer (but before
semaphore/fence feedback).
Query reset cmds are deferred as well and also remove any prior feedback
cmds for queries its resetting within the batch.
Cc: 23.3 <mesa-stable>
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25413>
Link the query feedback cmd lifecycle to a cmd in the batch so that when
that last cmd gets reset/freed, we assert its safe to reset the query
feedback cmd. The cmd is then placed on the free list for reuse.
Some edge cases if the the last cmd is simultaneous or gets resubmitted.
Cc: 23.3 <mesa-stable>
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25413>
Add function to alloc a cmd buffer and record all the deferred query
feedback cmds into it at submission time.
Cc: 23.3 <mesa-stable>
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25413>
Should only log when there's batched query feedbacks in the suspended
render pass instance. Additionally gate behind debug option.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24379>
For secondary command buffers, vn_cmd_begin_render_pass was only used to
track inherited render pass previously. So we clean it up.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24103>
Reset cmd states during vkBeginCommandBuffer regardless of the
VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT for simplicity.
Cc: mesa-stable
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24103>
There's no functional changes:
1. remove unused function arg and use snake case
2. do early return for direct recording (avoid dup feedback checks)
3. use vk_alloc instead of vk_zalloc if applicable
4. move local struct closer to usage, and use assignment
5. convert secondary cmd in_render_pass condition check to assert
6. avoid redundant list_del upon freeing up
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24009>