Sync protocol and fix all the interfaces, otherwise we have to generate
two sets of headers with both interfaces to separate protocol sync and
the driver side adaptation.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
At first, no behavior change in this CL.
The instance level helper for normal command submission is left to work
with the current venus protocol. Meanwhile, we leave the helper to
submit recorded command buffer inside instance to it can later redirect
to the primary ring.
We've internalized a few ring helpers that no longer need to be exposed.
Besides, indirect submission decision is on per-ring basis since the
ring buffer can vary later.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
This change only moves the fields without changing the accessors. It's
better to let ring own its own upload cs encoder (which is backed by
shmem array) to avoid lock contention between indirect submissions
across rings.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
Now we are able to break up the original lock to allow shmem alloc to be
outside the ring mutex, as long as the reply shmem set is still coupled
with ring submission.
Add and expose vn_instance_reply_shmem_alloc helper which will be used
by rings separately later.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
The encoder must not be empty by then so switch to an assert. Failing to
get a reply shmem would end up with VK_ERROR_OUT_OF_HOST_MEMORY, thus
there's no need to track either.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
This can be thread-safe only because we have dropped seeking command
stream offset, which requires comparing pool shmem to decide conditional
set stream.
This is to prepare for later sharing reply shmem pool across rings.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
More considerations and details here:
- The seek is a bit lighter than set, since it assumes renderer side
resource being immutable. It does affect perf when Venus is still
making verbose synchronous calls at runtime (e.g. descriptor set,
buffer, device memory, etc).
- Seek still requires lock protection as the reply shmem must be
immutable before the seek and the followed cmd are committed to the
ring.
- Removing seek without doing set requires renderer change to always
bump the encoder end position according to what the original request
is instead of being ad-hoc upon what the host driver tells to write.
The overhead and extra complexity there isn't negligible.
- Further, removing seek requires each ring to track the prior reply
pool shmem in the multi-ring scenario. While the additional host side
resource lookup isn't costy as the number of resources is must less
than the vk object table.
- The nice thing is that we can make shmem pool thead safe to be more
easily shared across rings.
So we just drop it.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
The ring wake up is no longer costy as the other notifies followed by
the initial call won't be blocked by ring cmd execution anymore
(without vkr side big context lock). Reducing the timeout can help cpu
bound scenarios.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
This avoids a potential race condition if two threads are competing for
the monitor with the initial states, and the losing one may run into
alive status being false and abort.
Fixes: 4a4b05869a ("venus: check and configure new ringMonitoring feature")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reported-by: Lina Versace <lina@kiwitree.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
When calculating the system memory heap size, we report only 3/4 of
the total RAM size (or 1/2 for systems with less than 4GB of RAM).
In the memory budget extension query, we were reporting 90% of the
available system memory. If most of the memory in the system is free,
this could result in the total heap size being 3/4 of RAM, but the
memory available being 9/10 of RAM. But if the application tried to
allocate the memory reported as "available", it would exceed the heap
size. This can confuse some applications.
This patch makes the memory budget query clamp the available RAM to
the heap size, so it will never report more available than the heap
can provide. Unfortunately, this means that we'll report only 67.5%
of system memory as available (3/4 * 9/10). We may want to adjust
this estimate in the future.
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26553>
This was mainly useful for older Gen7.x GPUs with 32-bit PPGTT, which
are now supported by hasvk rather than anv. The remaining platforms
which anv supports have 36, 47, or 48-bit PPGTT, which imposes a 3/4
limit of 48GB, 96TB, and 192TB of memory.
The GPUs with 36-bit PPGTT are Elkhart Lake and Jasper Lake, which
appear to be Atom CPUs that have a maximum supported memory
configuration of 32GB or less, so this limit should not matter there.
Nor is a multi-TB limit likely to matter on our other parts.
Drop this check to simplify the heap and memory budget calculations.
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26553>
The code treated 0x00 and 0xff the same, but they aren't,
port over the codegen code.
Fixes GTF-GL45.gtf40.GL3Tests.transform_feedback3.transform_feedback3_skip_components
with zink on nvk
v2: drop padding to 0, tests still pass.
Fixes: 30f01c47c2 ("nak: Translate XFB info")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26559>
The SET_STREAM_OUT_BUFFER_LOAD_WRITE_POINTER registers have an
8 dword stride, but the code is only adding one dword between them
then the MME is calling an illegal method.
This is the simple fix, otherwise I think we'd have to multiply
somehow in the MME which seems pointless.
Fixes KHR-GL45.transform_feedback* on zink on nvk.
Fixes: 5fd7df4aa2 ("nvk: Support for vertex shader transform feedback")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26558>