With this patch, VM binds remain synchronous in relation to vm_bind()
KMD backend calls. However, the syscalls required for VM bind is
reduce in 2(in the optimal cases), the syncobj create and destroy
syscall are replaced by he usage a timeline syncobj.
Next step will be make this completely asynchronous.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26805>
This is really the right thing to do; it makes libva pick up the driver
we just built in a meson devenv. Not doing so can be kinda fatal;
otherwise we can end up loading different gallium-drivers for OpenGL and
VA-API, which can end up being pretty fatal.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27514>
Apparently, Midgard GPUs don't like when the last 2 words of
compute/vertex jobs contain garbage. Extend the compute job definition
to include a padding section thus aligning the job on a 64-byte boundary,
and add the according pan_section_pack() calls where we have a
compute job filled.
Fixes: b76420be1f ("panfrost: Split command stream descriptor definitions per-gen")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10558
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Tested-by: Anton Bambura <jenneron@postmarketos.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27515>
Xe KMD don't refcount anything, so resources could be freed while they
are still in use if we don't wait for exec_queue to be idle.
This issue was found with Xe KMD error capture, VM was already
destroyed when it attemped to capture error state but it can also
happen in applications that did not hang.
This fixed the '*ERROR* GT0: TLB invalidation' errors when running
piglit all test list.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27500>
iris_wait_syncobj() succeed if IOCTL return is 0 otherwise it failled.
Cc: mesa-stable
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27500>
There are only three very simple tests but several hundreds of lines of
corresponding infrastructure. The tests never helped me to catch a real
issue and in fact I've only seen them fail when I changed the
corresponding pass semantics and forgot to update the tests (or like
when we enabled r300 in debian-testing and AddressSanitizer complained
that there was not a single free in the tests).
We now have a real CI and both piglit and dEQP contain tests that cover
the regalloc and presub testing here and a shader-db run should catch
if we fail the one omod optimization the optimize test is testing. So I
think its time we just get rid of the tests.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27501>
The upside is that this removes 600 lines of code. The downside is
that if instance divisors are used, we will compile the VS on demand.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27120>
This removes the take_ownership parameter and defines the behavior as if
take_ownership was always true, which is the fast path. This way, we don't
have to have 2 codepaths in set_vertex_buffers of every driver.
The old behavior is optionally available through util_set_vertex_buffers.
It also documents a new constraint that count in set_vertex_buffers must be
equal to the number of vertex buffers referenced by vertex elements or 0.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27492>
Move the execute function pointers to struct threaded_context, so that
drivers can change it. Also move struct tc_vertex_buffers into the header
file, so that drivers can implement their own function.
This allows drivers to inline pipe_context::set_vertex_buffers for TC.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27492>
It has no effect other than unreferencing buffers, which are then
immediatelly re-bound by st/mesa, so it doesn't do anything.
The name is also incorrect because it unbound all vertex buffers.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27492>
It's pointless to increase the refcount of the backing BO and then decrease
it in the same function when we already reference the slab entry BO that
holds the reference of the backing BO.
amdgpu_do_add_buffer is inlined in amdgpu_cs_submit_ib, so the new parameter
doesn't cost anything.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27408>
We unset RADEON_USAGE_SYNCHRONIZED, but we still checked it.
Iterate over such buffers separately without checking
RADEON_USAGE_SYNCHRONIZED.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27408>
The first loop updates sequence numbers. The second loop creates the BO list.
Do both in the same loop.
Each loop has to reload the whole BO list from the L2 cache or higher, so we
do it twice. By merging the loops, we only load the BO list from the L2
cache once.
The final result is actually 2 loops, but they iterate over different ranges
of the BO list, so each element is read only once.
If global_bo_list is enabled, it overwrites the BO list after all is done.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27408>
The first loop is inside amdgpu_add_sparse_backing_buffers. The second loop
iterates over sparse BOs to update sequence numbers.
Each loop has to reload the whole BO list from the L2 cache or higher, so we
do it twice. By merging the loops, we only load the BO list from the L2
cache once.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27408>
The first loop is inside amdgpu_add_slab_backing_buffers. The second loop
iterates over BOs to update sequence numbers.
Each loop has to reload the whole BO list from the L2 cache or higher, so we
do it twice. By merging the loops, we only load the BO list from the L2
cache once.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27408>
the lowering for is_sparse_texels_resident must run after all the instances of
sparse_residency_code_and have been eliminated in order to effectively guarantee
that there are no remaining cases of tex.e (sparse) component access
fixes#10548
Fixes: f0ca477b500 ("zink: run sparse lowering after all optimization passes")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27474>