It can be the case that a collect and one of its sources are assigned
to non-overlapping parts of the same merge set, for example:
ssa_1 = ...
ssa_2 = ...
ssa_3 = ...
ssa_4 = collect ssa_1, ssa_2 (kill), ssa_3
... = ssa_4 (kill)
ssa_5 = collect ssa_1, ssa_3
... = ssa_1 (kill)
... = ssa_3 (kill)
... = ssa_5 (kill)
If we merge the first collect first, we get a merge set:
ssa_1 (offset 0)
ssa_2 (offset 2)
ssa_3 (offset 4)
ssa_4 (offset 0)
Now, we decide to merge ssa_1 and ssa_5:
ssa_1 (offset 0)
ssa_2 (offset 2)
ssa_3 (offset 4)
ssa_4 (offset 0)
ssa_5 (offset 0)
ssa_3 cannot become a child of ssa_5 in the interval tree, just like a
source not in the same merge set, so we should not remove it and then
reinsert it assuming that RA will make it a child of ssa_5.
This fixes an RA validation error in Farming Simulater.
Fixes: 0ffcb19 ("ir3: Rewrite register allocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27497>
The upside is that this removes 600 lines of code. The downside is
that if instance divisors are used, we will compile the VS on demand.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27120>
This removes the take_ownership parameter and defines the behavior as if
take_ownership was always true, which is the fast path. This way, we don't
have to have 2 codepaths in set_vertex_buffers of every driver.
The old behavior is optionally available through util_set_vertex_buffers.
It also documents a new constraint that count in set_vertex_buffers must be
equal to the number of vertex buffers referenced by vertex elements or 0.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27492>
Move the execute function pointers to struct threaded_context, so that
drivers can change it. Also move struct tc_vertex_buffers into the header
file, so that drivers can implement their own function.
This allows drivers to inline pipe_context::set_vertex_buffers for TC.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27492>
It has no effect other than unreferencing buffers, which are then
immediatelly re-bound by st/mesa, so it doesn't do anything.
The name is also incorrect because it unbound all vertex buffers.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27492>
I somehow thought this list was sorted roughly alphabetical, but it's
actually sorted by the extension number. Whoops, let's fix that.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27452>
It's pointless to increase the refcount of the backing BO and then decrease
it in the same function when we already reference the slab entry BO that
holds the reference of the backing BO.
amdgpu_do_add_buffer is inlined in amdgpu_cs_submit_ib, so the new parameter
doesn't cost anything.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27408>
We unset RADEON_USAGE_SYNCHRONIZED, but we still checked it.
Iterate over such buffers separately without checking
RADEON_USAGE_SYNCHRONIZED.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27408>
The first loop updates sequence numbers. The second loop creates the BO list.
Do both in the same loop.
Each loop has to reload the whole BO list from the L2 cache or higher, so we
do it twice. By merging the loops, we only load the BO list from the L2
cache once.
The final result is actually 2 loops, but they iterate over different ranges
of the BO list, so each element is read only once.
If global_bo_list is enabled, it overwrites the BO list after all is done.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27408>
The first loop is inside amdgpu_add_sparse_backing_buffers. The second loop
iterates over sparse BOs to update sequence numbers.
Each loop has to reload the whole BO list from the L2 cache or higher, so we
do it twice. By merging the loops, we only load the BO list from the L2
cache once.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27408>
The first loop is inside amdgpu_add_slab_backing_buffers. The second loop
iterates over BOs to update sequence numbers.
Each loop has to reload the whole BO list from the L2 cache or higher, so we
do it twice. By merging the loops, we only load the BO list from the L2
cache once.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27408>
Ref: 0a52002a1c ("anv: disable reset query pools using blorp opt on MTL")
Ref: b3b12c2c27 ("anv: enable CmdCopyQueryPoolResults to use shader for copies")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27352>
Ref: bspec 55414
Ref: 951e08fc18 ("intel/compiler: Disable DPAS instructions on MTL")
Suggested-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27352>
MTL and ARL share many code paths, and this macro will make it easier
to check for either of them.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27352>