Runs the gpu tests, runtime is around 25 minutes. Contrary to dEQP which
is quite stable with the disabled HiZ, piglit flakes a lot more, but
the flakes seems mostly limited to tex-miplevel-selection.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26931>
This was held back by the issue fixed in the previous patch. Let's
enable it again!
There's a bunch of failures due to a bug in Piglit, where undefined
behavior gets invoked. Let's just mark them as expected failures for now
and move on.
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24942>
If we legalize AFBC late, we end up in a situation while we might need
to do a blit while inside a previous blit operation, but u_blitter
state isn't saved recursively, and that leads to crashes.
This patch solves this issue by splitting panfrost_blit into two
functions and legalizing AFBC early.
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24942>
mesa/main/draw.c calls threaded_context to add a draw call, but the caller
fills it manually.
This way we don't have to fill pipe_draw_info in a local variable and later
copy it to tc_batch. tc_batch is filled from draw.c directly.
It also eliminates a few conditional jumps thanks to assumptions we can make
in DrawElements but not tc_draw_vbo.
This decreases the overhead of the GL frontend thread by 1.1%, which has
CPU usage of 26%, so it decreases the overhead for that thread by 4.2%.
(1.1 / 26)
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26584>
Instead of a series of if-else statements, use a function table containing
all the draw variants, which is indexed by:
[is_indirect * 8 + index_size_and_has_user_indices * 4 +
is_multi_draw * 2 + non_zero_draw_id]
This decreases the overhead of tc_draw_vbo by 0.7% (from 4% to 3.3%)
for the GL frontend thread in VP2020/Catia1, which has CPU usage of 26%,
so it decreases the overhead for that thread by 2.7%. (0.7 / 26)
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26584>
flags=0 is used for e.g., glFenceSync, which apps use to insert sync points
to determine when all prior work has completed. eliding these flushes into no-ops
is fine for all scenarios except when the last op was a present, in which
case the no-op (previous) fence will not sync as expected for the present and
graphical artifacts will result
in the future, this may be changed back to the previous behavior if/when presentation
gains timeline semaphore capabilities by providing the last timeline id
as a fence instead of the last batch
fixes#10386
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26935>
Fix defect reported by Coverity Scan.
Dereference null return value
If the function actually returns a null value, a null pointer dereference will occur.
CID: 1492763
Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26924>
Update the sampler views with the color channels, that fixes the issue
caused by: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20815
It fixes the case:
`gst-launch-1.0 -v -v filesrc location=file.jpg ! jpegparse ! vaapijpegdec ! imagefreeze ! vaapisink`
Fixes: dc2119bf3f ("util/format: Fix wrong colors when importing YUYV and UYVY")
Cc: mesa-stable
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26911>
This decreases the time spent in amdgpu_cs_submit_ib from 15.4% to 8.3%
in VP2020/Catia1, which is a decrease of CPU load for that thread by 46%.
Overall, it increases performance by a small number in CPU-bound benchmarks.
The biggest improvement I have seen is VP2020/Catia2, where it increases
FPS by 12%.
It no longer stores pipe_fence_handle references inside amdgpu_winsys_bo.
The idea is to have a global fixed list of queues (only 1 queue per IP
for now) where each queue generates its own sequence numbers (generated
by the winsys, not the kernel). Each queue also has a ring of fences.
The sequence numbers are used as indices into the ring of fences, which
is how sequence numbers are converted to fences.
With that, each BO only has to keep a list of sequence numbers, 1 for each
queue. The maximum number of queues is set to 6. Since the system can
handle integer wraparounds of sequence numbers correctly, we only need
16-bit sequence numbers in BOs to have accurate busyness tracking. Thus,
each BO uses only 12 bytes to represent all its fences for all queues.
There is also a 1-byte bitmask saying which sequence numbers are
initialized.
amdgpu_winsys.h contains the complete description. It has several
limitations that exist to minimize the memory footprint and updating of
BO fences.
Acked-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26643>