This disable hierachy levels in a more generic way to handle v12+
differences.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34032>
A few function scope traces are added to instrument blits, clears,
flushes, BOs and resources handling, shader compilation and cache,
draws, compute shader job emissions and AFBC packing.
Give the panfrost_flush_all_batches() call from panfrost_flush() a
more specific reason ("Gallium flush") to improve traces.
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34385>
This can be useful to track different values like buffer sizes, ioctl
ops, etc.
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34385>
Lower the concurrency for the zink-anv-tgl job to avoid the out-of-memory
issues seen recently.
Remove the flakes that were added as the previous workaround.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34505>
We can't have merged shaders where the first part is compiled using ACO
and the second part is compiled using LLVM.
Add ACO-specific main shader parts to fix that.
This happens when ACO is enabled for gfx12 streamout where GS can be paired
with a previous shader compiled by LLVM.
Fixes: 8ba718fb7d - radeonsi/gfx12: use ACO for streamout because it's faster
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34491>
Read-only ZS optimization can only happen if the ZS tile buffer is not
written, which can only be known when the fixed-function settings is
set.
Change pan_earlyzs_get() to take an enum instead of a boolean and
differentiate ZS-read and ZS-read-with-readonly-optimization-allowed.
Fixes: 25a993731087 ("pan/earlyzs: Support the shader ZS read-only case and its optimization on v10+")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34480>
Log all operations with the information used to decide whether they
are supported or unsupported. Include tensor data types, conv2d fused
activation and dilation parameters to debug output.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34472>
This verifies that the requested allocation doesn't exceed the maximum
in cases where the size passed to `clSVMAlloc()` isn't a multiple of the
provided alignment. It also clamps the maximum allocation to `i32::MAX`,
which prevents overflowing `pipe_box`'s `width` field.
Both of these changes prevent possible undefined behavior on 32-bit
systems due to violation of `Layout` prerequisites.
v2: use safe layout creation for maintainability, add a few comments
v3: use Layout utils for aligned size calc, split out max alloc changes
v4: use `checked_compare()` for alloc/size comparison
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Cc: stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34166>
On v5, as well as v7 onwards, we can disable pipelining in order to fit
more data into the tile-memory. This is important in order to support
multiple, large color buffers with high MSAA sample counts.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33925>
We also have a budget for the tile size for depth-buffers. It's
currently hard to trigger issues with this than for color-buffers,
but this becomes important when we support larger MSAA counts.
We also need to take a bit of care for stencil-only attachments, because
they also count against a limit here. We really only care about the
sample counts here, because the stencil buffer budget is always a
quarter of the depth-buffer budget, and always uses a single byte per
sample.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33925>
We are about to allow ZS tile buffer reads in panvk in order to support
VK_KHR_dynamic_rendering_local_read, and this requires dealing with
a new case in the early ZS logic.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>
We just opened the GEM handle a few line before (or used drmPrimeFDToHandle to
acquire it), on failure it is just better to close it.
Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34437>
Panfrost supports pushing uniforms to hardware uniform registers (RMU/FAU for
Midgard/Bifrost respectively). Since OpenGL uniforms are lowered to UBO #0, it
does this with a pass that pushes UBOs. That's good!
The pass also pushes 'true' OpenGL UBOs, since they look the same in the backend
at this point. This is where the trouble comes in:
- True UBOs are allocated in GPU BOs, not CPU allocated buffers. That means it's
write-combine memory, which we cannot read from efficiently (at least
depending on coherency details that were never plumbed through panfrost.ko and
unlikely to be replumbed now that panthor is the new hot stuff). So, pushing
true UBOs reduces GPU overhead at the cost of tremendous CPU overhead. This is
dubious... When I benchmarked this on MT8192 in early 2023, this pushing
improved FPS in SuperTuxKart but hurt FPS in Dolphin.
- True UBOs can be written on the GPU. In OpenGL, we have batch tracking
infrastructure to sort this mess out in theory. What this means is that
pushing UBOs requires us to flush writers AND STALL at draw-time. If this is
ever hit, our performance is utterly trashed. But it gets worse.
- True UBOs can be written in the same batch that reads them. For example, we
could bind a buffer as a transform feedback buffer, do a draw with XFB, then
rebind as a UBO and do a draw reading. This is where we collapse -- our logic
will flush the writer, which is the same batch we were in the middle of
enqueueing a draw to. When we try to push words, we'll crash with theatrics.
This could be solved by smartening the batch tracking logic but it's not
trivial by any means.
So, pushing true UBOs on the CPU is broken and can hurt performance. Stop doing
it!
Long term, the solution will be to push on the GPU instead. This avoids all of
these issues. This can be done with a compute kernel or with CSF instructions.
The Vulkan driver will likely have to do this for performance, since pushing
UBOs from the CPU is utterly broken in Vulkan for the above reasons.
I have a branch somewhere doing this on v9 but I'm doing this on NIR time to
unblock a core change that was crashing piglit due to this pile of unsoundness.
Let's fix the correctness issues first, then someone can look at recovering
performance later when we're not blocking unrelated work.
Fixes corruption in Piglit test
gles-3.0-transform-feedback-uniform-buffer-object, which writes a UBO with
transform feedback. (I suspect the test still doesn't pass for the same reason
it's broken on other tilers. But that's a better place to be than oodles of
memory corruption.)
According to CI, fixes spec@arb_uniform_buffer_object@rendering{-dsa}-offset.
Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34193>