Freedreno's derived counters combine multiple perfcntrs into a more
sensible, human-friendly metric. This change picks up the counters
currently used in Freedreno's Perfetto producer and rolls them into a
more genericallly usable form.
First place of their use will be through VK_KHR_performance_query, but
the Perfetto producer should also be able to use this interface instead
of having the logic duplicated. For now the counters are available only
for a7xx devices.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33208>
Use CP_SCOPE_CNTL to disable preemption when beginning performance query
and enable it back when that performance query is ended. This way the
collected perfcounter measurements will only cover work that's encompassed
by the query.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33208>
When using vkGetQueryPoolResults for performance queries, the result of
each query is the VkPerformanceCounterResultKHR union, and the result of
each query should be written into that union according to the storage type
of the corresponding counter.
This fixes current behavior of using the generic write procedure, where
either 32-bit or 64-bit value is written depending on the specified query
result flags.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33208>
Query index should be used to calculate the correct iova for various
performance query fields used to store counter values at the beginning
and ending of performance queries.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33208>
This avoids reading past the buffer’s end in the client afterward, because the
drmFormatModifierCount hasn’t been changed from what the client passed, if it
wasn’t zero at first.
GTK triggers that bug by setting it to the length of the static array (see this
bug[0] though), but other Vulkan programs might have the same issue if they
don’t first query the count before allocating the array.
This has been tested on a Radxa ROCK 5B board running a Mali-G610 GPU.
[0] https://gitlab.gnome.org/GNOME/gtk/-/merge_requests/8222
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 252ddaf51b ("panvk: fix VkDrmFormatModifierPropertiesListEXT query")
Fixes: https://gitlab.freedesktop.org/mstoeckl/waypipe/-/issues/127
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33657>
We currently keep a0.x/a1.x unnecessarily blocked until we have to spill
it, even if all its uses have been scheduled already. This causes
unnecessary spills and potentially schedules address writes later than
we'd like.
Fix this by keeping track of the number of uses that are still
unscheduled, unblocking address writes when it drops to zero.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33816>
The current requirement for sync is only to support WSI, and it is not
necessarily needed at all per the comment added. Will drop it later.
Cc: mesa-stable
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33829>
This step is needed to support the same GPU model, but with a
different revision. A good example is the gc7000.
- imx8mp: model gc7000 with revision 6204 (uses RS)
- imx8mq: model gc7000 with revision 6214 (uses BLT)
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33811>
every vk driver wants these macros for executable statistics, so make them
common. there are two variants floating in-tree, a pure copy (radv) and a
formatted print (everyone else). we add both variants and then convert most
prints to copies where formatting isn't actually used. that has the benefit of
cleaning up trivial "%s" format strings in a bunch of places.
I didn't bother cleaning up the formatting in non-automatic-formatted drivers
because it's tedious and I'm planning to delete a lot of this driver code with
upcoming runtime work anyway. This is a step towards those runtime improvements.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33826>
Arm and NVIDIA hardware both have this as a bit you can set on the
texture instruction so we may as well have a shared pass for it.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33402>
Previously, the caps were not queried until after the first DPB buffer
was created. That causes the AOT path to be triggered even when the
device does report a requirement for texture array.
Tested-by: Benjamin Cheng <benjamin.cheng@amd.com>
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33824>
Android's Vulkan loader provides KHR_android_surface and intercepts
during instance creation; it does not implement KHR_present_[wait/id]
nor does it pass the enablement of KHR_android_surface to the driver
so this check wasn't actually disabling KHR_present_[wait/id] for
Android.
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33677>
> Allowing 1D and 3D images, and array images too, with DRM_FORMAT_MOD_LINEAR, is ok
> because VkImageDrmFormatModifierExplicitCreateInfoEXT::pPlaneLayouts is able to fully
> describe the image layout. IF miplevels == 1, which this patch continues to enforce.
Reviewed-by: Lina Versace <lina@kiwitree.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33668>
this is a nice idea, but there are apps/games that do not respect
hardware capabilities and yolo-bind fixed size buffers
fixes Ballionaire (2667120) launch on non-desktop drivers
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33819>
Beyond that, monolithic pipelines just bloat to incredible sizes,
destroying compile times for questionable, if any, runtime perf benefit.
Indiana Jones: The Great Circle has more than 100 stages and takes
several minutes to compile its RT pipeline on Deck when using monolithic
compilation, and yet separate shaders still end up faster (probably
because instruction cache coherency in traversal is better).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33818>
The first test verifies that, if possible, we don't emit unnecessary
renames/copies for temporaries where it's possible for them to stay
in their current register (if an operand is precolored to the register
the temporary is currently residing in).
The second test verifies that we correctly choose a non-clobbered
operand even if there is one fixed to the temporary's current register.
To minimize copies, we'll want to have the live copy of
%tmp0 in v[2] there, because v[0-1] gets overwritten.
The third test verifies that we add a copy to another free register and
rename if all possible precolored operands are clobbered.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29576>