Skip the zero-index perfcntr collection only if CP_ALWAYS_COUNT is used,
since we want to collect that at the very end/start of collection. But
don't unwittingly ignore that perfcntr if CP_ALWAYS_COUNT isn't being
collected.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: 86456cf0e6 ("tu: support exposing derived counters through VK_KHR_performance_query")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33852>
Turnip's current VK_KHR_performance_query implementation only exposed raw
perfcounters. These aren't exactly trivial to evaluate on their own.
This mode can still be used with the new TU_DEBUG_PERFCRAW flag. Existing
TU_DEBUG_PERFC now enables performance query mode where Freedreno's derived
counters are exposed instead. These default to using the command scope,
making them usable with RenderDoc's performance counter capture.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33208>
Freedreno's derived counters combine multiple perfcntrs into a more
sensible, human-friendly metric. This change picks up the counters
currently used in Freedreno's Perfetto producer and rolls them into a
more genericallly usable form.
First place of their use will be through VK_KHR_performance_query, but
the Perfetto producer should also be able to use this interface instead
of having the logic duplicated. For now the counters are available only
for a7xx devices.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33208>
Use CP_SCOPE_CNTL to disable preemption when beginning performance query
and enable it back when that performance query is ended. This way the
collected perfcounter measurements will only cover work that's encompassed
by the query.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33208>
When using vkGetQueryPoolResults for performance queries, the result of
each query is the VkPerformanceCounterResultKHR union, and the result of
each query should be written into that union according to the storage type
of the corresponding counter.
This fixes current behavior of using the generic write procedure, where
either 32-bit or 64-bit value is written depending on the specified query
result flags.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33208>
Query index should be used to calculate the correct iova for various
performance query fields used to store counter values at the beginning
and ending of performance queries.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33208>
We currently keep a0.x/a1.x unnecessarily blocked until we have to spill
it, even if all its uses have been scheduled already. This causes
unnecessary spills and potentially schedules address writes later than
we'd like.
Fix this by keeping track of the number of uses that are still
unscheduled, unblocking address writes when it drops to zero.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33816>
We would set the `src` flag on the interval of reloaded sources.
However, the interval might be merged with its parent when inserted and
the parent wouldn't have this flag set. This caused the parent interval
to potentially be reused to reload later sources. Fix this by setting
the `src` flag on the top-level interval after insertion.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: fa22b0901a ("ir3/ra: Add specialized shared register RA/spilling")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33810>
Now that every ANGLE use is covered by tag consistency checks
(structured tagging), we don't need the USE_ANGLE flag anymore, because
if we have ANGLE_TAG set, it means that ANGLE is required in this job.
In detail, it means that the test job has inherited ANGLE_TAG from
`.container-builds-angle`.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33421>
Use .test-angle as a full-featured job to be extended to enable angle
usage in the job. Right now, it comes with USE_ANGLE=1 flag and the
respective structural tag.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33421>
No need to burn an entire PAGE_SIZE BO for an event. And in particular
the pattern of allocate + immediate mmap is expensive in a VM.
Suballocating cuts down the # of times we do this in
dEQP-VK.api.command_buffers.execute_large_primary from 10000 to 157,
avoiding problems with the test running up against watchdog timeout.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33721>
Soon tu_debug_flags will overgrow its 32-bit capacity. To avoid issues the
enum is resized to 64 bits and handling of these flag values is adjusted
accordingly.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33663>
Turns out we were missing the glapi bits, making it impossible to use get
the function pointers for this extension. Whoops?!
[daniels: Squashed in a618 SkQP fails, presumably caused by these not
being skipped anymore.]
Fixes: 9f5af68995 ("mesa/main: expose `EXT_multi_draw_indirect`")
Reviewed-by: Antonino Maniscalco <antomani103@gmail.com>
Tested-by: Chris Healy <healych@amazon.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33546>
It seems like (ss) is not enough to resolve WAR hazards for
ray_intersection.
Fixes CTS tests:
- dEQP-VK.ray_query.stress.fragment_shader.aabbs
- dEQP-VK.ray_query.stress.fragment_shader.triangles
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33597>
We would create an immed 0 SRC2 for, for example, load_uav. Even though
this src would be dismissed in the final assembly, it would still waste
a register or alias.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33596>
The intent here was to check if the tile we're trying to merge
vertically (prev_y_tile) has already been merged horizontally into a
neighboring tile, but I used the slot_mask which also contains the tiles
that have been merged into the prev_y_tile, so the check was too
conservative and would fail even if another tile had been merged into
prev_y_tile. This meant that we would fail to ever create 2x2 regions of
tiles. Fix this by just testing prev_y_tile's bit in the mask.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33534>
Even though we always try to merge a horizontally or vertically adjacent
tile, when we try to merge a vertically adjacent tile it may not
actually be adjacent because it was merged horizontally and the current
tile wasn't or vice versa. We have to detect this and reject merging it.
Fixes: 3fdaad0948 ("tu: Implement bin merging for fragment density map")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33534>
KGSL unconditionally supports preemption so we cannot ignore it.
On a6xx, we have to emit VSC addresses per-bin or make the amble include
these registers, because CP_SET_BIN_DATA5_OFFSET will use the
register instead of the pseudo register and its value won't survive
across preemptions. The blob seems to take the second approach and
emits the preamble lazily. We chose the per-bin approach but blob's
should be a better one.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12627
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33580>
Apparently fast path cannot handle mismatched mutability and we
should use CP_BLIT which has SP_PS_2D_SRC_INFO.MUTABLEEN to signal
src mutability. Previously it was partially handled by
tu_attachment_store_mismatched_swap.
Fixes: a104a7ca1a
("tu: Handle non-identity GMEM swaps when resolving")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33514>
The valve-freedreno-turnip-manual-rules naming would suggest
freedreno + turnip rules, but in fact this rule is meant to only
impact turnip.
Change the name to match .google-turnip-manual-rules and
.collabora-turnip-manual-rules.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33542>
We currently assume the implicit offset shift is always 2. However, this
shift is actually based on the type, making sure the offset fields are
in units of the type size. The full offset calculation is as follows:
((SRC2<<SRC2_SHIFT) + OFF)<<TYPE_SHIFT
Where SRC2, SRC2_SHIFT, and OFF are instruction fields while TYPE_SHIFT
is implicit and derived from the TYPE field.
This commit implements (dis)assembly support for this, adopting the
syntax used by the blob.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33503>
The assembly syntax of certain instructions differs significantly
between generations (e.g., ldg.a/stg.a) so it's useful to be able to
generate syntax error based on the generation we are assembling for.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33503>