Use the new event-based tracing system to capture IDVS/COMPUTE/FRAGMENT
jobs and their context.
When tracing is enabled, the descriptor ring buffer is replaced by
a bigger linear buffer such that descriptors are not recycled before
we get a change to decode the trace.
If the decode buffer is too small and a OOB is detected, the driver will
suggest the user to allocate a bigger buffer with the
PANVK_{DESC,CS}_TRACEBUF_SIZE env vars.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32284>
Interpreting the command buffer only really works if everything is
static, but panvk started to make extensive use of loops, and
conditionals which depends on memory values that get updated by the
command stream itself. This makes it impossible to walk back to the
original state in order to replay the CS actions.
Move away from this approach in favor of an event-based tracing
mechanism recording particular CS commands and their context at
execution time. Of course, that means the auxiliary descriptors
shouldn't be recycled until the traces are decoded, but that's more
tractable. We just need to turn the descriptor ring buffers into
linear buffers with a guard page, and crash on OOB, with a message
suggesting the user to tweak the maximum trace buffer sizes.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32284>
Will be useful if we want to be able to make the trace events point to
the instruction they are recording.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32284>
Just a wrapper around pandecode_log() taking the lock and making sure
the dump stream is opened.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32284>
In panvk, we want to switch from interpretation-based decoding to
event-tracing based decoding, so we no longer depend on the memory state
to get accurate job information.
Even if we're not interested in interpreting the CS, we still want to
dump CS binaries so developers can know what's passed to the GPU.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32284>
pandecode_cs() does both the decoding and the interpretation.
Rename the function to avoid the confusion.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32284>
Everything else is prefixed cs, not ceu, so let's drop the remaining
ceu occurrences.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32284>
Despite the name, disassemble_ceu_instr() does more than disassembling
the instruction, it also partially interpret it.
Add a print_cs_instr() helper that does just the disassembling/printing
part, and move the remaining of disassemble_ceu_instr() to
interpret_ceu_instr().
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32284>
Similar to commit 087e9a96d1 ("venus: make cross-device optional"),
make VIRTGPU_BLOB_FLAG_USE_CROSS_DEVICE use optional, because qemu does
not support this.
Fixes: 06e57e3231 ("virtio: Add vdrm native-context helper")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32392>
The previous code not only left them undefined, but also
didn't increment the array index, so subsequent PS inputs
would be broken after the undefined one.
Note that this doesn't affect any valid Vulkan apps, but it makes
the code a bit simpler and it makes undefined inputs a little more
forgiving, at no expense for valid PS.
This code actually uncovers a bug in Zink, so I'm also documenting
the failing Zink test case.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32220>
GFX10.3 keeps track of per-vertex and per-primitive PS inputs
separately in NUM_INTERP / NUM_PRIM_INTERP,
which we only really know when emitting the inputs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32220>
num_inputs contains the total number of FS inputs.
Note that this also fixes a bug where some calculations in RADV
and ACO were missing the per-primitive attributes from the LDS
usage of PS.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32220>
Add export_prim_id_per_primitive for mesh shaders.
This prepares to also configure some of these to be per-primitive
in the future, even in the traditional pipeline.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32220>
There are some FS built-ins that can be per-vertex or
per-primitive depending on whether a mesh shader is used:
primitive ID (implicit in VS), layer and viewport.
However, the HW requires per-primitive FS inputs to be ordered last.
This causes bugs when the same unlinked FS is used together
with VS/TES/GS and MS (with unlinked ESO or fast-linked GPL).
To solve this problem, we reorder the FS inputs so that these
potentially per-primitive inputs go after per-vertex inputs but
before per-primitive inputs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32220>
switch libagx to the precompilation pipeline. see the big comment in the
previous commit for why we're doing this.
while doing so, we move some dispatch stuff. there was so much churn from
precompile that this avoids doing the churn twice. that new header will be used
for DGC down the road.
there's also a small vtn/bindgen patch in here to skip bindgen'ing entrypoints,
as that conflicts with the new dispatch macros. this is the sane behaviour, we
just need to do the full precomp switch across the tree at once.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32339>
This seems to be a little faster.
insert_NOPs (navi31):
Difference at 95.0% confidence
-11.484 +/- 6.13377
-1.62767% +/- 0.860593%
(Student's t, pooled s = 5.71913)
insert_NOPs (gfx1200):
Difference at 95.0% confidence
-35.6745 +/- 4.97972
-8.1236% +/- 1.10453%
(Student's t, pooled s = 4.6431)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32374>
This is a simple way for drivers to enable uniform expression propagation
without having to set any callbacks for it. It replaces the old option.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32390>
After the breaking commit, gbm_bo_create_with_modifiers({LINEAR}) returns
a BO with gbm_bo_get_modifier() = INVALID. This restores the functionality
and fixes most notably, hardware cursors for cards without modifiers.
Fixes#12039.
Fixes: 361f362258 ("dri: Unify createImage and createImageWithModifiers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31725>
This fixes:
spec@glsl-1.20@execution@clipping@vs-clip-vertex-enables
with the latest nir_lower_clip changes.
The driver breaks when POS is stored before CLIP_DIST.
That's the only change caused by previous commits according to
VC4_DEBUG=nir.
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32363>
This is currently needed to fix d3d12 for st_unlower_io_to_vars.
The idea is to track the current value of ClipVertex in a temporary
variable, and for every emit_vertex, we load the ClipVertex value from
the temporary (which matches the stored value) and insert new CLIP_DIST
stores before emit_vertex.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32363>
The code for IO variables was interleaved with code for IO intrinsics,
which was difficult to follow.
lower_clip_outputs is split and replaced by more accurate names:
lower_clip_vertex_var and lower_clip_vertex_intrin
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32363>
Predication instructions sometimes need extra nops to workaround what
seems to be a hardware bug: prede needs 6 nops and the second
predt/predf of a predt/predf pair needs 4 nops.
The prede workaround is enabled starting from a6xx gen3 and the
predf/predt workaround from a6xx gen4, following the blob.
Fixes rendering corruption in God of War (2018).
Signed-off-by: Job Noorman <job@noorman.info>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32366>
This function returns an uint64_t, but returns the result of two
uint32_t values. If we don't widen at least one of them before
returning, the multiplication wraps large results.
So let's widen the type first, so we can preserve large offsets.
Fixes: d1934e44fc ("panvk: Implement occlusion queries for JM")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
CID: 1634943
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32380>
These functions returns signed values, so we shouldn't use an unsigned
variable to store one of them in.
Fixes: d1934e44fc ("panvk: Implement occlusion queries for JM")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
CID: 1635021
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32380>
The page-alignment should always be a positive power of two. Let's
assert that, to avoid confusion.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
CID: 1605086
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32380>
If this could be zero, we'd end up with divisions by zero here, which
uh... would be bad? I don't think that can happen, so let's assert about
this, to make it clear what's going on.
CID: 1444660
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32381>
- Immediates are not allowed in 2nd src.
- Immediates are 12 bits, not sign-extended.
- Only one of the first two sources can be shared when not scalar ALU.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32181>