Wire up the Mesa shader disk cache into Panfrost. Coupled with the
precompiles from the previous patch, this should greatly reduce shader
recompile jank.
This is a bare bones implementation. Obvious future work includes:
- Caching internal (outside of Gallium) shaders
- Implement finalize_nir to reduce on disk size of shaders
That doesn't need to come in this patch.
This patch does shuffle some allocation patterns around to avoid extra
nir_shader_clones, but the result should be pretty clean.
---
Consider dEQP-GLES31.functional.ssbo.layout.basic_unsized_array.* in the CTS.
With a cold cache:
44.11user 0.66system 0:45.44elapsed 98%CPU (0avgtext+0avgdata 267804maxresident)
k 0inputs+0outputs (130major+74725minor)pagefaults 0swaps
But with this commit and a warm cache:
4.07user 0.35system 0:04.56elapsed 96%CPU (0avgtext+0avgdata 211012maxresident)
k0inputs+0outputs (1major+49489minor)pagefaults 0swaps
That's an 11x improvement!
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
We have no vertex shader key, and unless legacy GL features are used, the
fragment shader key is known ahead-of-time. That means we can precompile shaders
at CSO create time, hopefully avoiding some draw-time jank.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
This avoids the weird compiled_shader pointer inside of compiled_shader. Because
we don't have a nonempty vertex shader key, there will only ever be a single
transform feedback program per CSO.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
No need to open code our own "special" dynarray. Unify the graphics/compute CSO
creation to make this work without duplicating more code.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
The active compiled shader (variant) is context state, it is inappropriate to
stash it on the uncompiled shader. Add compiled shader pointers to the context
and get rid of the active_variant mutation. Names from iris.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
We now have a common place for the driver side of shader compilation. As a bonus
this gets rid of the old "assemble" name which hasn't been accurate since 2018
or so.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
Compute and graphics shaders will need similar paths for the disk cache. Let's
consolidate the code to make it easier to work with.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
NIR deemphasizes nir_variable. We want to transition off it. Instead of walking
the list of variables and playing games with the GLSL types to collect varying
information, walk the list of instructions and use the I/O semantics to collect
similar information.
In addition to avoiding the reliance on nir_variable, this fixes handling of
struct varyings under certain circumstances. Such programs are compiled by the
GLES3.1 CTS but not used, so without this fix, the affected tests would regress
when precompiling.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
PIPE_FORMAT_NONE has a block size of 1, oddly, but we don't actually
need to allocate any space for it. This acts as a small optimization for
a few shaders with the new varying linker.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
Move the pass from the Bifrost compiler to the Midgard/Bifrost common code
directory, and take advantage of it on Midgard, where it fixes the same
tests as it fixed originally on Bifrost.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
Rename the existing one to make it clear that it is per-tile, and add a
new one that runs after all the tile passes. Will be needed in the next
commit.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
For PIPE_CAP_QUERY_BUFFER_OBJECT we'll need to write on the GPU a flag
when the query result is available, which means the buffers used for
query results should have a header with availability flag.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
WFI is not a strong enough barrier, which shows up in piglit qbo tests
which do a single draw.
Fixes: 13fc03f4c0 ("freedreno/a6xx: Avoid stalling for occlusion queries")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
For devices that don't support getfiberid, we force the subgroup size
to 1 for things other than compute stage. This matches what zink does.
And fixes spec@arb_shader_group_vote@vs-eq-uniform once we expose
ARB_shader_group_vote.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
Simplifies the interface slightly and makes it possible to re-use the
path for pctx->clear_texture() in the next commit. The z dimensions
still come from the surface.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
It's undefined behavior in C to read a union member if another member
has been written to more recently. Let's be more careful here!
Fixes: a9d2b86c2c ("zink: store the spirv_shader to the zink_shader struct for generated tcs")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19457>
But because polygon-offset needs to consider the primitive-type *before*
overriding the type, add a zink_prim_type()-helper for the partially
resolved state.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19438>
This might seem like a premature optimization, but it's going to make a
bit more sense with the next commit, to prevent needlessly regressing
performance.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19438>
This should be based on the fill_mode, not on the primitive type. We
*also* need to check if we'll rasterize triangles in the end, though.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19438>
6698753cdb switched our GS output stores to use MUBUF.
The stride doesn't matter for the ESGS descriptor (because idxen=false and
the index stride is 64), but this fixes it anyway.
This also changes ACO to use MUBUF store too, since MTBUF doesn't seem to
work correctly with an invalid data format in the descriptor.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 6698753cdb ("ac/llvm: don't use tbuffer_store as a fallback for swizzled stores")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18885>