Wire up the Mesa shader disk cache into Panfrost. Coupled with the
precompiles from the previous patch, this should greatly reduce shader
recompile jank.
This is a bare bones implementation. Obvious future work includes:
- Caching internal (outside of Gallium) shaders
- Implement finalize_nir to reduce on disk size of shaders
That doesn't need to come in this patch.
This patch does shuffle some allocation patterns around to avoid extra
nir_shader_clones, but the result should be pretty clean.
---
Consider dEQP-GLES31.functional.ssbo.layout.basic_unsized_array.* in the CTS.
With a cold cache:
44.11user 0.66system 0:45.44elapsed 98%CPU (0avgtext+0avgdata 267804maxresident)
k 0inputs+0outputs (130major+74725minor)pagefaults 0swaps
But with this commit and a warm cache:
4.07user 0.35system 0:04.56elapsed 96%CPU (0avgtext+0avgdata 211012maxresident)
k0inputs+0outputs (1major+49489minor)pagefaults 0swaps
That's an 11x improvement!
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
We have no vertex shader key, and unless legacy GL features are used, the
fragment shader key is known ahead-of-time. That means we can precompile shaders
at CSO create time, hopefully avoiding some draw-time jank.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
This avoids the weird compiled_shader pointer inside of compiled_shader. Because
we don't have a nonempty vertex shader key, there will only ever be a single
transform feedback program per CSO.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
No need to open code our own "special" dynarray. Unify the graphics/compute CSO
creation to make this work without duplicating more code.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
The active compiled shader (variant) is context state, it is inappropriate to
stash it on the uncompiled shader. Add compiled shader pointers to the context
and get rid of the active_variant mutation. Names from iris.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
We now have a common place for the driver side of shader compilation. As a bonus
this gets rid of the old "assemble" name which hasn't been accurate since 2018
or so.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
Compute and graphics shaders will need similar paths for the disk cache. Let's
consolidate the code to make it easier to work with.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
PIPE_FORMAT_NONE has a block size of 1, oddly, but we don't actually
need to allocate any space for it. This acts as a small optimization for
a few shaders with the new varying linker.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
Rename the existing one to make it clear that it is per-tile, and add a
new one that runs after all the tile passes. Will be needed in the next
commit.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
For PIPE_CAP_QUERY_BUFFER_OBJECT we'll need to write on the GPU a flag
when the query result is available, which means the buffers used for
query results should have a header with availability flag.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
WFI is not a strong enough barrier, which shows up in piglit qbo tests
which do a single draw.
Fixes: 13fc03f4c0 ("freedreno/a6xx: Avoid stalling for occlusion queries")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
Simplifies the interface slightly and makes it possible to re-use the
path for pctx->clear_texture() in the next commit. The z dimensions
still come from the surface.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19400>
It's undefined behavior in C to read a union member if another member
has been written to more recently. Let's be more careful here!
Fixes: a9d2b86c2c ("zink: store the spirv_shader to the zink_shader struct for generated tcs")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19457>
But because polygon-offset needs to consider the primitive-type *before*
overriding the type, add a zink_prim_type()-helper for the partially
resolved state.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19438>
This might seem like a premature optimization, but it's going to make a
bit more sense with the next commit, to prevent needlessly regressing
performance.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19438>
This should be based on the fill_mode, not on the primitive type. We
*also* need to check if we'll rasterize triangles in the end, though.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19438>
This is required when crocus is enabled in rusticl,
the lack of it contributes to this error:
thread '<unnamed>' panicked at 'Context missing features. This should never happen!', ../src/gallium/frontends/rusticl/mesa/pipe/context.rs:44:13
Signed-off-by: Thomas Debesse <dev@illwieckz.net>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19001>
This does the exact opposite of 06e96074 from !16129.
Before LLVM commit 702d5de4 opaque pointers were supported but not enabled
by default when building LLVM. They were made default in commit 702d5de4.
LLVM commit d69e9f9d introduced -opaque-pointers/-no-opaque-pointers cc1
options to enable or disable them whatever the LLVM default is.
Those two commits follow llvmorg-15-init and precede llvmorg-15.0.0-rc1 tags.
Since LLVM commit d785a8ea, the CLANG_ENABLE_OPAQUE_POINTERS build option of
LLVM is removed, meaning there is no way to build LLVM with opaque pointers
enabled by default.
It was said at the time it was still possible to explicitly disable opaque
pointers via cc1 -no-opaque-pointers option, but it is known a later commit
broke backward compatibility provided by -no-opaque-pointers as verified with
arbitrary commit d7d586e5, so there is no way to use opaque pointers starting
with LLVM 16.
Those two commits follow llvmorg-16-init and precede llvmorg-16.0.0-rc1 tags.
Since Mesa commit 977dbfc9 opaque pointers are properly implemented in Clover
and used.
If we don't pass -opaque-pointers to Clang on LLVM versions supporting opaque
pointers but disabling them by default, there will be an API mismatch between
Mesa and LLVM and Clover will not work.
Signed-off-by: Thomas Debesse <dev@illwieckz.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19290>
This reverts commit 06e9607478 from !16129.
Clover passed -no-opaque-pointers option to Clang to workaround the fact
the Clover code was not ported to opaque pointers yet.
Opaque pointers are now implemented thanks to !19103 so passing this
option to tell Clang to not do opaque pointers while Clover does
is actually breaking Clover.
Here is an example of what happens when using opaque pointers while
passing -no-opaque-pointers at the same time:
fatal error: cannot open file 'hawaii-amdgcn-mesa-mesa3d.bc':
Opaque pointers are only supported in -opaque-pointers mode
This fixes one of the last remaining bits to fully support opaque pointers
in Mesa as referenced in #7468, this is the last remaining bit to fully support
opaque points in Clover.
Signed-off-by: Thomas Debesse <dev@illwieckz.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19290>
We can provide better guidance on when to (un-)set this given that
everyone's on NIR now.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16539>
We want to use nir_opt_large_constants() instead (which is already
enabled), since that doesn't involve uploading the large immediate data
array again on each CB0 update. The downside is a bit of addressing math,
since constant_data is accessed using 64-bit global addresses.
The shader-db results are a bit all over:
All Iris driver platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 19910185 -> 19913931 (0.02%)
instructions in affected programs: 225374 -> 229120 (1.66%)
helped: 3 / HURT: 348
total cycles in shared programs: 856004856 -> 855016808 (-0.12%)
cycles in affected programs: 22832422 -> 21844374 (-4.33%)
helped: 277 / HURT: 101
total spills in shared programs: 6580 -> 6609 (0.44%)
spills in affected programs: 516 -> 545 (5.62%)
helped: 1 / HURT: 4
total fills in shared programs: 8235 -> 8267 (0.39%)
fills in affected programs: 1022 -> 1054 (3.13%)
helped: 1 / HURT: 3
total sends in shared programs: 1039347 -> 1039095 (-0.02%)
sends in affected programs: 16367 -> 16115 (-1.54%)
helped: 251 / HURT: 0
LOST: 5
GAINED: 2
LOST:
- 3 SIMD16 fragment shaders (Superposition)
- 2 SIMD16 compute shaders (Aztec Ruins)
GAINED:
- fake news... 2 SIMD8 compute shaders that replace the lost SIMD16
compute shaders.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16539>
We were using the old load_global intrinsic still, which can't be
reordered, limiting optimization opportunities. We know the data here
is constant, so we can use the newer load_global_constant intrinsic.
This doesn't seem to have any impact on shader-db or fossil-db on any
Intel platform.
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16539>
[problem]
When decoding an av1 bitstream, it shows image corruption
in the middle of the bitstream around key frames.
[analysis]
in av1_spec.pdf page 38/669, there is a sentence below:
if ( frame_type == KEY_FRAME && show_frame ) {
for ( i = 0; i < NUM_REF_FRAMES; i++) {
RefValid[ i ] = 0
......
}
......
}
This shows that the condition of invalidating current
DPB frames should be the coming frame_type is KEY_FRAME plus
show_frame is equal to 1. Otherwise, some of the frames
in sequence after KEY_FRAME still refer to the reference frames
before KEY_FRAME, and if these before KEY_FRAME reference
frames were invalidated, these frames could not find their
reference frames, and it could cause image corruption.
[solution]
Add condition of show_frame, with the corresponding fix
in ffmpeg, we cannot see this issue any longer.
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19386>
Allows to make Clover devices appearing as cpu, gpu or accelerator
by setting the CLOVER_DEVICE_TYPE environment variable like
the RUSTICL_DEVICE_TYPE environment variable does.
For example it can make the CPU llvmpipe device appear as GPU or GPU devices
appear as CPU. This is useful for testing OpenCL with applications that may
use different code path given the OpenCL device is a CPU or a GPU.
The initial motivation for RUSTICL_DEVICE_TYPE implementation was to test
rusticl with llvmipe on applications ignoring CPU devices.
This brings Clover on par with rusticl on that topic.
CL_DEVICE_TYPE_CUSTOM isn't implemented or applications may crash when
iterating devices because CL_DEVICE_TYPE_CUSTOM is OpenCL 1.2 and Clover
is OpenCL 1.1.
Signed-off-by: Thomas Debesse <dev@illwieckz.net>
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18931>
Supported in all hardware and software drivers. Only that don't support
are virgl and svga, depending on host capabilities. I don't think
there's anything to be done there. This does give fewer places to screw
up the CAPs, though, because everyone wants ARB_buffer_storage.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Ol<C5><A1><C3><A1>k <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19392>