Copied wrong from radeonsi. The registers following the scratch
buffer address are the shader rsrc1/rsrc2. Not the user SGPR0
containing the ring resource word 1.
Fixes: 278e533ec9 ("radv: update scratch buffer registers on GFX11")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19488>
(cherry picked from commit b8865ad046)
The entire point of resource shadowing is to avoid unnecessary flushing.
Flushing readers after shadowing is counterproductive. A refresher on
how resource shadowing is supposed to work:
First, we determine if it's beneficial to shadow resources. If so, we
create a new backing buffer object. We flush the current writer of the
resource, if there is one, so the current contents become known to the
CPU. If we are not discarding the original resource, we then copy the
existing contents of the buffer to the new shadow buffer on the CPU.
Finally, we swap the resource's backing buffer for our shadow. Any batch
that reads the resource will continue to read the old copy of the
resource, and any future draw calls will see the new copy with the
change implemented.
Where did we go wrong?
In 988d5aae74 ("panfrost: Flush resources when shadowing"), we started
flushing all readers. We didn't actually need to flush, we just needed
to avoid dangling references on the batches reading the old copy of the
resource. But that's easily enough avoided: just remove the references.
The batches still hold a reference to the underlying BO, which will be
freed at the right time regardless.
Originally motivated by glmark2 -bbuffer:update-method=subdata, which
has some pathological access paterns.
Firefox is a lot faster anecdotally (now scrolling at 60fps in firefox).
But what actually motivated this is an apitrace from Duckstation's GLES
renderer. With this patch, the in-game portion is improved 3fps to 21fps.
Closes: #4028
Fixes: 988d5aae74 ("panfrost: Flush resources when shadowing")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19361>
(cherry picked from commit 2d8f28df73)
If a synchronized transfer_map is going to overwrite an entire resource,
there's no need to memcpy in the original contents ahead-of-time. This
memcpy is particularly bad for large buffers where it's copying WC->WC,
although that could be mitigated with threaded_context's cpu_storage in
the future if needed.
Prevents a performance regression in glmark2's buffer scenes from the
next patch, hence the Cc.
Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19361>
(cherry picked from commit 0b26a9f773)
This reverts commit 044b238507 and extends it with
- putting the comments directly in front of the if's
- do not support 2x MSAA on SMALL_MSAA hardware
- checking if blt/rs supports the format
MSAA should work as expected now. Tested with kmscube and qt5.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19013>
If the GPU supports SMALL_MSAA and does not support CACHE128B256BPERLINE we
need to set the cache mode to CACHE_MODE_256B.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19013>
MSAA surfaces are always used for rendering, and only as blit sources,
so the need to be allocated with PE compatible tiling.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19013>
st/mesa does not know anything about the forced MSAA and we end with
the following assert:
etna_try_rs_blit: Assertion `(blit_info->src.box.x + blit_info->src.box.width) * msaa_xscale <= src_lev->padded_width' failed
Let the application do its thing regarding MSAA and remove this 'debug'
feature.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19013>
Wire up the Mesa shader disk cache into Panfrost. Coupled with the
precompiles from the previous patch, this should greatly reduce shader
recompile jank.
This is a bare bones implementation. Obvious future work includes:
- Caching internal (outside of Gallium) shaders
- Implement finalize_nir to reduce on disk size of shaders
That doesn't need to come in this patch.
This patch does shuffle some allocation patterns around to avoid extra
nir_shader_clones, but the result should be pretty clean.
---
Consider dEQP-GLES31.functional.ssbo.layout.basic_unsized_array.* in the CTS.
With a cold cache:
44.11user 0.66system 0:45.44elapsed 98%CPU (0avgtext+0avgdata 267804maxresident)
k 0inputs+0outputs (130major+74725minor)pagefaults 0swaps
But with this commit and a warm cache:
4.07user 0.35system 0:04.56elapsed 96%CPU (0avgtext+0avgdata 211012maxresident)
k0inputs+0outputs (1major+49489minor)pagefaults 0swaps
That's an 11x improvement!
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
We have no vertex shader key, and unless legacy GL features are used, the
fragment shader key is known ahead-of-time. That means we can precompile shaders
at CSO create time, hopefully avoiding some draw-time jank.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
This avoids the weird compiled_shader pointer inside of compiled_shader. Because
we don't have a nonempty vertex shader key, there will only ever be a single
transform feedback program per CSO.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
No need to open code our own "special" dynarray. Unify the graphics/compute CSO
creation to make this work without duplicating more code.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
The active compiled shader (variant) is context state, it is inappropriate to
stash it on the uncompiled shader. Add compiled shader pointers to the context
and get rid of the active_variant mutation. Names from iris.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
We now have a common place for the driver side of shader compilation. As a bonus
this gets rid of the old "assemble" name which hasn't been accurate since 2018
or so.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
Compute and graphics shaders will need similar paths for the disk cache. Let's
consolidate the code to make it easier to work with.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
NIR deemphasizes nir_variable. We want to transition off it. Instead of walking
the list of variables and playing games with the GLSL types to collect varying
information, walk the list of instructions and use the I/O semantics to collect
similar information.
In addition to avoiding the reliance on nir_variable, this fixes handling of
struct varyings under certain circumstances. Such programs are compiled by the
GLES3.1 CTS but not used, so without this fix, the affected tests would regress
when precompiling.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
PIPE_FORMAT_NONE has a block size of 1, oddly, but we don't actually
need to allocate any space for it. This acts as a small optimization for
a few shaders with the new varying linker.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>
Move the pass from the Bifrost compiler to the Midgard/Bifrost common code
directory, and take advantage of it on Midgard, where it fixes the same
tests as it fixed originally on Bifrost.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>