Just always use layered, like AGXV. This was a pointless bit of optimization
that only affects render target spilling with neglible impact.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26963>
Rather than hardcoding u0_u1, this lets drivers map texture handles in whatever
way is convenient. In particular, this makes textures work properly with merged
shader stages (provided one of the stages is forced to use bindless access), by
giving each stage an independent texture heap.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26056>
If we spill render targets with a layered framebuffer, our spilled targets are
assumed to be 2D Arrays (in general). We need to use arrayed image operations to
load/store from these. The layer is given by the layer as read in the fragemnt
shader. This handles the eMRT portion of layered rendering.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
If we bound 4 render targets but we only write to 1 of them, the other 3 need
their contents preserved. This requires either properly configuring HSR to
implement colour masking (TODO) or using the big hammer of setting TRANSLUCENT.
This patch picks the latter for now.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25052>
Doing a descriptor crawl with binding tables requires a real binding table in
the shader, which won't work for VK or merged shader stages in GL. Instead,
let's lower anything that needs a crawl to bindless in the driver, so the
compiler code doesn't need to know anything about descriptor binding models.
That gets rid of the texture_base_agx sysval, which is problematic when there
are multiple descriptor sets worth of textures.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24847>
Instead, we replace every use of it with nir_def. Most of this commit
was generated by sed:
sed -i -e 's/dest.ssa/def/g' src/**/*.h src/**/*.c src/**/*.cpp
A few manual fixups were required in lima and the nir_legacy code.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674>
We could add a nir_def_bit_size() helper but we use ->bit_size about 3x
as often as nir_dest_bit_size() today so that's a major Coccinelle
refactor anyway and this doesn't make it much worse. Most of this
commit was generated byt the following semantic patch:
@@
expression D;
@@
<...
-nir_dest_bit_size(D)
+D.ssa.bit_size
...
Some manual fixup was needed, especially in cpp files where Coccinelle
tends to give up the moment it sees any interesting C++.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674>
To accommodate framebuffers which exceed tilebuffer limits, we'll need to spill
render targets to main memory. In effect, we need to emulate an immediate-mode
renderer for some render targets. This decision is made on a per-render target
basis. In our tilebuffer layout calculation, rather than asserting that all
render targets fit, introduce a notion of spilling.
This doesn't actually implement spilling -- it just pushes the assert failure
down to the users. But it's progress.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
We now have support for baseline MSAA, except for support for eMRT. But hey,
this gets us 99% of the way there, so it's worth flipping on at least in
agx/next.
We can also advertise dual-source blending again. It was reverted since Chromium
freaks out with dual-source blending on a GL 2.1 driver, but since we're
advertising GL 3.1 now, it's ok.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23480>
Blend states can require masking colour. Currently, this is handled by
nir_lower_blend, which lowers masks to a read-modify-write operation as required
on Mali hardware. However, our "tilebuffer store" instruction supports a write
mask, allowing us to write only a subset of channels to the tilebuffer. It's
more efficient to use that than to emit pointless tilebuffer loads.
Note that even without tilebuffer loads, non-opaque masks don't work with opaque
pass types. Here, we handle this with a translucent pass type, which gets HSR
to do the right thing and is consistent with the pass type used previously.
However, it's a bit heavy handed -- Apple manages to use an opaque pass type
with masking but with some unknown HSR fields twiddled. IMO reverse-engineering
those details shouldn't block this because this gets us closer to optimal (just
not all the way there) and is strictly better than what we had before.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21431>
The hardware doesn't extend in this case, we need to extend for it. This
fixes 32-bit render target formats with lower_mediump_io.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21082>
If a render target isn't written to, we don't use the sample mask. Avoid
generating the intermediate instructions, common with gl_FragColor. It will get
DCE'd, but this means less work for DCE, which should help for shader jank since
this pass gets called per-variant.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20446>
See 0afd691f29 ("panfrost: clang-format the tree") for why I'm doing this.
Asahi already mostly follows Mesa style so this doesn't do much. But this means
we can all stop thinking about formatting and trust the robot poets to do that
for us.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20434>
The compiler can't handle load/store_output directly for nontrivial tilebuffer
layouts. Add a NIR pass to lower these intrinsics, applying a given layout.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19871>