I'm not sure if this was always broken downstream or just got dropped at
some point, but it's definitely UAPI-agnostic and missing now that we
have all the non-UAPI bits upstream.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21677>
We expect to forward GPU fault information to userspace. Since Mesa can
get that information, we can look up the fault address to log what was
the containing or nearest BO. Add a helper for that, so it can be called
from the driver.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21662>
With macOS support out of the way, we can start implementing a lot of
the Linux driver interface and bookkeeping without actually adding the
UAPI proper. Let's do that to reduce the size of the UAPI patchset.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21662>
Nothing implemented, but this lets us get the batch tracking bits in,
including explicit sync/DMA-BUF integration which uses generic ioctls.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21662>
This might be useful in the future, but it is best reimplemented in
terms of the upcoming Linux UAPI instead of having parallel codepaths.
Let's drop it.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21662>
G13 does not support sampler descriptor LOD biasing, so this needs to be lowered
to shader code for APIs that require this functionality. Add an option to do
this lowering while doing our other backend texture lowerings. This generates
lod_bias_agx texture instructions which the driver is expected to lower
according to its binding model.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21276>
Blend states can require masking colour. Currently, this is handled by
nir_lower_blend, which lowers masks to a read-modify-write operation as required
on Mali hardware. However, our "tilebuffer store" instruction supports a write
mask, allowing us to write only a subset of channels to the tilebuffer. It's
more efficient to use that than to emit pointless tilebuffer loads.
Note that even without tilebuffer loads, non-opaque masks don't work with opaque
pass types. Here, we handle this with a translucent pass type, which gets HSR
to do the right thing and is consistent with the pass type used previously.
However, it's a bit heavy handed -- Apple manages to use an opaque pass type
with masking but with some unknown HSR fields twiddled. IMO reverse-engineering
those details shouldn't block this because this gets us closer to optimal (just
not all the way there) and is strictly better than what we had before.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21431>
ralloc is not thread-safe, so we can't use dev->memctx for allocating
context-specific things without locking. On top of that, we always
need to explicitly clean up pools anyway since we need to unref the BOs,
so there is no point to using a memctx.
And since pools need to be explicitly cleaned up, the meta cache code
needs explicit cleanup, so add that and drop memctx from there too.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21348>
This splits up the CDM commands into their subparts, after which
indirect dispatch is straightforward.
Also fix the pipeline bits.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21272>
Implement custom border colours, as required by OpenGL's CLAMP_TO_BORDER and
Vulkan with customBorderColor. This uses an extended sampler descriptor, which
has space for the custom border values. The trouble is that the border must be
packed into an internal interchange format that depends on the original format
in a complex way. That said, we're not solving NP-complete problems here, and it
passes the tests (dEQP-GLES31.functional.texture.border_clamp.* and piglit
texwrap).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20570>
The hardware doesn't extend in this case, we need to extend for it. This
fixes 32-bit render target formats with lower_mediump_io.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21082>
We only need 4 byte alignment, not 8 bytes. This isn't a big difference in
practice, but it probably reduces padding in some cases. More importantly, it
corrects our XML to match what the hardware actually does, which is great.
(There is exactly enough room for a 40-bit address with 4 byte alignment.)
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21118>
Check the size explicitly, instead of just implicitly in the GenXML pack: it is
the responsibility of the caller to split up larger uploads. While this is
nominally more complicated, agx_usc_uniform is called in the draw hot path
whereas the actual splitting decision can usually be done at compile-time.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21118>
All the ifdef __APPLE__ is getting really silly. Let's split off the
macOS UAPI abstraction into its own file, so we can have parallel
implementations.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21058>
In preparation for splitting off the macOS backend implementation into
its own file, pull out the shared BO code from agx_device.c into
agx_bo.c.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21058>
So we're not tied to the macOS or Linux UAPIs and are not translating awkwardly
from one to the other when creating BOs. They're not quite equivalent -- macOS
doesn't include writeback information in this flag field, and Linux doesn't have
a executable flag. (Maybe we should add one, though? Then we can enforce W^X.)
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21058>
Instead of smashing unconditionally to 1. Not sure if this fixes anything but it
gets rid of an unknown at least. Possibly slightly faster.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20561>
If a render target isn't written to, we don't use the sample mask. Avoid
generating the intermediate instructions, common with gl_FragColor. It will get
DCE'd, but this means less work for DCE, which should help for shader jank since
this pass gets called per-variant.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20446>
See 0afd691f29 ("panfrost: clang-format the tree") for why I'm doing this.
Asahi already mostly follows Mesa style so this doesn't do much. But this means
we can all stop thinking about formatting and trust the robot poets to do that
for us.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20434>
Fixes
dEQP-GLES2.functional.texture.mipmap.cube.basic.linear_nearest
when run with a GLES2 version.
We wire up seamless cube maps for GLES3+ only, working around an obscure
mesa/st limitation. See 6148e3aae7 ("mesa: Fix
ctx->Texture.CubeMapSeamless") for the full context.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20365>
The hash table needs a key pointer with at least the lifetime of the
hash entry, which the key pointer we get does not have (since it is
stack-allocated by agx_build_meta). Copy it into the shader struct
itself and use that for the hash table.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20365>
It seems triangle merging is incompatible with calculating derivatives
along primitive edges correctly. Take the appropriate NIR shader info
flags in the compiler and pass them down as a flag to the driver, so it
can set the disable triangle merging flag (formerly called "lines or
points").
TODO: Is this what macOS does when you set a sample mask there (which
apparently fixes the same bug on the Darwinia Metal backend)? Do we
also need to set this when sample masks are used?
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes Darwinia and dEQP2 projected tests.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20365>