Commit graph

740 commits

Author SHA1 Message Date
Rose Hudson
5e9538c12e agx: isolate compiler debug flags
The gallium disk cache is about to depend on these, and I don't want to
create a dependency on agx_opcodes.h.py for that. So, make a new header
for them that doesn't have build dependencies.
Rename them to agx_compiler_* too, to avoid collisions with the other
driver debug flags.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21776>
2023-03-08 02:07:44 +00:00
Alyssa Rosenzweig
66f806d01d agx: Assert that memory index is 32-bit reg
Semantics will be wrong otherwise (reading garbage).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21643>
2023-03-07 02:58:35 +00:00
Alyssa Rosenzweig
2a174f0019 agx/lower_address: Handle 16-bit offsets
These need to be upconverted for correctness.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21643>
2023-03-07 02:58:35 +00:00
Alyssa Rosenzweig
9f5a4a9604 agx/lower_address: Fix handling of 64-bit immediates
We can't add a 64-bit immediate with the hardware iadd, that won't work. What we
can do is add a 32-bit immediate, derived as the low 32-bits of a 64-bit
nir_ssa_def.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21643>
2023-03-07 02:58:35 +00:00
Alyssa Rosenzweig
4bd0e1d097 agx/lower_address: Handle 8-bit load/store
Should work ok with the implicit up-conversion that the backend does.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21643>
2023-03-07 02:58:35 +00:00
Alyssa Rosenzweig
5865e23a07 agx/lower_address: Handle large shifts
If we manage to fold in a left shift that's bigger than the hardware can do, we
should at least avoid generating a useless right shift to feed the hardware
rather bailing completely.

For motivation, this form of address arithmetic is encountered when indexing
into arrays with large power-of-two element sizes (array-of-structs).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21643>
2023-03-07 02:58:35 +00:00
Alyssa Rosenzweig
6203503196 agx/lower_address: Optimize "shift + constant"
Optimize address arithmetic of the form

   base + u2u64((index << shift) + const)

into hardware operands

   base, index << (shift - format_shift) + const'

which (if format_shift = shift) can be simply

   base, index + const'

rather than the current naive translation

   base, ((index << shift) + const) >> format_shift

This saves at least one pointless shift. We can't do this optimization with
nir_opt_algebraic, because explicitly optimizing "(a << #b) >> #b" to "a" isn't
sound due to overflow. But there's no overflow issue here, which is what this
whole pass is designed around.

For motivation, this address arithmetic implements "dynamically indexing into an
array inside of a C structure", where the const is the offset of the array
relative to the structure.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21643>
2023-03-07 02:58:35 +00:00
Alyssa Rosenzweig
dccf6f569b agx/lower_address: Break on match
Once we've matched a summand, commit to it. This avoids needlessly checking the
second source if the first matched, and removes some indentation/funny control
flow.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21643>
2023-03-07 02:58:35 +00:00
Alyssa Rosenzweig
f92738eaaa agx: Handle fragment shader side effects
Fragment shaders with side effects need to be lowered to ensure they execute for
all shaded pixels but no helper threads. Add a lowering pass to handle this.

Fixes dEQP-GLES31.functional.shaders.opaque_type_indexing.atomic_counter.const_literal_fragment

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21712>
2023-03-05 19:12:35 +00:00
Alyssa Rosenzweig
290f3b76f3 agx: Disable tri merging with side effects
As Metal does.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21712>
2023-03-05 19:12:35 +00:00
Alyssa Rosenzweig
ed587ae6ac asahi/meta: Use lowered I/O
No point in creating a variable when we can just synthesize the store_output
directly.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21430>
2023-03-05 09:27:02 +00:00
Alyssa Rosenzweig
c7b5f01461 agx: Only lower int64 late
This is required for address arithmetic to be lowered properly for compute
kernels, which may have u2u64 in the source NIR.

No shader-db changes (for GLES3.0).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21430>
2023-03-05 09:27:02 +00:00
Alyssa Rosenzweig
811f8b899d agx: Don't print pre-optimization shader
It's usually too noisy to be useful, especially before DCE. The optimized (but
pre-RA) shader is usually the useful bit.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21430>
2023-03-05 09:27:02 +00:00
Alyssa Rosenzweig
ea37d7f81f agx: Use agx_emit_collect for st_tile
Instead of open coding.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21430>
2023-03-05 09:27:02 +00:00
Alyssa Rosenzweig
7bb8112fd1 agx: Refactor vector creation
agx_vec4 is unused, drop in, and split out the common logic since we'll use it
in a new helper.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21430>
2023-03-05 09:27:02 +00:00
Alyssa Rosenzweig
037609f1dc agx: Constify agx_print
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21430>
2023-03-05 09:27:02 +00:00
Alyssa Rosenzweig
a9c5956f2f agx: Inline 16-bit load/store offsets
Most integer immediates are only 8-bit, but load/store instructions allow their
immediate offsets to be 16-bit instead. Take advantage of this in the optimizer.
This eliminates 36% of the instructions in
dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.36, a fitting
percentage.

Insignificant effect on dEQP-GLES31.functional.ssbo.* performance... Only a
small % of our compile-time pie is actually spent in the backend anyway (as
opposed to NIR passes or GLSL IR).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21430>
2023-03-05 09:27:02 +00:00
Alyssa Rosenzweig
c9728b41d5 agx: Factor out allows_16bit_immediate check
The optimizer needs this information to inline immediates effectively.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21430>
2023-03-05 09:27:02 +00:00
Alyssa Rosenzweig
445ca949cd agx: Clean up after lowering address arithmetic
This avoids creating silly preambles that don't actually do anything except push
a constant that we could've inlined for cheaper anyway, since nir_opt_preamble's
cost model is sensitive to e.g. constant folding.

This avoids a pointless preamble in split-hell.

As a nice bonus, this also improves compile-time on address-heavy shaders. With
a release build, CPU time in dEQP-GLES31.functional.ssbo.* reduces from 12.87s
to 10.77... a 16% improvement is nothing to sneeze at.

shader-db results are mostly noise.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21430>
2023-03-05 09:27:02 +00:00
Alyssa Rosenzweig
4b1f4b86ea agx: Add AGX_MESA_DEBUG=nopreamble option
Useful both for ruling out issues with shader preambles as well as (in some
cases) making for a nicer reading experience of the compiled assembly.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21430>
2023-03-05 09:27:02 +00:00
Alyssa Rosenzweig
c22a18c9af agx: Don't write sample mask from preambles
It doesn't make sense, they're basically little compute kernel environments.
Noticed while debugging dEQP-GLES31.functional.fbo.no_attachments.multisample.*

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21710>
2023-03-05 08:20:09 +00:00
Alyssa Rosenzweig
8bb40ce4ad agx: Fix 2D MSAA array texture register allocation
Sample index and layer index are both 16-bits, even though they are zero
extended for compiler simplicity in some cases. In particular this means that 2D
MSAA arrays consume 6 half-regs for their coordinates, not 8. This is what the
IR translation (actually agx_nir_lower_texture) produces, we just need to fix
the calculation in agx_read_registers to agree.

Fixes validation failure in tests like
dEQP-GLES31.functional.texture.multisample.samples_4.use_texture_color_2d_array

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21708>
2023-03-05 08:06:43 +00:00
Alyssa Rosenzweig
3032e3ad23 agx: Mask shifts in the backend
This gives our shifts SM5 behaviour at the cost of a little extra ALU. That way,
we match NIR's shifts.

This fixes unsoundness of GLSL expressions like "a << (b & 31)", where the &
would mistakenly get optimized away.

Closes: #8181
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reported-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21673>
2023-03-05 07:52:22 +00:00
Alyssa Rosenzweig
f4e2b22646 asahi: Advertise dual-source blending
This is handled entirely in common code.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21545>
2023-03-05 07:38:36 +00:00
Asahi Lina
1dd872ec17 asahi: Assert on TIB strides > 64
These just don't seem to work. macOS falls back to eMRT here...

dEQP-GLES3.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.13
from Fail -> Crash. Proper solution will come when we implement eMRT later on.

Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21705>
2023-03-04 10:58:10 -05:00
Asahi Lina
26c51bb8d8 asahi: clang-format the world again
Some things were missed (like winsys) and there's still some bad include
orders lying around and some other randomness.

We should set up CI checks for this soon... ^^;;

Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21687>
2023-03-03 22:55:59 +00:00
Asahi Lina
0a5f3556a1 asahi: Fix device fd leak in agx_close_device
I'm not sure if this was always broken downstream or just got dropped at
some point, but it's definitely UAPI-agnostic and missing now that we
have all the non-UAPI bits upstream.

Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21677>
2023-03-03 21:11:47 +00:00
Alyssa Rosenzweig
f083e1807d asahi/decode: Handle VDM barriers
We emit these now (for transform feedback).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21675>
2023-03-03 20:54:18 +00:00
Asahi Lina
798fc2730b asahi: Add agx_debug_fault() helper
We expect to forward GPU fault information to userspace. Since Mesa can
get that information, we can look up the fault address to log what was
the containing or nearest BO. Add a helper for that, so it can be called
from the driver.

Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21662>
2023-03-03 00:28:48 +00:00
Asahi Lina
240e9dc5dc asahi: Add APIs for DMA-BUF sync file import/export
These are generic ioctls, so it is safe to add them now.

Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21662>
2023-03-03 00:28:48 +00:00
Asahi Lina
d610f40e17 asahi: Implement Linux driver scaffolding, sans UAPI
With macOS support out of the way, we can start implementing a lot of
the Linux driver interface and bookkeeping without actually adding the
UAPI proper. Let's do that to reduce the size of the UAPI patchset.

Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21662>
2023-03-03 00:28:48 +00:00
Asahi Lina
942d9cc17b asahi: Align device submission API with upcoming UAPI
Nothing implemented, but this lets us get the batch tracking bits in,
including explicit sync/DMA-BUF integration which uses generic ioctls.

Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21662>
2023-03-03 00:28:48 +00:00
Asahi Lina
7f2e24d2ef asahi: Add nocluster,sync,stats debug flags
These are only useful with the upcoming Linux UAPI, but there's no harm
in getting the debug scaffolding in now.

Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21662>
2023-03-03 00:28:48 +00:00
Asahi Lina
afe134a49c asahi: Drop macOS backend
This might be useful in the future, but it is best reimplemented in
terms of the upcoming Linux UAPI instead of having parallel codepaths.
Let's drop it.

Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21662>
2023-03-03 00:28:48 +00:00
Asahi Lina
70169c7488 asahi: Identify USC cache invalidate
Signed-off-by: Asahi Lina <lina@asahilina.net>
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21538>
2023-03-01 01:04:29 +00:00
Asahi Lina
860ac5c149 asahi: Add readonly BO flag
Signed-off-by: Asahi Lina <lina@asahilina.net>
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21538>
2023-03-01 01:04:29 +00:00
Asahi Lina
0498ad3e26 asahi: Add BO_SHAREABLE flag
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21538>
2023-03-01 01:04:29 +00:00
Alyssa Rosenzweig
760f367386 agx: Lower sampler LOD bias
G13 does not support sampler descriptor LOD biasing, so this needs to be lowered
to shader code for APIs that require this functionality. Add an option to do
this lowering while doing our other backend texture lowerings. This generates
lod_bias_agx texture instructions which the driver is expected to lower
according to its binding model.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21276>
2023-02-27 02:35:41 +00:00
Daniel Schürmann
2bb369dd8d nir: add assertions that loops don't have a Continue Construct
Hoping that I didn't miss any, this *should* add assertions
to all functions and passes which explicitly handle 'nir_loop'.

Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13962>
2023-02-21 10:41:11 +00:00
Alyssa Rosenzweig
029c686c6d asahi: Implement color masks with masked stores
Blend states can require masking colour. Currently, this is handled by
nir_lower_blend, which lowers masks to a read-modify-write operation as required
on Mali hardware. However, our "tilebuffer store" instruction supports a write
mask, allowing us to write only a subset of channels to the tilebuffer. It's
more efficient to use that than to emit pointless tilebuffer loads.

Note that even without tilebuffer loads, non-opaque masks don't work with opaque
pass types.  Here, we handle this with a translucent pass type, which gets HSR
to do the right thing and is consistent with the pass type used previously.
However, it's a bit heavy handed -- Apple manages to use an opaque pass type
with masking but with some unknown HSR fields twiddled. IMO reverse-engineering
those details shouldn't block this because this gets us closer to optimal (just
not all the way there) and is strictly better than what we had before.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21431>
2023-02-21 08:10:15 +00:00
Alyssa Rosenzweig
3084e6e689 agx: Add agx_internal_format_supports_mask helper
Not all formats can be masked, add a query to check which can be.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21431>
2023-02-21 08:10:15 +00:00
Alyssa Rosenzweig
5e031867fe agx: Handle ssa_undef as zero
Masked stores may result in undefs after optimization. Rather than call
lower_undef_to_zero late (but get no benefit), we may as well handle ourselves
to prepare for proper undef support down the line.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21431>
2023-02-21 08:10:15 +00:00
Alyssa Rosenzweig
eab4d6a96f agx: Add and use agx_nir_ssa_index helper
Common subexpression that we'll repeat once more in the next patch.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21431>
2023-02-21 08:10:15 +00:00
Alyssa Rosenzweig
e93a221024 agx: Handle group_memory_barrier
A combination of control_barrier + memory_barrier but it's always seen with
those. This would be safer with scoped barriers...

Fixes dEQP-GLES31.functional.synchronization.inter_invocation.ssbo

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21326>
2023-02-20 18:50:40 +00:00
Alyssa Rosenzweig
e9cec96633 agx: Implement b2b32
Shows up with store_shared.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21326>
2023-02-20 18:50:40 +00:00
Alyssa Rosenzweig
955797bb00 agx: Pack local atomics
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21326>
2023-02-20 18:50:39 +00:00
Alyssa Rosenzweig
14f546726e agx: Lower shared memory offsets to 16-bit
Per the hardware requirement. This simplifies instruction selection (it avoids
the need to constant fold u2u16 in the backend).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21326>
2023-02-20 18:50:39 +00:00
Alyssa Rosenzweig
a21f6f8cb0 agx: Translate load/store_shared
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21326>
2023-02-20 18:50:39 +00:00
Alyssa Rosenzweig
f8b9dfbbad agx: Translate NIR atomics
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21326>
2023-02-20 18:50:39 +00:00
Alyssa Rosenzweig
2a021b1818 agx: Pack local load/store instructions
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21326>
2023-02-20 18:50:39 +00:00