Commit graph

368 commits

Author SHA1 Message Date
Alyssa Rosenzweig
3a6a5281b3 agx: Lower global loads/stores to AGX versions
This lets us do all the needed address arithmetic in a central place.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20558>
2023-01-11 20:36:51 +00:00
Alyssa Rosenzweig
90dea84ef6 agx: Remove dead arg
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20559>
2023-01-10 05:19:25 +00:00
Alyssa Rosenzweig
17d1559036 agx: Use i0/i1 variables
Now that we've defined them.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20559>
2023-01-10 05:19:25 +00:00
Alyssa Rosenzweig
1e61f13ffd agx: Get rid of emit_alu_bool
Deduplicate lots of cases. Splitting this out was silly, bools aren't that
special.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20559>
2023-01-10 05:19:25 +00:00
Alyssa Rosenzweig
5b25ee6cc7 agx: Use agx_subdivide_to for umul_high
Helpers!

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20559>
2023-01-10 05:19:25 +00:00
Alyssa Rosenzweig
f6c5b2a5a3 agx: Remove dead code
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20559>
2023-01-10 05:19:25 +00:00
Alyssa Rosenzweig
fa96dfb2d7 agx: Lower discard to zs_emit when zs_emit used
It is invalid to use both sample_mask and zs_emit in the same shader. We'll need
to do something similar for sample mask writes.

Fixes Dolphin ubershaders.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20446>
2023-01-05 11:49:23 -05:00
Alyssa Rosenzweig
ebe40b15ea agx: Fix discard with MRT
The exact semantics of sample_mask aren't quite clear to me yet, but executing
multiple sample_mask instructions seems to raise a fault :|

Fixes SuperTuxKart's advanced renderer.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20446>
2023-01-05 11:49:23 -05:00
Alyssa Rosenzweig
2b5519e865 agx: Introduce "no_varyings" instruction
Must be used at the end of a vertex shader that does NOT write any varyings, has
rasterizer discard enabled, and is run only for its side effects.

The encoding looks like st_var, but I don't know what this actually *does*. I
just know that the GPU faults if this is omitted.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20446>
2023-01-05 11:49:23 -05:00
Alyssa Rosenzweig
33e3418cfe agx: Consider "stop" a control flow instruction
...and therefore it needs to be after a "logical end". This means that
"after_block_logical" will do the right thing for the last block.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20446>
2023-01-05 11:49:22 -05:00
Alyssa Rosenzweig
f6aa43cf42 agx: Optimize waits locally
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20446>
2023-01-05 11:49:22 -05:00
Alyssa Rosenzweig
a01680b979 agx: Remove logical_end later
So we can use after_block_logical in the wait insertion pass.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20446>
2023-01-05 11:49:22 -05:00
Alyssa Rosenzweig
73ac73308b agx: Validate widths of vectors
Check the invariant that the widths of vectors in the IR are consistent, by
checking that write registers and read registers match up between the writers
and readers respectively.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20446>
2023-01-05 11:49:22 -05:00
Alyssa Rosenzweig
6685dba75e agx: Add agx_read_registers helper
To be used for inserting waits post-RA.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20446>
2023-01-05 11:49:22 -05:00
Alyssa Rosenzweig
e6631ba5af agx: Compact st_tile argument per mask
Otherwise the number of read registers won't match the vector we input, which
will trigger validation errors.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20446>
2023-01-05 11:49:22 -05:00
Alyssa Rosenzweig
545a3eb601 agx: Insert waits post-RA
This is the first step towards reducing stalling.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20446>
2023-01-05 11:49:22 -05:00
Alyssa Rosenzweig
463744e4f9 agx: Pack texture scoreboard slots
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20446>
2023-01-05 11:49:22 -05:00
Alyssa Rosenzweig
01f948ee13 agx: Pack wait instructions
For different scoreboard slots.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20446>
2023-01-05 11:49:22 -05:00
Alyssa Rosenzweig
640afb33b9 agx: Remove unused idiv const func
This was used for instancing, but has been unused since 8dcf7648f1 ("agx: Lower VBOs in NIR")

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20446>
2023-01-05 11:49:22 -05:00
Alyssa Rosenzweig
44925a142e agx: Use metadata for VS varying linking
Rather than variables. This gets rid of all backend nir_variable use.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20446>
2023-01-05 11:49:22 -05:00
Alyssa Rosenzweig
617f2f7a02 agx: Don't use nir_variable when gathering flat varyings
Walk the IR instead. This happens when preprocessing so it doesn't really
matter, but it complicates the nir_variable audit.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20446>
2023-01-05 11:49:22 -05:00
Alyssa Rosenzweig
d00a43f682 agx: Hash agx_instr faster
Prior to this change, agx_opt_cse is our most expensive backend pass, due to the
time spent hashing instructions. hash_instr was calling into XXH32 a massive
number of times, often to hash only a single bit. It's much faster to hash
entire blocks of memory at a time. Optimize to do just that.

With this change, agx_opt_cse is now cheaper than instruction selection as
it should be.

No shader-db changes (except CPU time decrease).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20446>
2023-01-05 11:49:22 -05:00
Alyssa Rosenzweig
f44afe766f agx: Use texture write mask
We do need to use undefs instead of zeroes in this internal collect. While this
vector gets copypropped out, it'd cause us to fail compilation if noopt is on.
Fix that.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20446>
2023-01-05 11:49:22 -05:00
Alyssa Rosenzweig
7284e4967c agx: Note that textures clobber even masked
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20446>
2023-01-05 11:49:22 -05:00
Alyssa Rosenzweig
ddbec45b6f agx: Plumb in store instruction
This will be used for compute kernels (and transform feedback) in the (near)
future. For now, let's get the opcode plumbed in the backend to reduce some of
the rebase pain.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20446>
2023-01-05 11:49:22 -05:00
Alyssa Rosenzweig
f603d8ce9e asahi: Clang-format the subtree
See 0afd691f29 ("panfrost: clang-format the tree") for why I'm doing this.
Asahi already mostly follows Mesa style so this doesn't do much. But this means
we can all stop thinking about formatting and trust the robot poets to do that
for us.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20434>
2022-12-27 22:46:29 +00:00
Alyssa Rosenzweig
d9dc77f068 asahi: Add some clang-format commas
Otherwise clang-format will mangle this.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20434>
2022-12-27 22:46:29 +00:00
Alyssa Rosenzweig
c1f175c9fa asahi: Manually format some parts of the code
clang-format will mangle these.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20434>
2022-12-27 22:46:29 +00:00
Alyssa Rosenzweig
680c873b35 agx: Undo sed fail
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20434>
2022-12-27 22:46:29 +00:00
Alyssa Rosenzweig
9578b47af3 agx: Implement depth and stencil export
Lower FRAG_RESULT_DEPTH and FRAG_RESULT_STENCIL writes to a combnied zs_emit
instruction with a multisampling index. To be used in the following commit.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20365>
2022-12-17 18:10:28 +00:00
Asahi Lina
c12153cd89 asahi: Identify & disable triangle merging for shaders using derivatives
It seems triangle merging is incompatible with calculating derivatives
along primitive edges correctly. Take the appropriate NIR shader info
flags in the compiler and pass them down as a flag to the driver, so it
can set the disable triangle merging flag (formerly called "lines or
points").

TODO: Is this what macOS does when you set a sample mask there (which
apparently fixes the same bug on the Darwinia Metal backend)? Do we
also need to set this when sample masks are used?

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes Darwinia and dEQP2 projected tests.

Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20365>
2022-12-17 18:10:28 +00:00
Asahi Lina
b80fb31678 asahi: Allocate enough push ranges for the worst possible case
We need one for every possible sysval, plus up to 16 VBOs.

Fixes plasma-systemmonitor.

Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20365>
2022-12-17 18:10:28 +00:00
Alyssa Rosenzweig
eba2b182c8 agx: Fix packing of extension for block image stores
Probably impossible to hit in practice but let's get it right. Found when
forcing RA to use the upper half of the reg file.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20365>
2022-12-17 18:10:28 +00:00
Alyssa Rosenzweig
ef23bbfdbd agx: Coalesce i2i16 and u2u16
Extract out the code for unpack_64_2x32_split_x and use it for other integer
downcasts too to coalesce out a move. Pointless, but I wanted to have a little
RA fun after getting stencil export working.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20365>
2022-12-17 18:10:28 +00:00
Ian Romanick
eb76cee9f8 nir: Eliminate nir_op_i2b
There are a lot of optimizations in opt_algebraic that match ('ine', a,
0), but there are almost none that match i2b.  Instead of adding a huge
pile of additional patterns (including variations that include both ine
and i2b), always lower i2b to a != 0.

At this point in the series, it should be impossible for anything to
generate i2b, so there /should not/ be any changes.

The failing test on d3d12 is a pre-existing bug that is triggered by
this change.  I talked to Jesse about it, and, after some analysis, he
suggested just adding it to the list of known failures.

v2: Don't rematerialize i2b instructions in dxil_nir_lower_x2b.

v3: Don't rematerialize i2b instructions in zink_nir_algebraic.py.

v4: Fix zink-on-TGL CI failures by calling nir_opt_algebraic after
nir_lower_doubles makes progress.  The latter can generate b2i
instructions, but nir_lower_int64 can't handle them (anymore).

v5: Add back most of the hunk at line 2125 of nir_opt_algebraic.py. I
had accidentally removed the f2b(bf2(x)) optimization.

v6: Just eliminate the i2b instruction.

v7: Remove missed i2b32 in midgard_compile.c. Remove (now unused)
emit_alu_i2orf2_b1 function from sfn_instr_alu.cpp. Previously this
function was still used. 🤷

No shader-db changes on any Intel platform.

All Intel platforms had similar results. (Ice Lake shown)
Instructions in all programs: 141165875 -> 141165873 (-0.0%)
Instructions helped: 2

Cycles in all programs: 9098956382 -> 9098956350 (-0.0%)
Cycles helped: 2

The two Vulkan shaders are helped because of the "new" (('b2i32',
('ine', ('ubfe', a, b, 1), 0)), ('ubfe', a, b, 1)) algebraic pattern.

Acked-by: Jesse Natalie <jenatali@microsoft.com> [earlier version]
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev> [earlier version]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
2022-12-14 06:23:21 +00:00
Alyssa Rosenzweig
8dcf7648f1 agx: Lower VBOs in NIR
Now we support all the vertex formats! This means we don't hit u_vbuf for format
translation, which helps performance in lots of applications. By doing the
lowering in NIR, the vertex fetch code itself can be optimized by NIR (e.g.
nir_opt_algebraic) which can improve generated code quality.

In my first implementation of this, I had a big switch statement mapping format
enums to interchange formats and post-processing code. This ends up being really
unwieldly, the combinatorics of bit packing + conversion + swizzles is
enormous and for performance we want to support everything (no u_vbuf
fallbacks). To keep the combinatorics in check, we rely on parsing the
util_format_description to separate out the issues of bit packing, conversion,
and swizzling, allowing us to handle bizarro formats like B10G10R10A2_SNORM with
no special casing.

In an effort to support everything in one shot, this handles all the formats
needed for the extensions EXT_vertex_array_bgra, ARB_vertex_type_2_10_10_10_rev,
and ARB_vertex_type_10f_11f_11f_rev.

Passes dEQP-GLES3.functional.vertex_arrays.*

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19996>
2022-12-02 06:25:20 +00:00
Alyssa Rosenzweig
fb49715a2c agx: Lower UBOs in NIR
Simpler than lowering in the backend and makes the sysvals obvious in the NIR.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19996>
2022-12-02 06:25:20 +00:00
Alyssa Rosenzweig
6b4ed663a8 agx: Implement 8-bit sign extensions
Long term, I think having i2i16 and i2i32 available with 8-bit sources should
make lowering the rest of 8-bit away a bit easier. Short term, this avoids
special casing 8-bit in the VBO lowering code.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19996>
2022-12-02 06:25:20 +00:00
Alyssa Rosenzweig
8127737c1e agx: Allow some 8-bit sources
8-bit sources are useful for int8->float32 conversions, which we can do in a
single hardware instruction.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19996>
2022-12-02 06:25:20 +00:00
Alyssa Rosenzweig
ba209fe493 agx: Implement formatted loads
These will be generated by the UBO and VBO lowerings. (and eventually by other
lowerings too?)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19996>
2022-12-02 06:25:20 +00:00
Alyssa Rosenzweig
580f25a266 agx: Add shift to device_load
We'll use this as an optimization soon. This acts in addition to the format's
shift.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19996>
2022-12-02 06:25:20 +00:00
Alyssa Rosenzweig
1555ac6f0b agx: Clamp point sizes
Fixes vs-point_size-zero.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20017>
2022-12-01 05:58:30 +00:00
Alyssa Rosenzweig
7108619c0d agx: Handle 32-bit gl_FragCoord.zw
The coefficient register is 16-bit so our builder will make the iter 16-bit too
(maybe not the best design...), force fp32 to match the NIR intrinsic.

Fixes glsl-fs-fragcoord-zw-ortho

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20017>
2022-12-01 05:58:30 +00:00
Alyssa Rosenzweig
eb4187b02d agx: Handle large varying indices
Fixes glsl-max-varyings.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20017>
2022-12-01 05:58:30 +00:00
Alyssa Rosenzweig
6de5bd5f41 agx: Fix signedness issues packing
UBSan complains otherwise:

../src/asahi/compiler/agx_pack.c:701:21: runtime error: left shift of 1 by 31 places cannot be represented in type 'int'
../src/asahi/compiler/agx_pack.c:534:18: runtime error: left shift of 8 by 28 places cannot be represented in type 'int'

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19971>
2022-11-24 23:37:48 +00:00
Alyssa Rosenzweig
d608ca0363 agx: Handle vertex shaders that use <= 8 halfregs
r5 and r6 are always getting lowered. Will prevent a regression with VBO
lowering on a shader which has stride=0 and hence gets the vertex ID read
optimized out with NIR:

   dEQP-GLES2.functional.draw.random.50

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19971>
2022-11-24 23:37:48 +00:00
Alyssa Rosenzweig
94124925ca agx: Try to align sources of pack_64_2x32_split
Helps with coalescing the pack.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19971>
2022-11-24 23:37:48 +00:00
Alyssa Rosenzweig
442e29890d agx: Implement nir_op_pack_64_2x32_split
This maps to a collect where the dest size is 64 and the src size is 32.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19971>
2022-11-24 23:37:48 +00:00
Yonggang Luo
40a9fc57aa tree-wide: Use __func__ instead of __FUNCTION__ in non-gallium code
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19861>
2022-11-22 06:53:46 +00:00
Alyssa Rosenzweig
74e92274af asahi,agx: Use new tilebuffer infrastructure
Flag day change to replace the previous hardcoded background/end-of-tile shaders
and the API-style load/store_output in fragment shaders with the generated
shaders and lowered *_agx intrinsics. This gets us working non-UNORM8 render
targets and working MRT. It's also a step in the direction of working MSAA but
that needs a lot more work, since the multisampling programming model on AGX is
quite different from any of the APIs (including Metal).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19871>
2022-11-19 20:25:41 +00:00