Commit graph

187067 commits

Author SHA1 Message Date
Alyssa Rosenzweig
90b4e27bb2 agx: use funop short form
noticed comparing asm with the blob

total bytes in shared programs: 14112726 -> 13986278 (-0.90%)
bytes in affected programs: 10848000 -> 10721552 (-1.17%)
helped: 9115
HURT: 0
Bytes are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
42a43bbdad libagx: parallelize prefix sum over 1024 threads
instead of 32. small % win on a synthetic GS benchmark

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
3319d4fdba libagx: deal with silly NIR
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
13ecef56d0 libagx: accelerate prim restart unroll across wg
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
890a96e2a7 libagx: accelerate restart unroll across a subgroup
before implementing hard stream compaction algorithms, let's do the easy
accelleration.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
b51282500d libagx: polyfill glsl ballot()
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
4a586c7e87 agx: implement load_subgroup_id
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
c274566bbf agx: test constant compaction
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
fb785a5503 agx: compact 32-bit constants
we can convert f16->f32 for free on read, so we can compact constants to reduce
register pressure. this makes constant promotion more effective.

this saves a few instructions in "wall and chimney".

total instructions in shared programs: 2039709 -> 2039862 (<.01%)
instructions in affected programs: 12585 -> 12738 (1.22%)
helped: 0
HURT: 3

total bytes in shared programs: 14111800 -> 14112726 (<.01%)
bytes in affected programs: 102778 -> 103704 (0.90%)
helped: 7
HURT: 4
Inconclusive result (value mean confidence interval includes 0).

total uniforms in shared programs: 1533232 -> 1532271 (-0.06%)
uniforms in affected programs: 60255 -> 59294 (-1.59%)
helped: 481
HURT: 0
Uniforms are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
61b74894a9 agx: promote constants to uniforms
Add an optimization pass to promote constants loaded in the shader to dedicated
uniform registers preloaded before the shader. This is beneficial for two
reasons:

* fewer mov_imm instructions
* less GPR pressure (uniforms have dedicated registers)

The latter can significantly improve occupancy since we don't remat constants
for occupancy. We do remat to avoid spilling so it won't affect spilling,
although it can still be a win by reducing remat when a shader would otherwise
spill.

The problem is that we have limited uniform registers so can't promote
everything that we would want to. We model this as a 0-1 knapsack problem and
use the well-known heuristic to prioritize frequently used constants. This is
not optimal but works quite well in practice.

This gives a nice fps win in some complex shaders, including:

* Dolphin ubers from 10.25fps to 10.85fps at 4K in MMG.
* "Wall and chimney" shadertoy from 24.8fps to 29.5fps at 1188x658.

shader-db results are excellent as well.

total instructions in shared programs: 2088290 -> 2039709 (-2.33%)
instructions in affected programs: 1478061 -> 1429480 (-3.29%)
helped: 8246
HURT: 85
Instructions are helped.

total bytes in shared programs: 14321004 -> 14111800 (-1.46%)
bytes in affected programs: 10108742 -> 9899538 (-2.07%)
helped: 7999
HURT: 1416
Bytes are helped.

total regs in shared programs: 602415 -> 590371 (-2.00%)
regs in affected programs: 92177 -> 80133 (-13.07%)
helped: 1887
HURT: 209
Regs are helped.

total uniforms in shared programs: 1457531 -> 1533232 (5.19%)
uniforms in affected programs: 835522 -> 911223 (9.06%)
helped: 0
HURT: 11042
Uniforms are HURT.

total threads in shared programs: 20325824 -> 20329216 (0.02%)
threads in affected programs: 29632 -> 33024 (11.45%)
helped: 41
HURT: 0
Threads are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
2a97657792 agx: extract agx_is_float_src
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
6e2cc790eb agx: model 64-bit uniform restriction on ALU
this one is annoying!

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
5b6640d013 agx: extract "accepts uniform?" ISA query
we'll want this in a second place, so invert it and export it.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
51d3a376bc agx: restrict high uniforms with textures
seems to cause brokenness in blender, guess this is a new ISA corner we just
found.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
51eba1c38e agx: fix lowering uniforms with abs/neg
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
17e05c2f21 agx: add more shaderdb stats
relevant to spilling and promotion

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
36491b6e0b asahi: use less bindless samplers
before/after:

wanderer/386.shader_test - MESA_SHADER_FRAGMENT shader: 16404 inst, 111604 bytes, 255 halfregs, 384 threads, 5 loops, 444:736 spills:fills
wanderer/386.shader_test - MESA_SHADER_FRAGMENT shader: 16268 inst, 110728 bytes, 255 halfregs, 384 threads, 5 loops, 436:720 spills:fills

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
55d7267d6c agx: implement get_sr remat
wanderer/386.shader_test - MESA_SHADER_FRAGMENT shader: 16268 inst, 110728 bytes, 255 halfregs, 384 threads, 5 loops, 436:720 spills:fills
wanderer/386.shader_test - MESA_SHADER_FRAGMENT shader: 16255 inst, 110670 bytes, 255 halfregs, 384 threads, 5 loops, 435:719 spills:fills

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
03000030d0 agx: generalize remat code
so we can remat more things

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
e2ce64d013 agx: enable indirect temps
we support scratch now.

total spills in shared programs: 32764 -> 990 (-96.98%)
spills in affected programs: 32764 -> 990 (-96.98%)

total fills in shared programs: 38694 -> 639 (-98.35%)
fills in affected programs: 38694 -> 639 (-98.35%)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
e80d451e55 agx: move spill/fills accounting to shaderdb
don't bother the compiler proper about it. this now counts NIR scratch access as
spills/fills, which I think is probably the right call

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
43e804b0e4 agx: add tests for SSA repair
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
0dbf1b48d1 agx: add helpers for multiblock unit tests
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
a181f3caf4 agx: make add_successor public
for unit testing

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
7c147ae448 agx: use dense reg_to_ssa map
ssa_to_reg is necessarily sparse, and since it's allocated per block, it's
tremendously memory intensive for shaders with thousands of blocks (which
can easily happen with if-ladders)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
034f369ddf agx: switch to Braun-Hack spiller
instructions in affected programs: 771842 -> 261686 (-66.10%)
bytes in affected programs: 5286320 -> 1981896 (-62.51%)
spills in affected programs: 134070 -> 32764 (-75.56%)
fills in affected programs: 341356 -> 38694 (-88.66%)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
fe8b245cc4 agx: add Braun-Hack spiller pass
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
a6e9f707f4 agx: add SSA repair pass
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
450e79c1e6 agx: add SSA reindexing pass
spilling and SSA repair will generate piles of dead SSA defs. add a reindexing
pass to keep memory usag emanagable on large shaders.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
20409b6bae agx: validate phi sources for consistency
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
d553af7f8b agx: drop scratch regs for spilling
remnant of an old approach.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
b48f2d0ebc agx: try to coalesce moves
No shader-db changes but meh.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
fe612acb8f agx: coalesce phi webs
This massively improves our coalescing of phis by considering not just single
phi instructions but entire webs of phi-related SSA values. We do this with a
union-find data structure, which is effectively constant time thanks to union by
rank and path compression. Phi related SSA values are unioned and we try to
assign the same register to everything in the union. Boissinot might be better
but this is delightfully simple.

total instructions in shared programs: 2910655 -> 2883792 (-0.92%)
instructions in affected programs: 1295671 -> 1268808 (-2.07%)
helped: 1129
HURT: 34
Instructions are helped.

total bytes in shared programs: 19417970 -> 19255234 (-0.84%)
bytes in affected programs: 8790112 -> 8627376 (-1.85%)
helped: 1129
HURT: 34
Bytes are helped.

total halfregs in shared programs: 517813 -> 517867 (0.01%)
halfregs in affected programs: 751 -> 805 (7.19%)
helped: 2
HURT: 15
Halfregs are HURT.

total spills in shared programs: 135918 -> 134070 (-1.36%)
spills in affected programs: 135918 -> 134070 (-1.36%)
helped: 6
HURT: 0

total fills in shared programs: 343204 -> 341356 (-0.54%)
fills in affected programs: 343204 -> 341356 (-0.54%)
helped: 6
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
afdcee6a0a agx: add limit for max sources per non-phi
used to bound stack allocations.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
3684c9ebbd agx: add before_function cursor
needs care around preloads.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
6bff596505 agx: add temp_like helper
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
ffd2b846c4 agx: add more iterator macros
will be used for spiller.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
645f5187ed agx: don't leak shuffle copies
==42579==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 9408 byte(s) in 147 object(s) allocated from:
    #0 0xffff244c4828 in __interceptor_realloc.part.0 (/usr/lib64/libasan.so.8.0.0+0xc4828) (BuildId: 3109905b64795755dad05d7bb29ad23633a06660)
    #1 0xffff1fc71fe0 in util_dynarray_ensure_cap ../src/util/u_dynarray.h:109
    #2 0xffff1fc71fe0 in util_dynarray_grow_bytes ../src/util/u_dynarray.h:157
    #3 0xffff1fc71fe0 in assign_regs_by_copying ../src/asahi/compiler/agx_register_allocate.c:440
    #4 0xffff1fc73858 in find_regs ../src/asahi/compiler/agx_register_allocate.c:648
    #5 0xffff1fc77d3c in pick_regs ../src/asahi/compiler/agx_register_allocate.c:1010
    #6 0xffff1fc77d3c in agx_ra_assign_local ../src/asahi/compiler/agx_register_allocate.c:1098
    #7 0xffff1fc77d3c in agx_ra ../src/asahi/compiler/agx_register_allocate.c:1355
    #8 0xffff1fc3b6c4 in agx_compile_function_nir ../src/asahi/compiler/agx_compile.c:2840

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
106da137e3 agx: implement live range splits of phis
this requires some special handling but closes the last soundness gap (I hope)
in our RA. with later patches in this series, we actually hit this (50+ tests on
the CTS even) so I can be sure this actually works ^^

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
a7f01d8ca5 agx: sink harder
cuts a few spills in blender

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
4024a82aa3 agx: fix bogus implicit cast with 2d msaa arrays
causes heartburn for spiller with KHR-GL46.multi_bind.dispatch_bind_image_textures

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
c41c6ff27f agx: assert phis don't have .kill set
it's meaningless for phis but would cause soundness problems.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
6b878c5b57 agx: allow vector phis to pass validation
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
f2b344a041 agx: scalarize vector phis
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
2fc2a45c8f agx: fix 16-bit mem swaps
don't clobber r1l

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
5bfdb20dac agx: add num_successors helper
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
da18ac5dfa agx: add more asserts
sigh, C

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
c0d47d827a agx: fix allocating phi sources past the reg file
count can differ.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
3a3f9de486 agx: fix stack smash with spilling
ASAN saves the day! Was stuck on this for hours.

==37495==ERROR: AddressSanitizer: stack-buffer-overflow on address 0xfffff29ecdbc at pc 0xffff7c0751f4 bp 0xfffff29eca30 sp 0xfffff29eca48
READ of size 4 at 0xfffff29ecdbc thread T0
    #0 0xffff7c0751f0 in __bitset_set_range ../src/util/bitset.h:249
    #1 0xffff7c0751f0 in find_regs ../src/asahi/compiler/agx_register_allocate.c:642
    #2 0xffff7c077d2c in pick_regs ../src/asahi/compiler/agx_register_allocate.c:1008
    #3 0xffff7c077d2c in agx_ra_assign_local ../src/asahi/compiler/agx_register_allocate.c:1096
    #4 0xffff7c077d2c in agx_ra ../src/asahi/compiler/agx_register_allocate.c:1353
    #5 0xffff7c03b6c4 in agx_compile_function_nir ../src/asahi/compiler/agx_compile.c:2840

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00
Alyssa Rosenzweig
9ca5778f3e agx/opt_cse: alloc less
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
2024-03-30 00:26:18 +00:00