we don't want the zs_emit dance emitted, all we care about is making sure tag
writes are enabled as if we had a regular tib store
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
I'm not measuring a significant perf difference in
-bshading:shading=phong:model=bunny -bideas -brefract so this seems Good Enough
For Me.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
Lower store_output to store_uvs_agx + math. Link UVS indices at draw-time
instead of compile-time to get efficient separate shaders. Also picks up varying
compaction along the way.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
I added this hack to avoid WaW hazards with iter instructions. Now that we know
about the iter elide bit and are not setting it improperly, we can drop the
complexity and just allow the hazard.
total instructions in shared programs: 2039480 -> 2038792 (-0.03%)
instructions in affected programs: 123441 -> 122753 (-0.56%)
helped: 811
HURT: 124
Instructions are helped.
total bytes in shared programs: 13983802 -> 13977870 (-0.04%)
bytes in affected programs: 806882 -> 800950 (-0.74%)
helped: 823
HURT: 117
Bytes are helped.
total regs in shared programs: 590670 -> 592862 (0.37%)
regs in affected programs: 8585 -> 10777 (25.53%)
helped: 29
HURT: 398
Regs are HURT.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
this keeps us from hoisting piles of iadd for no benefit withthe new vertex
path. results on shaderdb without HW VS:
total bytes in shared programs: 13975632 -> 13975666 (<.01%)
bytes in affected programs: 3298 -> 3332 (1.03%)
helped: 0
HURT: 3
total uniforms in shared programs: 1516540 -> 1516522 (<.01%)
uniforms in affected programs: 234 -> 216 (-7.69%)
helped: 3
HURT: 0
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
total instructions in shared programs: 2038338 -> 2038217 (<.01%)
instructions in affected programs: 32658 -> 32537 (-0.37%)
helped: 9
HURT: 4
Instructions are helped.
total alu in shared programs: 1593474 -> 1593094 (-0.02%)
alu in affected programs: 110828 -> 110448 (-0.34%)
helped: 315
HURT: 9
Alu are helped.
total fscib in shared programs: 1589634 -> 1589254 (-0.02%)
fscib in affected programs: 110828 -> 110448 (-0.34%)
helped: 315
HURT: 9
Fscib are helped.
total ic in shared programs: 477960 -> 477948 (<.01%)
ic in affected programs: 12 -> 0
helped: 3
HURT: 0
total bytes in shared programs: 13975162 -> 13975632 (<.01%)
bytes in affected programs: 978988 -> 979458 (0.05%)
helped: 14
HURT: 313
Inconclusive result (value mean confidence interval includes 0).
total uniforms in shared programs: 1516534 -> 1516540 (<.01%)
uniforms in affected programs: 4278 -> 4284 (0.14%)
helped: 5
HURT: 6
Inconclusive result (value mean confidence interval includes 0).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
so it runs on the results of b2f lowering.
total instructions in shared programs: 2039862 -> 2039473 (-0.02%)
instructions in affected programs: 12348 -> 11959 (-3.15%)
helped: 84
HURT: 0
Instructions are helped.
total bytes in shared programs: 13986278 -> 13983778 (-0.02%)
bytes in affected programs: 141748 -> 139248 (-1.76%)
helped: 84
HURT: 122
Bytes are helped.
total regs in shared programs: 590371 -> 590373 (<.01%)
regs in affected programs: 195 -> 197 (1.03%)
helped: 5
HURT: 6
Inconclusive result (value mean confidence interval includes 0).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
we can convert f16->f32 for free on read, so we can compact constants to reduce
register pressure. this makes constant promotion more effective.
this saves a few instructions in "wall and chimney".
total instructions in shared programs: 2039709 -> 2039862 (<.01%)
instructions in affected programs: 12585 -> 12738 (1.22%)
helped: 0
HURT: 3
total bytes in shared programs: 14111800 -> 14112726 (<.01%)
bytes in affected programs: 102778 -> 103704 (0.90%)
helped: 7
HURT: 4
Inconclusive result (value mean confidence interval includes 0).
total uniforms in shared programs: 1533232 -> 1532271 (-0.06%)
uniforms in affected programs: 60255 -> 59294 (-1.59%)
helped: 481
HURT: 0
Uniforms are helped.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
Add an optimization pass to promote constants loaded in the shader to dedicated
uniform registers preloaded before the shader. This is beneficial for two
reasons:
* fewer mov_imm instructions
* less GPR pressure (uniforms have dedicated registers)
The latter can significantly improve occupancy since we don't remat constants
for occupancy. We do remat to avoid spilling so it won't affect spilling,
although it can still be a win by reducing remat when a shader would otherwise
spill.
The problem is that we have limited uniform registers so can't promote
everything that we would want to. We model this as a 0-1 knapsack problem and
use the well-known heuristic to prioritize frequently used constants. This is
not optimal but works quite well in practice.
This gives a nice fps win in some complex shaders, including:
* Dolphin ubers from 10.25fps to 10.85fps at 4K in MMG.
* "Wall and chimney" shadertoy from 24.8fps to 29.5fps at 1188x658.
shader-db results are excellent as well.
total instructions in shared programs: 2088290 -> 2039709 (-2.33%)
instructions in affected programs: 1478061 -> 1429480 (-3.29%)
helped: 8246
HURT: 85
Instructions are helped.
total bytes in shared programs: 14321004 -> 14111800 (-1.46%)
bytes in affected programs: 10108742 -> 9899538 (-2.07%)
helped: 7999
HURT: 1416
Bytes are helped.
total regs in shared programs: 602415 -> 590371 (-2.00%)
regs in affected programs: 92177 -> 80133 (-13.07%)
helped: 1887
HURT: 209
Regs are helped.
total uniforms in shared programs: 1457531 -> 1533232 (5.19%)
uniforms in affected programs: 835522 -> 911223 (9.06%)
helped: 0
HURT: 11042
Uniforms are HURT.
total threads in shared programs: 20325824 -> 20329216 (0.02%)
threads in affected programs: 29632 -> 33024 (11.45%)
helped: 41
HURT: 0
Threads are helped.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>