Commit graph

52 commits

Author SHA1 Message Date
Timur Kristóf
04f90db9a0 aco: Use Operand instead of Temp for the exec mask stack.
This will enable us to store non-temporary values,
such as constant operands there.

No Fossil DB changes.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10691>
2021-05-18 11:48:22 +00:00
Rhys Perry
961361cdc9 aco: ensure loops nested in a WQM loop are in WQM
Fixes a potential empty exec mask in this situation:
enter_wqm()
loop {
   ... wqm code ...
   enter_exact()
   loop {
      ... no wqm code ...
   }
}

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: f0074a6f05 ("aco: do not flag all blocks WQM to ensure we enter all nested loops in WQM")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4546
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10075>
2021-04-08 09:56:25 +00:00
Timur Kristóf
8205cce007 aco: Use ASSERTED to avoid unused variable warning.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9632>
2021-03-16 21:46:52 +00:00
Rhys Perry
5f1b354472 aco: calculate all p_as_uniform and v_readfirstlane_b32 sources in WQM
We should avoid a situation where a v_readfirstlane_b32 is in WQM but it's
source is calculated in Exact.

Fixes hang when running Assassin's Creed: Valhalla benchmark.

fossil-db (GFX10.3):
Totals from 1021 (0.70% of 146267) affected shaders:
CodeSize: 7835228 -> 7842992 (+0.10%); split: -0.00%, +0.10%
Instrs: 1519208 -> 1521149 (+0.13%); split: -0.00%, +0.13%
SClause: 78921 -> 78920 (-0.00%)
Copies: 44456 -> 45421 (+2.17%); split: -0.05%, +2.22%
Branches: 12987 -> 13933 (+7.28%)
PreSGPRs: 47599 -> 47813 (+0.45%)
Cycles: 10037540 -> 10045304 (+0.08%); split: -0.00%, +0.08%
VMEM: 538381 -> 538777 (+0.07%); split: +0.11%, -0.03%
SMEM: 84553 -> 84554 (+0.00%); split: +0.01%, -0.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9288>
2021-02-26 13:33:56 +00:00
Daniel Schürmann
29b866fef6 aco: remove special handling of load_helper_invocation
These should now behave the same as is_helper_invocation.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9058>
2021-02-17 21:53:52 +00:00
Daniel Schürmann
fc6b5be666 aco: fix assertion in insert_exec_mask pass
Fixes: a56ddca4e8 ('aco: make all exec accesses non-temporaries ')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9047>
2021-02-15 19:50:16 +00:00
Rhys Perry
ddce1ec5f5 aco: fix transition_to_{WQM,Exact} if exec.back() is not in exec
This can happen at merge blocks.

fossil-db (GFX10.3):
Totals from 25229 (17.25% of 146267) affected shaders:
CodeSize: 58575920 -> 58571376 (-0.01%); split: -0.01%, +0.00%
Instrs: 10979245 -> 10978109 (-0.01%); split: -0.01%, +0.00%
SClause: 591817 -> 591816 (-0.00%)
Copies: 604987 -> 603851 (-0.19%); split: -0.19%, +0.00%
Cycles: 96088796 -> 96084252 (-0.00%); split: -0.00%, +0.00%
VMEM: 10470372 -> 10470368 (-0.00%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: a56ddca4e8 ("aco: make all exec accesses non-temporaries")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4299
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9047>
2021-02-15 19:50:16 +00:00
Daniel Schürmann
8b793f9567 aco: remove dead code for the handling of exec temporaries
Totals from 26026 (18.67% of 139391) affected shaders (Navi10):
PreSGPRs: 370993 -> 326177 (-12.08%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8870>
2021-02-12 22:41:31 +00:00
Daniel Schürmann
a56ddca4e8 aco: make all exec accesses non-temporaries
So that they are not counted into the register demand.

Totals from 107336 (77.00% of 139391) affected shaders (Navi10):
VGPRs: 4023452 -> 4023248 (-0.01%); split: -0.01%, +0.01%
SpillSGPRs: 14088 -> 12571 (-10.77%); split: -11.03%, +0.26%
CodeSize: 266816164 -> 266765528 (-0.02%); split: -0.04%, +0.02%
MaxWaves: 1553339 -> 1553374 (+0.00%); split: +0.00%, -0.00%
Instrs: 50977701 -> 50973093 (-0.01%); split: -0.02%, +0.01%
Cycles: 1733911128 -> 1733605320 (-0.02%); split: -0.05%, +0.03%
VMEM: 40867650 -> 40900204 (+0.08%); split: +0.13%, -0.05%
SMEM: 6835980 -> 6829073 (-0.10%); split: +0.10%, -0.20%
VClause: 1032783 -> 1032788 (+0.00%); split: -0.01%, +0.01%
SClause: 2103705 -> 2104115 (+0.02%); split: -0.09%, +0.11%
Copies: 3195658 -> 3193656 (-0.06%); split: -0.30%, +0.24%
Branches: 1140213 -> 1140120 (-0.01%); split: -0.05%, +0.04%
PreSGPRs: 3603785 -> 3437064 (-4.63%); split: -5.13%, +0.50%
PreVGPRs: 3321996 -> 3321850 (-0.00%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8870>
2021-02-12 22:41:31 +00:00
Daniel Schürmann
e663a15098 aco: don't create unnecessary exec phi on merge blocks
No fossil-db changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8870>
2021-02-12 22:41:31 +00:00
Rhys Perry
e2608312d3 aco: remove loop to flag loop blocks as WQM
This is no longer necessary.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8446>
2021-02-09 17:52:17 +00:00
Rhys Perry
ed020008b5 aco: rewrite setting of Exact_Branch
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8446>
2021-02-09 17:52:17 +00:00
Rhys Perry
f0074a6f05 aco: do not flag all blocks WQM to ensure we enter all nested loops in WQM
This should no longer be necessary since the mark_block_wqm() we use to
flag break conditions as WQM now adds block to the worklist. With them
added to the worklist, get_block_needs() will add WQM to block_needs.

Adding WQM to block_needs here without adding the block to the worklist
(like we do here) can cause issues because it does not ensure that the
predecessors' branches are in WQM (needed for it to be possible to
transition to WQM in the block). This happened in an Overwatch shader.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 661922f6ac ("aco: add block to worklist in mark_block_wqm()")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4066
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8446>
2021-02-09 17:52:17 +00:00
Rhys Perry
e115b01948 aco: return references in instruction cast methods
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>
2021-01-22 14:12:33 +00:00
Rhys Perry
1d245cd18b aco: use format-check methods
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>
2021-01-22 14:12:32 +00:00
Rhys Perry
70dbcfa1c9 aco: use instruction cast methods
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>
2021-01-22 14:12:32 +00:00
Daniel Schürmann
4a70c4d383 aco: make pred_by_exec_mask() accessible in other files
and rename to needs_exec_mask().

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7903>
2020-12-22 15:08:40 +01:00
Daniel Schürmann
2116b4504e aco: don't emit parallelcopy when switching to WQM.
The reason was an RA bug which has been fixed a while ago.
This change fixes some register demand miscalculations.

Totals from 1013 (0.73% of 139391) affected shaders (NAVI10):
CodeSize: 6050408 -> 6047504 (-0.05%); split: -0.05%, +0.00%
Instrs: 1160533 -> 1159765 (-0.07%); split: -0.07%, +0.00%
Cycles: 8027212 -> 8024140 (-0.04%); split: -0.04%, +0.00%
VMEM: 296195 -> 296091 (-0.04%)
SMEM: 73003 -> 73011 (+0.01%); split: +0.05%, -0.04%
SClause: 37221 -> 37222 (+0.00%)
Copies: 70931 -> 70166 (-1.08%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7903>
2020-12-22 15:08:40 +01:00
Rhys Perry
661922f6ac aco: add block to worklist in mark_block_wqm()
Since we're requiring the branch condition to be in WQM, we have to ensure
that the block is in the worklist.

Fixes Trials Fusion hang at 4K and High settings.

fossil-db (Sienna):
Totals from 216 (0.15% of 139391) affected shaders:
SGPRs: 13392 -> 13360 (-0.24%)
CodeSize: 1321184 -> 1318592 (-0.20%)
Instrs: 255310 -> 254662 (-0.25%)
Cycles: 2178360 -> 2174652 (-0.17%)

Affected fossils in fossil-db are dirt4, nier and youngblood.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3863
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8145>
2020-12-17 19:22:55 +00:00
Daniel Schürmann
c4217ef2fc aco: don't create dead exec mask phis on merge blocks
Avoids some unnecessary exec copies and allows for a bit more
jump threading.

Totals from 112 (0.08% of 139391) affected shaders (NAVI10):
SpillSGPRs: 3084 -> 3050 (-1.10%)
CodeSize: 2657516 -> 2652376 (-0.19%)
Instrs: 492074 -> 490824 (-0.25%)
Cycles: 40369704 -> 40317052 (-0.13%)
VMEM: 24212 -> 24128 (-0.35%)
SClause: 12018 -> 12010 (-0.07%)
Copies: 72950 -> 72275 (-0.93%)
Branches: 13249 -> 12701 (-4.14%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8059>
2020-12-14 15:58:24 +00:00
Tony Wasserka
2bb8874320 aco: Fix -Wshadow warnings
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7430>
2020-11-20 09:29:19 +00:00
Rhys Perry
867323379e aco: don't use SMEM for SSBO stores
fossil-db (Navi):
Totals from 70 (0.05% of 138791) affected shaders:
SGPRs: 2324 -> 2097 (-9.77%)
VGPRs: 1344 -> 1480 (+10.12%)
CodeSize: 157872 -> 154836 (-1.92%); split: -1.93%, +0.01%
MaxWaves: 1288 -> 1260 (-2.17%)
Instrs: 29730 -> 29108 (-2.09%); split: -2.13%, +0.04%
Cycles: 394944 -> 391280 (-0.93%); split: -0.94%, +0.01%
VMEM: 5288 -> 5695 (+7.70%); split: +11.97%, -4.27%
SMEM: 2680 -> 2444 (-8.81%); split: +1.34%, -10.15%
VClause: 291 -> 502 (+72.51%)
SClause: 1176 -> 918 (-21.94%)
Copies: 3549 -> 3517 (-0.90%); split: -1.80%, +0.90%
Branches: 1230 -> 1228 (-0.16%)
PreSGPRs: 1675 -> 1491 (-10.99%)
PreVGPRs: 1101 -> 1223 (+11.08%)

Totals from 70 (0.05% of 139517) affected shaders (RAVEN):
SGPRs: 2368 -> 2121 (-10.43%)
VGPRs: 1344 -> 1480 (+10.12%)
CodeSize: 156664 -> 153252 (-2.18%)
MaxWaves: 636 -> 622 (-2.20%)
Instrs: 29968 -> 29226 (-2.48%)
Cycles: 398284 -> 393492 (-1.20%)
VMEM: 5544 -> 5930 (+6.96%); split: +11.72%, -4.76%
SMEM: 2752 -> 2502 (-9.08%); split: +1.20%, -10.28%
VClause: 292 -> 504 (+72.60%)
SClause: 1236 -> 940 (-23.95%)
Copies: 3907 -> 3852 (-1.41%); split: -2.20%, +0.79%
Branches: 1230 -> 1228 (-0.16%)
PreSGPRs: 1671 -> 1487 (-11.01%)
PreVGPRs: 1102 -> 1225 (+11.16%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6143>
2020-11-16 15:52:22 +00:00
Rhys Perry
d20a752c0d aco: use Builder::copy more
fossil-db (Navi):
Totals from 6973 (5.07% of 137413) affected shaders:
SGPRs: 381768 -> 381776 (+0.00%)
VGPRs: 306092 -> 306096 (+0.00%); split: -0.00%, +0.00%
CodeSize: 24440844 -> 24421196 (-0.08%); split: -0.09%, +0.01%
MaxWaves: 86581 -> 86583 (+0.00%)
Instrs: 4682161 -> 4679578 (-0.06%); split: -0.06%, +0.00%
Cycles: 68793116 -> 68261648 (-0.77%); split: -0.83%, +0.05%

fossil-db (Polaris):
Totals from 8154 (5.87% of 138881) affected shaders:
VGPRs: 338916 -> 338920 (+0.00%); split: -0.00%, +0.00%
CodeSize: 23540428 -> 23540488 (+0.00%); split: -0.00%, +0.00%
MaxWaves: 49090 -> 49091 (+0.00%)
Instrs: 4576085 -> 4576101 (+0.00%); split: -0.00%, +0.00%
Cycles: 51720704 -> 51720888 (+0.00%); split: -0.00%, +0.00%

Most of the Navi cycle/instruction changes are from 8/16-bit parallel-rdp
shaders. They appear to be improved because the p_create_vector from
lower_subdword_phis() was blocking constant propagation.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>
2020-10-27 15:24:38 +00:00
Rhys Perry
e54c111c45 aco: always use p_parallelcopy for pre-RA copies
Most fossil-db changes are because literals are applied earlier
(in label_instruction), so use counts are more accurate and more literals
are applied.

fossil-db (Navi):
Totals from 79551 (57.89% of 137413) affected shaders:
SGPRs: 4549610 -> 4542802 (-0.15%); split: -0.19%, +0.04%
VGPRs: 3326764 -> 3324172 (-0.08%); split: -0.10%, +0.03%
SpillSGPRs: 38886 -> 34562 (-11.12%); split: -11.14%, +0.02%
CodeSize: 240143456 -> 240001008 (-0.06%); split: -0.11%, +0.05%
MaxWaves: 1078919 -> 1079281 (+0.03%); split: +0.04%, -0.01%
Instrs: 46627073 -> 46528490 (-0.21%); split: -0.22%, +0.01%

fossil-db (Polaris):
Totals from 98463 (70.90% of 138881) affected shaders:
SGPRs: 5164689 -> 5164353 (-0.01%); split: -0.02%, +0.01%
VGPRs: 3920936 -> 3921856 (+0.02%); split: -0.00%, +0.03%
SpillSGPRs: 56298 -> 52259 (-7.17%); split: -7.22%, +0.04%
CodeSize: 258680092 -> 258692712 (+0.00%); split: -0.02%, +0.03%
MaxWaves: 620863 -> 620823 (-0.01%); split: +0.00%, -0.01%
Instrs: 50776289 -> 50757577 (-0.04%); split: -0.04%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>
2020-10-27 15:24:38 +00:00
Tony Wasserka
34bc9477de aco: Clean up symbol names and comments related to NGG
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7094>
2020-10-21 09:49:38 +00:00
Tony Wasserka
86c227c10c aco: Use strong typing to model SW<->HW stage mappings
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7094>
2020-10-21 09:49:38 +00:00
Rhys Perry
156fd58cda aco: reserve 2 sgprs for each branch
We'll need two sgprs for the possibility of a long jump.

fossil-db (Navi):
Totals from 10197 (7.50% of 135946) affected shaders:
SGPRs: 946268 -> 946468 (+0.02%)
VGPRs: 705884 -> 707956 (+0.29%); split: -0.00%, +0.30%
SpillSGPRs: 31485 -> 36212 (+15.01%); split: -0.04%, +15.05%
CodeSize: 88296484 -> 88384604 (+0.10%); split: -0.01%, +0.11%
MaxWaves: 81379 -> 81171 (-0.26%)
Instrs: 17219111 -> 17231682 (+0.07%); split: -0.03%, +0.10%
Cycles: 1594875900 -> 1596450136 (+0.10%); split: -0.05%, +0.15%
VMEM: 1687263 -> 1689080 (+0.11%); split: +0.14%, -0.03%
SMEM: 657726 -> 660262 (+0.39%); split: +0.61%, -0.22%
VClause: 294806 -> 294638 (-0.06%); split: -0.08%, +0.02%
SClause: 556702 -> 556210 (-0.09%); split: -0.12%, +0.03%
Copies: 1466323 -> 1469349 (+0.21%); split: -0.57%, +0.78%
Branches: 619793 -> 618556 (-0.20%); split: -0.28%, +0.08%
PreSGPRs: 806364 -> 811477 (+0.63%); split: -0.14%, +0.77%
PreVGPRs: 655845 -> 657174 (+0.20%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: 20.2 <mesa-stable>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6212>
2020-08-26 13:26:58 +00:00
Rhys Perry
a537c9e73f aco: don't fix break condition for break+discard to exec
This would move the old exec mask back into exec. This also fixes the
live_out_exec.

Issue found in dEQP-VK.graphicsfuzz.cosh-return-inf-unused

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: 20.2 <mesa-stable>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6212>
2020-08-26 13:26:58 +00:00
Daniel Schürmann
fdb97d3d29 aco: execute branch instructions in WQM if necessary
It could happen that only the branch condition was computed in WQM
and not the branch instruction.
There is now some rendundancy which should be cleaned up.

Fixes: 3817fa7a4d ('aco: fix WQM handling in nested loops')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6260>
2020-08-11 15:35:59 +00:00
Daniel Schürmann
3817fa7a4d aco: fix WQM handling in nested loops
If on a nested loop
- the outer loop needs WQM but
- the inner loop doesn't need WQM and
- the break condition of the inner loop is computed in the outer loop
then it could happen that we transitioned to Exact before entering the inner loop
which could create an empty exec mask and lead to an infinite loop.

Fixes a GPU hang with RDR2

Cc: 20.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5518>
2020-06-18 13:40:15 +00:00
Timur Kristóf
ec72c504c6 aco/ngg: Initialize exec mask for NGG VS and TES.
They behave like merged ESGS shaders, so the exec mask needs
to be manually initialized for these NGG shaders too.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576>
2020-04-07 11:29:35 +00:00
Rhys Perry
4a909068ad aco: look at p_{extract,split}_vector's definitions in pred_by_exec_mask()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4333>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4333>
2020-03-30 17:34:46 +00:00
Eric Anholt
b9773631d3 aco: Fix signed-vs-unsigned warning.
The previous instance of this comparision was 1u to avoid the warning, fix
this one too.

Fixes: dba71de5c6 ("aco: only create parallelcopy to restore exec at loop exit if needed")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3607>
2020-02-27 21:59:31 -08:00
Rhys Perry
db19e96c8c aco: fix exec mask consistency issues
There seems to be more, these are just the ones found in
Detroit: Become Human shaders.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
c7d0514168 aco: parallelcopy exec mask before s_wqm
It can be used later and we want any uses to not be fixed to exec, so it's
definition can't be fixed to exec because of how exec masks interact with
register demand calculation.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
dba71de5c6 aco: only create parallelcopy to restore exec at loop exit if needed
The operand isn't fixed to exec, which can mess up the spiller. This also
adds a new situation where a phi is needed.

Fixes dEQP-VK.ssbo.layout.random.descriptor_indexing.2 and an assertion
when compiling a Detroit: Become Human shader.

Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
404818dd28 aco: run p_wqm instructions in WQM
If the p_wqm ends up creating copies, these need to be in WQM. Helps (but
doesn't completely fix) artifacts in Strange Brigade. The actual issue
still exists and is harder to fix.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3273>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3273>
2020-01-29 13:23:03 +00:00
Rhys Perry
2d7386a2d0 aco: ensure predecessors' p_logical_end is in WQM when a p_phi is in WQM
We want any copies to be in WQM. I don't know if this fixes any real
application, but I can create a vkrunner test than reproduces the issue.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3273>
2020-01-29 13:23:03 +00:00
Rhys Perry
40bb81c9dd radv/aco,aco: implement GS on GFX9+
v2: implement GFX10
v3: rebase
v7: rebase after shader args MR
v8: fix gs_vtx_offset usage on GFX9/GFX10
v8: use unreachable() instead of printing intrinsic
v8: rename output_state to ge_output_state
v8: fix formatting around nir_foreach_variable()
v8: rename some helpers in the scheduler
v8: rename p_memory_barrier_all to p_memory_barrier_common
v8: fix assertion comparing ctx.stage against vertex_geometry_gs

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Timur Kristóf
1c9ecb2123 aco: Fix signedness compare warning.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3483>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3483>
2020-01-22 11:09:17 +01:00
Daniel Schürmann
05c81875d7 aco: fix unconditional demote_to_helper
This patch fixes an out-of-bounds access on p_exit_early
and binds the exec register to the correct operand.

Fixes: 2ea9e59e8d ('aco: move s_andn2_b64 instructions out of the p_discard_if')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3347>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3347>
2020-01-13 21:08:41 +00:00
Daniel Schürmann
28c95cc402 aco: return to loop_active mask at continue_or_break blocks
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
6a586a6006 aco: split read/writelane opcode into VOP2/VOP3 version for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Timur Kristóf
e0bcefc3a0 aco/wave32: Use lane mask regclass for exec/vcc.
Currently all usages of exec and vcc are hardcoded to use s2 regclass.
This commit makes it possible to use s1 in wave32 mode and
s2 in wave64 mode.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Rhys Perry
01cacdb71e aco: fix block_kind_discard s_andn2 definition to exec
Improves generated code of dEQP-VK.graphicsfuzz.disc-and-add-in-func-in-loop
because a loop exit phi can then be fixed to exec, removing copies and
improving jump threading.

No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-02 16:56:24 +00:00
Rhys Perry
cc742562c1 aco: don't enable store_global for helper invocations
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-29 17:46:02 +00:00
Daniel Schürmann
8657eede8a aco: check if SALU instructions are predeceeded by exec when calculating WQM needs
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-11-14 17:27:10 +01:00
Daniel Schürmann
655a703349 aco: remove potential critical edge on loops.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Rhys Perry
2ea9e59e8d aco: move s_andn2_b64 instructions out of the p_discard_if
And use a new p_discard_early_exit instruction. This fixes some cases
where a definition having the same register as an operand causes issues.

v2: rename instruction to p_exit_early_if
v2: modify the existing instruction instead of creating a new one
v3: merge the "i == num - 1" IFs

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-09 16:19:02 +00:00
Rhys Perry
1f2813e103 aco: don't remove the loop exec mask in transition_to_Exact()
No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-09-27 10:57:03 +01:00