Commit graph

77 commits

Author SHA1 Message Date
Rhys Perry
d2d94b62f2 aco: initialize scratch base registers on GFX9-GFX10.3
fossil-db (navi21):
Totals from 1142 (0.70% of 162293) affected shaders:
Instrs: 271636 -> 271974 (+0.12%)
CodeSize: 1532020 -> 1533792 (+0.12%)
Latency: 7484066 -> 7485698 (+0.02%)
InvThroughput: 4048824 -> 4049579 (+0.02%)
SClause: 4171 -> 4212 (+0.98%)
PreSGPRs: 11203 -> 12276 (+9.58%)

fossil-db (vega10):
Totals from 3327 (2.06% of 161355) affected shaders:
Instrs: 257413 -> 257601 (+0.07%)
CodeSize: 1424244 -> 1425372 (+0.08%)
Latency: 8598402 -> 8600466 (+0.02%)
InvThroughput: 7906335 -> 7908234 (+0.02%)
SClause: 4932 -> 4973 (+0.83%)
PreSGPRs: 22010 -> 25405 (+15.42%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
d068eb53e8 aco/insert_exec_mask: optimize top-level transition to exact before demote
fossil-db (Sienna Cichlid):
Totals from 5767 (3.55% of 162293) affected shaders:
Instrs: 3264949 -> 3257527 (-0.23%); split: -0.23%, +0.00%
CodeSize: 17835692 -> 17806004 (-0.17%); split: -0.17%, +0.00%
Latency: 45990060 -> 45987924 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 7643850 -> 7643835 (-0.00%); split: -0.00%, +0.00%
Copies: 193641 -> 186219 (-3.83%); split: -3.84%, +0.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15244>
2022-03-08 12:49:59 +00:00
Rhys Perry
42a5be975a aco/insert_exec_mask: use get_exec_op
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15244>
2022-03-08 12:49:59 +00:00
Rhys Perry
aa55ecc296 aco/insert_exec_mask: fix top-level to-exact with non-global exact mask
After transitioning to exact after a discard, the exec stack might be:
[exact|global, wqm, exact]

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15244>
2022-03-08 12:49:59 +00:00
Rhys Perry
ceca5e68c4 aco: remove vcc hint from branch definitions
This doesn't seem to have much benefit anymore.

fossil-db (Sienna Cichlid):
Totals from 198 (0.15% of 134913) affected shaders:
CodeSize: 2610536 -> 2610872 (+0.01%); split: -0.01%, +0.02%
Instrs: 479001 -> 479085 (+0.02%); split: -0.01%, +0.03%
Latency: 7310684 -> 7300735 (-0.14%); split: -0.16%, +0.02%
InvThroughput: 2439084 -> 2437446 (-0.07%); split: -0.07%, +0.00%
SClause: 14760 -> 14722 (-0.26%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
2022-03-03 20:21:08 +00:00
Daniel Schürmann
1bbbabedb7 aco/insert_exec_mask: refactor and remove some unnecessary WQM handling code
Some cases cannot happen and don't need to be handled anymore.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
d7d7b9974a aco/insert_exec_mask: refactor and simplify get_block_needs()
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
fcc5dec8d6 aco/insert_exec_mask: remove ever_again_needs and Exact_Branch
This information is not required anymore.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
cbb1b095ca aco/insert_exec_mask: remove some unnecessary WQM loop handling code
These workarounds are were necessary to prevent infinite loops
with helper lane registers containing wrong data.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
580a63b4ac aco/insert_exec_mask: remove Preserve_WQM flag
If WQM is needed anywhere after discard_if(), it will also
be flagged as WQM. We can rely on that to preserve the WQM mask.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
f816dd1be7 aco: don't propagate WQM for p_as_uniform
This was needed, so that in case of active helper lanes,
these contain the correct value. It is now handled implicitly.

Totals from 1004 (0.74% of 134913) affected shaders: (GFX10.3)
CodeSize: 7581020 -> 7580892 (-0.00%); split: -0.00%, +0.00%
Instrs: 1454940 -> 1454908 (-0.00%); split: -0.00%, +0.00%
Latency: 12984953 -> 12984894 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 3173037 -> 3173049 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 47498 -> 47273 (-0.47%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
825cd696dc aco/insert_exec_mask: stay in WQM while helper lanes are still needed
This patch flags all instructions WQM which don't require
Exact mode, but depend on the exec mask as long as WQM
is needed on any control flow path afterwards.
This will mostly prevent accidental copies of WQM values
within Exact mode, and also makes a lot of other workarounds
unnecessary.

Totals from 17374 (12.88% of 134913) affected shaders: (GFX10.3)
VGPRs: 526952 -> 527384 (+0.08%); split: -0.01%, +0.09%
CodeSize: 33740512 -> 33766636 (+0.08%); split: -0.06%, +0.14%
MaxWaves: 488166 -> 488108 (-0.01%); split: +0.00%, -0.02%
Instrs: 6254240 -> 6260557 (+0.10%); split: -0.08%, +0.18%
Latency: 66497580 -> 66463472 (-0.05%); split: -0.15%, +0.10%
InvThroughput: 13265741 -> 13264036 (-0.01%); split: -0.03%, +0.01%
VClause: 122962 -> 122975 (+0.01%); split: -0.01%, +0.02%
SClause: 334805 -> 334405 (-0.12%); split: -0.51%, +0.39%
Copies: 275728 -> 282341 (+2.40%); split: -0.91%, +3.31%
Branches: 92546 -> 90990 (-1.68%); split: -1.68%, +0.00%
PreSGPRs: 504119 -> 504352 (+0.05%); split: -0.00%, +0.05%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
5e9df85b1a aco: optimize discard_if when WQM is not needed afterwards
Totals from 11560 (8.57% of 134913) affected shaders: (GFX10.3)
CodeSize: 12092560 -> 11997652 (-0.78%)
Instrs: 2205325 -> 2181598 (-1.08%)
Latency: 15376048 -> 15356958 (-0.12%); split: -0.12%, +0.00%
InvThroughput: 3526105 -> 3525120 (-0.03%); split: -0.03%, +0.00%
Copies: 98543 -> 87601 (-11.10%)
Branches: 16919 -> 16873 (-0.27%)
PreSGPRs: 291584 -> 291532 (-0.02%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14805>
2022-02-08 16:16:07 +00:00
Daniel Schürmann
13c3137960 aco: merge block_kind_uses_[demote|discard_if]
These serve the same purpose. The new name is
block_kind_uses_discard.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14805>
2022-02-08 16:16:07 +00:00
Daniel Schürmann
e7d1c8cc5e aco: make Preserve_WQM independent from block_kind_uses_discard_if
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14805>
2022-02-08 16:16:07 +00:00
Daniel Schürmann
08b8500dfb aco: remove block_kind_discard
This case doesn't seem to happen in practice.
No need to micro-optimize it.

This patch merges instruction selection for discard/discard_if.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14805>
2022-02-08 16:16:07 +00:00
Daniel Schürmann
b67092e685 aco: emit nir_intrinsic_discard() as p_discard_if()
This simplifies the code and emits a slightly better
sequence in some cases.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14805>
2022-02-08 16:16:07 +00:00
Timur Kristóf
e66f54e5c8 aco: Allow elect to take advantage of knowing when all lanes are active.
Implement elect using a pseudo-op which is lowered during the
insert_exec_mask pass. This makes it possible to emit a more
optimal sequence when the exec mask is constant.

Fossil DB results on Sienna Cichlid:
Totals from 211 (0.16% of 128647) affected shaders:
CodeSize: 2254356 -> 2240468 (-0.62%); split: -0.62%, +0.00%
Instrs: 438471 -> 434996 (-0.79%); split: -0.80%, +0.01%
Latency: 2717082 -> 2709400 (-0.28%); split: -0.28%, +0.00%
InvThroughput: 566987 -> 566342 (-0.11%); split: -0.11%, +0.00%
Copies: 40058 -> 40162 (+0.26%)
Branches: 31209 -> 31211 (+0.01%)
PreSGPRs: 9927 -> 10125 (+1.99%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11458>
2021-07-16 14:31:54 +00:00
Tony Wasserka
66e51dc474 aco: Remove use of deprecated Operand constructors
This migration was done with libclang-based automatic tooling, which
performed these replacements:
* Operand(uint8_t) -> Operand::c8
* Operand(uint16_t) -> Operand::c16
* Operand(uint32_t, false) -> Operand::c32
* Operand(uint32_t, bool) -> Operand::c32_or_c64
* Operand(uint64_t) -> Operand::c64
* Operand(0) -> Operand::zero(num_bytes)

Casts that were previously used for constructor selection have automatically
been removed (e.g. Operand((uint16_t)1) -> Operand::c16(1)).

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11653>
2021-07-13 17:43:26 +00:00
Daniel Schürmann
1e2639026f aco: Format.
Manually adjusted some comments for more intuitive line breaks.

Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11258>
2021-07-12 21:27:31 +00:00
Daniel Schürmann
59fdaa1985 aco: reorder and cleanup #includes
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11271>
2021-07-12 12:09:31 +00:00
Daniel Schürmann
32c7d17120 aco: remove condition operand from branch in invert block
As value numbering only handles logical blocks, this
could lead to invalid IR until insert_exec_mask().
No fossil-db changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10894>
2021-05-20 17:44:20 +00:00
Timur Kristóf
c4f6e4d6b0 aco/insert_exec_mask: Fixed unused variable warning in release build.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10806>
2021-05-20 17:11:22 +00:00
Timur Kristóf
25a7947da7 aco: Don't use s_and_saveexec with branches when exec is constant.
When exec is constant, we can remember the constant as the old exec,
and just copy the condition and use it as the new exec. There is no
need to save the constant.

Due to using p_parallelcopy which is lowered to s_mov_b64 (or 32),
many exec restores now become copies, hence the increase in the copy
stats.

Fossil DB changes on Sienna Cichlid:

Totals from 73969 (49.37% of 149839) affected shaders:
SpillSGPRs: 1768 -> 1610 (-8.94%)
CodeSize: 99053892 -> 99047884 (-0.01%); split: -0.02%, +0.01%
Instrs: 19372852 -> 19370398 (-0.01%); split: -0.02%, +0.01%
VClause: 515154 -> 515142 (-0.00%); split: -0.00%, +0.00%
SClause: 719236 -> 718395 (-0.12%); split: -0.14%, +0.02%
Copies: 1109770 -> 1254634 (+13.05%); split: -0.07%, +13.12%
Branches: 374338 -> 374348 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 1776481 -> 1653761 (-6.91%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10691>
2021-05-18 11:48:22 +00:00
Timur Kristóf
c850af936a aco: Remember when exec mask is const, and restore the const then.
Previously, we would store even the constant -1 exec mask from the
beginning of every merged shader. With this change it is no longer
necessary because we can restore to constant exec mask directly.

Hence, this frees up a register pair (single register for Wave32)
in every merged shader.

No Fossil DB changes.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10691>
2021-05-18 11:48:22 +00:00
Timur Kristóf
04f90db9a0 aco: Use Operand instead of Temp for the exec mask stack.
This will enable us to store non-temporary values,
such as constant operands there.

No Fossil DB changes.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10691>
2021-05-18 11:48:22 +00:00
Rhys Perry
961361cdc9 aco: ensure loops nested in a WQM loop are in WQM
Fixes a potential empty exec mask in this situation:
enter_wqm()
loop {
   ... wqm code ...
   enter_exact()
   loop {
      ... no wqm code ...
   }
}

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: f0074a6f05 ("aco: do not flag all blocks WQM to ensure we enter all nested loops in WQM")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4546
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10075>
2021-04-08 09:56:25 +00:00
Timur Kristóf
8205cce007 aco: Use ASSERTED to avoid unused variable warning.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9632>
2021-03-16 21:46:52 +00:00
Rhys Perry
5f1b354472 aco: calculate all p_as_uniform and v_readfirstlane_b32 sources in WQM
We should avoid a situation where a v_readfirstlane_b32 is in WQM but it's
source is calculated in Exact.

Fixes hang when running Assassin's Creed: Valhalla benchmark.

fossil-db (GFX10.3):
Totals from 1021 (0.70% of 146267) affected shaders:
CodeSize: 7835228 -> 7842992 (+0.10%); split: -0.00%, +0.10%
Instrs: 1519208 -> 1521149 (+0.13%); split: -0.00%, +0.13%
SClause: 78921 -> 78920 (-0.00%)
Copies: 44456 -> 45421 (+2.17%); split: -0.05%, +2.22%
Branches: 12987 -> 13933 (+7.28%)
PreSGPRs: 47599 -> 47813 (+0.45%)
Cycles: 10037540 -> 10045304 (+0.08%); split: -0.00%, +0.08%
VMEM: 538381 -> 538777 (+0.07%); split: +0.11%, -0.03%
SMEM: 84553 -> 84554 (+0.00%); split: +0.01%, -0.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9288>
2021-02-26 13:33:56 +00:00
Daniel Schürmann
29b866fef6 aco: remove special handling of load_helper_invocation
These should now behave the same as is_helper_invocation.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9058>
2021-02-17 21:53:52 +00:00
Daniel Schürmann
fc6b5be666 aco: fix assertion in insert_exec_mask pass
Fixes: a56ddca4e8 ('aco: make all exec accesses non-temporaries ')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9047>
2021-02-15 19:50:16 +00:00
Rhys Perry
ddce1ec5f5 aco: fix transition_to_{WQM,Exact} if exec.back() is not in exec
This can happen at merge blocks.

fossil-db (GFX10.3):
Totals from 25229 (17.25% of 146267) affected shaders:
CodeSize: 58575920 -> 58571376 (-0.01%); split: -0.01%, +0.00%
Instrs: 10979245 -> 10978109 (-0.01%); split: -0.01%, +0.00%
SClause: 591817 -> 591816 (-0.00%)
Copies: 604987 -> 603851 (-0.19%); split: -0.19%, +0.00%
Cycles: 96088796 -> 96084252 (-0.00%); split: -0.00%, +0.00%
VMEM: 10470372 -> 10470368 (-0.00%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: a56ddca4e8 ("aco: make all exec accesses non-temporaries")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4299
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9047>
2021-02-15 19:50:16 +00:00
Daniel Schürmann
8b793f9567 aco: remove dead code for the handling of exec temporaries
Totals from 26026 (18.67% of 139391) affected shaders (Navi10):
PreSGPRs: 370993 -> 326177 (-12.08%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8870>
2021-02-12 22:41:31 +00:00
Daniel Schürmann
a56ddca4e8 aco: make all exec accesses non-temporaries
So that they are not counted into the register demand.

Totals from 107336 (77.00% of 139391) affected shaders (Navi10):
VGPRs: 4023452 -> 4023248 (-0.01%); split: -0.01%, +0.01%
SpillSGPRs: 14088 -> 12571 (-10.77%); split: -11.03%, +0.26%
CodeSize: 266816164 -> 266765528 (-0.02%); split: -0.04%, +0.02%
MaxWaves: 1553339 -> 1553374 (+0.00%); split: +0.00%, -0.00%
Instrs: 50977701 -> 50973093 (-0.01%); split: -0.02%, +0.01%
Cycles: 1733911128 -> 1733605320 (-0.02%); split: -0.05%, +0.03%
VMEM: 40867650 -> 40900204 (+0.08%); split: +0.13%, -0.05%
SMEM: 6835980 -> 6829073 (-0.10%); split: +0.10%, -0.20%
VClause: 1032783 -> 1032788 (+0.00%); split: -0.01%, +0.01%
SClause: 2103705 -> 2104115 (+0.02%); split: -0.09%, +0.11%
Copies: 3195658 -> 3193656 (-0.06%); split: -0.30%, +0.24%
Branches: 1140213 -> 1140120 (-0.01%); split: -0.05%, +0.04%
PreSGPRs: 3603785 -> 3437064 (-4.63%); split: -5.13%, +0.50%
PreVGPRs: 3321996 -> 3321850 (-0.00%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8870>
2021-02-12 22:41:31 +00:00
Daniel Schürmann
e663a15098 aco: don't create unnecessary exec phi on merge blocks
No fossil-db changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8870>
2021-02-12 22:41:31 +00:00
Rhys Perry
e2608312d3 aco: remove loop to flag loop blocks as WQM
This is no longer necessary.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8446>
2021-02-09 17:52:17 +00:00
Rhys Perry
ed020008b5 aco: rewrite setting of Exact_Branch
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8446>
2021-02-09 17:52:17 +00:00
Rhys Perry
f0074a6f05 aco: do not flag all blocks WQM to ensure we enter all nested loops in WQM
This should no longer be necessary since the mark_block_wqm() we use to
flag break conditions as WQM now adds block to the worklist. With them
added to the worklist, get_block_needs() will add WQM to block_needs.

Adding WQM to block_needs here without adding the block to the worklist
(like we do here) can cause issues because it does not ensure that the
predecessors' branches are in WQM (needed for it to be possible to
transition to WQM in the block). This happened in an Overwatch shader.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 661922f6ac ("aco: add block to worklist in mark_block_wqm()")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4066
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8446>
2021-02-09 17:52:17 +00:00
Rhys Perry
e115b01948 aco: return references in instruction cast methods
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>
2021-01-22 14:12:33 +00:00
Rhys Perry
1d245cd18b aco: use format-check methods
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>
2021-01-22 14:12:32 +00:00
Rhys Perry
70dbcfa1c9 aco: use instruction cast methods
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>
2021-01-22 14:12:32 +00:00
Daniel Schürmann
4a70c4d383 aco: make pred_by_exec_mask() accessible in other files
and rename to needs_exec_mask().

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7903>
2020-12-22 15:08:40 +01:00
Daniel Schürmann
2116b4504e aco: don't emit parallelcopy when switching to WQM.
The reason was an RA bug which has been fixed a while ago.
This change fixes some register demand miscalculations.

Totals from 1013 (0.73% of 139391) affected shaders (NAVI10):
CodeSize: 6050408 -> 6047504 (-0.05%); split: -0.05%, +0.00%
Instrs: 1160533 -> 1159765 (-0.07%); split: -0.07%, +0.00%
Cycles: 8027212 -> 8024140 (-0.04%); split: -0.04%, +0.00%
VMEM: 296195 -> 296091 (-0.04%)
SMEM: 73003 -> 73011 (+0.01%); split: +0.05%, -0.04%
SClause: 37221 -> 37222 (+0.00%)
Copies: 70931 -> 70166 (-1.08%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7903>
2020-12-22 15:08:40 +01:00
Rhys Perry
661922f6ac aco: add block to worklist in mark_block_wqm()
Since we're requiring the branch condition to be in WQM, we have to ensure
that the block is in the worklist.

Fixes Trials Fusion hang at 4K and High settings.

fossil-db (Sienna):
Totals from 216 (0.15% of 139391) affected shaders:
SGPRs: 13392 -> 13360 (-0.24%)
CodeSize: 1321184 -> 1318592 (-0.20%)
Instrs: 255310 -> 254662 (-0.25%)
Cycles: 2178360 -> 2174652 (-0.17%)

Affected fossils in fossil-db are dirt4, nier and youngblood.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3863
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8145>
2020-12-17 19:22:55 +00:00
Daniel Schürmann
c4217ef2fc aco: don't create dead exec mask phis on merge blocks
Avoids some unnecessary exec copies and allows for a bit more
jump threading.

Totals from 112 (0.08% of 139391) affected shaders (NAVI10):
SpillSGPRs: 3084 -> 3050 (-1.10%)
CodeSize: 2657516 -> 2652376 (-0.19%)
Instrs: 492074 -> 490824 (-0.25%)
Cycles: 40369704 -> 40317052 (-0.13%)
VMEM: 24212 -> 24128 (-0.35%)
SClause: 12018 -> 12010 (-0.07%)
Copies: 72950 -> 72275 (-0.93%)
Branches: 13249 -> 12701 (-4.14%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8059>
2020-12-14 15:58:24 +00:00
Tony Wasserka
2bb8874320 aco: Fix -Wshadow warnings
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7430>
2020-11-20 09:29:19 +00:00
Rhys Perry
867323379e aco: don't use SMEM for SSBO stores
fossil-db (Navi):
Totals from 70 (0.05% of 138791) affected shaders:
SGPRs: 2324 -> 2097 (-9.77%)
VGPRs: 1344 -> 1480 (+10.12%)
CodeSize: 157872 -> 154836 (-1.92%); split: -1.93%, +0.01%
MaxWaves: 1288 -> 1260 (-2.17%)
Instrs: 29730 -> 29108 (-2.09%); split: -2.13%, +0.04%
Cycles: 394944 -> 391280 (-0.93%); split: -0.94%, +0.01%
VMEM: 5288 -> 5695 (+7.70%); split: +11.97%, -4.27%
SMEM: 2680 -> 2444 (-8.81%); split: +1.34%, -10.15%
VClause: 291 -> 502 (+72.51%)
SClause: 1176 -> 918 (-21.94%)
Copies: 3549 -> 3517 (-0.90%); split: -1.80%, +0.90%
Branches: 1230 -> 1228 (-0.16%)
PreSGPRs: 1675 -> 1491 (-10.99%)
PreVGPRs: 1101 -> 1223 (+11.08%)

Totals from 70 (0.05% of 139517) affected shaders (RAVEN):
SGPRs: 2368 -> 2121 (-10.43%)
VGPRs: 1344 -> 1480 (+10.12%)
CodeSize: 156664 -> 153252 (-2.18%)
MaxWaves: 636 -> 622 (-2.20%)
Instrs: 29968 -> 29226 (-2.48%)
Cycles: 398284 -> 393492 (-1.20%)
VMEM: 5544 -> 5930 (+6.96%); split: +11.72%, -4.76%
SMEM: 2752 -> 2502 (-9.08%); split: +1.20%, -10.28%
VClause: 292 -> 504 (+72.60%)
SClause: 1236 -> 940 (-23.95%)
Copies: 3907 -> 3852 (-1.41%); split: -2.20%, +0.79%
Branches: 1230 -> 1228 (-0.16%)
PreSGPRs: 1671 -> 1487 (-11.01%)
PreVGPRs: 1102 -> 1225 (+11.16%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6143>
2020-11-16 15:52:22 +00:00
Rhys Perry
d20a752c0d aco: use Builder::copy more
fossil-db (Navi):
Totals from 6973 (5.07% of 137413) affected shaders:
SGPRs: 381768 -> 381776 (+0.00%)
VGPRs: 306092 -> 306096 (+0.00%); split: -0.00%, +0.00%
CodeSize: 24440844 -> 24421196 (-0.08%); split: -0.09%, +0.01%
MaxWaves: 86581 -> 86583 (+0.00%)
Instrs: 4682161 -> 4679578 (-0.06%); split: -0.06%, +0.00%
Cycles: 68793116 -> 68261648 (-0.77%); split: -0.83%, +0.05%

fossil-db (Polaris):
Totals from 8154 (5.87% of 138881) affected shaders:
VGPRs: 338916 -> 338920 (+0.00%); split: -0.00%, +0.00%
CodeSize: 23540428 -> 23540488 (+0.00%); split: -0.00%, +0.00%
MaxWaves: 49090 -> 49091 (+0.00%)
Instrs: 4576085 -> 4576101 (+0.00%); split: -0.00%, +0.00%
Cycles: 51720704 -> 51720888 (+0.00%); split: -0.00%, +0.00%

Most of the Navi cycle/instruction changes are from 8/16-bit parallel-rdp
shaders. They appear to be improved because the p_create_vector from
lower_subdword_phis() was blocking constant propagation.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>
2020-10-27 15:24:38 +00:00
Rhys Perry
e54c111c45 aco: always use p_parallelcopy for pre-RA copies
Most fossil-db changes are because literals are applied earlier
(in label_instruction), so use counts are more accurate and more literals
are applied.

fossil-db (Navi):
Totals from 79551 (57.89% of 137413) affected shaders:
SGPRs: 4549610 -> 4542802 (-0.15%); split: -0.19%, +0.04%
VGPRs: 3326764 -> 3324172 (-0.08%); split: -0.10%, +0.03%
SpillSGPRs: 38886 -> 34562 (-11.12%); split: -11.14%, +0.02%
CodeSize: 240143456 -> 240001008 (-0.06%); split: -0.11%, +0.05%
MaxWaves: 1078919 -> 1079281 (+0.03%); split: +0.04%, -0.01%
Instrs: 46627073 -> 46528490 (-0.21%); split: -0.22%, +0.01%

fossil-db (Polaris):
Totals from 98463 (70.90% of 138881) affected shaders:
SGPRs: 5164689 -> 5164353 (-0.01%); split: -0.02%, +0.01%
VGPRs: 3920936 -> 3921856 (+0.02%); split: -0.00%, +0.03%
SpillSGPRs: 56298 -> 52259 (-7.17%); split: -7.22%, +0.04%
CodeSize: 258680092 -> 258692712 (+0.00%); split: -0.02%, +0.03%
MaxWaves: 620863 -> 620823 (-0.01%); split: +0.00%, -0.01%
Instrs: 50776289 -> 50757577 (-0.04%); split: -0.04%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>
2020-10-27 15:24:38 +00:00
Tony Wasserka
34bc9477de aco: Clean up symbol names and comments related to NGG
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7094>
2020-10-21 09:49:38 +00:00