Commit graph

67 commits

Author SHA1 Message Date
Daniel Schürmann
ea765162c3 aco/ssa_elimination: create a single parallelcopy instruction for linear and logical phis
Totals from 6651 (8.38% of 79377) affected shaders: (Navi31)

Instrs: 14722896 -> 14722290 (-0.00%); split: -0.01%, +0.00%
CodeSize: 77992072 -> 77989284 (-0.00%); split: -0.01%, +0.00%
Latency: 160542885 -> 160541215 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 24543177 -> 24542710 (-0.00%); split: -0.00%, +0.00%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33527>
2025-02-24 13:11:20 +00:00
Daniel Schürmann
0e98388614 aco/ssa_elimination: refactor scratch_sgpr handling
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33527>
2025-02-24 13:11:20 +00:00
Daniel Schürmann
302678df91 aco/ssa_elimination: insert parallelcopies for p_phi immediately before branch
Totals from 2499 (3.15% of 79377) affected shaders: (Navi31)
Instrs: 6011729 -> 6011761 (+0.00%); split: -0.00%, +0.00%
CodeSize: 31573216 -> 31574236 (+0.00%); split: -0.00%, +0.00%
Latency: 83364734 -> 83365781 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 13545643 -> 13545783 (+0.00%); split: -0.00%, +0.00%

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33527>
2025-02-24 13:11:20 +00:00
Daniel Schürmann
1ff9a0fe80 aco: remove Pseudo_instruction::tmp_in_scc
This information is redundant, now.

No fossil-changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32217>
2024-11-20 11:04:32 +00:00
Daniel Schürmann
a1a4a6061c aco/ra: explicitly assign scratch SGPR for linear phis
We are about to remove the branch definitions which previously
served this purpose. Also remove Block::scc_live_out.
Some changes due to round-robin RA.

Totals from 939 (1.18% of 79395) affected shaders: (Navi31)

Instrs: 5038786 -> 5038611 (-0.00%); split: -0.01%, +0.00%
CodeSize: 26153412 -> 26152904 (-0.00%); split: -0.00%, +0.00%
Latency: 41649989 -> 41650120 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 6447508 -> 6447536 (+0.00%); split: -0.00%, +0.00%
SClause: 131319 -> 131276 (-0.03%); split: -0.03%, +0.00%
Copies: 359362 -> 359256 (-0.03%); split: -0.05%, +0.02%
SALU: 639275 -> 639169 (-0.02%); split: -0.03%, +0.01%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32217>
2024-11-20 11:04:32 +00:00
Daniel Schürmann
21ceeb22ed aco: move jump threading optimization into separate pass
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31888>
2024-10-30 09:23:54 +00:00
Daniel Schürmann
87a3c08df1 aco/ssa_elimination: remove some redundant checks during jump threading
Since phis got already lowered to parallelcopies by this point,
there is no need to cross-check.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31888>
2024-10-30 09:23:54 +00:00
Daniel Schürmann
a6c38f706d aco/ssa_elimination: perform jump threading after parallelcopy insertion
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31888>
2024-10-30 09:23:54 +00:00
Georg Lehmann
d01c1ba939 aco: move exec copy out of waterfall loops
Foz-DB Navi21:
Totals from 348 (0.44% of 79395) affected shaders:
CodeSize: 17944800 -> 17946268 (+0.01%); split: -0.02%, +0.03%
Latency: 29775973 -> 29774369 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 10233380 -> 10232801 (-0.01%); split: -0.01%, +0.00%

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19070>
2024-10-25 16:47:32 +00:00
Georg Lehmann
6c73a8a7f2 aco: optimize conditional divergent breaks at the end of loops
Removes one branch and one s_mov.

Foz-DB Navi21:
Totals from 1483 (1.87% of 79395) affected shaders:
Instrs: 6424114 -> 6373084 (-0.79%)
CodeSize: 35309320 -> 35091084 (-0.62%); split: -0.63%, +0.01%
Latency: 87950935 -> 88030841 (+0.09%); split: -0.03%, +0.12%
InvThroughput: 24784756 -> 24799536 (+0.06%); split: -0.02%, +0.08%
Copies: 588743 -> 561805 (-4.58%)
Branches: 242521 -> 215578 (-11.11%)
SALU: 877856 -> 850918 (-3.07%)

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19070>
2024-10-25 16:47:32 +00:00
Georg Lehmann
075c5818cb aco/ssa_elimination: don't assume exec writes can be removed based on block kind
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19070>
2024-10-25 16:47:32 +00:00
Georg Lehmann
61ab33c883 aco/ssa_elimination: add instr_accesses helper
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19070>
2024-10-25 16:47:32 +00:00
Georg Lehmann
a4b179e445 aco/ssa_elimination: don't avoid saving exec when optimizing branching sequence
insert_exec_mask will no longer use s_and_saveexec if there was a previous copy
from a sgpr to exec, so this code path is no longer taken.

No Foz-DB changes.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31567>
2024-10-23 19:34:53 +00:00
Georg Lehmann
0338bb9ae8 aco/ssa_elimination: also optimize branching sequence with s_and without saveexec
insert_exec will start using this in the future, handle it the same
just without the path to save exec before the v_cmpx instruction.

No Foz-DB changes.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31567>
2024-10-23 19:34:53 +00:00
Georg Lehmann
63b45767f8 aco/ssa_elimination: optimize branching sequence with SALU that has multiple definitions
Foz-DB Navi31:
Totals from 1801 (2.27% of 79395) affected shaders:
Instrs: 1595030 -> 1591942 (-0.19%); split: -0.19%, +0.00%
CodeSize: 8442656 -> 8430140 (-0.15%); split: -0.15%, +0.00%
Latency: 12885611 -> 12879201 (-0.05%); split: -0.05%, +0.00%
InvThroughput: 2420596 -> 2419800 (-0.03%); split: -0.03%, +0.00%
Copies: 125726 -> 123572 (-1.71%)
SALU: 249990 -> 247836 (-0.86%)

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31184>
2024-09-26 14:29:14 +00:00
Georg Lehmann
f129ae647a aco/ssa_elimination: don't check for VALU limitation when optimizing branching sequence
These instructions need exec, so we would never see them here because
try_optimize_branching_sequence will not be called.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31184>
2024-09-26 14:29:14 +00:00
Rhys Perry
9f1a5645cf aco: completely skip branches if they're never taken
fossil-db (navi21):
Totals from 196 (0.25% of 79395) affected shaders:
Instrs: 101902 -> 101706 (-0.19%)
CodeSize: 576988 -> 576232 (-0.13%)
Latency: 750344 -> 750280 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 119170 -> 119161 (-0.01%)
Branches: 3933 -> 3737 (-4.98%)

fossil-db (vega10):
Totals from 585 (0.93% of 63053) affected shaders:
Instrs: 346877 -> 346292 (-0.17%)
CodeSize: 1810600 -> 1808260 (-0.13%)
Latency: 1817743 -> 1814233 (-0.19%)
InvThroughput: 652142 -> 651944 (-0.03%)
Branches: 5087 -> 4502 (-11.50%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30321>
2024-08-15 16:00:19 +00:00
Georg Lehmann
818ff03865 aco: optimize branching sequence with p_create_vector exec producer
This happens with inverse_ballot and wave64.

Foz-DB Navi21:
Totals from 3 (0.00% of 79395) affected shaders:
Instrs: 2689 -> 2683 (-0.22%)
CodeSize: 14988 -> 14972 (-0.11%)
Latency: 20207 -> 20204 (-0.01%)
Copies: 144 -> 141 (-2.08%)
Branches: 76 -> 73 (-3.95%)
SALU: 241 -> 238 (-1.24%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29502>
2024-06-04 15:40:57 +00:00
Samuel Pitoiset
7a69d78ba2 aco: use SPDX-License-Identifier
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28622>
2024-04-08 15:49:25 +00:00
Daniel Schürmann
a863c7951e aco: remove create_instruction() template parameter
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370>
2024-03-28 11:25:43 +00:00
Daniel Schürmann
9b0ebcc39b aco: change return type of create_instruction() to Instruction*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370>
2024-03-28 11:25:43 +00:00
Daniel Schürmann
9bbb9f1104 aco: use small_vec as Block::edge_vec for predecessors and successors
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27984>
2024-03-19 13:06:58 +00:00
Georg Lehmann
482137402a aco/ssa_elimination: check if pseudo scratch reg overwrittes regs used for v_cmpx opt
Cc: mesa-stable

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27855>
2024-03-05 20:16:21 +00:00
Georg Lehmann
1eb067ee9f aco: store if pseudo instr needs scratch reg
Cc: mesa-stable
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27855>
2024-03-05 20:16:21 +00:00
Qiang Yu
80728a2e71 aco: do not eliminate final exec write when p_end_with_regs block
p_end_with_regs just partially end the program, next part need
exec mask to be set correctly. For example p_end_wqm will generate
a exec restore from WQM mode after p_end_with_regs.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
2023-10-10 02:36:33 +00:00
Rhys Perry
6dd751b3b9 aco: remove unused p_logical_end check when optimizing branching sequence
I don't see why a p_logical_end is expected or required. It might not be
present in some situations, which causes an assertion failure:
 s2: %19646:s[0-1] = p_reload %19701:v[8], 11
 s2: %0:exec,  s1: %8817:scc = s_andn2_b64 %19646:s[0-1], %0:exec
 s2: %8818:s[20-21] = p_cbranch_z %0:exec BB1116, BB1114

No fossil-db changes (gfx1100).

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25244>
2023-09-18 11:19:28 +00:00
Rhys Perry
8d5bd3ca48 aco: check logical_phi_info at p_logical_end when eliminating exec writes
This is when the copies actually happen, not at the branch.

fossil-db (gfx1100):
Totals from 1 (0.00% of 79332) affected shaders:
Instrs: 424 -> 423 (-0.24%)
CodeSize: 2172 -> 2168 (-0.18%)
Latency: 2899 -> 2896 (-0.10%)
Copies: 24 -> 23 (-4.17%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25244>
2023-09-18 11:19:28 +00:00
Timur Kristóf
67a0f2532f aco: Mark exec write used when it writes other registers.
When an exec write isn't used but writes other registers
besides exec, and also reads exec (such as s_and_saveexec),
we would mistakenly delete the previous instruction that
writes the exec value that this instruction uses.

No Fossil DB changes on Rembrandt (GFX10.3).

Fixes: 0211e66f65
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9036
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23576>
2023-06-12 15:07:55 +02:00
Timur Kristóf
0eb7c49c7f aco: Pop branch operands when targets are same in SSA elimination.
The branch instruction is no longer conditional when the targets are the
same, so the operand is not necessary and can be removed.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21493>
2023-04-03 14:36:07 +00:00
Timur Kristóf
739bd03c37 aco: Don't verify branch exec read when eliminating exec writes.
Verifying that the branch instruction reads exec is not actually
necessary because the pattern that we look for already implies that.

This prepares for the next commit which will remove the exec operand
from branches that have the same target. These branches will no
longer read exec, but they should still get the same optimization.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21493>
2023-04-03 14:36:07 +00:00
Timur Kristóf
0211e66f65 aco: Don't remove exec writes that also write other registers.
Don't eliminate an instruction that writes registers other than exec and scc.
It is possible that this is eg. an s_and_saveexec and the saved value is
used by a later branch.

Fixes: bc13049747
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21493>
2023-04-03 14:36:07 +00:00
Georg Lehmann
60cd3ba39f aco: copy abs/neg with assignment
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21766>
2023-03-09 14:15:13 +00:00
Georg Lehmann
097a97cc42 aco: remove VOP[123C]P? structs
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21023>
2023-03-07 11:53:23 +00:00
Georg Lehmann
4fbcd046ce aco: Don't use vcmpx with DPP.
V_CMPX+DPP returns 0 with reads from disabled lanes, unlike V_CMP+DPP (RDNA3 ISA doc, 7.7)

Fixes: baab6f18c9 ("aco: Optimize branching sequence during SSA elimination.")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20537>
2023-02-14 19:15:17 +00:00
Georg Lehmann
e21448d0d3 aco: Don't create useless exec movs while creating v_cmpx.
In a lot of situations the previous exec value was already copied from the
same registers that exec should be saved to. In that case we don't have to
insert an extra copy to save exec.

This breaks ssa but this pass is going out of ssa anyway.

Foz-DB Navi21:
Totals from 16129 (11.96% of 134913) affected shaders:
CodeSize: 128184044 -> 128054468 (-0.10%)
Instrs: 23902694 -> 23870325 (-0.14%)
Latency: 387124324 -> 387095955 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 79949118 -> 79945859 (-0.00%); split: -0.01%, +0.00%
Copies: 1620768 -> 1588752 (-1.98%); split: -1.99%, +0.01%

Foz-DB Vega10:
Totals from 15546 (11.51% of 135041) affected shaders:
CodeSize: 120322524 -> 120200568 (-0.10%)
Instrs: 23448344 -> 23417855 (-0.13%)
Latency: 414018749 -> 413639289 (-0.09%); split: -0.09%, +0.00%
InvThroughput: 183819363 -> 183726539 (-0.05%); split: -0.05%, +0.00%
Copies: 2194937 -> 2164448 (-1.39%)

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18528>
2022-11-11 16:02:12 +00:00
Georg Lehmann
a653a390e1 aco: Make vcmpx definition handling clearer.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18528>
2022-11-11 16:02:12 +00:00
Georg Lehmann
28a69b72d8 aco: Use plain VOPC for vcmpx when possible.
Foz-DB Navi21:
Totals from 66947 (49.62% of 134913) affected shaders:
CodeSize: 210383024 -> 210033376 (-0.17%)

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18417>
2022-09-07 16:08:36 +00:00
Rhys Perry
f7d02a9b5e aco: test for one and_savexec opcode in try_optimize_branching_sequence
A situation where it doesn't match is probably not possible, so this
probably doesn't fix anything.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>
2022-08-16 17:31:33 +00:00
Rhys Perry
2587e75ee1 aco: improve vcc check for instructions between exec_val and exec_copy
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>
2022-08-16 17:31:33 +00:00
Rhys Perry
faf8038254 aco: remove val_and_copy_adjacent
If this is true, then the only instruction the loops visit is
p_logical_end and the loops are no-ops.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>
2022-08-16 17:31:33 +00:00
Rhys Perry
cfb306b4e4 aco: test branch opcode if removing it in try_optimize_branching_sequence
We shouldn't remove a p_cbranch_nz branch in this situation.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: b731be2e96 ("aco: Remove branch instruction when exec is constant non-zero.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>
2022-08-16 17:31:33 +00:00
Rhys Perry
21cb002b60 aco: fix re-write of uses of exec_val's lo/hi half
The isConstant() check isn't useful. If it's a constant, then the
physReg() check will fail.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: baab6f18c9 ("aco: Optimize branching sequence during SSA elimination.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>
2022-08-16 17:31:33 +00:00
Rhys Perry
e493aab3c3 aco: fix consecutive exec writes when finding exec_copy instruction
This can happen with transitions to exact:
 s2: %0:exec = p_parallelcopy %622:s[0-1]
 s2: %625:s[0-1],  s1: %624:scc,  s2: %0:exec = s_and_saveexec_b64 %141:vcc, %0:exec
 s2: %626:s[12-13] = p_cbranch_z %0:exec BB2, BB1

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: 410eff4d2f ("aco: Fix optimizing branching sequence with s_and_saveexec.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>
2022-08-16 17:31:33 +00:00
Georg Lehmann
7b9d3ebe42 aco: Use v_cmpx pre GFX10.
Foz-DB Vega10:
Totals from 29508 (21.85% of 135041) affected shaders:
CodeSize: 184345656 -> 184345820 (+0.00%)
Instrs: 35906154 -> 35906195 (+0.00%)
Latency: 581696114 -> 581530021 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 245625572 -> 245561351 (-0.03%); split: -0.03%, +0.00%
Copies: 3134925 -> 3278672 (+4.59%)

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18049>
2022-08-15 13:25:38 +00:00
Georg Lehmann
410eff4d2f aco: Fix optimizing branching sequence with s_and_saveexec.
This optimization was broken for two reasons:
- s_and_saveexec has two operands, the copy value and exec
- s_and_saveexec has an exec read, so exec_write_used will always be true
  before we find branch_exec_val_idx

Foz-DB Navi21:
Totals from 31453 (23.31% of 134913) affected shaders:
CodeSize: 204831260 -> 204831156 (-0.00%)
Instrs: 38157117 -> 38157091 (-0.00%)
Latency: 533708882 -> 531211721 (-0.47%); split: -0.47%, +0.00%
InvThroughput: 107088408 -> 106719188 (-0.34%); split: -0.35%, +0.00%
Copies: 2326179 -> 2502490 (+7.58%)

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18049>
2022-08-15 13:25:38 +00:00
Georg Lehmann
9e5f311efe aco: Check that we don't override exec_val operands during branching sequence optimization.
Fixes: baab6f18c9 ("aco: Optimize branching sequence during SSA elimination.")
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18049>
2022-08-15 13:25:38 +00:00
Timur Kristóf
baf314e2c0 aco: Check for instructions that inhibit the branching sequence optimization.
Fixes: baab6f18c9 ("aco: Optimize branching sequence during SSA elimination.")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18049>
2022-08-15 13:25:38 +00:00
Timur Kristóf
d88b2e4ab5 aco: Fix invalidated reference in branching sequence optimization.
Inserting in the instructions vector may invalidate the exec_val reference,
so do that last.

Fixes: baab6f18c9 ("aco: Optimize branching sequence during SSA elimination.")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18049>
2022-08-15 13:25:38 +00:00
Timur Kristóf
b731be2e96 aco: Remove branch instruction when exec is constant non-zero.
This mainly helps the "if (elect())" that is used in
NGG culling shaders, effectively removing a useless branch
from every culling shader.

Totals from 58346 (45.35% of 128653) affected shaders:
CodeSize: 153238668 -> 153005284 (-0.15%)
Instrs: 29066198 -> 29007852 (-0.20%)
Latency: 133626003 -> 133598182 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 20208765 -> 20208689 (-0.00%)
Branches: 1190209 -> 1131863 (-4.90%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13383>
2022-08-05 23:58:57 +00:00
Timur Kristóf
baab6f18c9 aco: Optimize branching sequence during SSA elimination.
Totals from 63245 (49.16% of 128653) affected shaders:
CodeSize: 166703680 -> 166436164 (-0.16%)
Instrs: 31618383 -> 31551504 (-0.21%)
Latency: 149404811 -> 149337729 (-0.04%); split: -0.05%, +0.00%
InvThroughput: 23996388 -> 23994734 (-0.01%); split: -0.01%, +0.00%
Copies: 2794107 -> 2727228 (-2.39%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13383>
2022-08-05 23:58:57 +00:00