insert_exec_mask will no longer use s_and_saveexec if there was a previous copy
from a sgpr to exec, so this code path is no longer taken.
No Foz-DB changes.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31567>
insert_exec will start using this in the future, handle it the same
just without the path to save exec before the v_cmpx instruction.
No Foz-DB changes.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31567>
p_end_with_regs just partially end the program, next part need
exec mask to be set correctly. For example p_end_wqm will generate
a exec restore from WQM mode after p_end_with_regs.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
I don't see why a p_logical_end is expected or required. It might not be
present in some situations, which causes an assertion failure:
s2: %19646:s[0-1] = p_reload %19701:v[8], 11
s2: %0:exec, s1: %8817:scc = s_andn2_b64 %19646:s[0-1], %0:exec
s2: %8818:s[20-21] = p_cbranch_z %0:exec BB1116, BB1114
No fossil-db changes (gfx1100).
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25244>
This is when the copies actually happen, not at the branch.
fossil-db (gfx1100):
Totals from 1 (0.00% of 79332) affected shaders:
Instrs: 424 -> 423 (-0.24%)
CodeSize: 2172 -> 2168 (-0.18%)
Latency: 2899 -> 2896 (-0.10%)
Copies: 24 -> 23 (-4.17%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25244>
When an exec write isn't used but writes other registers
besides exec, and also reads exec (such as s_and_saveexec),
we would mistakenly delete the previous instruction that
writes the exec value that this instruction uses.
No Fossil DB changes on Rembrandt (GFX10.3).
Fixes: 0211e66f65
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9036
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23576>
The branch instruction is no longer conditional when the targets are the
same, so the operand is not necessary and can be removed.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21493>
Verifying that the branch instruction reads exec is not actually
necessary because the pattern that we look for already implies that.
This prepares for the next commit which will remove the exec operand
from branches that have the same target. These branches will no
longer read exec, but they should still get the same optimization.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21493>
Don't eliminate an instruction that writes registers other than exec and scc.
It is possible that this is eg. an s_and_saveexec and the saved value is
used by a later branch.
Fixes: bc13049747
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21493>
In a lot of situations the previous exec value was already copied from the
same registers that exec should be saved to. In that case we don't have to
insert an extra copy to save exec.
This breaks ssa but this pass is going out of ssa anyway.
Foz-DB Navi21:
Totals from 16129 (11.96% of 134913) affected shaders:
CodeSize: 128184044 -> 128054468 (-0.10%)
Instrs: 23902694 -> 23870325 (-0.14%)
Latency: 387124324 -> 387095955 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 79949118 -> 79945859 (-0.00%); split: -0.01%, +0.00%
Copies: 1620768 -> 1588752 (-1.98%); split: -1.99%, +0.01%
Foz-DB Vega10:
Totals from 15546 (11.51% of 135041) affected shaders:
CodeSize: 120322524 -> 120200568 (-0.10%)
Instrs: 23448344 -> 23417855 (-0.13%)
Latency: 414018749 -> 413639289 (-0.09%); split: -0.09%, +0.00%
InvThroughput: 183819363 -> 183726539 (-0.05%); split: -0.05%, +0.00%
Copies: 2194937 -> 2164448 (-1.39%)
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18528>
A situation where it doesn't match is probably not possible, so this
probably doesn't fix anything.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>
If this is true, then the only instruction the loops visit is
p_logical_end and the loops are no-ops.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>
We shouldn't remove a p_cbranch_nz branch in this situation.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: b731be2e96 ("aco: Remove branch instruction when exec is constant non-zero.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>
The isConstant() check isn't useful. If it's a constant, then the
physReg() check will fail.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: baab6f18c9 ("aco: Optimize branching sequence during SSA elimination.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>
This optimization was broken for two reasons:
- s_and_saveexec has two operands, the copy value and exec
- s_and_saveexec has an exec read, so exec_write_used will always be true
before we find branch_exec_val_idx
Foz-DB Navi21:
Totals from 31453 (23.31% of 134913) affected shaders:
CodeSize: 204831260 -> 204831156 (-0.00%)
Instrs: 38157117 -> 38157091 (-0.00%)
Latency: 533708882 -> 531211721 (-0.47%); split: -0.47%, +0.00%
InvThroughput: 107088408 -> 106719188 (-0.34%); split: -0.35%, +0.00%
Copies: 2326179 -> 2502490 (+7.58%)
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18049>
Inserting in the instructions vector may invalidate the exec_val reference,
so do that last.
Fixes: baab6f18c9 ("aco: Optimize branching sequence during SSA elimination.")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18049>