Commit graph

71 commits

Author SHA1 Message Date
Georg Lehmann
3386ea09d4 aco/opt_postRA: split try_optimize_scc_nocompare in two functions
These are two independent steps, no real reason why they should be in the same
function.

No FOZ-DB changes.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33734>
2025-03-04 21:36:17 +00:00
Friedrich Vock
71392fff25 aco: Fix dead instruction/index handling for try_insert_saveexec_out_of_loop
The loop checking if exec is overwritten didn't check for NULL
instructions, and didn't fix up reg write indices after inserting
instructions.

Fixes: fcd94a8c ("aco: move try_optimize_branching_sequence() to postRA optimizations")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32746>
2025-01-08 10:48:01 +00:00
Georg Lehmann
6a6b26dfa5 aco: create v_cmpx with s_andn2(exec, v_cmp)
Foz-DB Navi21:
Totals from 3928 (4.95% of 79395) affected shaders:
Instrs: 1155370 -> 1151154 (-0.36%)
CodeSize: 6332192 -> 6314616 (-0.28%)
Latency: 11955231 -> 11933281 (-0.18%); split: -0.18%, +0.00%
InvThroughput: 1842283 -> 1841822 (-0.03%); split: -0.03%, +0.00%
SALU: 175431 -> 171215 (-2.40%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32731>
2024-12-30 13:05:23 +00:00
Daniel Schürmann
fcd94a8ca7 aco: move try_optimize_branching_sequence() to postRA optimizations
Totals from 196 (0.25% of 79206) affected shaders: (Navi31)

Instrs: 534343 -> 534438 (+0.02%); split: -0.00%, +0.02%
CodeSize: 2774852 -> 2775420 (+0.02%); split: -0.00%, +0.02%
Latency: 7103512 -> 7103021 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 959477 -> 959447 (-0.00%)
Copies: 42646 -> 42648 (+0.00%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32330>
2024-12-12 08:11:21 +00:00
Daniel Schürmann
95d44c7ce0 aco/optimizer_postRA: set branch()->never_taken if exec is constant non-zero
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32330>
2024-12-12 08:11:21 +00:00
Daniel Schürmann
1ff9a0fe80 aco: remove Pseudo_instruction::tmp_in_scc
This information is redundant, now.

No fossil-changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32217>
2024-11-20 11:04:32 +00:00
Daniel Schürmann
a1a4a6061c aco/ra: explicitly assign scratch SGPR for linear phis
We are about to remove the branch definitions which previously
served this purpose. Also remove Block::scc_live_out.
Some changes due to round-robin RA.

Totals from 939 (1.18% of 79395) affected shaders: (Navi31)

Instrs: 5038786 -> 5038611 (-0.00%); split: -0.01%, +0.00%
CodeSize: 26153412 -> 26152904 (-0.00%); split: -0.00%, +0.00%
Latency: 41649989 -> 41650120 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 6447508 -> 6447536 (+0.00%); split: -0.00%, +0.00%
SClause: 131319 -> 131276 (-0.03%); split: -0.03%, +0.00%
Copies: 359362 -> 359256 (-0.03%); split: -0.05%, +0.02%
SALU: 639275 -> 639169 (-0.02%); split: -0.03%, +0.01%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32217>
2024-11-20 11:04:32 +00:00
Daniel Schürmann
e2705a9d85 aco: set Precolored flag before register allocation
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31387>
2024-10-22 12:29:18 +00:00
Rhys Perry
39270a8be3 aco: preserve SSA in try_eliminate_scc_copy
Otherwise, there is no definition of this temporary. Fixes fail in future
validation.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30440>
2024-08-21 10:47:20 +00:00
Samuel Pitoiset
7a69d78ba2 aco: use SPDX-License-Identifier
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28622>
2024-04-08 15:49:25 +00:00
Timur Kristóf
b9b557f2e7 aco/optimizer_postRA: Remove a check from SCC no-compare optimization.
Not all code paths of this optimization depend on there being only
one user of the first operand; and those code paths already have
their own check for this.

Fossil DB stats on Navi 21:

Totals from 477 (0.60% of 79395) affected shaders:
Instrs: 995901 -> 995341 (-0.06%); split: -0.06%, +0.00%
CodeSize: 5218856 -> 5216816 (-0.04%); split: -0.04%, +0.00%
Latency: 16340256 -> 16338799 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 3044975 -> 3044871 (-0.00%); split: -0.00%, +0.00%
Copies: 95047 -> 95071 (+0.03%)
SALU: 150345 -> 149785 (-0.37%); split: -0.38%, +0.01%

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28545>
2024-04-04 00:50:11 +00:00
Daniel Schürmann
a863c7951e aco: remove create_instruction() template parameter
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370>
2024-03-28 11:25:43 +00:00
Daniel Schürmann
1187189235 aco: unify different SALU types into single struct SALU_instruction
This removes
- SOP1_instruction
- SOP2_instruction
- SOPC_instruction
- SOPK_instruction
- SOPP_instruction

and their corresponding methods.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370>
2024-03-28 11:25:43 +00:00
Timur Kristóf
58e3b1f930 aco: Allow passing constant operand to is_overwritten_since.
This is to make it more intuitive and also consistent
with last_writer_idx which does allow constant operands.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28046>
2024-03-19 20:50:12 +00:00
Daniel Schürmann
9bbb9f1104 aco: use small_vec as Block::edge_vec for predecessors and successors
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27984>
2024-03-19 13:06:58 +00:00
Rhys Perry
dea8b02b03 aco: don't pass constant to is_overwritten_since()
Fixes usage of uninitialised value in dead_space/065515347dca4851 and
other dead_space shaders.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: def0c275c4 ("aco: Eliminate SCC copies when possible.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28011>
2024-03-07 12:17:39 +00:00
Timur Kristóf
def0c275c4 aco: Eliminate SCC copies when possible.
Foz-DB Navi31:
Totals from 2517 (3.22% of 78112) affected shaders:
Instrs: 5992126 -> 5972611 (-0.33%); split: -0.33%, +0.00%
CodeSize: 30986404 -> 30914536 (-0.23%); split: -0.23%, +0.00%
Latency: 43221112 -> 43217422 (-0.01%); split: -0.02%, +0.01%
InvThroughput: 6675983 -> 6674598 (-0.02%); split: -0.02%, +0.00%
SClause: 181987 -> 181976 (-0.01%); split: -0.01%, +0.00%
Copies: 538852 -> 519419 (-3.61%)

Co-authored-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27072>
2024-03-05 22:51:01 +00:00
Georg Lehmann
e7d6cd9216 aco/post-ra: track pseudo scratch sgpr/scc clobber
Foz-DB Navi31:
Totals from 1439 (1.84% of 78112) affected shaders:
Instrs: 1994854 -> 1996650 (+0.09%)
CodeSize: 11376864 -> 11383384 (+0.06%)
Latency: 14996299 -> 14999317 (+0.02%); split: -0.00%, +0.02%
InvThroughput: 2061294 -> 2061518 (+0.01%); split: -0.00%, +0.01%

Cc: mesa-stable

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27855>
2024-03-05 20:16:21 +00:00
Georg Lehmann
bd93e8372d aco/post-ra: assume scc is going to be overwritten by phis at end of blocks
Cc: mesa-stable
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27855>
2024-03-05 20:16:21 +00:00
Georg Lehmann
a5056b2f93 aco/post-ra: rename overwritten_subdword to allow additional uses
Cc: mesa-stable
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27855>
2024-03-05 20:16:21 +00:00
Georg Lehmann
ad5fbc4407 aco: use fmamk/ak instead of fma with inline constant for more VOPD
Foz-DB navi31, forced wave32:
Totals from 24438 (31.29% of 78112) affected shaders:
Instrs: 21632788 -> 21551766 (-0.37%); split: -0.38%, +0.01%
CodeSize: 126181860 -> 126083848 (-0.08%); split: -0.10%, +0.02%
Latency: 162491062 -> 162516234 (+0.02%); split: -0.05%, +0.07%
InvThroughput: 31121194 -> 31002125 (-0.38%); split: -0.40%, +0.02%
VClause: 420176 -> 420169 (-0.00%); split: -0.00%, +0.00%
SClause: 791844 -> 791762 (-0.01%); split: -0.01%, +0.00%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27523>
2024-02-15 12:38:55 +01:00
Georg Lehmann
e36235e6d5 aco: reassign split vector to SOPC
Foz-DB Navi21:
Totals from 2669 (3.42% of 78112) affected shaders:
Instrs: 3570360 -> 3562026 (-0.23%)
CodeSize: 19049784 -> 19017092 (-0.17%)
Latency: 25343555 -> 25337767 (-0.02%); split: -0.03%, +0.00%
InvThroughput: 6191344 -> 6191079 (-0.00%); split: -0.01%, +0.00%
VClause: 90803 -> 90802 (-0.00%)
SClause: 114858 -> 114842 (-0.01%); split: -0.03%, +0.01%
Copies: 269287 -> 260999 (-3.08%)

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27046>
2024-01-13 11:03:09 +00:00
Georg Lehmann
576afa8540 aco: don't optimize DPP across more than one block
Register write tracking doesn't work for inactive lanes, so this was unsafe.

Foz-DB Navi31:
Totals from 8 (0.01% of 78196) affected shaders:
Instrs: 11513 -> 11515 (+0.02%)
CodeSize: 61056 -> 61064 (+0.01%)

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10197
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26373>
2023-11-28 12:48:56 +00:00
Rhys Perry
0e79f76aa5 aco: add fetch_inactive field to DPP instructions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25525>
2023-10-04 18:53:43 +00:00
Rhys Perry
26fce534b5 aco: shrink DPP8_instruction
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25525>
2023-10-04 18:53:43 +00:00
Rhys Perry
ea633c128c aco/optimizer_postRA: don't combine DPP across exec on GFX8/9
GFX8/9 seem to use FI=0 behaviour.

fossil-db (vega10):
Totals from 1 (0.00% of 63053) affected shaders:
Instrs: 542 -> 570 (+5.17%)
CodeSize: 2928 -> 3040 (+3.83%)
Latency: 2087 -> 2118 (+1.49%)
InvThroughput: 1103 -> 1143 (+3.63%)

Affected shader is from Cyberpunk 2077 fossil.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Cc: 23.2 <mesa-stable>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9784
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25471>
2023-09-29 18:23:21 +00:00
Rhys Perry
ac48334ecd aco/optimizer_postRA: check overwritten_subdword in is_overwritten_since()
Fixes crash for
dEQP-VK.mesh_shader.ext.in_out.with_f16.permutation_0.mesh_only and
similar tests on GFX11.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: 3d29779a25 ("aco/optimizer_postRA: Distinguish overwritten untrackable and subdword.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25223>
2023-09-18 10:26:16 +00:00
Georg Lehmann
476149f90d aco: use can_use_input_modifiers helper
Foz-DB GFX1100:
Totals from 80 (0.06% of 132657) affected shaders:
CodeSize: 504500 -> 503660 (-0.17%)
Instrs: 95033 -> 94824 (-0.22%)
Latency: 629695 -> 629235 (-0.07%)
InvThroughput: 97105 -> 97008 (-0.10%)
VClause: 1779 -> 1777 (-0.11%)
Copies: 3233 -> 3236 (+0.09%); split: -0.03%, +0.12%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23059>
2023-05-18 22:57:24 +00:00
Georg Lehmann
644c5e95a0 aco: use get_operand_size for dpp opt
This matters now that v_fma_mixlo_f16/v_fma_mixhi_f16 can use dpp.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23059>
2023-05-18 22:57:24 +00:00
Georg Lehmann
6a53af3fc8 aco: introduce helper to swap valu operands with modifiers
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23059>
2023-05-18 22:57:23 +00:00
Georg Lehmann
04976beac7 aco: don't apply dpp if the alu instr uses the operand twice
CP77 has a ton of fma(dpp(a), dpp(a), b).

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22698>
2023-05-12 13:31:16 +00:00
Georg Lehmann
151bcc1e8b aco: use VOP3+DPP
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22698>
2023-05-12 13:31:16 +00:00
Georg Lehmann
2a1e6a140d aco: also reassign p_extract_vector post ra
Foz-DB Navi21:
Totals from 1223 (0.91% of 134864) affected shaders:
CodeSize: 6923888 -> 6913516 (-0.15%)
Instrs: 1293744 -> 1291151 (-0.20%)
Latency: 16928653 -> 16925035 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 2985304 -> 2984775 (-0.02%); split: -0.02%, +0.00%
VClause: 32260 -> 32319 (+0.18%)
SClause: 54952 -> 54949 (-0.01%)
Copies: 83968 -> 81377 (-3.09%)

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22821>
2023-05-11 10:26:24 +00:00
Georg Lehmann
c1cf40da8a aco: Assert that operands have the same byte offset when reassigning split vectors
This can not happen because the post-RA optimizer doesn't support sub dword
writes at the moment, but everytime I look at this I wonder if there might
be a bug here.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22821>
2023-05-11 10:26:24 +00:00
Georg Lehmann
a60b9313d3 aco: swap opsel when swapping VOP2/C operands
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069>
2023-03-30 03:34:34 +00:00
Georg Lehmann
de4805f25f aco: use bitfield array helpers for valu modifiers
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21023>
2023-03-07 11:53:23 +00:00
Georg Lehmann
77afe7d960 aco: treat VINTERP_INREG as VALU
It's just v_fma with fixed DPP8 and builtin s_waitcnt_expcnt, so it can mostly
be handled as a pure VALU instruction.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21023>
2023-03-07 11:53:23 +00:00
Georg Lehmann
e1eabab6fe aco/optimizer_postRA: assume all registers are untrackable in loop headers
Register writes from the pre-header might not be correct for any but
the first loop iteration because they can be clobbered inside the loop.

Foz-DB Navi21:
Totals from 18 (0.01% of 134913) affected shaders:
CodeSize: 251384 -> 251508 (+0.05%)
Instrs: 47644 -> 47664 (+0.04%)
Latency: 801801 -> 801852 (+0.01%)
InvThroughput: 177579 -> 177593 (+0.01%)
Copies: 4752 -> 4771 (+0.40%)

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8376
Fixes: d3b0f78110 ("aco/optimizer_postRA: Initialize loop header with preheader information")

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21540>
2023-02-28 04:27:05 +00:00
Rhys Perry
ab3184c0a2 aco: don't apply modifiers through DPP to unsupported instructions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21201>
2023-02-21 14:59:38 +00:00
Daniel Schürmann
83b31b11a5 aco: Reassign dead definitions of p_split_vector to associated register
Any unused split_vector definition can always use the same register
as the operand. This avoids creating unnecessary copies.

Fossil DB stats on Rembrandt (RDNA2):
Totals from 3904 (2.89% of 134906) affected shaders:
CodeSize: 18326692 -> 18271688 (-0.30%)
Instrs: 3386632 -> 3372888 (-0.41%)
Latency: 42337481 -> 42330085 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 6566731 -> 6566424 (-0.00%); split: -0.01%, +0.00%
Copies: 224301 -> 210559 (-6.13%)

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16161>
2023-01-01 15:04:07 +01:00
Timur Kristóf
75b1027722 aco: Try to reassign split vector registers post-RA.
Eliminate unnecessary copies when the operand registers of a
p_split_vector instruction are not clobbered between the p_split_vector
and the user of its definitions.

This happens when p_split_vector doesn't kill its operand and its
definitions have a shorter lifespan that the operand. It affects every
NGG culling shader among other things.

This optimization exists because it's too difficult to solve it
in RA, and should be removed after we solved this in RA.

v2 by Daniel Schürmann:
- Rearrange and simplify conditions for the new optimization
- Fix a few bugs

v3 by Daniel Schürmann:
- Check number of encoded ALU operands

Fossil DB stats on Rembrandt (RDNA2):
Totals from 64896 (48.10% of 134906) affected shaders:
CodeSize: 175693348 -> 175434944 (-0.15%)
Instrs: 33333912 -> 33269388 (-0.19%)
Latency: 183766084 -> 183763432 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 28589651 -> 28589340 (-0.00%); split: -0.00%, +0.00%
Copies: 2806550 -> 2742038 (-2.30%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16161>
2023-01-01 15:04:07 +01:00
Timur Kristóf
3d29779a25 aco/optimizer_postRA: Distinguish overwritten untrackable and subdword.
This allows is_overwritten_since to return false when the last
writer instruction of a register can't be tracked but we know it wasn't
written in the current block.

Fossil DB stats on Rembrandt (RDNA2):
Totals from 1163 (0.86% of 134906) affected shaders:
CodeSize: 9815920 -> 9805016 (-0.11%)
Instrs: 1843688 -> 1840962 (-0.15%)
Latency: 19219153 -> 19209171 (-0.05%)
InvThroughput: 3354375 -> 3353852 (-0.02%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16161>
2023-01-01 15:04:07 +01:00
Daniel Schürmann
d3b0f78110 aco/optimizer_postRA: Initialize loop header with preheader information
This works because of SSA and should be safer than just setting 'not_written_yet'.

No Fossil DB changes on Rembrandt (RDNA2).

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16161>
2023-01-01 15:03:57 +01:00
Daniel Schürmann
8f4eccb138 aco: fix reset_block_regs() in postRA-optimizer
Accidentally, we picked the index of the predecessors instead of the predecessors.

Totals from 8496 (6.30% of 134913) affected shaders: (GFX10.3)
CodeSize: 64070724 -> 64022516 (-0.08%); split: -0.08%, +0.00%
Instrs: 11932750 -> 11920698 (-0.10%); split: -0.10%, +0.00%
Latency: 144040266 -> 144017062 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 29327735 -> 29326421 (-0.00%); split: -0.00%, +0.00%

Fossil DB stats on Rembrandt (RDNA2):
Totals from 4488 (3.33% of 134906) affected shaders:
CodeSize: 42759736 -> 42735392 (-0.06%); split: -0.06%, +0.00%
Instrs: 7960522 -> 7954436 (-0.08%); split: -0.08%, +0.00%
Latency: 96192647 -> 96172571 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 19313576 -> 19312575 (-0.01%); split: -0.01%, +0.00%

Fixes: 75967a4814 ('aco/optimizer_postRA: Speed up reset_block() with predecessors.')
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16161>
2023-01-01 15:03:51 +01:00
Timur Kristóf
36bc3afb8b aco/optimizer_postRA: Delete dead instructions more efficiently.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18103>
2022-10-24 20:14:16 +00:00
Timur Kristóf
7263a29794 aco/optimizer_postRA: Properly handle vccz/execz/scc in reset_block.
Fixes: a8dd07518c
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18103>
2022-10-24 20:14:16 +00:00
Timur Kristóf
75967a4814 aco/optimizer_postRA: Speed up reset_block() with predecessors.
Copy the information from the first predecessor then check whether
it matches other predecessors and modify the data accordingly.

Marked for backporting to stable to make it possible to also
backport fixes based on this.

Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18103>
2022-10-24 20:14:16 +00:00
Timur Kristóf
b542ab0243 aco/optimizer_postRA: Use unique_ptr + array for instruction indices.
According to perf, this roughly halves the impact of the post-RA
optimizer in ACO's compile times.

Measurement was taken using a debug optimized build using
NIR_DEBUG=novalidate RADV_DEBUG=nocache and replaying the Fossil DB
from the Doom Eternal shaders.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18103>
2022-10-24 20:14:16 +00:00
Timur Kristóf
ed76471d08 aco/optimizer_postRA: Clarify terminology.
Change the terminology around the post-RA optimizer, primarily this
changes the use of "clobbered" to "overwritten" to avoid confusion,
and it removes some redundant states.

Proposed for backporting to stable, to make sure it is easy to
backport further fixes (if any) on top of this.

Fossil DB stats unaffected on Navi 21.

Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18488>
2022-09-21 16:56:57 +00:00
Timur Kristóf
a8dd07518c aco/optimizer_postRA: Fix logical control flow handling.
Change reset_block() so it only considers the logical
predecessors for VGPRs. Relevant for some optimizations
across loops.

This commit fixes an assertion failure which was triggered
by Zink in a piglit test.

Fossil DB stats unaffected on Navi 21.

Fixes: 2e56e23420
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18488>
2022-09-21 16:56:57 +00:00