Commit graph

91 commits

Author SHA1 Message Date
Rhys Perry
fae2a85d57 aco/gfx12: implement subgroup shader clock
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330>
2024-05-28 10:52:11 +00:00
Rhys Perry
cb81ec7a61 aco: don't count certain pseudo towards VMEM_STORE_CLAUSE_MAX_GRAB_DIST
fossil-db (navi31):
Totals from 1023 (1.29% of 79395) affected shaders:
MaxWaves: 29258 -> 29240 (-0.06%)
Instrs: 1134024 -> 1133163 (-0.08%); split: -0.10%, +0.02%
CodeSize: 5682108 -> 5678696 (-0.06%); split: -0.08%, +0.02%
VGPRs: 60248 -> 60272 (+0.04%); split: -0.08%, +0.12%
Latency: 3797510 -> 3792797 (-0.12%); split: -0.18%, +0.05%
InvThroughput: 781270 -> 781239 (-0.00%); split: -0.03%, +0.03%
VClause: 24701 -> 23976 (-2.94%); split: -3.55%, +0.61%
Copies: 75177 -> 75169 (-0.01%); split: -0.20%, +0.19%
VALU: 659939 -> 659962 (+0.00%); split: -0.02%, +0.02%
VOPD: 2040 -> 2009 (-1.52%); split: +0.29%, -1.81%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28948>
2024-05-14 12:30:54 +00:00
Rhys Perry
88e03feb27 aco: schedule LDS instructions
fossil-db (navi31):
Totals from 1823 (2.30% of 79395) affected shaders:
MaxWaves: 53845 -> 53827 (-0.03%); split: +0.02%, -0.05%
Instrs: 1736317 -> 1731200 (-0.29%); split: -0.38%, +0.09%
CodeSize: 8876760 -> 8857908 (-0.21%); split: -0.29%, +0.08%
VGPRs: 91688 -> 92276 (+0.64%); split: -0.03%, +0.67%
Latency: 11743095 -> 11698872 (-0.38%); split: -0.42%, +0.04%
InvThroughput: 2070526 -> 2067440 (-0.15%); split: -0.17%, +0.02%
VClause: 39048 -> 39058 (+0.03%); split: -0.01%, +0.03%
SClause: 35371 -> 35406 (+0.10%); split: -0.02%, +0.12%
Copies: 104335 -> 104384 (+0.05%); split: -0.21%, +0.26%
Branches: 29769 -> 29794 (+0.08%); split: -0.00%, +0.09%
VALU: 970925 -> 970974 (+0.01%); split: -0.01%, +0.02%
SALU: 146222 -> 146345 (+0.08%); split: -0.01%, +0.09%
VOPD: 1119 -> 1162 (+3.84%); split: +4.29%, -0.45%

fossil-db (navi21):
Totals from 37078 (46.70% of 79395) affected shaders:
MaxWaves: 990093 -> 990025 (-0.01%)
Instrs: 21130662 -> 21182543 (+0.25%); split: -0.01%, +0.26%
CodeSize: 110205364 -> 110415032 (+0.19%); split: -0.01%, +0.20%
VGPRs: 1407168 -> 1410768 (+0.26%)
Latency: 90024839 -> 89929196 (-0.11%); split: -0.11%, +0.01%
InvThroughput: 17170356 -> 17167412 (-0.02%); split: -0.02%, +0.00%
VClause: 392830 -> 392825 (-0.00%); split: -0.01%, +0.01%
SClause: 463150 -> 463188 (+0.01%); split: -0.00%, +0.01%
Copies: 1768433 -> 1768483 (+0.00%); split: -0.02%, +0.02%
Branches: 605989 -> 606011 (+0.00%); split: -0.00%, +0.00%
VALU: 11614810 -> 11614912 (+0.00%); split: -0.00%, +0.00%
SALU: 3794531 -> 3794655 (+0.00%); split: -0.00%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28763>
2024-04-23 12:31:59 +00:00
Rhys Perry
0ee4fa33bc aco: schedule LDSDIR instructions
fossil-db (navi31):
Totals from 33850 (42.63% of 79395) affected shaders:
MaxWaves: 1011236 -> 1011204 (-0.00%)
Instrs: 23589117 -> 23559185 (-0.13%); split: -0.21%, +0.08%
CodeSize: 126099716 -> 125968376 (-0.10%); split: -0.17%, +0.07%
VGPRs: 1348632 -> 1356012 (+0.55%); split: -0.09%, +0.63%
Latency: 183233795 -> 180997751 (-1.22%); split: -1.33%, +0.11%
InvThroughput: 27081576 -> 27056383 (-0.09%); split: -0.15%, +0.06%
VClause: 386453 -> 386551 (+0.03%); split: -0.11%, +0.13%
SClause: 811941 -> 813023 (+0.13%); split: -0.38%, +0.52%
Copies: 1279706 -> 1280051 (+0.03%); split: -0.46%, +0.49%
Branches: 416940 -> 416938 (-0.00%); split: -0.02%, +0.02%
VALU: 13566410 -> 13567367 (+0.01%); split: -0.04%, +0.04%
SALU: 1835804 -> 1835652 (-0.01%); split: -0.02%, +0.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11013
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28763>
2024-04-23 12:31:59 +00:00
Rhys Perry
0bc8a9be67 aco: make store clauses more aggressively
Apparently this significantly improves performance of a radeonsi resolve
shader.

fossil-db (navi31):
Totals from 2372 (2.99% of 79395) affected shaders:
MaxWaves: 59903 -> 59863 (-0.07%)
Instrs: 3508838 -> 3506178 (-0.08%); split: -0.10%, +0.02%
CodeSize: 18516272 -> 18505956 (-0.06%); split: -0.07%, +0.02%
VGPRs: 152708 -> 154604 (+1.24%)
Latency: 27881253 -> 27861445 (-0.07%); split: -0.07%, +0.00%
InvThroughput: 4076649 -> 4076220 (-0.01%); split: -0.03%, +0.02%
VClause: 92696 -> 89409 (-3.55%); split: -3.55%, +0.01%
Copies: 310787 -> 311697 (+0.29%); split: -0.03%, +0.32%
VALU: 1891048 -> 1891933 (+0.05%); split: -0.01%, +0.05%
VOPD: 2534 -> 2559 (+0.99%); split: +1.07%, -0.08%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11014
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28763>
2024-04-23 12:31:59 +00:00
Rhys Perry
9775318aa9 aco: don't include the clause in VMEM_CLAUSE_MAX_GRAB_DIST
By excluding the clause from this check, we only count the number of
instructions that we're actually moving the store across.

fossil-db (navi31):
Totals from 4409 (5.55% of 79395) affected shaders:
MaxWaves: 120234 -> 119738 (-0.41%)
Instrs: 3184513 -> 3184702 (+0.01%); split: -0.09%, +0.09%
CodeSize: 15942424 -> 15943276 (+0.01%); split: -0.07%, +0.07%
VGPRs: 248448 -> 255816 (+2.97%); split: -0.04%, +3.00%
Latency: 18841156 -> 18829451 (-0.06%); split: -0.08%, +0.02%
InvThroughput: 2549229 -> 2552042 (+0.11%); split: -0.02%, +0.13%
VClause: 67760 -> 64138 (-5.35%); split: -5.40%, +0.06%
SClause: 82921 -> 82922 (+0.00%)
Copies: 270026 -> 273399 (+1.25%); split: -0.14%, +1.39%
VALU: 1793374 -> 1796743 (+0.19%); split: -0.02%, +0.21%
VOPD: 798 -> 802 (+0.50%); split: +0.63%, -0.13%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28633>
2024-04-11 18:30:47 +00:00
Samuel Pitoiset
7a69d78ba2 aco: use SPDX-License-Identifier
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28622>
2024-04-08 15:49:25 +00:00
Daniel Schürmann
1187189235 aco: unify different SALU types into single struct SALU_instruction
This removes
- SOP1_instruction
- SOP2_instruction
- SOPC_instruction
- SOPK_instruction
- SOPP_instruction

and their corresponding methods.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370>
2024-03-28 11:25:43 +00:00
Georg Lehmann
74fc2e287f aco: stop scheduling at p_logical_end
No Foz-DB changes, but this fixes some issues when the spiller inserts
scratch loads after p_logical_end for p_return.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27119>
2024-01-19 17:04:28 +00:00
Georg Lehmann
0a5d3ac8d2 aco/sched: treat p_dual_src_export_gfx11 like export
This prevents the scheduler from moving the dual source export above mrtz
export, which caused hangs.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10173

Cc: mesa-stable
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26317>
2023-11-21 18:11:45 +00:00
Qiang Yu
85d9646288 aco: add p_end_with_regs pseudo instruction
Used by radeonsi shader parts to pass args from one part to another.
It has variable number of operands to reserve fixed registers with
wanted value.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24442>
2023-08-16 02:27:45 +00:00
Vitaliy Triang3l Kuzmin
2194e8bd82 aco: Add Primitive Ordered Pixel Shading scheduling rules
Implementing the acquire/release semantics of fragment shader interlock
ordered section in Vulkan, and preventing reordering of memory accesses
requiring primitive ordering out of the ordered section.

Also, the ordered section should be as short as possible, so not reordering
the instructions awaiting overlapped waves upwards, and the exit from the
ordered section downwards.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>
2023-06-26 15:58:04 +00:00
Rhys Perry
f1f01aaef5 aco/gfx11: schedule for VMEM store clauses
fossil-db (gfx1100):
Totals from 49486 (37.09% of 133428) affected shaders:
Instrs: 18376819 -> 18480712 (+0.57%); split: -0.00%, +0.57%
CodeSize: 91810836 -> 92227292 (+0.45%); split: -0.00%, +0.45%
VGPRs: 2031824 -> 2047784 (+0.79%); split: -0.02%, +0.81%
Latency: 104259318 -> 103804792 (-0.44%); split: -0.44%, +0.00%
InvThroughput: 16388760 -> 16399819 (+0.07%); split: -0.13%, +0.19%
VClause: 568844 -> 432401 (-23.99%)
Copies: 1197942 -> 1231202 (+2.78%); split: -0.08%, +2.86%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23505>
2023-06-20 18:37:34 +00:00
Eric Engestrom
6b21653ab4 aco: reformat according to its .clang-format
Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23253>
2023-06-16 19:59:52 +00:00
Rhys Perry
417990b19e aco: consider position/primitive exports around memory barriers
This is needed to create barriers which ensure stores finish before
position/primitive exports.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21624>
2023-06-07 13:19:41 +00:00
Timur Kristóf
9b6945bb65 amd: Cleanup old GS intrinsics code.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22690>
2023-05-04 19:08:59 +00:00
Rhys Perry
d0caa50dcd aco: don't move exec writes around exec writes
Not sure if this is possible, but we should avoid it anyway.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22714>
2023-04-27 14:33:53 +00:00
Rhys Perry
5e20fbd424 aco: don't move exec reads around exec writes
Fixes flickering and blocky plants in Jedi: Fallen Order.

Also fixes flickering squares in The Last of Us Part 1.

fossil-db (navi21):
Totals from 92 (0.07% of 135636) affected shaders:
Instrs: 35324 -> 35354 (+0.08%); split: -0.03%, +0.11%
CodeSize: 189568 -> 189668 (+0.05%); split: -0.03%, +0.08%
Latency: 345305 -> 346529 (+0.35%); split: -0.02%, +0.37%
InvThroughput: 78632 -> 78625 (-0.01%)
SClause: 1955 -> 1972 (+0.87%); split: -0.61%, +1.48%
Copies: 1311 -> 1304 (-0.53%); split: -0.69%, +0.15%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8883
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8878
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22696>
2023-04-26 13:16:00 +00:00
Harri Nieminen
aea48a4ff1 amd: fix typos
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22432>
2023-04-13 23:08:22 +00:00
Rhys Perry
fad1f716dd aco: fix out-of-bounds access when moving s_mem(real)time across SMEM
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8224
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21138>
2023-02-07 14:50:43 +00:00
Rhys Perry
ea8ddf5c26 aco: add storage_gds
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19345>
2022-10-28 21:50:05 +00:00
Samuel Pitoiset
c481978ac2 aco: split the sendmsg enumeration into sendmsg_rtn
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19267>
2022-10-25 20:23:07 +02:00
Samuel Pitoiset
6630b6e2aa aco: add support for s_sendmsg_rtn_b{32,64}
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19267>
2022-10-25 20:23:05 +02:00
Rhys Perry
6407d783ea aco: update sendmsg enum from LLVM
Add GFX11 enums and some new ones that apparently existed before.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17710>
2022-09-30 20:57:02 +00:00
Samuel Pitoiset
8bdcc20815 aco: add new pseudo instruction p_jump_to_epilog
The first operand of this new pseudo-instruction is a 64-bit SGPR for
the continue PC, followed by a variable list of fixed VGPRS for the
color exports which are the PS epilog inputs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17485>
2022-07-18 18:40:02 +00:00
Rhys Perry
d2d94b62f2 aco: initialize scratch base registers on GFX9-GFX10.3
fossil-db (navi21):
Totals from 1142 (0.70% of 162293) affected shaders:
Instrs: 271636 -> 271974 (+0.12%)
CodeSize: 1532020 -> 1533792 (+0.12%)
Latency: 7484066 -> 7485698 (+0.02%)
InvThroughput: 4048824 -> 4049579 (+0.02%)
SClause: 4171 -> 4212 (+0.98%)
PreSGPRs: 11203 -> 12276 (+9.58%)

fossil-db (vega10):
Totals from 3327 (2.06% of 161355) affected shaders:
Instrs: 257413 -> 257601 (+0.07%)
CodeSize: 1424244 -> 1425372 (+0.08%)
Latency: 8598402 -> 8600466 (+0.02%)
InvThroughput: 7906335 -> 7908234 (+0.02%)
SClause: 4932 -> 4973 (+0.83%)
PreSGPRs: 22010 -> 25405 (+15.42%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
97e9e42e0d aco: treat flat-like as vmem in some scheduling heuristics
fossil-db (navi21):
Totals from 12 (0.01% of 162293) affected shaders:
Instrs: 48754 -> 48762 (+0.02%)
CodeSize: 267092 -> 267124 (+0.01%)
Latency: 1293798 -> 1292303 (-0.12%); split: -0.12%, +0.00%
InvThroughput: 854599 -> 853578 (-0.12%)
VClause: 1623 -> 1619 (-0.25%)
SClause: 1187 -> 1188 (+0.08%); split: -0.08%, +0.17%

fossil-db (vega10):
Totals from 1 (0.00% of 161355) affected shaders:
Latency: 18720 -> 18848 (+0.68%)
InvThroughput: 5775 -> 5776 (+0.02%)
SClause: 12 -> 11 (-8.33%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Dave Airlie
a2701bfdb8 aco: move info pointer to a copy.
This is just setup to move this to a different struct later.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16342>
2022-05-11 19:07:11 +00:00
Daniel Schürmann
d5dc0c0392 aco: adjust num_waves for LDS before scheduling
Totals from 67 (0.05% of 134913) affected shaders: (GFX10.3)
VGPRs: 2024 -> 2136 (+5.53%); split: -0.40%, +5.93%
CodeSize: 162364 -> 162348 (-0.01%); split: -0.08%, +0.07%
MaxWaves: 1882 -> 1816 (-3.51%); split: +0.11%, -3.61%
Instrs: 29176 -> 29162 (-0.05%); split: -0.09%, +0.04%
Latency: 329984 -> 327272 (-0.82%); split: -0.88%, +0.06%
InvThroughput: 54653 -> 54672 (+0.03%); split: -0.01%, +0.04%
VClause: 782 -> 761 (-2.69%); split: -2.81%, +0.13%
SClause: 833 -> 824 (-1.08%); split: -2.28%, +1.20%
Copies: 1872 -> 1873 (+0.05%); split: -0.37%, +0.43%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16039>
2022-04-29 15:39:10 +00:00
Timur Kristóf
cd0dd5d6b7 aco: Add storage class for Task Shader payload.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15161>
2022-02-25 13:20:08 +01:00
Daniel Schürmann
4e7a777093 aco: try forming clauses even if reg_pressure exceeds
This patch allows to form clauses even if the register pressure
is at the limit with the effect that VMEM instructions are less
scattered after the first clause in a Block.
It respects the previous clause size to avoid excessive moving
of VMEM instructions.
VMEM_CLAUSE_MAX_GRAB_DIST is further reduced to compensate
some of the effects.

Totals from 28922 (19.26% of 150170) affected shaders: (GFX10.3)
VGPRs: 1546568 -> 1523072 (-1.52%); split: -1.52%, +0.00%
CodeSize: 117524892 -> 117510288 (-0.01%); split: -0.08%, +0.07%
MaxWaves: 605554 -> 611120 (+0.92%)
Instrs: 22292568 -> 22291927 (-0.00%); split: -0.10%, +0.09%
Latency: 488975399 -> 490230904 (+0.26%); split: -0.06%, +0.32%
InvThroughput: 117842300 -> 116521653 (-1.12%); split: -1.15%, +0.03%
VClause: 541550 -> 522464 (-3.52%); split: -9.73%, +6.20%
SClause: 718185 -> 718298 (+0.02%); split: -0.00%, +0.02%
Copies: 1420603 -> 1386949 (-2.37%); split: -2.64%, +0.27%
Branches: 559559 -> 559278 (-0.05%); split: -0.06%, +0.01%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10896>
2021-09-27 14:29:16 +00:00
Daniel Schürmann
7e1faf9349 aco: make clause-forming depend on the number of moved instructions
This allows more aggressive clause-forming in presence of
larger def-use distances. To compensate for the effect,
VMEM_CLAUSE_MAX_GRAB_DIST was decreased.

Totals from 5788 (3.85% of 150170) affected shaders: (GFX10.3)
VGPRs: 483960 -> 475272 (-1.80%); split: -1.82%, +0.02%
CodeSize: 59661240 -> 59669084 (+0.01%); split: -0.01%, +0.02%
MaxWaves: 70408 -> 71450 (+1.48%); split: +1.51%, -0.03%
Instrs: 11222417 -> 11224479 (+0.02%); split: -0.01%, +0.03%
Latency: 349397104 -> 349298602 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 88584832 -> 87762262 (-0.93%); split: -0.93%, +0.00%
VClause: 168905 -> 177089 (+4.85%); split: -0.48%, +5.32%
SClause: 375795 -> 375767 (-0.01%); split: -0.01%, +0.01%
Copies: 840298 -> 840231 (-0.01%); split: -0.04%, +0.03%
Branches: 373265 -> 373278 (+0.00%); split: -0.00%, +0.00%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10896>
2021-09-27 14:29:16 +00:00
Daniel Schürmann
903999c119 aco: stop scheduling if clause-forming fails
This avoids unintended reordering of VMEM instructions.
It is also highly unlikely that we find more independent
instructions before previous clause-related instructions.

Totals from 1921 (1.28% of 150170) affected shaders: (GFX10.3)
VGPRs: 103832 -> 103736 (-0.09%); split: -0.10%, +0.01%
CodeSize: 8695560 -> 8706000 (+0.12%); split: -0.03%, +0.15%
Instrs: 1643752 -> 1646349 (+0.16%); split: -0.04%, +0.20%
Latency: 26755527 -> 26614645 (-0.53%); split: -0.67%, +0.14%
InvThroughput: 7226604 -> 7204809 (-0.30%); split: -0.39%, +0.08%
VClause: 46536 -> 46201 (-0.72%); split: -0.81%, +0.09%
SClause: 47910 -> 47769 (-0.29%); split: -0.43%, +0.14%
Copies: 94647 -> 94558 (-0.09%); split: -0.26%, +0.17%
Branches: 36843 -> 36847 (+0.01%); split: -0.00%, +0.01%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10896>
2021-09-27 14:29:16 +00:00
Rhys Perry
b23a9dd1f6 aco/scheduler: allow moving down VMEM stores to below VMEM loads
fossil-db (Vega10):
Totals from 93 (0.06% of 150305) affected shaders:
SGPRs: 4832 -> 4768 (-1.32%)
VGPRs: 4084 -> 4144 (+1.47%)
CodeSize: 316080 -> 317208 (+0.36%); split: -0.11%, +0.47%
MaxWaves: 589 -> 580 (-1.53%)
Instrs: 60229 -> 60511 (+0.47%); split: -0.15%, +0.61%
Latency: 636477 -> 540029 (-15.15%); split: -15.26%, +0.10%
InvThroughput: 293027 -> 283043 (-3.41%); split: -4.21%, +0.80%
VClause: 2557 -> 2716 (+6.22%); split: -0.35%, +6.57%
SClause: 1381 -> 1395 (+1.01%); split: -0.14%, +1.16%
Copies: 9424 -> 9728 (+3.23%); split: -0.74%, +3.97%

fossil-db (Sienna Cichlid):
Totals from 88 (0.06% of 150170) affected shaders:
VGPRs: 3840 -> 3872 (+0.83%)
CodeSize: 300544 -> 300960 (+0.14%); split: -0.09%, +0.23%
Instrs: 53714 -> 53871 (+0.29%); split: -0.05%, +0.35%
Latency: 489854 -> 462001 (-5.69%); split: -6.30%, +0.61%
InvThroughput: 100307 -> 95142 (-5.15%); split: -5.50%, +0.35%
VClause: 2322 -> 2564 (+10.42%); split: -0.39%, +10.81%
SClause: 1345 -> 1358 (+0.97%); split: -0.15%, +1.12%
Copies: 4113 -> 4351 (+5.79%); split: -0.66%, +6.44%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12211>
2021-08-23 16:48:31 +00:00
Timur Kristóf
8341af5109 radv, aco, ac/nir: Tweak position export scheduling for NGG culling.
The result is about +5-ish fps in Doom Eternal.

It turns out that the location of position exports matters more
than we thought, and it's actually better to keep them at the bottom
for culling shaders rather than schedule it up to the top.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>
2021-07-13 23:56:33 +00:00
Daniel Schürmann
1e2639026f aco: Format.
Manually adjusted some comments for more intuitive line breaks.

Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11258>
2021-07-12 21:27:31 +00:00
Daniel Schürmann
0eea0e55ad aco: add 'common/' and 'llvm/' prefix to #includes
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11271>
2021-07-12 12:09:31 +00:00
Daniel Schürmann
59fdaa1985 aco: reorder and cleanup #includes
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11271>
2021-07-12 12:09:31 +00:00
Rhys Perry
d64f5a3f9d aco: move VMEM instructions below descriptor loads
This is to prevent sequences like:
   a = descriptor_load()
   vmem(a)
   b = descriptor_load()
   vmem(b)
and instead create:
   a = descriptor_load()
   b = descriptor_load()
   vmem(a)
   vmem(b)

fossil-db (GFX10.3):
Totals from 114521 (78.30% of 146267) affected shaders:
VGPRs: 4540352 -> 4540216 (-0.00%); split: -0.03%, +0.02%
CodeSize: 289864228 -> 289114652 (-0.26%); split: -0.29%, +0.03%
MaxWaves: 2940234 -> 2940338 (+0.00%); split: +0.00%, -0.00%
Instrs: 55112418 -> 54919910 (-0.35%); split: -0.38%, +0.03%
Latency: 956528393 -> 954682011 (-0.19%); split: -0.24%, +0.05%
InvThroughput: 229280830 -> 229238107 (-0.02%); split: -0.04%, +0.02%
VClause: 1141832 -> 1139002 (-0.25%); split: -0.63%, +0.38%
SClause: 2357840 -> 2225008 (-5.63%); split: -6.01%, +0.38%
Copies: 3316040 -> 3331519 (+0.47%); split: -0.31%, +0.77%
Branches: 1187212 -> 1186919 (-0.02%); split: -0.03%, +0.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6489>
2021-06-14 15:47:37 +00:00
Rhys Perry
bc71222cd9 aco: don't move descriptor loads below buffer loads
fossil-db (GFX10.3):
Totals from 52870 (36.15% of 146267) affected shaders:
VGPRs: 2109936 -> 2110056 (+0.01%); split: -0.01%, +0.01%
CodeSize: 134898056 -> 134812748 (-0.06%); split: -0.08%, +0.02%
MaxWaves: 1347354 -> 1347346 (-0.00%)
Instrs: 25598063 -> 25575415 (-0.09%); split: -0.11%, +0.02%
Latency: 432491613 -> 432047723 (-0.10%); split: -0.12%, +0.02%
InvThroughput: 90940977 -> 90927545 (-0.01%); split: -0.03%, +0.01%
VClause: 570039 -> 570019 (-0.00%); split: -0.05%, +0.04%
SClause: 1145076 -> 1139040 (-0.53%); split: -0.60%, +0.07%
Copies: 1513949 -> 1513102 (-0.06%); split: -0.32%, +0.26%
Branches: 524279 -> 524275 (-0.00%); split: -0.03%, +0.03%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6489>
2021-06-14 15:47:37 +00:00
Tony Wasserka
3c390e2eb6 aco/scheduler: Move cursor handling state to dedicated interfaces
This clarifies the semantics of the index variables compared to the previous
version, which used the same variables in a slightly different way depending
on whether they were used for downwards moves or upwards ones.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10885>
2021-06-07 12:09:39 +02:00
Tony Wasserka
81761a311e aco/scheduler: Clean up register demand tracking
Refactoring total_demand and total_demand_clause to cover non-overlapping
instruction intervals makes the code easier to follow and allows the register
demand to be updated more efficiently in some cases.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10885>
2021-06-07 12:09:39 +02:00
Tony Wasserka
80ee9d3947 aco/scheduler: Verify register demand invariants in debug mode
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10644>
2021-05-13 15:27:57 +00:00
Tony Wasserka
50ba919d37 aco/scheduler: Fix register demand computation for upwards moves
The initial value needs to be taken from the instruction that is being
moved over, not the one to be moved.

Additionally the parameter of this function was removed because it was
misleading. Setting it to any value other than source_idx would cause
register_demand to be initialized incorrectly. (Instead, the maximum
demand among the covered instructions would need to be determined.)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10644>
2021-05-13 15:27:57 +00:00
Tony Wasserka
c528af1076 aco/scheduler: Fix register demand computation for downwards moves
Previously, changes in total_demand_clause were not always propagated to
total_demand. For instance, clause moves do not change the local register
demand at the end of a clause, yet they may still affect the total maximum.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 8235bc6411 ("aco: try to group together VMEM loads of the same resource")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4533
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10644>
2021-05-13 15:27:57 +00:00
Rhys Perry
dfa38fa0c7 aco: group loads from the same vertex binding into the same clause
In the future, we might have vertex attribute loads from the same binding
but with different descriptors. Since they will be loading from the same
buffer, we should continue grouping them into clauses.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7871>
2021-05-10 12:09:14 +00:00
Rhys Perry
3d4c13f3b8 aco: add DeviceInfo
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8761>
2021-02-15 13:44:22 +00:00
Daniel Schürmann
947bf0bd67 aco: don't decrease the vgpr_limit when encountering bpermute
Instead we recalculate vgpr_limit on demand, depending on
the number of needed shared VGPRs.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921>
2021-02-12 19:00:18 +00:00
Rhys Perry
e115b01948 aco: return references in instruction cast methods
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>
2021-01-22 14:12:33 +00:00
Rhys Perry
1d245cd18b aco: use format-check methods
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>
2021-01-22 14:12:32 +00:00