Samuel Pitoiset
107f29c39a
aco: do not reorder s_trap instructions
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32055 >
2024-11-11 15:46:36 +00:00
Daniel Schürmann
5a39cbdef6
aco: change signature of get_live_changes() and get_temp_registers()
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30182 >
2024-08-14 08:11:48 +00:00
Daniel Schürmann
e5d920e0b9
aco/scheduler: enable live variables validation when ACO_DEBUG=validate-livevars is set
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30182 >
2024-08-14 08:11:48 +00:00
Daniel Schürmann
6729e81d15
aco/live_var_analysis: inline block->register_demand updates
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29962 >
2024-07-10 12:31:02 +00:00
Daniel Schürmann
6c6f382d68
aco: add RegisterDemand member to Instruction
...
Since we never need both at the same time, we can use
a union with pass_flags.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29962 >
2024-07-10 12:31:02 +00:00
Daniel Schürmann
26c58ca9de
aco/scheduler: fix register_demand validation debug code
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29804 >
2024-06-26 09:42:33 +00:00
Georg Lehmann
046414e061
aco: add more anonymous namespaces
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29740 >
2024-06-18 17:53:07 +00:00
Daniel Schürmann
a497d105e3
aco: move live var information into struct Program
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29713 >
2024-06-14 14:32:35 +00:00
Daniel Schürmann
2322ab427e
aco/scheduler: remove unused register_demand parameter
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29713 >
2024-06-14 14:32:35 +00:00
Rhys Perry
8e475bba61
aco: implement nir_intrinsic_nop_amd and nir_intrinsic_sleep_amd
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466 >
2024-06-06 14:26:52 +00:00
Rhys Perry
2a4424425a
aco/gfx12: fix s_wait_event immediate
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466 >
2024-06-06 14:26:52 +00:00
Rhys Perry
fae2a85d57
aco/gfx12: implement subgroup shader clock
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330 >
2024-05-28 10:52:11 +00:00
Rhys Perry
cb81ec7a61
aco: don't count certain pseudo towards VMEM_STORE_CLAUSE_MAX_GRAB_DIST
...
fossil-db (navi31):
Totals from 1023 (1.29% of 79395) affected shaders:
MaxWaves: 29258 -> 29240 (-0.06%)
Instrs: 1134024 -> 1133163 (-0.08%); split: -0.10%, +0.02%
CodeSize: 5682108 -> 5678696 (-0.06%); split: -0.08%, +0.02%
VGPRs: 60248 -> 60272 (+0.04%); split: -0.08%, +0.12%
Latency: 3797510 -> 3792797 (-0.12%); split: -0.18%, +0.05%
InvThroughput: 781270 -> 781239 (-0.00%); split: -0.03%, +0.03%
VClause: 24701 -> 23976 (-2.94%); split: -3.55%, +0.61%
Copies: 75177 -> 75169 (-0.01%); split: -0.20%, +0.19%
VALU: 659939 -> 659962 (+0.00%); split: -0.02%, +0.02%
VOPD: 2040 -> 2009 (-1.52%); split: +0.29%, -1.81%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28948 >
2024-05-14 12:30:54 +00:00
Rhys Perry
88e03feb27
aco: schedule LDS instructions
...
fossil-db (navi31):
Totals from 1823 (2.30% of 79395) affected shaders:
MaxWaves: 53845 -> 53827 (-0.03%); split: +0.02%, -0.05%
Instrs: 1736317 -> 1731200 (-0.29%); split: -0.38%, +0.09%
CodeSize: 8876760 -> 8857908 (-0.21%); split: -0.29%, +0.08%
VGPRs: 91688 -> 92276 (+0.64%); split: -0.03%, +0.67%
Latency: 11743095 -> 11698872 (-0.38%); split: -0.42%, +0.04%
InvThroughput: 2070526 -> 2067440 (-0.15%); split: -0.17%, +0.02%
VClause: 39048 -> 39058 (+0.03%); split: -0.01%, +0.03%
SClause: 35371 -> 35406 (+0.10%); split: -0.02%, +0.12%
Copies: 104335 -> 104384 (+0.05%); split: -0.21%, +0.26%
Branches: 29769 -> 29794 (+0.08%); split: -0.00%, +0.09%
VALU: 970925 -> 970974 (+0.01%); split: -0.01%, +0.02%
SALU: 146222 -> 146345 (+0.08%); split: -0.01%, +0.09%
VOPD: 1119 -> 1162 (+3.84%); split: +4.29%, -0.45%
fossil-db (navi21):
Totals from 37078 (46.70% of 79395) affected shaders:
MaxWaves: 990093 -> 990025 (-0.01%)
Instrs: 21130662 -> 21182543 (+0.25%); split: -0.01%, +0.26%
CodeSize: 110205364 -> 110415032 (+0.19%); split: -0.01%, +0.20%
VGPRs: 1407168 -> 1410768 (+0.26%)
Latency: 90024839 -> 89929196 (-0.11%); split: -0.11%, +0.01%
InvThroughput: 17170356 -> 17167412 (-0.02%); split: -0.02%, +0.00%
VClause: 392830 -> 392825 (-0.00%); split: -0.01%, +0.01%
SClause: 463150 -> 463188 (+0.01%); split: -0.00%, +0.01%
Copies: 1768433 -> 1768483 (+0.00%); split: -0.02%, +0.02%
Branches: 605989 -> 606011 (+0.00%); split: -0.00%, +0.00%
VALU: 11614810 -> 11614912 (+0.00%); split: -0.00%, +0.00%
SALU: 3794531 -> 3794655 (+0.00%); split: -0.00%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28763 >
2024-04-23 12:31:59 +00:00
Rhys Perry
0ee4fa33bc
aco: schedule LDSDIR instructions
...
fossil-db (navi31):
Totals from 33850 (42.63% of 79395) affected shaders:
MaxWaves: 1011236 -> 1011204 (-0.00%)
Instrs: 23589117 -> 23559185 (-0.13%); split: -0.21%, +0.08%
CodeSize: 126099716 -> 125968376 (-0.10%); split: -0.17%, +0.07%
VGPRs: 1348632 -> 1356012 (+0.55%); split: -0.09%, +0.63%
Latency: 183233795 -> 180997751 (-1.22%); split: -1.33%, +0.11%
InvThroughput: 27081576 -> 27056383 (-0.09%); split: -0.15%, +0.06%
VClause: 386453 -> 386551 (+0.03%); split: -0.11%, +0.13%
SClause: 811941 -> 813023 (+0.13%); split: -0.38%, +0.52%
Copies: 1279706 -> 1280051 (+0.03%); split: -0.46%, +0.49%
Branches: 416940 -> 416938 (-0.00%); split: -0.02%, +0.02%
VALU: 13566410 -> 13567367 (+0.01%); split: -0.04%, +0.04%
SALU: 1835804 -> 1835652 (-0.01%); split: -0.02%, +0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11013
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28763 >
2024-04-23 12:31:59 +00:00
Rhys Perry
0bc8a9be67
aco: make store clauses more aggressively
...
Apparently this significantly improves performance of a radeonsi resolve
shader.
fossil-db (navi31):
Totals from 2372 (2.99% of 79395) affected shaders:
MaxWaves: 59903 -> 59863 (-0.07%)
Instrs: 3508838 -> 3506178 (-0.08%); split: -0.10%, +0.02%
CodeSize: 18516272 -> 18505956 (-0.06%); split: -0.07%, +0.02%
VGPRs: 152708 -> 154604 (+1.24%)
Latency: 27881253 -> 27861445 (-0.07%); split: -0.07%, +0.00%
InvThroughput: 4076649 -> 4076220 (-0.01%); split: -0.03%, +0.02%
VClause: 92696 -> 89409 (-3.55%); split: -3.55%, +0.01%
Copies: 310787 -> 311697 (+0.29%); split: -0.03%, +0.32%
VALU: 1891048 -> 1891933 (+0.05%); split: -0.01%, +0.05%
VOPD: 2534 -> 2559 (+0.99%); split: +1.07%, -0.08%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11014
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28763 >
2024-04-23 12:31:59 +00:00
Rhys Perry
9775318aa9
aco: don't include the clause in VMEM_CLAUSE_MAX_GRAB_DIST
...
By excluding the clause from this check, we only count the number of
instructions that we're actually moving the store across.
fossil-db (navi31):
Totals from 4409 (5.55% of 79395) affected shaders:
MaxWaves: 120234 -> 119738 (-0.41%)
Instrs: 3184513 -> 3184702 (+0.01%); split: -0.09%, +0.09%
CodeSize: 15942424 -> 15943276 (+0.01%); split: -0.07%, +0.07%
VGPRs: 248448 -> 255816 (+2.97%); split: -0.04%, +3.00%
Latency: 18841156 -> 18829451 (-0.06%); split: -0.08%, +0.02%
InvThroughput: 2549229 -> 2552042 (+0.11%); split: -0.02%, +0.13%
VClause: 67760 -> 64138 (-5.35%); split: -5.40%, +0.06%
SClause: 82921 -> 82922 (+0.00%)
Copies: 270026 -> 273399 (+1.25%); split: -0.14%, +1.39%
VALU: 1793374 -> 1796743 (+0.19%); split: -0.02%, +0.21%
VOPD: 798 -> 802 (+0.50%); split: +0.63%, -0.13%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28633 >
2024-04-11 18:30:47 +00:00
Samuel Pitoiset
7a69d78ba2
aco: use SPDX-License-Identifier
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28622 >
2024-04-08 15:49:25 +00:00
Daniel Schürmann
1187189235
aco: unify different SALU types into single struct SALU_instruction
...
This removes
- SOP1_instruction
- SOP2_instruction
- SOPC_instruction
- SOPK_instruction
- SOPP_instruction
and their corresponding methods.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370 >
2024-03-28 11:25:43 +00:00
Georg Lehmann
74fc2e287f
aco: stop scheduling at p_logical_end
...
No Foz-DB changes, but this fixes some issues when the spiller inserts
scratch loads after p_logical_end for p_return.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27119 >
2024-01-19 17:04:28 +00:00
Georg Lehmann
0a5d3ac8d2
aco/sched: treat p_dual_src_export_gfx11 like export
...
This prevents the scheduler from moving the dual source export above mrtz
export, which caused hangs.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10173
Cc: mesa-stable
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26317 >
2023-11-21 18:11:45 +00:00
Qiang Yu
85d9646288
aco: add p_end_with_regs pseudo instruction
...
Used by radeonsi shader parts to pass args from one part to another.
It has variable number of operands to reserve fixed registers with
wanted value.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24442 >
2023-08-16 02:27:45 +00:00
Vitaliy Triang3l Kuzmin
2194e8bd82
aco: Add Primitive Ordered Pixel Shading scheduling rules
...
Implementing the acquire/release semantics of fragment shader interlock
ordered section in Vulkan, and preventing reordering of memory accesses
requiring primitive ordering out of the ordered section.
Also, the ordered section should be as short as possible, so not reordering
the instructions awaiting overlapped waves upwards, and the exit from the
ordered section downwards.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250 >
2023-06-26 15:58:04 +00:00
Rhys Perry
f1f01aaef5
aco/gfx11: schedule for VMEM store clauses
...
fossil-db (gfx1100):
Totals from 49486 (37.09% of 133428) affected shaders:
Instrs: 18376819 -> 18480712 (+0.57%); split: -0.00%, +0.57%
CodeSize: 91810836 -> 92227292 (+0.45%); split: -0.00%, +0.45%
VGPRs: 2031824 -> 2047784 (+0.79%); split: -0.02%, +0.81%
Latency: 104259318 -> 103804792 (-0.44%); split: -0.44%, +0.00%
InvThroughput: 16388760 -> 16399819 (+0.07%); split: -0.13%, +0.19%
VClause: 568844 -> 432401 (-23.99%)
Copies: 1197942 -> 1231202 (+2.78%); split: -0.08%, +2.86%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23505 >
2023-06-20 18:37:34 +00:00
Eric Engestrom
6b21653ab4
aco: reformat according to its .clang-format
...
Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23253 >
2023-06-16 19:59:52 +00:00
Rhys Perry
417990b19e
aco: consider position/primitive exports around memory barriers
...
This is needed to create barriers which ensure stores finish before
position/primitive exports.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21624 >
2023-06-07 13:19:41 +00:00
Timur Kristóf
9b6945bb65
amd: Cleanup old GS intrinsics code.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22690 >
2023-05-04 19:08:59 +00:00
Rhys Perry
d0caa50dcd
aco: don't move exec writes around exec writes
...
Not sure if this is possible, but we should avoid it anyway.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22714 >
2023-04-27 14:33:53 +00:00
Rhys Perry
5e20fbd424
aco: don't move exec reads around exec writes
...
Fixes flickering and blocky plants in Jedi: Fallen Order.
Also fixes flickering squares in The Last of Us Part 1.
fossil-db (navi21):
Totals from 92 (0.07% of 135636) affected shaders:
Instrs: 35324 -> 35354 (+0.08%); split: -0.03%, +0.11%
CodeSize: 189568 -> 189668 (+0.05%); split: -0.03%, +0.08%
Latency: 345305 -> 346529 (+0.35%); split: -0.02%, +0.37%
InvThroughput: 78632 -> 78625 (-0.01%)
SClause: 1955 -> 1972 (+0.87%); split: -0.61%, +1.48%
Copies: 1311 -> 1304 (-0.53%); split: -0.69%, +0.15%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8883
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8878
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22696 >
2023-04-26 13:16:00 +00:00
Harri Nieminen
aea48a4ff1
amd: fix typos
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22432 >
2023-04-13 23:08:22 +00:00
Rhys Perry
fad1f716dd
aco: fix out-of-bounds access when moving s_mem(real)time across SMEM
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8224
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21138 >
2023-02-07 14:50:43 +00:00
Rhys Perry
ea8ddf5c26
aco: add storage_gds
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19345 >
2022-10-28 21:50:05 +00:00
Samuel Pitoiset
c481978ac2
aco: split the sendmsg enumeration into sendmsg_rtn
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19267 >
2022-10-25 20:23:07 +02:00
Samuel Pitoiset
6630b6e2aa
aco: add support for s_sendmsg_rtn_b{32,64}
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19267 >
2022-10-25 20:23:05 +02:00
Rhys Perry
6407d783ea
aco: update sendmsg enum from LLVM
...
Add GFX11 enums and some new ones that apparently existed before.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17710 >
2022-09-30 20:57:02 +00:00
Samuel Pitoiset
8bdcc20815
aco: add new pseudo instruction p_jump_to_epilog
...
The first operand of this new pseudo-instruction is a 64-bit SGPR for
the continue PC, followed by a variable list of fixed VGPRS for the
color exports which are the PS epilog inputs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17485 >
2022-07-18 18:40:02 +00:00
Rhys Perry
d2d94b62f2
aco: initialize scratch base registers on GFX9-GFX10.3
...
fossil-db (navi21):
Totals from 1142 (0.70% of 162293) affected shaders:
Instrs: 271636 -> 271974 (+0.12%)
CodeSize: 1532020 -> 1533792 (+0.12%)
Latency: 7484066 -> 7485698 (+0.02%)
InvThroughput: 4048824 -> 4049579 (+0.02%)
SClause: 4171 -> 4212 (+0.98%)
PreSGPRs: 11203 -> 12276 (+9.58%)
fossil-db (vega10):
Totals from 3327 (2.06% of 161355) affected shaders:
Instrs: 257413 -> 257601 (+0.07%)
CodeSize: 1424244 -> 1425372 (+0.08%)
Latency: 8598402 -> 8600466 (+0.02%)
InvThroughput: 7906335 -> 7908234 (+0.02%)
SClause: 4932 -> 4973 (+0.83%)
PreSGPRs: 22010 -> 25405 (+15.42%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079 >
2022-07-08 14:49:03 +00:00
Rhys Perry
97e9e42e0d
aco: treat flat-like as vmem in some scheduling heuristics
...
fossil-db (navi21):
Totals from 12 (0.01% of 162293) affected shaders:
Instrs: 48754 -> 48762 (+0.02%)
CodeSize: 267092 -> 267124 (+0.01%)
Latency: 1293798 -> 1292303 (-0.12%); split: -0.12%, +0.00%
InvThroughput: 854599 -> 853578 (-0.12%)
VClause: 1623 -> 1619 (-0.25%)
SClause: 1187 -> 1188 (+0.08%); split: -0.08%, +0.17%
fossil-db (vega10):
Totals from 1 (0.00% of 161355) affected shaders:
Latency: 18720 -> 18848 (+0.68%)
InvThroughput: 5775 -> 5776 (+0.02%)
SClause: 12 -> 11 (-8.33%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079 >
2022-07-08 14:49:03 +00:00
Dave Airlie
a2701bfdb8
aco: move info pointer to a copy.
...
This is just setup to move this to a different struct later.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16342 >
2022-05-11 19:07:11 +00:00
Daniel Schürmann
d5dc0c0392
aco: adjust num_waves for LDS before scheduling
...
Totals from 67 (0.05% of 134913) affected shaders: (GFX10.3)
VGPRs: 2024 -> 2136 (+5.53%); split: -0.40%, +5.93%
CodeSize: 162364 -> 162348 (-0.01%); split: -0.08%, +0.07%
MaxWaves: 1882 -> 1816 (-3.51%); split: +0.11%, -3.61%
Instrs: 29176 -> 29162 (-0.05%); split: -0.09%, +0.04%
Latency: 329984 -> 327272 (-0.82%); split: -0.88%, +0.06%
InvThroughput: 54653 -> 54672 (+0.03%); split: -0.01%, +0.04%
VClause: 782 -> 761 (-2.69%); split: -2.81%, +0.13%
SClause: 833 -> 824 (-1.08%); split: -2.28%, +1.20%
Copies: 1872 -> 1873 (+0.05%); split: -0.37%, +0.43%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16039 >
2022-04-29 15:39:10 +00:00
Timur Kristóf
cd0dd5d6b7
aco: Add storage class for Task Shader payload.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15161 >
2022-02-25 13:20:08 +01:00
Daniel Schürmann
4e7a777093
aco: try forming clauses even if reg_pressure exceeds
...
This patch allows to form clauses even if the register pressure
is at the limit with the effect that VMEM instructions are less
scattered after the first clause in a Block.
It respects the previous clause size to avoid excessive moving
of VMEM instructions.
VMEM_CLAUSE_MAX_GRAB_DIST is further reduced to compensate
some of the effects.
Totals from 28922 (19.26% of 150170) affected shaders: (GFX10.3)
VGPRs: 1546568 -> 1523072 (-1.52%); split: -1.52%, +0.00%
CodeSize: 117524892 -> 117510288 (-0.01%); split: -0.08%, +0.07%
MaxWaves: 605554 -> 611120 (+0.92%)
Instrs: 22292568 -> 22291927 (-0.00%); split: -0.10%, +0.09%
Latency: 488975399 -> 490230904 (+0.26%); split: -0.06%, +0.32%
InvThroughput: 117842300 -> 116521653 (-1.12%); split: -1.15%, +0.03%
VClause: 541550 -> 522464 (-3.52%); split: -9.73%, +6.20%
SClause: 718185 -> 718298 (+0.02%); split: -0.00%, +0.02%
Copies: 1420603 -> 1386949 (-2.37%); split: -2.64%, +0.27%
Branches: 559559 -> 559278 (-0.05%); split: -0.06%, +0.01%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10896 >
2021-09-27 14:29:16 +00:00
Daniel Schürmann
7e1faf9349
aco: make clause-forming depend on the number of moved instructions
...
This allows more aggressive clause-forming in presence of
larger def-use distances. To compensate for the effect,
VMEM_CLAUSE_MAX_GRAB_DIST was decreased.
Totals from 5788 (3.85% of 150170) affected shaders: (GFX10.3)
VGPRs: 483960 -> 475272 (-1.80%); split: -1.82%, +0.02%
CodeSize: 59661240 -> 59669084 (+0.01%); split: -0.01%, +0.02%
MaxWaves: 70408 -> 71450 (+1.48%); split: +1.51%, -0.03%
Instrs: 11222417 -> 11224479 (+0.02%); split: -0.01%, +0.03%
Latency: 349397104 -> 349298602 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 88584832 -> 87762262 (-0.93%); split: -0.93%, +0.00%
VClause: 168905 -> 177089 (+4.85%); split: -0.48%, +5.32%
SClause: 375795 -> 375767 (-0.01%); split: -0.01%, +0.01%
Copies: 840298 -> 840231 (-0.01%); split: -0.04%, +0.03%
Branches: 373265 -> 373278 (+0.00%); split: -0.00%, +0.00%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10896 >
2021-09-27 14:29:16 +00:00
Daniel Schürmann
903999c119
aco: stop scheduling if clause-forming fails
...
This avoids unintended reordering of VMEM instructions.
It is also highly unlikely that we find more independent
instructions before previous clause-related instructions.
Totals from 1921 (1.28% of 150170) affected shaders: (GFX10.3)
VGPRs: 103832 -> 103736 (-0.09%); split: -0.10%, +0.01%
CodeSize: 8695560 -> 8706000 (+0.12%); split: -0.03%, +0.15%
Instrs: 1643752 -> 1646349 (+0.16%); split: -0.04%, +0.20%
Latency: 26755527 -> 26614645 (-0.53%); split: -0.67%, +0.14%
InvThroughput: 7226604 -> 7204809 (-0.30%); split: -0.39%, +0.08%
VClause: 46536 -> 46201 (-0.72%); split: -0.81%, +0.09%
SClause: 47910 -> 47769 (-0.29%); split: -0.43%, +0.14%
Copies: 94647 -> 94558 (-0.09%); split: -0.26%, +0.17%
Branches: 36843 -> 36847 (+0.01%); split: -0.00%, +0.01%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10896 >
2021-09-27 14:29:16 +00:00
Rhys Perry
b23a9dd1f6
aco/scheduler: allow moving down VMEM stores to below VMEM loads
...
fossil-db (Vega10):
Totals from 93 (0.06% of 150305) affected shaders:
SGPRs: 4832 -> 4768 (-1.32%)
VGPRs: 4084 -> 4144 (+1.47%)
CodeSize: 316080 -> 317208 (+0.36%); split: -0.11%, +0.47%
MaxWaves: 589 -> 580 (-1.53%)
Instrs: 60229 -> 60511 (+0.47%); split: -0.15%, +0.61%
Latency: 636477 -> 540029 (-15.15%); split: -15.26%, +0.10%
InvThroughput: 293027 -> 283043 (-3.41%); split: -4.21%, +0.80%
VClause: 2557 -> 2716 (+6.22%); split: -0.35%, +6.57%
SClause: 1381 -> 1395 (+1.01%); split: -0.14%, +1.16%
Copies: 9424 -> 9728 (+3.23%); split: -0.74%, +3.97%
fossil-db (Sienna Cichlid):
Totals from 88 (0.06% of 150170) affected shaders:
VGPRs: 3840 -> 3872 (+0.83%)
CodeSize: 300544 -> 300960 (+0.14%); split: -0.09%, +0.23%
Instrs: 53714 -> 53871 (+0.29%); split: -0.05%, +0.35%
Latency: 489854 -> 462001 (-5.69%); split: -6.30%, +0.61%
InvThroughput: 100307 -> 95142 (-5.15%); split: -5.50%, +0.35%
VClause: 2322 -> 2564 (+10.42%); split: -0.39%, +10.81%
SClause: 1345 -> 1358 (+0.97%); split: -0.15%, +1.12%
Copies: 4113 -> 4351 (+5.79%); split: -0.66%, +6.44%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12211 >
2021-08-23 16:48:31 +00:00
Timur Kristóf
8341af5109
radv, aco, ac/nir: Tweak position export scheduling for NGG culling.
...
The result is about +5-ish fps in Doom Eternal.
It turns out that the location of position exports matters more
than we thought, and it's actually better to keep them at the bottom
for culling shaders rather than schedule it up to the top.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525 >
2021-07-13 23:56:33 +00:00
Daniel Schürmann
1e2639026f
aco: Format.
...
Manually adjusted some comments for more intuitive line breaks.
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11258 >
2021-07-12 21:27:31 +00:00
Daniel Schürmann
0eea0e55ad
aco: add 'common/' and 'llvm/' prefix to #includes
...
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11271 >
2021-07-12 12:09:31 +00:00
Daniel Schürmann
59fdaa1985
aco: reorder and cleanup #includes
...
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11271 >
2021-07-12 12:09:31 +00:00
Rhys Perry
d64f5a3f9d
aco: move VMEM instructions below descriptor loads
...
This is to prevent sequences like:
a = descriptor_load()
vmem(a)
b = descriptor_load()
vmem(b)
and instead create:
a = descriptor_load()
b = descriptor_load()
vmem(a)
vmem(b)
fossil-db (GFX10.3):
Totals from 114521 (78.30% of 146267) affected shaders:
VGPRs: 4540352 -> 4540216 (-0.00%); split: -0.03%, +0.02%
CodeSize: 289864228 -> 289114652 (-0.26%); split: -0.29%, +0.03%
MaxWaves: 2940234 -> 2940338 (+0.00%); split: +0.00%, -0.00%
Instrs: 55112418 -> 54919910 (-0.35%); split: -0.38%, +0.03%
Latency: 956528393 -> 954682011 (-0.19%); split: -0.24%, +0.05%
InvThroughput: 229280830 -> 229238107 (-0.02%); split: -0.04%, +0.02%
VClause: 1141832 -> 1139002 (-0.25%); split: -0.63%, +0.38%
SClause: 2357840 -> 2225008 (-5.63%); split: -6.01%, +0.38%
Copies: 3316040 -> 3331519 (+0.47%); split: -0.31%, +0.77%
Branches: 1187212 -> 1186919 (-0.02%); split: -0.03%, +0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6489 >
2021-06-14 15:47:37 +00:00