Daniel Schürmann
8b793f9567
aco: remove dead code for the handling of exec temporaries
...
Totals from 26026 (18.67% of 139391) affected shaders (Navi10):
PreSGPRs: 370993 -> 326177 (-12.08%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8870 >
2021-02-12 22:41:31 +00:00
Daniel Schürmann
b98a4d4dd7
aco: refactor GPR limit calculation
...
This patch delays the calculation of GPR limits in order to
precisely incorporate extra registers (VCC etc.) and shared VGPRs.
Additionally, the allocation granularity is used to set the config.
This has some effect on the reported SGPR stats.
Totals (Navi10):
SGPRs: 6971787 -> 17753642 (+154.65%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921 >
2021-02-12 19:00:18 +00:00
Rhys Perry
e115b01948
aco: return references in instruction cast methods
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595 >
2021-01-22 14:12:33 +00:00
Rhys Perry
70dbcfa1c9
aco: use instruction cast methods
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595 >
2021-01-22 14:12:32 +00:00
Rhys Perry
441ead5fb3
aco: remove Format::{VOP3A,VOP3B}
...
These are really the same as Format::VOP3.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595 >
2021-01-22 14:12:32 +00:00
Daniel Schürmann
7dcb9a0d8c
aco/optimizer: convert extract_vector with index 0 into parallelcopies if possible
...
Totals from 273 (0.20% of 139391) affected shaders (Navi10):
VGPRs: 11600 -> 11792 (+1.66%)
CodeSize: 1389304 -> 1383152 (-0.44%); split: -0.53%, +0.08%
MaxWaves: 3848 -> 3752 (-2.49%)
Instrs: 240228 -> 239478 (-0.31%); split: -0.37%, +0.06%
Cycles: 20637708 -> 20580024 (-0.28%); split: -0.46%, +0.18%
VMEM: 39164 -> 38831 (-0.85%); split: +0.06%, -0.91%
SMEM: 21743 -> 22204 (+2.12%)
VClause: 4787 -> 4783 (-0.08%)
Copies: 39057 -> 38308 (-1.92%); split: -2.28%, +0.37%
Branches: 6556 -> 6557 (+0.02%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8260 >
2021-01-21 11:05:36 +00:00
Rhys Perry
c353895c92
aco: use non-sequential addressing
...
fossil-db (GFX10.3):
Totals from 70493 (50.57% of 139391) affected shaders:
SGPRs: 4232624 -> 4231808 (-0.02%); split: -0.09%, +0.07%
VGPRs: 2831772 -> 2764740 (-2.37%); split: -2.53%, +0.17%
CodeSize: 225584412 -> 225048740 (-0.24%); split: -0.44%, +0.21%
MaxWaves: 875319 -> 878837 (+0.40%); split: +0.44%, -0.04%
Instrs: 43157803 -> 42496421 (-1.53%); split: -1.54%, +0.01%
Cycles: 1656380132 -> 1641532056 (-0.90%); split: -0.94%, +0.04%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523 >
2021-01-20 16:46:54 +00:00
Rhys Perry
faf3e9a27f
aco: move VADDR to the end of the operand list
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523 >
2021-01-20 16:46:54 +00:00
Rhys Perry
4c1953a9b8
aco: add test for incorrect convert_to_SDWA() check
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8577 >
2021-01-19 15:38:56 +00:00
Rhys Perry
8301d483ff
aco/tests: don't rely on argument evaluation order
...
The argument evaluation order is implementation-defined and affects the
order the instructions are inserted.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3938
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7945 >
2021-01-13 13:04:26 +00:00
Rhys Perry
a502aa7b04
aco: form sparse load clauses
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7775 >
2021-01-08 14:27:07 +00:00
Tony Wasserka
b841b4fde8
aco: Add tests for subdword register allocation
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7461 >
2020-12-29 18:57:10 +00:00
Tony Wasserka
6a246f5c6d
aco/tests: Fix deadlock for too large test lists
...
The write() to the communication pipe shared with check_output.py would block
for large test output streams since the pipe's consumer wouldn't be launched
until the write already completed.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7461 >
2020-12-29 18:57:10 +00:00
Tony Wasserka
a240341ec9
aco/tests: Allow specifiying the test subvariant in setup_cs
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7461 >
2020-12-29 18:57:10 +00:00
Tony Wasserka
05ca6758cb
aco/tests: Fix GFX10_3 being printed as gfx11
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7461 >
2020-12-29 18:57:10 +00:00
Rhys Perry
8451911156
aco: test self-intersecting copies when src=higher
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7798 >
2020-12-04 14:44:48 +00:00
Rhys Perry
4eac442217
aco/ngg: fix division-by-zero in assertion
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7576 >
2020-11-25 13:41:04 +00:00
Rhys Perry
37a2c9ace6
aco: fix GS with no outputs
...
With NGG, ngg_gs_known_vtxcnt[0] would be false and ngg_gs_finale() would
assert.
With legacy GS, I don't know why the assertion was there.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7576 >
2020-11-25 13:41:04 +00:00
Samuel Pitoiset
8e961b91c3
aco: optimize v_add+v_lshlrev to v_mad_u32_u24 on GFX6-8
...
This optimizes v_add(c, v_lshlrev(a, b)) to v_mad_u32_u24(b, 1<<a, c)
if 'a' is a constant (less than or equal to 6 to avoid creating
literals) and 'b' known to be a 16-bit or a 24-bit value.
On GFX9+, this is already optimized to v_lshl_add_u32.
No fossils-db changes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7673 >
2020-11-23 18:34:40 +00:00
Samuel Pitoiset
d9e4504b0d
aco: optimize v_add+s_lshl to v_mad_u32_u24 on GFX6-8
...
This optimizes v_add(c, s_lshl(a, b)) to v_mad_u32_u24(a, 1<<b, c)
if 'b' is a constant (less than or equal to 6 to avoid creating
literals) and 'a' known to be a 16-bit or a 24-bit value.
On GFX9+, this is already optimized to v_lshl_add_u32.
fossils-db (Polaris10):
Totals from 1916 (1.36% of 140385) affected shaders:
SGPRs: 88322 -> 87780 (-0.61%); split: -0.66%, +0.05%
CodeSize: 7852668 -> 7851800 (-0.01%); split: -0.01%, +0.00%
Instrs: 1533965 -> 1530459 (-0.23%); split: -0.23%, +0.00%
Cycles: 57001852 -> 56983244 (-0.03%); split: -0.03%, +0.00%
VMEM: 372561 -> 371733 (-0.22%); split: +0.03%, -0.25%
SMEM: 108859 -> 103711 (-4.73%); split: +0.23%, -4.96%
VClause: 37231 -> 37204 (-0.07%)
SClause: 58116 -> 58086 (-0.05%); split: -0.06%, +0.01%
Copies: 199953 -> 199931 (-0.01%); split: -0.03%, +0.02%
Branches: 63478 -> 63477 (-0.00%)
PreSGPRs: 61818 -> 61816 (-0.00%)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7673 >
2020-11-23 18:34:40 +00:00
Samuel Pitoiset
05fd780012
aco/tests: extend the optimize.add_lshl tests to GFX8
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7673 >
2020-11-23 18:34:40 +00:00
Rhys Perry
14186a1b84
aco/tests: add Builder::v_mul_imm() tests
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5390 >
2020-11-20 19:50:32 +00:00
Tony Wasserka
a978602d1f
aco/tests: Fix -Wunused warnings in release mode
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7430 >
2020-11-20 09:29:19 +00:00
Tony Wasserka
5231c788ff
aco/tests: Fix -Wshadow warnings
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7430 >
2020-11-20 09:29:19 +00:00
Rhys Perry
631e18d427
aco: create v_mad_u32_u24
...
fossil-db (Navi):
Totals from 849 (0.61% of 138791) affected shaders:
SGPRs: 38528 -> 38544 (+0.04%)
VGPRs: 39860 -> 39856 (-0.01%)
CodeSize: 2701880 -> 2702016 (+0.01%)
MaxWaves: 9148 -> 9150 (+0.02%)
Instrs: 509864 -> 509821 (-0.01%); split: -0.01%, +0.00%
Cycles: 3400124 -> 3399628 (-0.01%); split: -0.02%, +0.00%
VMEM: 262757 -> 262672 (-0.03%)
SMEM: 59710 -> 59704 (-0.01%)
Copies: 44461 -> 44466 (+0.01%)
fossil-db (Polaris):
Totals from 1487 (1.06% of 140385) affected shaders:
SGPRs: 54688 -> 55840 (+2.11%)
CodeSize: 2725608 -> 2725720 (+0.00%); split: -0.01%, +0.01%
Instrs: 521394 -> 517710 (-0.71%)
Cycles: 18474108 -> 18410964 (-0.34%)
VMEM: 436992 -> 431028 (-1.36%); split: +0.06%, -1.43%
SMEM: 124503 -> 122564 (-1.56%); split: +0.45%, -2.00%
VClause: 21972 -> 22015 (+0.20%); split: -0.12%, +0.31%
SClause: 14274 -> 14287 (+0.09%)
Copies: 44407 -> 44411 (+0.01%); split: -0.02%, +0.03%
PreSGPRs: 34318 -> 34321 (+0.01%); split: -0.00%, +0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7639 >
2020-11-19 12:36:17 +00:00
Samuel Pitoiset
0fcd379184
aco: fix combining max(-min(a, b), c) if a or b uses the neg modifier
...
No fossils-db changes.
Cc: 20.2, 20.3
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7657 >
2020-11-18 08:00:28 +00:00
Rhys Perry
867323379e
aco: don't use SMEM for SSBO stores
...
fossil-db (Navi):
Totals from 70 (0.05% of 138791) affected shaders:
SGPRs: 2324 -> 2097 (-9.77%)
VGPRs: 1344 -> 1480 (+10.12%)
CodeSize: 157872 -> 154836 (-1.92%); split: -1.93%, +0.01%
MaxWaves: 1288 -> 1260 (-2.17%)
Instrs: 29730 -> 29108 (-2.09%); split: -2.13%, +0.04%
Cycles: 394944 -> 391280 (-0.93%); split: -0.94%, +0.01%
VMEM: 5288 -> 5695 (+7.70%); split: +11.97%, -4.27%
SMEM: 2680 -> 2444 (-8.81%); split: +1.34%, -10.15%
VClause: 291 -> 502 (+72.51%)
SClause: 1176 -> 918 (-21.94%)
Copies: 3549 -> 3517 (-0.90%); split: -1.80%, +0.90%
Branches: 1230 -> 1228 (-0.16%)
PreSGPRs: 1675 -> 1491 (-10.99%)
PreVGPRs: 1101 -> 1223 (+11.08%)
Totals from 70 (0.05% of 139517) affected shaders (RAVEN):
SGPRs: 2368 -> 2121 (-10.43%)
VGPRs: 1344 -> 1480 (+10.12%)
CodeSize: 156664 -> 153252 (-2.18%)
MaxWaves: 636 -> 622 (-2.20%)
Instrs: 29968 -> 29226 (-2.48%)
Cycles: 398284 -> 393492 (-1.20%)
VMEM: 5544 -> 5930 (+6.96%); split: +11.72%, -4.76%
SMEM: 2752 -> 2502 (-9.08%); split: +1.20%, -10.28%
VClause: 292 -> 504 (+72.60%)
SClause: 1236 -> 940 (-23.95%)
Copies: 3907 -> 3852 (-1.41%); split: -2.20%, +0.79%
Branches: 1230 -> 1228 (-0.16%)
PreSGPRs: 1671 -> 1487 (-11.01%)
PreVGPRs: 1102 -> 1225 (+11.16%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6143 >
2020-11-16 15:52:22 +00:00
Rhys Perry
2736f97496
aco/tests: add output modifier tests
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7605 >
2020-11-16 12:58:44 +00:00
Rhys Perry
4d727ee913
aco/tests: add some more clamp combining tests
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045 >
2020-11-13 12:34:27 +00:00
Rhys Perry
15d08a06e2
aco/tests: expand optimize.const_comparison_ordering tests
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045 >
2020-11-13 12:34:27 +00:00
Rhys Perry
6bf3c606be
aco/tests: initialize debug function
...
aco_log() will print the message to stderr.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045 >
2020-11-13 12:34:27 +00:00
Rhys Perry
966732e8ca
aco: disallow various v_add_u32 opts if modifiers are used
...
Check for clamp, SDWA or DPP. The optimization isn't possible with SDWA
and DPP, so it would have been skipped anyway. Doing any of these with a
clamp modifier present would be incorrect.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045 >
2020-11-13 12:34:27 +00:00
Rhys Perry
91ffeed88a
aco: fix combine_constant_comparison_ordering() NaN check with 16/64-bit
...
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045 >
2020-11-13 12:34:27 +00:00
Rhys Perry
d4c821da0e
aco: don't combine precise max(min()) to med3
...
fossil-db (Navi):
Totals from 241 (0.18% of 137413) affected shaders:
CodeSize: 856280 -> 856308 (+0.00%); split: -0.00%, +0.00%
Instrs: 164220 -> 164514 (+0.18%); split: -0.00%, +0.18%
Cycles: 1031916 -> 1033092 (+0.11%); split: -0.00%, +0.11%
VMEM: 77855 -> 78514 (+0.85%); split: +0.85%, -0.01%
SMEM: 20501 -> 20593 (+0.45%); split: +0.46%, -0.01%
Copies: 9791 -> 9790 (-0.01%); split: -0.03%, +0.02%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045 >
2020-11-13 12:34:27 +00:00
Samuel Pitoiset
68488fd383
aco: optimize v_add(v_bcnt(a, 0), b) to v_bcnt(a, b)
...
The first operand of v_bcnt should always be a VGPR because if it's
a SGPR, isel selects s_bcnt1 but I added a sanity check to prevent
any problems.
fossils-db (Vega10):
Totals from 23 (0.02% of 139517) affected shaders:
CodeSize: 106828 -> 106664 (-0.15%)
Instrs: 20242 -> 20201 (-0.20%)
Cycles: 213112 -> 211352 (-0.83%)
VMEM: 3200 -> 3184 (-0.50%)
SMEM: 928 -> 927 (-0.11%)
Helps Control, Assassins Creeds Origins and Youngblood.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7568 >
2020-11-13 07:28:50 +00:00
Samuel Pitoiset
db9d13b4ff
aco: optimize v_add_u32(v_mul_lo_u16) -> v_mad_u32_u16
...
fossils-db (Vega10):
Totals from 779 (0.56% of 139517) affected shaders:
CodeSize: 1187928 -> 1187508 (-0.04%); split: -0.04%, +0.00%
Instrs: 247353 -> 244608 (-1.11%); split: -1.11%, +0.00%
Cycles: 1127472 -> 1116420 (-0.98%); split: -0.98%, +0.00%
VMEM: 139720 -> 138297 (-1.02%); split: +0.00%, -1.02%
SMEM: 51069 -> 50735 (-0.65%); split: +0.04%, -0.69%
Copies: 11548 -> 11547 (-0.01%); split: -0.03%, +0.03%
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425 >
2020-11-12 12:32:26 +00:00
Samuel Pitoiset
bbdafd6ab3
aco: optimize v_mad_u32_u16 with acc=0 to v_mul_u32_u24
...
v_mad_u32_u16 will be selected by isel to keep the range analysis
information around and to combine more v_add_u32+v_mad_u32_u16
together. When it's not possible to optimize that pattern, fallback
to v_mul_u32_u24 which is VOP2 instead of VOP3.
No fossils-db changes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425 >
2020-11-12 12:32:26 +00:00
Samuel Pitoiset
dfd878f2ba
aco: combine more s_add+s_lshl to s_lshl<n>_add by ignoring uses
...
Even if the s_lshl is used more that once, it can still be combined.
fossils-db (Vega10):
Totals from 771 (0.55% of 139517) affected shaders:
SGPRs: 46216 -> 46304 (+0.19%); split: -0.02%, +0.21%
VGPRs: 38488 -> 38464 (-0.06%)
SpillSGPRs: 1894 -> 1875 (-1.00%); split: -3.12%, +2.11%
CodeSize: 5681856 -> 5679844 (-0.04%); split: -0.07%, +0.03%
MaxWaves: 5320 -> 5323 (+0.06%)
Instrs: 1093960 -> 1093474 (-0.04%); split: -0.09%, +0.05%
Cycles: 47198380 -> 47258872 (+0.13%); split: -0.06%, +0.19%
VMEM: 176036 -> 176283 (+0.14%); split: +0.16%, -0.02%
SMEM: 53397 -> 53255 (-0.27%); split: +0.03%, -0.30%
VClause: 23156 -> 23152 (-0.02%); split: -0.03%, +0.01%
SClause: 35716 -> 35726 (+0.03%); split: -0.00%, +0.03%
Copies: 139395 -> 139871 (+0.34%); split: -0.04%, +0.39%
Branches: 33808 -> 33798 (-0.03%); split: -0.04%, +0.01%
PreSGPRs: 35381 -> 35331 (-0.14%); split: -0.20%, +0.06%
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7539 >
2020-11-12 07:36:07 +00:00
Samuel Pitoiset
64748a2be2
aco/tests: add some tests for combining s_add+s_lshl to s_lshl<n>_add
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7539 >
2020-11-12 07:36:07 +00:00
Samuel Pitoiset
ec347ee9bc
aco: fix combining add/sub to b2i if a new dest needs to be allocated
...
The uses vector needs to be expanded to avoid out of bounds access
and to make sure the number of uses is initialized to 0.
This fixes combining more v_and(a, v_subbrev_co_u32).
fossilds-db (Vega10):
Totals from 4574 (3.28% of 139517) affected shaders:
SGPRs: 291625 -> 292217 (+0.20%); split: -0.01%, +0.21%
VGPRs: 276368 -> 276188 (-0.07%); split: -0.07%, +0.01%
SpillSGPRs: 455 -> 533 (+17.14%)
SpillVGPRs: 76 -> 78 (+2.63%)
CodeSize: 23327500 -> 23304152 (-0.10%); split: -0.17%, +0.07%
MaxWaves: 22044 -> 22066 (+0.10%)
Instrs: 4583064 -> 4576301 (-0.15%); split: -0.15%, +0.01%
Cycles: 47925276 -> 47871968 (-0.11%); split: -0.13%, +0.01%
VMEM: 1599363 -> 1597473 (-0.12%); split: +0.08%, -0.19%
SMEM: 331461 -> 331126 (-0.10%); split: +0.08%, -0.18%
VClause: 80639 -> 80696 (+0.07%); split: -0.02%, +0.09%
SClause: 155992 -> 155993 (+0.00%); split: -0.02%, +0.02%
Copies: 333482 -> 333318 (-0.05%); split: -0.12%, +0.07%
Branches: 70967 -> 70968 (+0.00%)
PreSGPRs: 187078 -> 187711 (+0.34%); split: -0.01%, +0.35%
PreVGPRs: 244918 -> 244785 (-0.05%)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7513 >
2020-11-10 10:25:00 +01:00
Samuel Pitoiset
bae5487659
aco: optimize v_and(a, v_subbrev_co(0, 0, vcc)) -> v_cndmask(0, a, vcc)
...
fossils-db (Vega10):
Totals from 7786 (5.70% of 136546) affected shaders:
SGPRs: 517778 -> 518626 (+0.16%); split: -0.01%, +0.17%
VGPRs: 488252 -> 488084 (-0.03%); split: -0.04%, +0.01%
CodeSize: 42282068 -> 42250152 (-0.08%); split: -0.16%, +0.09%
MaxWaves: 35697 -> 35716 (+0.05%); split: +0.06%, -0.01%
Instrs: 8319309 -> 8304792 (-0.17%); split: -0.18%, +0.00%
Cycles: 88619440 -> 88489636 (-0.15%); split: -0.16%, +0.01%
VMEM: 2788278 -> 2780431 (-0.28%); split: +0.06%, -0.35%
SMEM: 570364 -> 569370 (-0.17%); split: +0.12%, -0.30%
VClause: 144906 -> 144908 (+0.00%); split: -0.05%, +0.05%
SClause: 302143 -> 302055 (-0.03%); split: -0.04%, +0.01%
Copies: 579124 -> 578779 (-0.06%); split: -0.14%, +0.08%
PreSGPRs: 327695 -> 328845 (+0.35%); split: -0.00%, +0.35%
PreVGPRs: 434280 -> 433954 (-0.08%)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7438 >
2020-11-09 17:36:42 +00:00
Rhys Perry
ecdcf22d5d
aco: switch aco_print_asm to a FILE *
...
Streams are really stateful and (IMO) difficult to read for non-trivial
usage. This is also more consistent with NIR and the rest of ACO.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7166 >
2020-10-28 17:32:32 +00:00
Rhys Perry
e54c111c45
aco: always use p_parallelcopy for pre-RA copies
...
Most fossil-db changes are because literals are applied earlier
(in label_instruction), so use counts are more accurate and more literals
are applied.
fossil-db (Navi):
Totals from 79551 (57.89% of 137413) affected shaders:
SGPRs: 4549610 -> 4542802 (-0.15%); split: -0.19%, +0.04%
VGPRs: 3326764 -> 3324172 (-0.08%); split: -0.10%, +0.03%
SpillSGPRs: 38886 -> 34562 (-11.12%); split: -11.14%, +0.02%
CodeSize: 240143456 -> 240001008 (-0.06%); split: -0.11%, +0.05%
MaxWaves: 1078919 -> 1079281 (+0.03%); split: +0.04%, -0.01%
Instrs: 46627073 -> 46528490 (-0.21%); split: -0.22%, +0.01%
fossil-db (Polaris):
Totals from 98463 (70.90% of 138881) affected shaders:
SGPRs: 5164689 -> 5164353 (-0.01%); split: -0.02%, +0.01%
VGPRs: 3920936 -> 3921856 (+0.02%); split: -0.00%, +0.03%
SpillSGPRs: 56298 -> 52259 (-7.17%); split: -7.22%, +0.04%
CodeSize: 258680092 -> 258692712 (+0.00%); split: -0.02%, +0.03%
MaxWaves: 620863 -> 620823 (-0.01%); split: +0.00%, -0.01%
Instrs: 50776289 -> 50757577 (-0.04%); split: -0.04%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216 >
2020-10-27 15:24:38 +00:00
Daniel Schürmann
cf083f1d02
aco: use do_pack() for self-intersecting operations.
...
This improves the code for GFX8+, but is slightly
worse for GFX6_7.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7189 >
2020-10-26 12:21:13 +00:00
Daniel Schürmann
d96f387e7a
aco: improve code sequences for 16bit packing
...
This includes using alignbyte for GFX6 and GFX7,
and 32-bit instructions for GFX8.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7189 >
2020-10-26 12:21:13 +00:00
Daniel Schürmann
40bfb08828
aco: refactor GFX6_7 subdword copy lowering
...
The new code uses alignbyte which leads
to shorter code and preserves the operand's
registers.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7189 >
2020-10-26 12:21:13 +00:00
Rhys Perry
1a652244e4
aco: implement 16-bit literals
...
We can copy any value into a 16-bit subregister with a 3 dword
v_pack_b32_f16 on GFX10 or a v_and_b32+v_or_b32 on GFX9.
Because the generated code can depend on the register assignment and to
improve constant propagation, Builder::copy creates a p_create_vector in
the case of sub-dword literals.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7111 >
2020-10-15 11:33:42 +00:00
Samuel Pitoiset
18fd6274b2
aco/tests: add disassembler tests to reproduce the add3+clamp crash
...
Like some other v_add instructions, LLVM fails to disassemble
v_add3_u32 + clamp.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6961 >
2020-10-02 14:21:33 +02:00
Rhys Perry
7bfaeaa590
aco: pass -fno-exceptions and -fno-rtti
...
We don't use exceptions or RTTI at all, so pass this flag to the compiler
to allow it to create better code.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6315 >
2020-09-15 14:36:57 +00:00
Rhys Perry
eb3c16e1f8
aco/tests: add tests for long jumps
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6212 >
2020-08-26 13:26:58 +00:00