Commit graph

1085 commits

Author SHA1 Message Date
Samuel Pitoiset
be600b009a aco: add a new Operand flag to indicate that is 24-bit
To indicate that the upper 8-bits are always 0 to optimize more MADs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7673>
2020-11-23 18:34:40 +00:00
Samuel Pitoiset
05fd780012 aco/tests: extend the optimize.add_lshl tests to GFX8
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7673>
2020-11-23 18:34:40 +00:00
Rhys Perry
14186a1b84 aco/tests: add Builder::v_mul_imm() tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5390>
2020-11-20 19:50:32 +00:00
Rhys Perry
aab507c6b0 aco: use v_mul_imm() for some nir_op_imul
Some of the optimizations v_mul_imm() does are complex and very
target-specific and not suitable to do in ACO's optimizer.

fossil-db (Vega):
Totals from 49135 (35.76% of 137413) affected shaders:
SGPRs: 2698547 -> 2696103 (-0.09%); split: -0.16%, +0.07%
VGPRs: 2301412 -> 2301600 (+0.01%); split: -0.01%, +0.02%
SpillSGPRs: 51520 -> 51519 (-0.00%)
CodeSize: 168798572 -> 169164012 (+0.22%); split: -0.00%, +0.22%
MaxWaves: 306553 -> 306539 (-0.00%); split: +0.00%, -0.01%
Instrs: 33423982 -> 33506598 (+0.25%); split: -0.00%, +0.25%
Cycles: 1807800632 -> 1804101376 (-0.20%); split: -0.20%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5390>
2020-11-20 19:50:32 +00:00
Rhys Perry
02c5519e6c aco: try harder to not create v_mul_lo_u32
fossil-db (Vega):
Totals from 4 (0.00% of 137413) affected shaders:
CodeSize: 13708 -> 13716 (+0.06%)
Instrs: 2742 -> 2744 (+0.07%)
Cycles: 24348 -> 24236 (-0.46%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5390>
2020-11-20 19:50:31 +00:00
Rhys Perry
8ca23bcf39 aco: copy constant to sgpr in Builder::v_mul_imm()
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5390>
2020-11-20 19:50:31 +00:00
Rhys Perry
756bb29391 aco: create vgpr constant copies using v_bfrev_b32
Looks like this worked once but broke at some point.

fossil-db (Vega):
Totals from 577 (0.42% of 137413) affected shaders:
CodeSize: 3822052 -> 3818720 (-0.09%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5390>
2020-11-20 19:50:31 +00:00
Rhys Perry
4d93fc25f0 aco: count v_mul_lo_u32 as 16 cycles
There are more, but this is a common one.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5390>
2020-11-20 19:50:31 +00:00
Rhys Perry
70d665d981 aco: don't create v_mov_b32 in v_mul_imm()
We switched to p_parallelcopy for everything else.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5390>
2020-11-20 19:50:31 +00:00
Tony Wasserka
b4d6131c15 radv,aco: Compile with -Wshadow when available
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7430>
2020-11-20 09:29:19 +00:00
Tony Wasserka
a978602d1f aco/tests: Fix -Wunused warnings in release mode
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7430>
2020-11-20 09:29:19 +00:00
Tony Wasserka
5231c788ff aco/tests: Fix -Wshadow warnings
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7430>
2020-11-20 09:29:19 +00:00
Tony Wasserka
2bb8874320 aco: Fix -Wshadow warnings
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7430>
2020-11-20 09:29:19 +00:00
Rhys Perry
631e18d427 aco: create v_mad_u32_u24
fossil-db (Navi):
Totals from 849 (0.61% of 138791) affected shaders:
SGPRs: 38528 -> 38544 (+0.04%)
VGPRs: 39860 -> 39856 (-0.01%)
CodeSize: 2701880 -> 2702016 (+0.01%)
MaxWaves: 9148 -> 9150 (+0.02%)
Instrs: 509864 -> 509821 (-0.01%); split: -0.01%, +0.00%
Cycles: 3400124 -> 3399628 (-0.01%); split: -0.02%, +0.00%
VMEM: 262757 -> 262672 (-0.03%)
SMEM: 59710 -> 59704 (-0.01%)
Copies: 44461 -> 44466 (+0.01%)

fossil-db (Polaris):
Totals from 1487 (1.06% of 140385) affected shaders:
SGPRs: 54688 -> 55840 (+2.11%)
CodeSize: 2725608 -> 2725720 (+0.00%); split: -0.01%, +0.01%
Instrs: 521394 -> 517710 (-0.71%)
Cycles: 18474108 -> 18410964 (-0.34%)
VMEM: 436992 -> 431028 (-1.36%); split: +0.06%, -1.43%
SMEM: 124503 -> 122564 (-1.56%); split: +0.45%, -2.00%
VClause: 21972 -> 22015 (+0.20%); split: -0.12%, +0.31%
SClause: 14274 -> 14287 (+0.09%)
Copies: 44407 -> 44411 (+0.01%); split: -0.02%, +0.03%
PreSGPRs: 34318 -> 34321 (+0.01%); split: -0.00%, +0.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7639>
2020-11-19 12:36:17 +00:00
Samuel Pitoiset
0fcd379184 aco: fix combining max(-min(a, b), c) if a or b uses the neg modifier
No fossils-db changes.

Cc: 20.2, 20.3
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7657>
2020-11-18 08:00:28 +00:00
Rhys Perry
867323379e aco: don't use SMEM for SSBO stores
fossil-db (Navi):
Totals from 70 (0.05% of 138791) affected shaders:
SGPRs: 2324 -> 2097 (-9.77%)
VGPRs: 1344 -> 1480 (+10.12%)
CodeSize: 157872 -> 154836 (-1.92%); split: -1.93%, +0.01%
MaxWaves: 1288 -> 1260 (-2.17%)
Instrs: 29730 -> 29108 (-2.09%); split: -2.13%, +0.04%
Cycles: 394944 -> 391280 (-0.93%); split: -0.94%, +0.01%
VMEM: 5288 -> 5695 (+7.70%); split: +11.97%, -4.27%
SMEM: 2680 -> 2444 (-8.81%); split: +1.34%, -10.15%
VClause: 291 -> 502 (+72.51%)
SClause: 1176 -> 918 (-21.94%)
Copies: 3549 -> 3517 (-0.90%); split: -1.80%, +0.90%
Branches: 1230 -> 1228 (-0.16%)
PreSGPRs: 1675 -> 1491 (-10.99%)
PreVGPRs: 1101 -> 1223 (+11.08%)

Totals from 70 (0.05% of 139517) affected shaders (RAVEN):
SGPRs: 2368 -> 2121 (-10.43%)
VGPRs: 1344 -> 1480 (+10.12%)
CodeSize: 156664 -> 153252 (-2.18%)
MaxWaves: 636 -> 622 (-2.20%)
Instrs: 29968 -> 29226 (-2.48%)
Cycles: 398284 -> 393492 (-1.20%)
VMEM: 5544 -> 5930 (+6.96%); split: +11.72%, -4.76%
SMEM: 2752 -> 2502 (-9.08%); split: +1.20%, -10.28%
VClause: 292 -> 504 (+72.60%)
SClause: 1236 -> 940 (-23.95%)
Copies: 3907 -> 3852 (-1.41%); split: -2.20%, +0.79%
Branches: 1230 -> 1228 (-0.16%)
PreSGPRs: 1671 -> 1487 (-11.01%)
PreVGPRs: 1102 -> 1225 (+11.16%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6143>
2020-11-16 15:52:22 +00:00
Rhys Perry
2736f97496 aco/tests: add output modifier tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7605>
2020-11-16 12:58:44 +00:00
Rhys Perry
0c522d3aa7 aco: fix fp16 *0.5 omod
We were testing for -0.5 instead.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 1210e0bd62 ("aco: create 16-bit input and output modifiers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7605>
2020-11-16 12:58:44 +00:00
Rhys Perry
558daa73f9 aco: disable omod if the sign of zeros should be preserved
The RDNA ISA doc says that omod doesn't preserve -0.0 in 6.2.2. LLVM
appears to always disable omod in this situation, but clamp is unaffected.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: df645fa369 ("aco: implement VK_KHR_shader_float_controls")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7605>
2020-11-16 12:58:44 +00:00
Samuel Pitoiset
2f5b3ac2f8 aco: remove v_{add,sub,subrev}_u32 on GFX8
These opcodes are never used and they always write the carry-out
according to the GCN3 ISA documentation.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7569>
2020-11-16 10:56:13 +00:00
Rhys Perry
4d727ee913 aco/tests: add some more clamp combining tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045>
2020-11-13 12:34:27 +00:00
Rhys Perry
15d08a06e2 aco/tests: expand optimize.const_comparison_ordering tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045>
2020-11-13 12:34:27 +00:00
Rhys Perry
6bf3c606be aco/tests: initialize debug function
aco_log() will print the message to stderr.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045>
2020-11-13 12:34:27 +00:00
Rhys Perry
966732e8ca aco: disallow various v_add_u32 opts if modifiers are used
Check for clamp, SDWA or DPP. The optimization isn't possible with SDWA
and DPP, so it would have been skipped anyway. Doing any of these with a
clamp modifier present would be incorrect.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045>
2020-11-13 12:34:27 +00:00
Rhys Perry
91ffeed88a aco: fix combine_constant_comparison_ordering() NaN check with 16/64-bit
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045>
2020-11-13 12:34:27 +00:00
Rhys Perry
d4c821da0e aco: don't combine precise max(min()) to med3
fossil-db (Navi):
Totals from 241 (0.18% of 137413) affected shaders:
CodeSize: 856280 -> 856308 (+0.00%); split: -0.00%, +0.00%
Instrs: 164220 -> 164514 (+0.18%); split: -0.00%, +0.18%
Cycles: 1031916 -> 1033092 (+0.11%); split: -0.00%, +0.11%
VMEM: 77855 -> 78514 (+0.85%); split: +0.85%, -0.01%
SMEM: 20501 -> 20593 (+0.45%); split: +0.46%, -0.01%
Copies: 9791 -> 9790 (-0.01%); split: -0.03%, +0.02%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045>
2020-11-13 12:34:27 +00:00
Samuel Pitoiset
68488fd383 aco: optimize v_add(v_bcnt(a, 0), b) to v_bcnt(a, b)
The first operand of v_bcnt should always be a VGPR because if it's
a SGPR, isel selects s_bcnt1 but I added a sanity check to prevent
any problems.

fossils-db (Vega10):
Totals from 23 (0.02% of 139517) affected shaders:
CodeSize: 106828 -> 106664 (-0.15%)
Instrs: 20242 -> 20201 (-0.20%)
Cycles: 213112 -> 211352 (-0.83%)
VMEM: 3200 -> 3184 (-0.50%)
SMEM: 928 -> 927 (-0.11%)

Helps Control, Assassins Creeds Origins and Youngblood.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7568>
2020-11-13 07:28:50 +00:00
Samuel Pitoiset
db9d13b4ff aco: optimize v_add_u32(v_mul_lo_u16) -> v_mad_u32_u16
fossils-db (Vega10):
Totals from 779 (0.56% of 139517) affected shaders:
CodeSize: 1187928 -> 1187508 (-0.04%); split: -0.04%, +0.00%
Instrs: 247353 -> 244608 (-1.11%); split: -1.11%, +0.00%
Cycles: 1127472 -> 1116420 (-0.98%); split: -0.98%, +0.00%
VMEM: 139720 -> 138297 (-1.02%); split: +0.00%, -1.02%
SMEM: 51069 -> 50735 (-0.65%); split: +0.04%, -0.69%
Copies: 11548 -> 11547 (-0.01%); split: -0.03%, +0.03%

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425>
2020-11-12 12:32:26 +00:00
Samuel Pitoiset
20e48551ac aco: select v_mul_lo_u16 for 16-bit multiplications that can't overflow
Only on GFX8-9 because GFX10 doesn't zero the upper 16 bits.

No fossils-db changes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425>
2020-11-12 12:32:26 +00:00
Samuel Pitoiset
7028e9875f aco: select v_mad_u32_u16 for 16-bit multiplications on GFX9+
No fossils-db changes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425>
2020-11-12 12:32:26 +00:00
Samuel Pitoiset
bbdafd6ab3 aco: optimize v_mad_u32_u16 with acc=0 to v_mul_u32_u24
v_mad_u32_u16 will be selected by isel to keep the range analysis
information around and to combine more v_add_u32+v_mad_u32_u16
together. When it's not possible to optimize that pattern, fallback
to v_mul_u32_u24 which is VOP2 instead of VOP3.

No fossils-db changes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425>
2020-11-12 12:32:26 +00:00
Samuel Pitoiset
0ea763a727 aco: add a new Operand flag to indicate that is 16-bit
To indicate that the upper 16-bits are always 0 and that optimizing
v_mad_u32_u16 to v_mul_u32_u24 is valid.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425>
2020-11-12 12:32:26 +00:00
Samuel Pitoiset
bda35ae6b9 aco: introduce a generic label for labelling instructions
When one instruction doesn't fit into the existing labels, use
the generic one.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425>
2020-11-12 12:32:26 +00:00
Samuel Pitoiset
dfd878f2ba aco: combine more s_add+s_lshl to s_lshl<n>_add by ignoring uses
Even if the s_lshl is used more that once, it can still be combined.

fossils-db (Vega10):
Totals from 771 (0.55% of 139517) affected shaders:
SGPRs: 46216 -> 46304 (+0.19%); split: -0.02%, +0.21%
VGPRs: 38488 -> 38464 (-0.06%)
SpillSGPRs: 1894 -> 1875 (-1.00%); split: -3.12%, +2.11%
CodeSize: 5681856 -> 5679844 (-0.04%); split: -0.07%, +0.03%
MaxWaves: 5320 -> 5323 (+0.06%)
Instrs: 1093960 -> 1093474 (-0.04%); split: -0.09%, +0.05%
Cycles: 47198380 -> 47258872 (+0.13%); split: -0.06%, +0.19%
VMEM: 176036 -> 176283 (+0.14%); split: +0.16%, -0.02%
SMEM: 53397 -> 53255 (-0.27%); split: +0.03%, -0.30%
VClause: 23156 -> 23152 (-0.02%); split: -0.03%, +0.01%
SClause: 35716 -> 35726 (+0.03%); split: -0.00%, +0.03%
Copies: 139395 -> 139871 (+0.34%); split: -0.04%, +0.39%
Branches: 33808 -> 33798 (-0.03%); split: -0.04%, +0.01%
PreSGPRs: 35381 -> 35331 (-0.14%); split: -0.20%, +0.06%

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7539>
2020-11-12 07:36:07 +00:00
Samuel Pitoiset
64748a2be2 aco/tests: add some tests for combining s_add+s_lshl to s_lshl<n>_add
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7539>
2020-11-12 07:36:07 +00:00
Samuel Pitoiset
ec347ee9bc aco: fix combining add/sub to b2i if a new dest needs to be allocated
The uses vector needs to be expanded to avoid out of bounds access
and to make sure the number of uses is initialized to 0.

This fixes combining more v_and(a, v_subbrev_co_u32).

fossilds-db (Vega10):
Totals from 4574 (3.28% of 139517) affected shaders:
SGPRs: 291625 -> 292217 (+0.20%); split: -0.01%, +0.21%
VGPRs: 276368 -> 276188 (-0.07%); split: -0.07%, +0.01%
SpillSGPRs: 455 -> 533 (+17.14%)
SpillVGPRs: 76 -> 78 (+2.63%)
CodeSize: 23327500 -> 23304152 (-0.10%); split: -0.17%, +0.07%
MaxWaves: 22044 -> 22066 (+0.10%)
Instrs: 4583064 -> 4576301 (-0.15%); split: -0.15%, +0.01%
Cycles: 47925276 -> 47871968 (-0.11%); split: -0.13%, +0.01%
VMEM: 1599363 -> 1597473 (-0.12%); split: +0.08%, -0.19%
SMEM: 331461 -> 331126 (-0.10%); split: +0.08%, -0.18%
VClause: 80639 -> 80696 (+0.07%); split: -0.02%, +0.09%
SClause: 155992 -> 155993 (+0.00%); split: -0.02%, +0.02%
Copies: 333482 -> 333318 (-0.05%); split: -0.12%, +0.07%
Branches: 70967 -> 70968 (+0.00%)
PreSGPRs: 187078 -> 187711 (+0.34%); split: -0.01%, +0.35%
PreVGPRs: 244918 -> 244785 (-0.05%)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7513>
2020-11-10 10:25:00 +01:00
Rhys Perry
5b81e80fb6 aco: implement 64-bit images
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7234>
2020-11-09 18:28:59 +00:00
Samuel Pitoiset
bae5487659 aco: optimize v_and(a, v_subbrev_co(0, 0, vcc)) -> v_cndmask(0, a, vcc)
fossils-db (Vega10):
Totals from 7786 (5.70% of 136546) affected shaders:
SGPRs: 517778 -> 518626 (+0.16%); split: -0.01%, +0.17%
VGPRs: 488252 -> 488084 (-0.03%); split: -0.04%, +0.01%
CodeSize: 42282068 -> 42250152 (-0.08%); split: -0.16%, +0.09%
MaxWaves: 35697 -> 35716 (+0.05%); split: +0.06%, -0.01%
Instrs: 8319309 -> 8304792 (-0.17%); split: -0.18%, +0.00%
Cycles: 88619440 -> 88489636 (-0.15%); split: -0.16%, +0.01%
VMEM: 2788278 -> 2780431 (-0.28%); split: +0.06%, -0.35%
SMEM: 570364 -> 569370 (-0.17%); split: +0.12%, -0.30%
VClause: 144906 -> 144908 (+0.00%); split: -0.05%, +0.05%
SClause: 302143 -> 302055 (-0.03%); split: -0.04%, +0.01%
Copies: 579124 -> 578779 (-0.06%); split: -0.14%, +0.08%
PreSGPRs: 327695 -> 328845 (+0.35%); split: -0.00%, +0.35%
PreVGPRs: 434280 -> 433954 (-0.08%)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7438>
2020-11-09 17:36:42 +00:00
Jason Ekstrand
21b1b91549 nir,spirv: Add support for the ShaderCallKHR scope
It's currently entirely trivial.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6479>
2020-11-05 23:36:46 +00:00
Tony Wasserka
1a1099c54f aco: Fix format string used when raising validation errors
Validation errors mention the pretty-printed instruction including
operands with the reserved % character, which caused vasprintf to
expect more format arguments than aco provided.

Fixes: c2b1978aa4 ("aco: rework the way various compilation/validation errors are reported")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7442>
2020-11-05 17:56:18 +00:00
Tony Wasserka
456beb40b8 aco/ra: Fix counting of subdword variables in get_reg_create_vector
The loop variable "k" shadowed another variable in the outer scope, so
this loop had no actual effect.

Fixes: 52cc1f8237 ("aco: improve p_create_vector RA for sub-dword operands")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7427>
2020-11-04 12:08:49 +00:00
Rhys Perry
786828131a aco: implement 8/16-bit instructions which can be trivially widened
When nir_lower_bit_size becomes more capable, we might want to revert some
of this.

fossil-db (parallel-rdp, Navi):
Totals from 217 (31.77% of 683) affected shaders:
SGPRs: 11320 -> 10200 (-9.89%)
VGPRs: 7156 -> 7364 (+2.91%)
CodeSize: 1453948 -> 1430136 (-1.64%); split: -1.66%, +0.02%
Instrs: 258530 -> 254840 (-1.43%); split: -1.44%, +0.01%
Cycles: 37334360 -> 37247936 (-0.23%); split: -0.26%, +0.03%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4791>
2020-11-04 11:50:37 +00:00
Rhys Perry
ef95ba8cdd aco: implement some 16-bit arithmetic instead of lowering
fossil-db (parallel-rdp, Navi):
Totals from 210 (30.75% of 683) affected shaders:
SGPRs: 9704 -> 10248 (+5.61%)
VGPRs: 5884 -> 5368 (-8.77%)
CodeSize: 1155564 -> 1098752 (-4.92%)
Instrs: 199927 -> 189940 (-5.00%)
Cycles: 20438392 -> 19860124 (-2.83%)

v2: use divergence analysis to determine which instructions to lower.

Co-Authored-by: Daniel Schürmann <daniel@schuermann.dev>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4791>
2020-11-04 11:50:37 +00:00
Samuel Pitoiset
57c152af9c aco: select v_mul_{hi}_u32_u24 for 24-bit multiplications
This is based on the NIR range analysis. v_mul_u32_u24 is VOP2, while
v_mul_lo_u32 is VOP3, so that should reduce codesize.

fossils-db (Vega10):
Totals from 12590 (9.22% of 136546) affected shaders:
SGPRs: 680207 -> 677271 (-0.43%); split: -0.47%, +0.04%
VGPRs: 620840 -> 620856 (+0.00%); split: -0.02%, +0.02%
CodeSize: 37930200 -> 37774088 (-0.41%); split: -0.41%, +0.00%
Instrs: 7463550 -> 7458120 (-0.07%); split: -0.07%, +0.00%
Cycles: 133487628 -> 133427532 (-0.05%); split: -0.05%, +0.00%
VMEM: 2514729 -> 2513426 (-0.05%); split: +0.02%, -0.08%
SMEM: 1533579 -> 1532795 (-0.05%); split: +0.05%, -0.10%
VClause: 231391 -> 231389 (-0.00%); split: -0.01%, +0.00%
SClause: 255352 -> 255294 (-0.02%); split: -0.04%, +0.02%
Copies: 605821 -> 600352 (-0.90%); split: -0.92%, +0.02%
Branches: 133739 -> 133743 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 351092 -> 348048 (-0.87%)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7405>
2020-11-03 13:47:40 +00:00
Samuel Pitoiset
3a72021d7c aco: store NIR range analysis data to the isel context
It will be used to optimize some ALU instructions.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7405>
2020-11-03 13:47:40 +00:00
James Park
4bd18e772a amd/llvm,aco: Replace VLA with alloca
MSVC will never support VLA, so use alloca instead.

Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7157>
2020-11-03 07:44:02 +00:00
Samuel Pitoiset
03f260cb27 radv,aco: optimize computing the sample mask for per-sample shading
I don't know why these values were introduced for but it seems like
we can optimize this by just doing:

gl_SampleMaskIn[0] = (SampleCoverage & (1 << gl_SampleID))

AMDGPU-PRO and AMDVLK apply the same formula to compute the
sample mask when per-sample shading is enabled.

No fossils-db changes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7377>
2020-11-02 08:05:47 +01:00
Samuel Pitoiset
c63bcda22c radv,aco: adjust the sample mask only if per-sample shading is enabled
When per-sample shading isn't enabled, we can just load the
samplemask from the hardware which is always the coverage of
the entire pixel/fragment.

fossilds-db (VEGA10):
Totals from 131 (0.10% of 136546) affected shaders:
SGPRs: 5056 -> 5048 (-0.16%)
VGPRs: 2600 -> 2372 (-8.77%)
CodeSize: 115788 -> 112560 (-2.79%)
MaxWaves: 1266 -> 1274 (+0.63%)
Instrs: 20620 -> 20071 (-2.66%)
Cycles: 82416 -> 80220 (-2.66%)
VMEM: 51567 -> 35532 (-31.10%); split: +0.24%, -31.34%
SMEM: 8952 -> 8258 (-7.75%); split: +0.11%, -7.86%
SClause: 1223 -> 1199 (-1.96%); split: -2.62%, +0.65%
Copies: 1247 -> 1124 (-9.86%); split: -10.18%, +0.32%
PreVGPRs: 2112 -> 1981 (-6.20%)

Helps Britannia, Shadow of the Tomb Raider, Warhammer II and Control.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7377>
2020-11-02 08:05:43 +01:00
James Park
6d058ac6c9 aco: Fix accidental copies, attempt two
Use auto to avoid mistyping the constness of the pair key, which
triggers implicit conversions rather than compilation errors.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7346>
2020-10-30 09:55:53 +00:00
Rhys Perry
1761379481 aco: handle SDWA in the optimizer
Apply SGPRs/modifiers when possible and try not to break when SDWA
instructions are encountered.

No shader-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7349>
2020-10-29 18:08:31 +00:00