Samuel Pitoiset
eaef1f2127
aco: allow to use the range analysis UB in emit_{sop2,vop2}_instruction()
...
It will allow to combine v_add+s_lshl or v_add+v_lshlrev to
v_mad_u32_u24 on GFX6-8 if operands are known to be 16-bit or 24-bit.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7673 >
2020-11-23 18:34:40 +00:00
Samuel Pitoiset
be600b009a
aco: add a new Operand flag to indicate that is 24-bit
...
To indicate that the upper 8-bits are always 0 to optimize more MADs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7673 >
2020-11-23 18:34:40 +00:00
Samuel Pitoiset
05fd780012
aco/tests: extend the optimize.add_lshl tests to GFX8
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7673 >
2020-11-23 18:34:40 +00:00
Rhys Perry
14186a1b84
aco/tests: add Builder::v_mul_imm() tests
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5390 >
2020-11-20 19:50:32 +00:00
Rhys Perry
aab507c6b0
aco: use v_mul_imm() for some nir_op_imul
...
Some of the optimizations v_mul_imm() does are complex and very
target-specific and not suitable to do in ACO's optimizer.
fossil-db (Vega):
Totals from 49135 (35.76% of 137413) affected shaders:
SGPRs: 2698547 -> 2696103 (-0.09%); split: -0.16%, +0.07%
VGPRs: 2301412 -> 2301600 (+0.01%); split: -0.01%, +0.02%
SpillSGPRs: 51520 -> 51519 (-0.00%)
CodeSize: 168798572 -> 169164012 (+0.22%); split: -0.00%, +0.22%
MaxWaves: 306553 -> 306539 (-0.00%); split: +0.00%, -0.01%
Instrs: 33423982 -> 33506598 (+0.25%); split: -0.00%, +0.25%
Cycles: 1807800632 -> 1804101376 (-0.20%); split: -0.20%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5390 >
2020-11-20 19:50:32 +00:00
Rhys Perry
02c5519e6c
aco: try harder to not create v_mul_lo_u32
...
fossil-db (Vega):
Totals from 4 (0.00% of 137413) affected shaders:
CodeSize: 13708 -> 13716 (+0.06%)
Instrs: 2742 -> 2744 (+0.07%)
Cycles: 24348 -> 24236 (-0.46%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5390 >
2020-11-20 19:50:31 +00:00
Rhys Perry
8ca23bcf39
aco: copy constant to sgpr in Builder::v_mul_imm()
...
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5390 >
2020-11-20 19:50:31 +00:00
Rhys Perry
756bb29391
aco: create vgpr constant copies using v_bfrev_b32
...
Looks like this worked once but broke at some point.
fossil-db (Vega):
Totals from 577 (0.42% of 137413) affected shaders:
CodeSize: 3822052 -> 3818720 (-0.09%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5390 >
2020-11-20 19:50:31 +00:00
Rhys Perry
4d93fc25f0
aco: count v_mul_lo_u32 as 16 cycles
...
There are more, but this is a common one.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5390 >
2020-11-20 19:50:31 +00:00
Rhys Perry
70d665d981
aco: don't create v_mov_b32 in v_mul_imm()
...
We switched to p_parallelcopy for everything else.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5390 >
2020-11-20 19:50:31 +00:00
Tony Wasserka
b4d6131c15
radv,aco: Compile with -Wshadow when available
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7430 >
2020-11-20 09:29:19 +00:00
Tony Wasserka
a978602d1f
aco/tests: Fix -Wunused warnings in release mode
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7430 >
2020-11-20 09:29:19 +00:00
Tony Wasserka
5231c788ff
aco/tests: Fix -Wshadow warnings
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7430 >
2020-11-20 09:29:19 +00:00
Tony Wasserka
2bb8874320
aco: Fix -Wshadow warnings
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7430 >
2020-11-20 09:29:19 +00:00
Rhys Perry
631e18d427
aco: create v_mad_u32_u24
...
fossil-db (Navi):
Totals from 849 (0.61% of 138791) affected shaders:
SGPRs: 38528 -> 38544 (+0.04%)
VGPRs: 39860 -> 39856 (-0.01%)
CodeSize: 2701880 -> 2702016 (+0.01%)
MaxWaves: 9148 -> 9150 (+0.02%)
Instrs: 509864 -> 509821 (-0.01%); split: -0.01%, +0.00%
Cycles: 3400124 -> 3399628 (-0.01%); split: -0.02%, +0.00%
VMEM: 262757 -> 262672 (-0.03%)
SMEM: 59710 -> 59704 (-0.01%)
Copies: 44461 -> 44466 (+0.01%)
fossil-db (Polaris):
Totals from 1487 (1.06% of 140385) affected shaders:
SGPRs: 54688 -> 55840 (+2.11%)
CodeSize: 2725608 -> 2725720 (+0.00%); split: -0.01%, +0.01%
Instrs: 521394 -> 517710 (-0.71%)
Cycles: 18474108 -> 18410964 (-0.34%)
VMEM: 436992 -> 431028 (-1.36%); split: +0.06%, -1.43%
SMEM: 124503 -> 122564 (-1.56%); split: +0.45%, -2.00%
VClause: 21972 -> 22015 (+0.20%); split: -0.12%, +0.31%
SClause: 14274 -> 14287 (+0.09%)
Copies: 44407 -> 44411 (+0.01%); split: -0.02%, +0.03%
PreSGPRs: 34318 -> 34321 (+0.01%); split: -0.00%, +0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7639 >
2020-11-19 12:36:17 +00:00
Samuel Pitoiset
0fcd379184
aco: fix combining max(-min(a, b), c) if a or b uses the neg modifier
...
No fossils-db changes.
Cc: 20.2, 20.3
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7657 >
2020-11-18 08:00:28 +00:00
Rhys Perry
867323379e
aco: don't use SMEM for SSBO stores
...
fossil-db (Navi):
Totals from 70 (0.05% of 138791) affected shaders:
SGPRs: 2324 -> 2097 (-9.77%)
VGPRs: 1344 -> 1480 (+10.12%)
CodeSize: 157872 -> 154836 (-1.92%); split: -1.93%, +0.01%
MaxWaves: 1288 -> 1260 (-2.17%)
Instrs: 29730 -> 29108 (-2.09%); split: -2.13%, +0.04%
Cycles: 394944 -> 391280 (-0.93%); split: -0.94%, +0.01%
VMEM: 5288 -> 5695 (+7.70%); split: +11.97%, -4.27%
SMEM: 2680 -> 2444 (-8.81%); split: +1.34%, -10.15%
VClause: 291 -> 502 (+72.51%)
SClause: 1176 -> 918 (-21.94%)
Copies: 3549 -> 3517 (-0.90%); split: -1.80%, +0.90%
Branches: 1230 -> 1228 (-0.16%)
PreSGPRs: 1675 -> 1491 (-10.99%)
PreVGPRs: 1101 -> 1223 (+11.08%)
Totals from 70 (0.05% of 139517) affected shaders (RAVEN):
SGPRs: 2368 -> 2121 (-10.43%)
VGPRs: 1344 -> 1480 (+10.12%)
CodeSize: 156664 -> 153252 (-2.18%)
MaxWaves: 636 -> 622 (-2.20%)
Instrs: 29968 -> 29226 (-2.48%)
Cycles: 398284 -> 393492 (-1.20%)
VMEM: 5544 -> 5930 (+6.96%); split: +11.72%, -4.76%
SMEM: 2752 -> 2502 (-9.08%); split: +1.20%, -10.28%
VClause: 292 -> 504 (+72.60%)
SClause: 1236 -> 940 (-23.95%)
Copies: 3907 -> 3852 (-1.41%); split: -2.20%, +0.79%
Branches: 1230 -> 1228 (-0.16%)
PreSGPRs: 1671 -> 1487 (-11.01%)
PreVGPRs: 1102 -> 1225 (+11.16%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6143 >
2020-11-16 15:52:22 +00:00
Rhys Perry
2736f97496
aco/tests: add output modifier tests
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7605 >
2020-11-16 12:58:44 +00:00
Rhys Perry
0c522d3aa7
aco: fix fp16 *0.5 omod
...
We were testing for -0.5 instead.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 1210e0bd62 ("aco: create 16-bit input and output modifiers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7605 >
2020-11-16 12:58:44 +00:00
Rhys Perry
558daa73f9
aco: disable omod if the sign of zeros should be preserved
...
The RDNA ISA doc says that omod doesn't preserve -0.0 in 6.2.2. LLVM
appears to always disable omod in this situation, but clamp is unaffected.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: df645fa369 ("aco: implement VK_KHR_shader_float_controls")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7605 >
2020-11-16 12:58:44 +00:00
Samuel Pitoiset
2f5b3ac2f8
aco: remove v_{add,sub,subrev}_u32 on GFX8
...
These opcodes are never used and they always write the carry-out
according to the GCN3 ISA documentation.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7569 >
2020-11-16 10:56:13 +00:00
Rhys Perry
4d727ee913
aco/tests: add some more clamp combining tests
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045 >
2020-11-13 12:34:27 +00:00
Rhys Perry
15d08a06e2
aco/tests: expand optimize.const_comparison_ordering tests
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045 >
2020-11-13 12:34:27 +00:00
Rhys Perry
6bf3c606be
aco/tests: initialize debug function
...
aco_log() will print the message to stderr.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045 >
2020-11-13 12:34:27 +00:00
Rhys Perry
966732e8ca
aco: disallow various v_add_u32 opts if modifiers are used
...
Check for clamp, SDWA or DPP. The optimization isn't possible with SDWA
and DPP, so it would have been skipped anyway. Doing any of these with a
clamp modifier present would be incorrect.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045 >
2020-11-13 12:34:27 +00:00
Rhys Perry
91ffeed88a
aco: fix combine_constant_comparison_ordering() NaN check with 16/64-bit
...
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045 >
2020-11-13 12:34:27 +00:00
Rhys Perry
d4c821da0e
aco: don't combine precise max(min()) to med3
...
fossil-db (Navi):
Totals from 241 (0.18% of 137413) affected shaders:
CodeSize: 856280 -> 856308 (+0.00%); split: -0.00%, +0.00%
Instrs: 164220 -> 164514 (+0.18%); split: -0.00%, +0.18%
Cycles: 1031916 -> 1033092 (+0.11%); split: -0.00%, +0.11%
VMEM: 77855 -> 78514 (+0.85%); split: +0.85%, -0.01%
SMEM: 20501 -> 20593 (+0.45%); split: +0.46%, -0.01%
Copies: 9791 -> 9790 (-0.01%); split: -0.03%, +0.02%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7045 >
2020-11-13 12:34:27 +00:00
Samuel Pitoiset
68488fd383
aco: optimize v_add(v_bcnt(a, 0), b) to v_bcnt(a, b)
...
The first operand of v_bcnt should always be a VGPR because if it's
a SGPR, isel selects s_bcnt1 but I added a sanity check to prevent
any problems.
fossils-db (Vega10):
Totals from 23 (0.02% of 139517) affected shaders:
CodeSize: 106828 -> 106664 (-0.15%)
Instrs: 20242 -> 20201 (-0.20%)
Cycles: 213112 -> 211352 (-0.83%)
VMEM: 3200 -> 3184 (-0.50%)
SMEM: 928 -> 927 (-0.11%)
Helps Control, Assassins Creeds Origins and Youngblood.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7568 >
2020-11-13 07:28:50 +00:00
Samuel Pitoiset
db9d13b4ff
aco: optimize v_add_u32(v_mul_lo_u16) -> v_mad_u32_u16
...
fossils-db (Vega10):
Totals from 779 (0.56% of 139517) affected shaders:
CodeSize: 1187928 -> 1187508 (-0.04%); split: -0.04%, +0.00%
Instrs: 247353 -> 244608 (-1.11%); split: -1.11%, +0.00%
Cycles: 1127472 -> 1116420 (-0.98%); split: -0.98%, +0.00%
VMEM: 139720 -> 138297 (-1.02%); split: +0.00%, -1.02%
SMEM: 51069 -> 50735 (-0.65%); split: +0.04%, -0.69%
Copies: 11548 -> 11547 (-0.01%); split: -0.03%, +0.03%
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425 >
2020-11-12 12:32:26 +00:00
Samuel Pitoiset
20e48551ac
aco: select v_mul_lo_u16 for 16-bit multiplications that can't overflow
...
Only on GFX8-9 because GFX10 doesn't zero the upper 16 bits.
No fossils-db changes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425 >
2020-11-12 12:32:26 +00:00
Samuel Pitoiset
7028e9875f
aco: select v_mad_u32_u16 for 16-bit multiplications on GFX9+
...
No fossils-db changes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425 >
2020-11-12 12:32:26 +00:00
Samuel Pitoiset
bbdafd6ab3
aco: optimize v_mad_u32_u16 with acc=0 to v_mul_u32_u24
...
v_mad_u32_u16 will be selected by isel to keep the range analysis
information around and to combine more v_add_u32+v_mad_u32_u16
together. When it's not possible to optimize that pattern, fallback
to v_mul_u32_u24 which is VOP2 instead of VOP3.
No fossils-db changes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425 >
2020-11-12 12:32:26 +00:00
Samuel Pitoiset
0ea763a727
aco: add a new Operand flag to indicate that is 16-bit
...
To indicate that the upper 16-bits are always 0 and that optimizing
v_mad_u32_u16 to v_mul_u32_u24 is valid.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425 >
2020-11-12 12:32:26 +00:00
Samuel Pitoiset
bda35ae6b9
aco: introduce a generic label for labelling instructions
...
When one instruction doesn't fit into the existing labels, use
the generic one.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425 >
2020-11-12 12:32:26 +00:00
Samuel Pitoiset
dfd878f2ba
aco: combine more s_add+s_lshl to s_lshl<n>_add by ignoring uses
...
Even if the s_lshl is used more that once, it can still be combined.
fossils-db (Vega10):
Totals from 771 (0.55% of 139517) affected shaders:
SGPRs: 46216 -> 46304 (+0.19%); split: -0.02%, +0.21%
VGPRs: 38488 -> 38464 (-0.06%)
SpillSGPRs: 1894 -> 1875 (-1.00%); split: -3.12%, +2.11%
CodeSize: 5681856 -> 5679844 (-0.04%); split: -0.07%, +0.03%
MaxWaves: 5320 -> 5323 (+0.06%)
Instrs: 1093960 -> 1093474 (-0.04%); split: -0.09%, +0.05%
Cycles: 47198380 -> 47258872 (+0.13%); split: -0.06%, +0.19%
VMEM: 176036 -> 176283 (+0.14%); split: +0.16%, -0.02%
SMEM: 53397 -> 53255 (-0.27%); split: +0.03%, -0.30%
VClause: 23156 -> 23152 (-0.02%); split: -0.03%, +0.01%
SClause: 35716 -> 35726 (+0.03%); split: -0.00%, +0.03%
Copies: 139395 -> 139871 (+0.34%); split: -0.04%, +0.39%
Branches: 33808 -> 33798 (-0.03%); split: -0.04%, +0.01%
PreSGPRs: 35381 -> 35331 (-0.14%); split: -0.20%, +0.06%
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7539 >
2020-11-12 07:36:07 +00:00
Samuel Pitoiset
64748a2be2
aco/tests: add some tests for combining s_add+s_lshl to s_lshl<n>_add
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7539 >
2020-11-12 07:36:07 +00:00
Samuel Pitoiset
ec347ee9bc
aco: fix combining add/sub to b2i if a new dest needs to be allocated
...
The uses vector needs to be expanded to avoid out of bounds access
and to make sure the number of uses is initialized to 0.
This fixes combining more v_and(a, v_subbrev_co_u32).
fossilds-db (Vega10):
Totals from 4574 (3.28% of 139517) affected shaders:
SGPRs: 291625 -> 292217 (+0.20%); split: -0.01%, +0.21%
VGPRs: 276368 -> 276188 (-0.07%); split: -0.07%, +0.01%
SpillSGPRs: 455 -> 533 (+17.14%)
SpillVGPRs: 76 -> 78 (+2.63%)
CodeSize: 23327500 -> 23304152 (-0.10%); split: -0.17%, +0.07%
MaxWaves: 22044 -> 22066 (+0.10%)
Instrs: 4583064 -> 4576301 (-0.15%); split: -0.15%, +0.01%
Cycles: 47925276 -> 47871968 (-0.11%); split: -0.13%, +0.01%
VMEM: 1599363 -> 1597473 (-0.12%); split: +0.08%, -0.19%
SMEM: 331461 -> 331126 (-0.10%); split: +0.08%, -0.18%
VClause: 80639 -> 80696 (+0.07%); split: -0.02%, +0.09%
SClause: 155992 -> 155993 (+0.00%); split: -0.02%, +0.02%
Copies: 333482 -> 333318 (-0.05%); split: -0.12%, +0.07%
Branches: 70967 -> 70968 (+0.00%)
PreSGPRs: 187078 -> 187711 (+0.34%); split: -0.01%, +0.35%
PreVGPRs: 244918 -> 244785 (-0.05%)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7513 >
2020-11-10 10:25:00 +01:00
Rhys Perry
5b81e80fb6
aco: implement 64-bit images
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7234 >
2020-11-09 18:28:59 +00:00
Samuel Pitoiset
bae5487659
aco: optimize v_and(a, v_subbrev_co(0, 0, vcc)) -> v_cndmask(0, a, vcc)
...
fossils-db (Vega10):
Totals from 7786 (5.70% of 136546) affected shaders:
SGPRs: 517778 -> 518626 (+0.16%); split: -0.01%, +0.17%
VGPRs: 488252 -> 488084 (-0.03%); split: -0.04%, +0.01%
CodeSize: 42282068 -> 42250152 (-0.08%); split: -0.16%, +0.09%
MaxWaves: 35697 -> 35716 (+0.05%); split: +0.06%, -0.01%
Instrs: 8319309 -> 8304792 (-0.17%); split: -0.18%, +0.00%
Cycles: 88619440 -> 88489636 (-0.15%); split: -0.16%, +0.01%
VMEM: 2788278 -> 2780431 (-0.28%); split: +0.06%, -0.35%
SMEM: 570364 -> 569370 (-0.17%); split: +0.12%, -0.30%
VClause: 144906 -> 144908 (+0.00%); split: -0.05%, +0.05%
SClause: 302143 -> 302055 (-0.03%); split: -0.04%, +0.01%
Copies: 579124 -> 578779 (-0.06%); split: -0.14%, +0.08%
PreSGPRs: 327695 -> 328845 (+0.35%); split: -0.00%, +0.35%
PreVGPRs: 434280 -> 433954 (-0.08%)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7438 >
2020-11-09 17:36:42 +00:00
Jason Ekstrand
21b1b91549
nir,spirv: Add support for the ShaderCallKHR scope
...
It's currently entirely trivial.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6479 >
2020-11-05 23:36:46 +00:00
Tony Wasserka
1a1099c54f
aco: Fix format string used when raising validation errors
...
Validation errors mention the pretty-printed instruction including
operands with the reserved % character, which caused vasprintf to
expect more format arguments than aco provided.
Fixes: c2b1978aa4 ("aco: rework the way various compilation/validation errors are reported")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7442 >
2020-11-05 17:56:18 +00:00
Tony Wasserka
456beb40b8
aco/ra: Fix counting of subdword variables in get_reg_create_vector
...
The loop variable "k" shadowed another variable in the outer scope, so
this loop had no actual effect.
Fixes: 52cc1f8237 ("aco: improve p_create_vector RA for sub-dword operands")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7427 >
2020-11-04 12:08:49 +00:00
Rhys Perry
786828131a
aco: implement 8/16-bit instructions which can be trivially widened
...
When nir_lower_bit_size becomes more capable, we might want to revert some
of this.
fossil-db (parallel-rdp, Navi):
Totals from 217 (31.77% of 683) affected shaders:
SGPRs: 11320 -> 10200 (-9.89%)
VGPRs: 7156 -> 7364 (+2.91%)
CodeSize: 1453948 -> 1430136 (-1.64%); split: -1.66%, +0.02%
Instrs: 258530 -> 254840 (-1.43%); split: -1.44%, +0.01%
Cycles: 37334360 -> 37247936 (-0.23%); split: -0.26%, +0.03%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4791 >
2020-11-04 11:50:37 +00:00
Rhys Perry
ef95ba8cdd
aco: implement some 16-bit arithmetic instead of lowering
...
fossil-db (parallel-rdp, Navi):
Totals from 210 (30.75% of 683) affected shaders:
SGPRs: 9704 -> 10248 (+5.61%)
VGPRs: 5884 -> 5368 (-8.77%)
CodeSize: 1155564 -> 1098752 (-4.92%)
Instrs: 199927 -> 189940 (-5.00%)
Cycles: 20438392 -> 19860124 (-2.83%)
v2: use divergence analysis to determine which instructions to lower.
Co-Authored-by: Daniel Schürmann <daniel@schuermann.dev>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4791 >
2020-11-04 11:50:37 +00:00
Samuel Pitoiset
57c152af9c
aco: select v_mul_{hi}_u32_u24 for 24-bit multiplications
...
This is based on the NIR range analysis. v_mul_u32_u24 is VOP2, while
v_mul_lo_u32 is VOP3, so that should reduce codesize.
fossils-db (Vega10):
Totals from 12590 (9.22% of 136546) affected shaders:
SGPRs: 680207 -> 677271 (-0.43%); split: -0.47%, +0.04%
VGPRs: 620840 -> 620856 (+0.00%); split: -0.02%, +0.02%
CodeSize: 37930200 -> 37774088 (-0.41%); split: -0.41%, +0.00%
Instrs: 7463550 -> 7458120 (-0.07%); split: -0.07%, +0.00%
Cycles: 133487628 -> 133427532 (-0.05%); split: -0.05%, +0.00%
VMEM: 2514729 -> 2513426 (-0.05%); split: +0.02%, -0.08%
SMEM: 1533579 -> 1532795 (-0.05%); split: +0.05%, -0.10%
VClause: 231391 -> 231389 (-0.00%); split: -0.01%, +0.00%
SClause: 255352 -> 255294 (-0.02%); split: -0.04%, +0.02%
Copies: 605821 -> 600352 (-0.90%); split: -0.92%, +0.02%
Branches: 133739 -> 133743 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 351092 -> 348048 (-0.87%)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7405 >
2020-11-03 13:47:40 +00:00
Samuel Pitoiset
3a72021d7c
aco: store NIR range analysis data to the isel context
...
It will be used to optimize some ALU instructions.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7405 >
2020-11-03 13:47:40 +00:00
James Park
4bd18e772a
amd/llvm,aco: Replace VLA with alloca
...
MSVC will never support VLA, so use alloca instead.
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7157 >
2020-11-03 07:44:02 +00:00
Samuel Pitoiset
03f260cb27
radv,aco: optimize computing the sample mask for per-sample shading
...
I don't know why these values were introduced for but it seems like
we can optimize this by just doing:
gl_SampleMaskIn[0] = (SampleCoverage & (1 << gl_SampleID))
AMDGPU-PRO and AMDVLK apply the same formula to compute the
sample mask when per-sample shading is enabled.
No fossils-db changes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7377 >
2020-11-02 08:05:47 +01:00
Samuel Pitoiset
c63bcda22c
radv,aco: adjust the sample mask only if per-sample shading is enabled
...
When per-sample shading isn't enabled, we can just load the
samplemask from the hardware which is always the coverage of
the entire pixel/fragment.
fossilds-db (VEGA10):
Totals from 131 (0.10% of 136546) affected shaders:
SGPRs: 5056 -> 5048 (-0.16%)
VGPRs: 2600 -> 2372 (-8.77%)
CodeSize: 115788 -> 112560 (-2.79%)
MaxWaves: 1266 -> 1274 (+0.63%)
Instrs: 20620 -> 20071 (-2.66%)
Cycles: 82416 -> 80220 (-2.66%)
VMEM: 51567 -> 35532 (-31.10%); split: +0.24%, -31.34%
SMEM: 8952 -> 8258 (-7.75%); split: +0.11%, -7.86%
SClause: 1223 -> 1199 (-1.96%); split: -2.62%, +0.65%
Copies: 1247 -> 1124 (-9.86%); split: -10.18%, +0.32%
PreVGPRs: 2112 -> 1981 (-6.20%)
Helps Britannia, Shadow of the Tomb Raider, Warhammer II and Control.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7377 >
2020-11-02 08:05:43 +01:00
James Park
6d058ac6c9
aco: Fix accidental copies, attempt two
...
Use auto to avoid mistyping the constness of the pair key, which
triggers implicit conversions rather than compilation errors.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7346 >
2020-10-30 09:55:53 +00:00