Commit graph

307 commits

Author SHA1 Message Date
Rhys Perry
cf5fc4b973 aco: disallow SMEM offsets that are not multiples of 4
These can't be encoded on GFX6/7, and combining these additions causes
CTS failures on GFX10.3.

I think the low 2 MSBs are ignored before the addition, not after, so
load(a + 3, 0) becomes load(a, 3), which is the same as load(a, 0).

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13755>
2021-12-17 22:14:36 +00:00
Rhys Perry
94603786c5 aco: fix check_vop3_operands() for f16vec2 ffma fneg combine
For v_pk_fma_f16, we should consider all three operands, not the first
two.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 15a375b4c8 ("radv,aco: don't lower some ffma instructions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14229>
2021-12-17 11:16:12 +00:00
Rhys Perry
165ca5088b radv,aco: implement nir_op_ffma
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9805>
2021-12-13 11:22:33 +00:00
Rhys Perry
f4f5d577fc aco: swap operands if necessary to create v_madak/v_fmaak
Also rewrite the check_literal logic to be more straightforward.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9805>
2021-12-13 11:22:33 +00:00
Rhys Perry
2665320c78 aco: create v_fmamk_f32/v_fmaak_f32 from nir_op_ffma
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9805>
2021-12-13 11:22:33 +00:00
Rhys Perry
a487747ebd aco: use more predictable tiebreaker when forming MADs
fossil-db (GFX10.3):
Totals from 84981 (58.10% of 146267) affected shaders:
VGPRs: 3829896 -> 3820480 (-0.25%); split: -0.33%, +0.08%
CodeSize: 270860472 -> 270850132 (-0.00%); split: -0.08%, +0.08%
MaxWaves: 2035822 -> 2042516 (+0.33%); split: +0.39%, -0.06%
Instrs: 51285526 -> 51308869 (+0.05%); split: -0.03%, +0.08%
Latency: 931503706 -> 932556231 (+0.11%); split: -0.19%, +0.30%
InvThroughput: 217084232 -> 217070849 (-0.01%); split: -0.12%, +0.11%

fossil-db (GFX10):
Totals from 85520 (58.47% of 146267) affected shaders:
VGPRs: 3729132 -> 3725344 (-0.10%); split: -0.21%, +0.10%
CodeSize: 272796500 -> 272783084 (-0.00%); split: -0.09%, +0.08%
MaxWaves: 2246410 -> 2249012 (+0.12%); split: +0.17%, -0.05%
Instrs: 51643962 -> 51664865 (+0.04%); split: -0.04%, +0.08%
Latency: 932331949 -> 933274979 (+0.10%); split: -0.19%, +0.29%
InvThroughput: 214187040 -> 214130994 (-0.03%); split: -0.13%, +0.11%

fossil-db (GFX9):
Totals from 84619 (57.80% of 146401) affected shaders:
SGPRs: 5366240 -> 5366944 (+0.01%); split: -0.09%, +0.10%
VGPRs: 3765608 -> 3764972 (-0.02%); split: -0.23%, +0.22%
CodeSize: 263634732 -> 263616320 (-0.01%); split: -0.08%, +0.08%
MaxWaves: 546617 -> 547091 (+0.09%); split: +0.18%, -0.09%
Instrs: 51426195 -> 51458334 (+0.06%); split: -0.03%, +0.10%
Latency: 1164445660 -> 1161923480 (-0.22%); split: -0.46%, +0.24%
InvThroughput: 542964697 -> 542329595 (-0.12%); split: -0.26%, +0.14%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9805>
2021-12-13 11:22:33 +00:00
Rhys Perry
65a78b2252 aco: properly update use counts if a extract is still used
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13909>
2021-11-29 18:52:12 +00:00
Samuel Pitoiset
add883bf9b aco: fix right shift of exponent 32 detected by UBSAN
src/amd/compiler/aco_optimizer.cpp:1316:17: runtime error: shift
exponent 32 is too large for 32-bit type 'unsigned int'

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13951>
2021-11-25 16:15:30 +00:00
Rhys Perry
9bc0fc89c8 aco: disable mul(cndmask(0, 1, b), a) optimization sometimes
This optimization doesn't work for SDWA or DPP multiplications and we
can't do it if denormal flushing is required because v_cndmask_b32 doesn't
do that and we can't do it if we can't assume operands are finite because
0.0 * inf is NaN, not 0.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13434>
2021-10-20 07:35:52 +00:00
Timur Kristóf
ca6ef505ff aco/optimizer: Skip SDWA on v_lshlrev when unnecessary in apply_extract.
In the following cases:
- lower 16 bits are extracted and the shift amount is 16 or more
- lower 8 bits are extracted and the shift amount is 24 or more
the undesireable upper bits are already shifted out, and therefore
there is no need to add SDWA to the v_lshlrev instruction.

Fossil DB stats on Sienna Cichlid with NGGC on:

Totals from 58239 (45.27% of 128647) affected shaders:
CodeSize: 153498624 -> 153265616 (-0.15%); split: -0.15%, +0.00%
Instrs: 29636304 -> 29578064 (-0.20%); split: -0.20%, +0.00%
Latency: 136931496 -> 136876379 (-0.04%); split: -0.04%, +0.00%
InvThroughput: 21134367 -> 21078861 (-0.26%); split: -0.26%, +0.00%
Copies: 2777550 -> 2777548 (-0.00%); split: -0.00%, +0.00%

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13121>
2021-10-12 16:27:50 +00:00
Daniel Schürmann
40a93e271c aco: clang-format
No changes, just formatting.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13087>
2021-09-28 19:48:00 +00:00
Timur Kristóf
d3e0cf3d32 aco: Omit p_extract after ds_read with matching bit size.
Fossil DB stats on Sienna Cichlid:

Totals from 135 (0.10% of 128647) affected shaders:
CodeSize: 525184 -> 523704 (-0.28%)
Instrs: 92835 -> 92684 (-0.16%)
Latency: 311528 -> 311055 (-0.15%)
InvThroughput: 86572 -> 86455 (-0.14%)
Copies: 7666 -> 7650 (-0.21%)

Fossil DB stats on Sienna Cichlid with NGGC on:

Totals from 58374 (45.38% of 128647) affected shaders:
CodeSize: 160322912 -> 159622564 (-0.44%)
Instrs: 30755822 -> 30639193 (-0.38%)
Latency: 136713768 -> 136690360 (-0.02%)
InvThroughput: 21739219 -> 21658151 (-0.37%)
Copies: 3297969 -> 3297953 (-0.00%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11560>
2021-09-28 17:59:27 +00:00
Timur Kristóf
f2e41eda9e aco: Add ability to optimize v_lshl + v_sub into v_mad_i32_i24.
Also change combine_add_lshl to use check_vop3_operands instead
of its own checks of the operands.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12786>
2021-09-20 12:39:03 +02:00
Filip Gawin
6b5e9352ef aco: cleanup assignment of unique_ptrs
Reviewed-by: Joshua Ashton <joshua@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12903>
2021-09-18 11:09:24 +00:00
Daniel Schürmann
0988f7b9ba aco: remove explicit dst_preserve flag
Instead, we can rely on the fact that subdword definitions
must preserve the unused bits while dword definitions either
pad or sign-extend.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12640>
2021-09-02 20:39:17 +02:00
Daniel Schürmann
9e3ff06c38 aco: rewrite SDWA selector
This commit introduces a new struct SubdwordSel
in order to ease and clean up the usage of SDWA
selections. This includes removing the distinction
between register-allocated and fixed SDWA selections.
Instead, SDWA selections can now also access the high
bits of subdword variables. Alignment and sizes are
validated accordingly. Size, offset and sign_extend
can be evaluated via helper methods.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12640>
2021-09-02 20:39:17 +02:00
Rhys Perry
33ddbd220f aco: remove DPP when applying constants/literals/sgprs
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12601>
2021-08-31 16:58:20 +00:00
Rhys Perry
e27946ca11 aco: don't constant propagate to DPP instructions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12601>
2021-08-31 16:58:20 +00:00
Timur Kristóf
76b9dd6266 aco: Unset 16 and 24-bit flags from operands in apply_extract.
Consider the following sequence in a shader:
b = p_extract a
c = v_mad_u32_u16 b, X, 0

The optimizer applies extract, resulting in:
c = v_mad_u32_u16 a, X, 0 (correct)

Then it mistakenly turns that into:
c = v_mul_u32_u24 a, X, 0 (incorrect)

In this case, the p_extract is applied to v_mad_u32_u16 by
apply_extract. After this, we can no longer be sure that
the operands are still 16 or 24-bit, so we have to remove
this flag.

No Fossil DB changes.

Fixes: 54292e99c7
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12558>
2021-08-30 14:05:33 +00:00
Daniel Schürmann
2eeaaabb8e aco/optimizer: combine v_pk_mul_u16 + v_pk_add_u16 -> v_pk_mad_u16
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11678>
2021-08-27 19:57:59 +00:00
Daniel Schürmann
be16ebc5ca aco/optimizer: fuse v_mul_f64 + v_add_f64 -> v_fma_f64
No fossil-db changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11678>
2021-08-27 19:57:59 +00:00
Daniel Schürmann
8e27ca9953 aco/optimizer: combine v_mul_lo_u16 + v_add_u16 -> v_mad_u16
Totals from 192 (0.13% of 150170) affected shaders: (GFX10.3)
CodeSize: 1027224 -> 1019872 (-0.72%)
Instrs: 174784 -> 173863 (-0.53%)
Latency: 4235742 -> 4232177 (-0.08%); split: -0.11%, +0.03%
InvThroughput: 1777026 -> 1775945 (-0.06%); split: -0.09%, +0.03%
Copies: 34098 -> 34099 (+0.00%); split: -0.03%, +0.03%
PreVGPRs: 4920 -> 4850 (-1.42%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11678>
2021-08-27 19:57:59 +00:00
Daniel Schürmann
23d5865f42 aco: refactor nir_op_imul selection
Previously, the optimization to use v_mul_lo_u16 for
32bit multiplications was done in instruction_selection.
This was moved to the optimizer to ease some case distinctions.

The mixed results are due to increased use of SDWA.

Totals from 2616 (1.74% of 150170) affected shaders: (GFX10.3)
VGPRs: 143888 -> 143872 (-0.01%); split: -0.02%, +0.01%
CodeSize: 5604032 -> 5604080 (+0.00%); split: -0.01%, +0.01%
Instrs: 1086798 -> 1083915 (-0.27%); split: -0.27%, +0.01%
Latency: 8215793 -> 8213023 (-0.03%); split: -0.10%, +0.07%
InvThroughput: 20765157 -> 20773766 (+0.04%); split: -0.02%, +0.06%
VClause: 35256 -> 35260 (+0.01%); split: -0.02%, +0.03%
SClause: 29021 -> 29024 (+0.01%); split: -0.00%, +0.01%
Copies: 74163 -> 74306 (+0.19%); split: -0.05%, +0.24%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11678>
2021-08-27 19:57:59 +00:00
Daniel Schürmann
d8eef134d8 aco: only apply extract if not used more than 4 times
Totals from 61 (0.04% of 150170) affected shaders: (GFX10.3)
CodeSize: 1087732 -> 1087380 (-0.03%); split: -0.22%, +0.18%
Instrs: 192343 -> 192205 (-0.07%); split: -0.16%, +0.09%
Latency: 7231670 -> 7148073 (-1.16%); split: -1.19%, +0.04%
InvThroughput: 3436715 -> 3394926 (-1.22%); split: -1.25%, +0.04%
VClause: 4831 -> 4833 (+0.04%)
Copies: 50130 -> 49934 (-0.39%); split: -0.67%, +0.28%
Branches: 5945 -> 5948 (+0.05%)
PreSGPRs: 3486 -> 3472 (-0.40%)
PreVGPRs: 5154 -> 5152 (-0.04%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11678>
2021-08-27 19:57:59 +00:00
Timur Kristóf
abcc83e713 aco: Fix to_uniform_bool_instr when operands are not suitable.
Don't attempt to transform uniform boolean instructions when
their operands are unsuitable. This can happen eg. due to other
optimizations that combine SALU instructions which clear out
the uniform instruction labels.

Cc: mesa-stable
Fixes: 8a32f57fff
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11573>
2021-08-25 12:43:50 +00:00
Rhys Perry
2201f5a58c aco: remove label_extract if the extract is used by a non-VALU
If an extract is used by a non-VALU instruction, it can't be applied to
all instructions, so it's not beneficial to try to apply it.

This check isn't needed because can_apply_extract()/can_use_SDWA() should
already handle non-VALU instructions.

fossil-db (Sienna Cichlid):
Totals from 1020 (0.68% of 150170) affected shaders:
SpillSGPRs: 1577 -> 1571 (-0.38%)
CodeSize: 7863668 -> 7858336 (-0.07%); split: -0.07%, +0.00%
Instrs: 1431583 -> 1431083 (-0.03%); split: -0.04%, +0.01%
Latency: 25891250 -> 25890916 (-0.00%); split: -0.01%, +0.01%
InvThroughput: 7248683 -> 7248655 (-0.00%); split: -0.01%, +0.01%
SClause: 49072 -> 49071 (-0.00%)
Copies: 126649 -> 126580 (-0.05%); split: -0.11%, +0.06%
Branches: 39129 -> 39120 (-0.02%); split: -0.03%, +0.01%
PreSGPRs: 53071 -> 52943 (-0.24%); split: -0.26%, +0.02%
PreVGPRs: 57437 -> 57435 (-0.00%); split: -0.01%, +0.01%

fossil-db (Polaris10):
Totals from 654 (0.43% of 151696) affected shaders:
CodeSize: 5814552 -> 5811568 (-0.05%); split: -0.05%, +0.00%
Instrs: 1105783 -> 1105049 (-0.07%); split: -0.07%, +0.00%
Latency: 20261458 -> 20259744 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 9011785 -> 9011749 (-0.00%); split: -0.00%, +0.00%
Copies: 104693 -> 103904 (-0.75%); split: -0.76%, +0.00%
PreSGPRs: 36105 -> 36095 (-0.03%); split: -0.03%, +0.01%
PreVGPRs: 43813 -> 43809 (-0.01%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12212>
2021-08-23 14:56:37 +01:00
Rhys Perry
2e6834d4f6 aco: combine DPP into VALU before RA
Mostly helps a bunch of Cyberpunk 2077 shaders. Catches some of the cases
that the post-RA can't optimize because of register assignment.

fossil-db (Siena Cichlid):
Totals from 25 (0.02% of 150170) affected shaders:
CodeSize: 78808 -> 75764 (-3.86%)
Instrs: 14311 -> 13547 (-5.34%)
Latency: 278697 -> 277885 (-0.29%)
InvThroughput: 63428 -> 62754 (-1.06%)
Copies: 1348 -> 1349 (+0.07%); split: -0.07%, +0.15%
PreVGPRs: 1035 -> 1011 (-2.32%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924>
2021-08-19 18:17:33 +00:00
Rhys Perry
b97cfd72af aco: handle DPP in the optimizer
There are a bunch of optimizations that are broken when DPP is involved.

fossil-db (Sienna Cichlid):
Totals from 100 (0.07% of 150170) affected shaders:
CodeSize: 325204 -> 325192 (-0.00%); split: -0.06%, +0.05%
Instrs: 62773 -> 62664 (-0.17%); split: -0.18%, +0.00%
Latency: 295348 -> 295266 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 73990 -> 73946 (-0.06%); split: -0.06%, +0.01%
Copies: 1650 -> 1609 (-2.48%); split: -2.55%, +0.06%
PreSGPRs: 3554 -> 3520 (-0.96%)

Fossil-db changes are probably because v_sub_f32_dpp(v_mul_f32) is no
longer being combined into MAD and then split back into separate
instructions.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924>
2021-08-19 18:17:33 +00:00
Rhys Perry
1d894a8c85 aco: move a bunch of helpers into aco_ir.h/aco_ir.cpp
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924>
2021-08-19 18:17:33 +00:00
Rhys Perry
211d1dfd34 aco: don't create v_madmk_f32/v_madak_f32 from v_fma_legacy_f16
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5105
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12004>
2021-07-22 15:43:31 +00:00
Daniel Schürmann
9b1a296172 aco/optimizer: ensure to not erase high bits when propagating packed constants
Packed constants with non-zero values in the high half
might have been propagated as 16 bit, dropping the high half.

Cc: mesa-stable
Closes: #5070
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11954>
2021-07-20 07:48:39 +00:00
Timur Kristóf
60c5abf685 aco: Remove s_and with exec when all lanes are active.
This helps NGG GS and culling shaders.
No Fossil DB changes without NGG culling.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11458>
2021-07-16 14:31:54 +00:00
Tony Wasserka
66e51dc474 aco: Remove use of deprecated Operand constructors
This migration was done with libclang-based automatic tooling, which
performed these replacements:
* Operand(uint8_t) -> Operand::c8
* Operand(uint16_t) -> Operand::c16
* Operand(uint32_t, false) -> Operand::c32
* Operand(uint32_t, bool) -> Operand::c32_or_c64
* Operand(uint64_t) -> Operand::c64
* Operand(0) -> Operand::zero(num_bytes)

Casts that were previously used for constructor selection have automatically
been removed (e.g. Operand((uint16_t)1) -> Operand::c16(1)).

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11653>
2021-07-13 17:43:26 +00:00
Daniel Schürmann
1e2639026f aco: Format.
Manually adjusted some comments for more intuitive line breaks.

Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11258>
2021-07-12 21:27:31 +00:00
Daniel Schürmann
3f9e986d33 aco: add missing Licenses and remove Authors from files
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11271>
2021-07-12 12:09:31 +00:00
Daniel Schürmann
59fdaa1985 aco: reorder and cleanup #includes
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11271>
2021-07-12 12:09:31 +00:00
Timur Kristóf
5713e059ea aco: Add validation for v_permlane instructions.
Previously there hasn't been any validation for these instructions,
but after shooting myself in the leg with it a few times, I decided
to add the validation now.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>
2021-06-09 16:48:51 +00:00
Rhys Perry
54292e99c7 aco: optimize 32-bit extracts and inserts using SDWA
Still need to use dst_u=preserve field to optimize packs

fossil-db (Sienna Cichlid):
Totals from 15974 (10.66% of 149839) affected shaders:
VGPRs: 1009064 -> 1008968 (-0.01%); split: -0.03%, +0.02%
SpillSGPRs: 7959 -> 7964 (+0.06%)
CodeSize: 101716436 -> 101159568 (-0.55%); split: -0.55%, +0.01%
MaxWaves: 284464 -> 284490 (+0.01%); split: +0.02%, -0.01%
Instrs: 19334216 -> 19224241 (-0.57%); split: -0.57%, +0.00%
Latency: 375465295 -> 375230478 (-0.06%); split: -0.14%, +0.08%
InvThroughput: 79006105 -> 78860705 (-0.18%); split: -0.25%, +0.07%

fossil-db (Polaris):
Totals from 11369 (7.51% of 151365) affected shaders:
SGPRs: 787920 -> 787680 (-0.03%); split: -0.04%, +0.01%
VGPRs: 681056 -> 681040 (-0.00%); split: -0.01%, +0.00%
CodeSize: 68127288 -> 67664120 (-0.68%); split: -0.69%, +0.01%
MaxWaves: 54370 -> 54371 (+0.00%)
Instrs: 13294638 -> 13214109 (-0.61%); split: -0.62%, +0.01%
Latency: 373515759 -> 373214571 (-0.08%); split: -0.11%, +0.03%
InvThroughput: 166529524 -> 166275291 (-0.15%); split: -0.20%, +0.05%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:43 +00:00
Rhys Perry
2f94353735 aco: add p_extract/p_insert
These will let us make the SDWA optimizer much simpler than if we were to
recognize combinations of shift/and/bfe.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:42 +00:00
Rhys Perry
3013670dfd aco: disallow SGPRs on DPP instructions
They can't be encoded.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10841>
2021-05-19 14:25:37 +00:00
Rhys Perry
e3c283e0bc aco: use -1.0*x and 1.0*|x| for fneg/fabs
Besides -1.0*x being 1 dword smaller than x^0x80000000, this commit also
improves generated code when the application requires that denormals are
flushed.

Future versions of DXVK will require that 32-bit denormals are flushed.

fossil-db (GFX8):
Totals from 21021 (14.22% of 147787) affected shaders:
SGPRs: 1288960 -> 1288944 (-0.00%); split: -0.01%, +0.01%
VGPRs: 792672 -> 792848 (+0.02%); split: -0.01%, +0.03%
CodeSize: 62439228 -> 62403552 (-0.06%); split: -0.11%, +0.05%
MaxWaves: 136182 -> 136181 (-0.00%); split: +0.00%, -0.00%
Instrs: 12230882 -> 12239927 (+0.07%); split: -0.01%, +0.08%

fossil-db (GFX10.3):
Totals from 20191 (13.80% of 146267) affected shaders:
VGPRs: 799992 -> 800032 (+0.01%)
CodeSize: 59763656 -> 59715484 (-0.08%); split: -0.12%, +0.03%
MaxWaves: 525378 -> 525376 (-0.00%)
Instrs: 11511082 -> 11517419 (+0.06%); split: -0.00%, +0.06%

fossil-db (GFX8, d3d float controls):
Totals from 87160 (58.98% of 147787) affected shaders:
SGPRs: 5395072 -> 5408480 (+0.25%); split: -0.06%, +0.31%
VGPRs: 3596716 -> 3581592 (-0.42%); split: -0.55%, +0.13%
CodeSize: 271347396 -> 266814460 (-1.67%); split: -1.67%, +0.00%
MaxWaves: 539669 -> 540400 (+0.14%); split: +0.15%, -0.02%
Instrs: 53395194 -> 52257505 (-2.13%); split: -2.13%, +0.00%

fossil-db (GFX10.3, d3d float controls):
Totals from 82306 (56.27% of 146267) affected shaders:
VGPRs: 3572312 -> 3558848 (-0.38%); split: -0.44%, +0.06%
CodeSize: 273494748 -> 269648968 (-1.41%); split: -1.41%, +0.00%
MaxWaves: 2007156 -> 2009950 (+0.14%); split: +0.15%, -0.01%
Instrs: 52251568 -> 51356424 (-1.71%); split: -1.71%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9079>
2021-03-24 14:02:41 +00:00
Rhys Perry
561fcfb50f aco: don't optimize min(a*1.0, ...) to min(a, ...) on GFX8
fossil-db (GFX8):
Totals from 2 (0.00% of 147787) affected shaders:
VMEM: 662 -> 642 (-3.02%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9079>
2021-03-24 14:02:41 +00:00
Daniel Schürmann
fc3606f29c aco/optimizer: set VCC hint on new v_cmp_* definitions
Totals from 11692 (7.99% of 146267) affected shaders (Navi10):
CodeSize: 97419384 -> 97352560 (-0.07%); split: -0.07%, +0.00%
Instrs: 18571138 -> 18570969 (-0.00%); split: -0.00%, +0.00%
Cycles: 1431348400 -> 1431346296 (-0.00%); split: -0.00%, +0.00%
SMEM: 696646 -> 696650 (+0.00%)
SClause: 668511 -> 668490 (-0.00%); split: -0.00%, +0.00%
Copies: 1279475 -> 1279474 (-0.00%)

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9531>
2021-03-18 17:15:00 +00:00
Rhys Perry
3d4c13f3b8 aco: add DeviceInfo
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8761>
2021-02-15 13:44:22 +00:00
Rhys Perry
0f178290ca aco: don't affect isPrecise() after applying output modifiers
fossil-db (GFX10.3):
Totals from 26679 (19.14% of 139391) affected shaders:
SGPRs: 1757155 -> 1757059 (-0.01%); split: -0.05%, +0.04%
VGPRs: 1175932 -> 1173556 (-0.20%); split: -0.21%, +0.01%
CodeSize: 86203592 -> 85572480 (-0.73%); split: -0.73%, +0.00%
MaxWaves: 315513 -> 315805 (+0.09%); split: +0.10%, -0.00%
Instrs: 16297785 -> 16143745 (-0.95%); split: -0.95%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8718>
2021-01-26 22:22:58 +00:00
Rhys Perry
2f0d480c73 aco: optimize out a*1.0 if it's used as a float
fossil-db (GFX10):
Totals from 370 (0.27% of 139391) affected shaders:
CodeSize: 641436 -> 634156 (-1.13%); split: -1.14%, +0.00%
Instrs: 117668 -> 115739 (-1.64%); split: -1.64%, +0.00%

fossil-db (GFX10.3):
Totals from 370 (0.27% of 139391) affected shaders:
SGPRs: 26888 -> 26912 (+0.09%)
VGPRs: 13964 -> 13916 (-0.34%)
CodeSize: 692008 -> 679008 (-1.88%)
MaxWaves: 4779 -> 4783 (+0.08%)
Instrs: 134265 -> 132026 (-1.67%); split: -1.67%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5523>
2021-01-26 11:36:13 +00:00
Rhys Perry
54a09545ec aco: optimize a*0.0
fossil-db (GFX10):
Totals from 1943 (1.39% of 139391) affected shaders:
SGPRs: 99952 -> 99544 (-0.41%); split: -0.44%, +0.03%
VGPRs: 60880 -> 60272 (-1.00%); split: -1.02%, +0.02%
CodeSize: 5138488 -> 5107500 (-0.60%); split: -0.61%, +0.01%
MaxWaves: 32193 -> 32380 (+0.58%)
Instrs: 983178 -> 975684 (-0.76%); split: -0.77%, +0.01%

fossil-db (GFX10.3):
Totals from 1943 (1.39% of 139391) affected shaders:
SGPRs: 99832 -> 99648 (-0.18%); split: -0.25%, +0.06%
VGPRs: 64708 -> 63944 (-1.18%); split: -1.27%, +0.09%
CodeSize: 5196732 -> 5157632 (-0.75%); split: -0.76%, +0.00%
MaxWaves: 27478 -> 27486 (+0.03%); split: +0.06%, -0.03%
Instrs: 1007222 -> 998737 (-0.84%); split: -0.84%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5523>
2021-01-26 11:36:13 +00:00
Rhys Perry
0c3d8e8e2e aco: disable a*1.0 optimization if the instruction is precise
fossil-db (GFX10):
Totals from 10370 (7.44% of 139391) affected shaders:
SGPRs: 564072 -> 564016 (-0.01%); split: -0.01%, +0.00%
VGPRs: 248312 -> 248532 (+0.09%); split: -0.02%, +0.11%
CodeSize: 12866732 -> 13208904 (+2.66%); split: -0.00%, +2.66%
MaxWaves: 190198 -> 190170 (-0.01%)
Instrs: 2460818 -> 2545351 (+3.44%)

fossil-db (GFX10.3):
Totals from 10370 (7.44% of 139391) affected shaders:
SGPRs: 563904 -> 564272 (+0.07%); split: -0.16%, +0.22%
VGPRs: 289344 -> 295016 (+1.96%); split: -0.88%, +2.84%
CodeSize: 13519204 -> 14197020 (+5.01%); split: -0.00%, +5.01%
MaxWaves: 155946 -> 154566 (-0.88%)
Instrs: 2719177 -> 2806919 (+3.23%); split: -0.00%, +3.23%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5523>
2021-01-26 11:36:13 +00:00
Rhys Perry
e115b01948 aco: return references in instruction cast methods
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>
2021-01-22 14:12:33 +00:00
Rhys Perry
1d245cd18b aco: use format-check methods
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>
2021-01-22 14:12:32 +00:00