Commit graph

80 commits

Author SHA1 Message Date
Rhys Perry
82c265a514 aco: improve check for moving temporaries out of fixed definitions
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
2020-06-15 18:24:22 +00:00
Rhys Perry
e9578e3033 aco: allow GFX9 partial writes with instructions which use opsel
Some instructions such as v_mad_f16 can do partial writes on GFX9.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
2020-06-15 18:24:22 +00:00
Rhys Perry
7f511efa16 aco: create 16-bit mad/fma
fossil-db (Navi, fp16 enabled):
Totals from 1 (0.00% of 127638) affected shaders:
CodeSize: 4868 -> 4552 (-6.49%)
Instrs: 956 -> 863 (-9.73%)
Cycles: 3824 -> 3452 (-9.73%)
VMEM: 504 -> 490 (-2.78%)
SMEM: 109 -> 107 (-1.83%)
VClause: 19 -> 20 (+5.26%)
Copies: 54 -> 58 (+7.41%)
PreVGPRs: 43 -> 41 (-4.65%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
2020-06-15 18:24:22 +00:00
Rhys Perry
1b10764e50 aco: try to use fma instead of mad when denormals are enabled
v_mad_f32 doesn't support denormals but v_fma_f32 does.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
2020-06-15 18:24:22 +00:00
Rhys Perry
1b2e1163b2 aco: fix moving sub-dword values out of a register for a fixed definition
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>
2020-06-10 15:05:11 +00:00
Rhys Perry
edf863d1d2 aco: use Info::definition_size instead of definition's regclass
16-bit abs/neg creates v_xor_b32/v_and_b32 with v2b definitions. These
instructions never do partial writes without SDWA.

No shader-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>
2020-06-10 15:05:11 +00:00
Rhys Perry
62ea429a99 aco: prefer 4-byte aligned definitions
shader-db (Navi, fp16 enabled):
Totals from 42 (0.03% of 127638) affected shaders:
CodeSize: 811984 -> 806224 (-0.71%)
Instrs: 155733 -> 155939 (+0.13%); split: -0.04%, +0.18%
Cycles: 1982568 -> 1984400 (+0.09%); split: -0.06%, +0.15%
VMEM: 7187 -> 7121 (-0.92%); split: +0.86%, -1.78%
SMEM: 1770 -> 1769 (-0.06%)
VClause: 1475 -> 1476 (+0.07%)
Copies: 12406 -> 12606 (+1.61%); split: -0.46%, +2.07%
Branches: 5901 -> 5900 (-0.02%); split: -0.25%, +0.24%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>
2020-06-10 15:05:11 +00:00
Rhys Perry
56345b8c61 aco: allow reading/writing upper halves/bytes when possible
Use SDWA, opsel or a different opcode to achieve this.

shader-db (Navi, fp16 enabled):
Totals from 42 (0.03% of 127638) affected shaders:
VGPRs: 3424 -> 3416 (-0.23%)
CodeSize: 811124 -> 811984 (+0.11%); split: -0.12%, +0.23%
Instrs: 156638 -> 155733 (-0.58%)
Cycles: 1994180 -> 1982568 (-0.58%); split: -0.59%, +0.00%
VMEM: 7019 -> 7187 (+2.39%); split: +3.45%, -1.05%
SMEM: 1771 -> 1770 (-0.06%); split: +0.06%, -0.11%
VClause: 1477 -> 1475 (-0.14%)
Copies: 13216 -> 12406 (-6.13%)
Branches: 5942 -> 5901 (-0.69%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>
2020-06-10 15:05:11 +00:00
Daniel Schürmann
b21d2d9a9f aco: add and use scratch SGPR to lower subdword p_create_vector on GFX6/7
This is needed to lower some corner cases correctly,
in case the same operand occurs multiple times:
e.g. v0 = p_create_vector(v0[0:8], v0[0:8], v0[0:8], v0[0:8])

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>
2020-06-09 21:25:38 +00:00
Daniel Schürmann
0560831593 aco: fix register assignment for p_create_vector on GFX6/7
In case, some operand was already placed in the definition space,
it could happen that it wasn't considered for live-range splits.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>
2020-06-09 21:25:38 +00:00
Samuel Pitoiset
75a730ced5 aco: fix register allocation for subdword instructions on GFX10
Cc: 20.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5148>
2020-05-29 11:20:58 +00:00
Daniel Schürmann
51f4b22fee aco: don't allow unaligned subdword accesses on GFX6/7
There are no SDWA instructions which means that only
full registers can be accessed.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5070>
2020-05-21 12:07:40 +00:00
Daniel Schürmann
ae390755fe aco: fix corner case in register allocation
We mark dead operands in the register file when searching for
a register for a definition. Only do so, if this space has not
yet been taken by a different definition.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5070>
2020-05-21 12:07:40 +00:00
Daniel Schürmann
acec00eae0 aco: don't move create_vector subdword operands to unsupported register offsets
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5070>
2020-05-21 12:07:40 +00:00
Daniel Schürmann
5201985332 aco: restrict copying of create_vector operands to GFX9+
This improves code size for Polaris and earlier due to less register swapping

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5070>
2020-05-21 12:07:40 +00:00
Daniel Schürmann
66e3c74f9c aco: fix WQM coalescing
get_reg_specified() doesn't handle special registers like SCC.
Fixes: a5fc96b533 ('aco: coalesce parallelcopies during register allocation')

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5036>
2020-05-14 16:30:19 +00:00
Rhys Perry
7ce527a4fe aco: improve phi affinities with p_split_vector
Totals from 5860 (4.59% of 127638) affected shaders:
VGPRs: 460212 -> 460216 (+0.00%)
CodeSize: 65554356 -> 65464816 (-0.14%)
Instrs: 12655972 -> 12633578 (-0.18%)
Copies: 1309994 -> 1292163 (-1.36%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4990>
2020-05-13 13:12:08 +00:00
Rhys Perry
51e797e233 aco: consider affinities when creating v_mac_f32
Totals from 8487 (6.65% of 127638) affected shaders:
CodeSize: 62061988 -> 62058020 (-0.01%); split: -0.01%, +0.01%
Instrs: 11910757 -> 11885409 (-0.21%); split: -0.21%, +0.00%
Copies: 1065244 -> 1040945 (-2.28%); split: -2.30%, +0.02%
Branches: 349665 -> 348914 (-0.21%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4990>
2020-05-13 13:12:08 +00:00
Rhys Perry
138eed45b5 aco: mark phi definitions as last-seen phi operands
Totals from 14340 (11.23% of 127638) affected shaders:
SGPRs: 1251648 -> 1251512 (-0.01%)
VGPRs: 994556 -> 994104 (-0.05%); split: -0.06%, +0.01%
CodeSize: 122894528 -> 121099604 (-1.46%); split: -1.49%, +0.03%
MaxWaves: 106039 -> 106103 (+0.06%); split: +0.06%, -0.00%
Instrs: 23860066 -> 23414317 (-1.87%); split: -1.90%, +0.03%
Copies: 2448228 -> 2049305 (-16.29%); split: -16.37%, +0.07%
Branches: 789381 -> 757921 (-3.99%); split: -4.62%, +0.64%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4990>
2020-05-13 13:12:08 +00:00
Daniel Schürmann
a5fc96b533 aco: coalesce parallelcopies during register allocation
These are the result of lowering to CSSA, and should be removed if possible

Totals from affected shaders: (VEGA)
SGPRS: 544544 -> 544544 (0.00 %)
VGPRS: 418224 -> 418224 (0.00 %)
Spilled SGPRs: 141826 -> 141826 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 65853740 -> 64703380 (-1.75 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 13669 -> 13669 (0.00 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4952>
2020-05-12 15:59:31 +00:00
Rhys Perry
307aca83a2 aco: add missing adjust_max_used_regs()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4772>
2020-04-28 23:16:55 +00:00
Rhys Perry
99ca96fbf5 aco: improve RA for uneven p_split_vector
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4772>
2020-04-28 23:16:55 +00:00
Rhys Perry
24116a8a56 aco: don't recurse in sub-dword get_reg_simple()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4772>
2020-04-28 23:16:55 +00:00
Rhys Perry
be4a34966c aco: fix neighboring register check in get_reg_simple()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4772>
2020-04-28 23:16:55 +00:00
Rhys Perry
fb59ed6bb9 aco: check alignment of non-subdword registers in get_reg_specified()
When splitting a v6b vector into v1 and v2b components, we should ensure
the v1 definition doesn't start at the upper half.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4772>
2020-04-28 23:16:55 +00:00
Rhys Perry
916cc3e231 aco: make RegisterFile::block() take a regclass
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4772>
2020-04-28 23:16:55 +00:00
Timur Kristóf
f321dc33c8 aco: Abort when RA can't find a register.
Previously, it was just unreachable, which means it will generate
invalid shaders when it encounters a situation when it can't allocate
registers for eg. a large load.

This commit makes it slightly easier to notice such problems without
triggering a GPU hang.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>
2020-04-24 17:58:57 +00:00
Daniel Schürmann
36e0d2f39b aco: coalesce v_mad's accumulator with definition's affinities
Totals from affected shaders:
Code Size: 8922676 -> 8915192 (-0.08 %) bytes

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
d000d76f13 aco: use upper part of gap in register file if it is beneficial for striding
Totals from affected shaders:
SGPRS: 1717288 -> 1716984 (-0.02 %)
VGPRS: 1305924 -> 1304904 (-0.08 %)
Code Size: 138508892 -> 138420144 (-0.06 %) bytes
Max Waves: 115726 -> 115735 (0.01 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
d666d83be2 aco: try to always find a register with stride for even sizes
Totals from affected shaders:
SGPRS: 1162400 -> 1162400 (0.00 %)
VGPRS: 947364 -> 946960 (-0.04 %)
Code Size: 98399300 -> 98399004 (-0.00 %) bytes
Max Waves: 74665 -> 74682 (0.02 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
5a3c1f4f0b aco: stop get_reg_simple after reaching max_used_gpr
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
2796cb4c24 aco: refactor get_reg_simple() to return early on exact matches
in the best fit algorithm

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
6792e134f3 aco: don't create vector affinities for operands which are not killed or are duplicates
Totals from affected shaders:
SGPRS: 825184 -> 825184 (0.00 %)
VGPRS: 697640 -> 697240 (-0.06 %)
Code Size: 79244104 -> 79201072 (-0.05 %) bytes
Max Waves: 42388 -> 42386 (-0.00 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
edc2b57ac1 aco: allocate full register for subdword definitions if HW doesn't support it
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
97a870cf88 aco: move attempt to find strided register into get_reg_simple()
This simplifies code and helps some shaders

Totals from affected shaders:
Code Size: 51227172 -> 51202216 (-0.05 %) bytes
Max Waves: 19955 -> 19948 (-0.04 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
c7f97f110c aco: use DefInfo in more places to simplify RA
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
734f86db6b aco: create and use DefInfo struct in RA
for maintaining all information necessary to find a register.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
5b2f628da3 aco: create pseudo dummy instruction in RA to be used for live-range splits
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
d9f7d1d5cb aco: refactor get_reg() to also handle affinities
This simplifies definition handling and
helps a few shaders

Totals from affected shaders:
Code Size: 659540 -> 659376 (-0.02 %) bytes

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
7c8f4ebca9 aco: refactor get_reg() to take Temp instead of RegClass
This patch also moves get_reg_specified() and
get_reg_vec() before get_reg() to make use of it later.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:22 +00:00
Daniel Schürmann
0a9ed98178 aco: simplify operand handling in RA
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:22 +00:00
Rhys Perry
fbd2be3f5d aco: clear moved operands in get_reg_create_vector()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4507>
2020-04-14 10:49:12 +00:00
Rhys Perry
52cc1f8237 aco: improve p_create_vector RA for sub-dword operands
These's still improvements needed for sub-dword definitions, but that's
not as simple.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4507>
2020-04-14 10:49:12 +00:00
Daniel Schürmann
38622de2ec aco: make some reg_file helpers private and fix their uses
Fixes various subdword RA issues

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4492>
2020-04-10 07:19:27 +00:00
Daniel Schürmann
d22e2b3bd0 aco: RA - move all std::function objects into proper functions
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
2020-04-09 15:08:57 +00:00
Daniel Schürmann
5351fee56a aco: move all needed helper containers to ra_ctx
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
2020-04-09 15:08:57 +00:00
Daniel Schürmann
2ae27b96ef aco: change live_out variables to std::unordered_set
Improves performance of live_var_analysis for larger shaders

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
2020-04-09 15:08:57 +00:00
Daniel Schürmann
acc10a7e51 aco: change some std::map to std::unordered_map in register_allocation
This improves compile times slightly for larger shaders

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
2020-04-09 15:08:57 +00:00
Daniel Schürmann
69b6069dd2 aco: refactor try_remove_trivial_phi() in RA
Minor refactoring to avoid some pointer chasing.
This patch also changes the live_out argument to be
passed by reference to avoid an unnecessary copy.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
2020-04-09 15:08:57 +00:00
Daniel Schürmann
09850e0a94 aco: during RA only insert into renames table if a variable got renamed
This improves the speed of register allocation.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>
2020-04-09 15:08:57 +00:00