Commit graph

1698 commits

Author SHA1 Message Date
Daniel Schürmann
eb8ec12b23 aco/ra: Fix potential out-of-bounds array accesses.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12748>
2021-09-10 19:39:18 +00:00
Timur Kristóf
536580b139 aco: Add some useful info to the README for debugging.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12748>
2021-09-10 19:39:18 +00:00
Rhys Perry
c1e668d5d1 aco/ra: don't use ds_write_b8_d16_hi/ds_write_b16_d16_hi on GFX8
GFX8 doesn't support these opcodes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: c75138ed64 ("aco/ra: refactor subdword definition info")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12721>
2021-09-06 15:10:26 +00:00
Timur Kristóf
268158a758 aco/optimize_postRA: Use iterators instead of operator[] of std::array.
Also add a few more assertions to make sure the registers are
within the bounds of the array.

Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Reviewed-by: Joshua Ashton <joshua@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12682>
2021-09-03 15:00:55 +00:00
Timur Kristóf
bb956464cb aco: Skip code paths to emit copies when there are no copies.
Found while running with libstdc++ debug mode.
Fixes the following:

Error: attempt to advance a dereferenceable (start-of-sequence)
iterator -1 steps, which falls outside its valid range.

Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12682>
2021-09-03 15:00:55 +00:00
Timur Kristóf
728ed892df aco: Use Builder reference in emit_copies_block.
Found while running with libstdc++ debug mode.
Fixes the following:

Error: attempt to copy-construct an iterator from a singular iterator.

Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12682>
2021-09-03 15:00:55 +00:00
Rhys Perry
8037b21573 aco/ra: allow v1b operands with 16-bit instructions
Instruction selection can create these.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: ec1bbfa608 ("aco/ra: refactor subdword operand stride")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617>
2021-09-03 13:21:28 +00:00
Rhys Perry
2a7fa132be aco: implement udot_4x8/sdot_4x8/udot_2x16/sdot_2x16 opcodes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617>
2021-09-03 13:21:28 +00:00
Rhys Perry
e0d232c2fc aco: implement nir_op_pack_32_4x8
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617>
2021-09-03 13:21:28 +00:00
Rhys Perry
4dd420f76d radv,aco: implement iadd_sat
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617>
2021-09-03 13:21:28 +00:00
Rhys Perry
54f83d718a aco/spill: add temporary operands of exec phis to next_use_distances_end
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: dfb10e4f4b ("aco/spill: don't count phis as variable access")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12702>
2021-09-03 14:01:27 +01:00
Rhys Perry
f241bd3749 aco: don't coalesce constant copies into non-power-of-two sizes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12702>
2021-09-03 14:01:27 +01:00
Daniel Schürmann
8bd7e2392b aco: preserve subdword RC when lowering p_insert/p_extract
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12640>
2021-09-02 20:39:17 +02:00
Daniel Schürmann
73481338fe aco/print_ir: always print SDWA dst & src selections
This way, it becomes more apparent how SDWA behaves.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12640>
2021-09-02 20:39:17 +02:00
Daniel Schürmann
0988f7b9ba aco: remove explicit dst_preserve flag
Instead, we can rely on the fact that subdword definitions
must preserve the unused bits while dword definitions either
pad or sign-extend.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12640>
2021-09-02 20:39:17 +02:00
Daniel Schürmann
9e3ff06c38 aco: rewrite SDWA selector
This commit introduces a new struct SubdwordSel
in order to ease and clean up the usage of SDWA
selections. This includes removing the distinction
between register-allocated and fixed SDWA selections.
Instead, SDWA selections can now also access the high
bits of subdword variables. Alignment and sizes are
validated accordingly. Size, offset and sign_extend
can be evaluated via helper methods.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12640>
2021-09-02 20:39:17 +02:00
Daniel Schürmann
cc4682ed47 aco: fix p_insert lowering with 16bit sources
The previous lowering only wrote a single byte.

Fixes: 2f94353735 ('aco: add p_extract/p_insert')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12640>
2021-09-02 20:39:17 +02:00
Daniel Schürmann
ef7d840a70 aco/lower_phis: optimize loop exit phis
This optimization works by ensuring that disabled lanes
are zero'd before any merge sequence.

Totals from 6075 (4.05% of 150170) affected shaders: (GFX10.3)
CodeSize: 57913908 -> 57913212 (-0.00%)
Instrs: 11055852 -> 11055678 (-0.00%)
Latency: 438705219 -> 438534652 (-0.04%); split: -0.04%, +0.00%
InvThroughput: 125284101 -> 125251397 (-0.03%); split: -0.03%, +0.00%
Copies: 807388 -> 821035 (+1.69%); split: -0.00%, +1.69%
Branches: 391827 -> 391782 (-0.01%)
PreSGPRs: 574841 -> 574838 (-0.00%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11659>
2021-09-02 16:41:52 +02:00
Daniel Schürmann
207249d2b2 aco/lower_phis: propagate constants before emitting merge code
This generalizes a previous optimization.

Totals from 521 (0.35% of 150170) affected shaders: (GFX10.3)
CodeSize: 1680348 -> 1678884 (-0.09%)
Instrs: 307994 -> 307628 (-0.12%)
Latency: 5799845 -> 5792655 (-0.12%)
InvThroughput: 994859 -> 994030 (-0.08%)
Copies: 18992 -> 18767 (-1.18%)
Branches: 10143 -> 10037 (-1.05%)
PreSGPRs: 21904 -> 21853 (-0.23%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11659>
2021-09-02 16:41:52 +02:00
Daniel Schürmann
aed7b7d185 aco/lower_bool_phis: avoid creating trivial phis
For this purpose, get_ssa() is also refactored slightly.

Totals from 4 (0.00% of 150170) affected shaders: (GFX10.3)
CodeSize: 15504 -> 15376 (-0.83%)
Instrs: 2942 -> 2910 (-1.09%)
Latency: 292444 -> 291642 (-0.27%)
InvThroughput: 30842 -> 30770 (-0.23%)
Copies: 164 -> 150 (-8.54%)
Branches: 96 -> 82 (-14.58%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11659>
2021-09-02 16:41:52 +02:00
Daniel Schürmann
6d8ea54654 aco: refactor lower_phis()
No fossil-db changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11659>
2021-09-02 16:41:52 +02:00
Daniel Schürmann
e4c5062fb7 aco: fix init_any_pred_defined() for loop header phis
This includes setting the correct end point of the propagation and
not propagating the incoming values after the loop header.
This patch also changes the propagation to a single iteration for
loop exit phis.

No fossil-db changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>

aco: don't propagate incoming value in init_any_pred_defined()

No fossil-db changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11659>
2021-09-02 16:41:52 +02:00
Timur Kristóf
9d20cf2732 aco: Fix invalid usage of std::fill with std::array.
In this case std::array doesn't behave like a regular array, therefore
it is NOT okay to index it outside the array, even though std::fill
needs us to do so.

Change the syntax to do the same thing slightly differently,
and add an assertion to make sure the registers are always within
the array's bounds.

Closes: #5289
Fixes: 0e4747d3fb
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12664>
2021-09-01 09:33:28 +00:00
Rhys Perry
33ddbd220f aco: remove DPP when applying constants/literals/sgprs
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12601>
2021-08-31 16:58:20 +00:00
Rhys Perry
7d95f7510f aco/tests: test copy propagation with DPP instructions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12601>
2021-08-31 16:58:20 +00:00
Rhys Perry
e27946ca11 aco: don't constant propagate to DPP instructions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12601>
2021-08-31 16:58:20 +00:00
Rhys Perry
9df9fe7dfa aco: include utility in isel
For std::exchange().

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: c1d11bb92c ("aco: Add loop creation helpers.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5301
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12614>
2021-08-30 14:28:00 +00:00
Timur Kristóf
76b9dd6266 aco: Unset 16 and 24-bit flags from operands in apply_extract.
Consider the following sequence in a shader:
b = p_extract a
c = v_mad_u32_u16 b, X, 0

The optimizer applies extract, resulting in:
c = v_mad_u32_u16 a, X, 0 (correct)

Then it mistakenly turns that into:
c = v_mul_u32_u24 a, X, 0 (incorrect)

In this case, the p_extract is applied to v_mad_u32_u16 by
apply_extract. After this, we can no longer be sure that
the operands are still 16 or 24-bit, so we have to remove
this flag.

No Fossil DB changes.

Fixes: 54292e99c7
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12558>
2021-08-30 14:05:33 +00:00
Timur Kristóf
cfb0d931f2 aco: Emit zero for the derivatives of uniforms.
Observed in a shader from Resident Evil Village.
This also helps prevent emitting invalid IR.

Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12599>
2021-08-27 20:34:22 +00:00
Daniel Schürmann
2eeaaabb8e aco/optimizer: combine v_pk_mul_u16 + v_pk_add_u16 -> v_pk_mad_u16
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11678>
2021-08-27 19:57:59 +00:00
Daniel Schürmann
be16ebc5ca aco/optimizer: fuse v_mul_f64 + v_add_f64 -> v_fma_f64
No fossil-db changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11678>
2021-08-27 19:57:59 +00:00
Daniel Schürmann
8e27ca9953 aco/optimizer: combine v_mul_lo_u16 + v_add_u16 -> v_mad_u16
Totals from 192 (0.13% of 150170) affected shaders: (GFX10.3)
CodeSize: 1027224 -> 1019872 (-0.72%)
Instrs: 174784 -> 173863 (-0.53%)
Latency: 4235742 -> 4232177 (-0.08%); split: -0.11%, +0.03%
InvThroughput: 1777026 -> 1775945 (-0.06%); split: -0.09%, +0.03%
Copies: 34098 -> 34099 (+0.00%); split: -0.03%, +0.03%
PreVGPRs: 4920 -> 4850 (-1.42%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11678>
2021-08-27 19:57:59 +00:00
Daniel Schürmann
23d5865f42 aco: refactor nir_op_imul selection
Previously, the optimization to use v_mul_lo_u16 for
32bit multiplications was done in instruction_selection.
This was moved to the optimizer to ease some case distinctions.

The mixed results are due to increased use of SDWA.

Totals from 2616 (1.74% of 150170) affected shaders: (GFX10.3)
VGPRs: 143888 -> 143872 (-0.01%); split: -0.02%, +0.01%
CodeSize: 5604032 -> 5604080 (+0.00%); split: -0.01%, +0.01%
Instrs: 1086798 -> 1083915 (-0.27%); split: -0.27%, +0.01%
Latency: 8215793 -> 8213023 (-0.03%); split: -0.10%, +0.07%
InvThroughput: 20765157 -> 20773766 (+0.04%); split: -0.02%, +0.06%
VClause: 35256 -> 35260 (+0.01%); split: -0.02%, +0.03%
SClause: 29021 -> 29024 (+0.01%); split: -0.00%, +0.01%
Copies: 74163 -> 74306 (+0.19%); split: -0.05%, +0.24%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11678>
2021-08-27 19:57:59 +00:00
Daniel Schürmann
d8eef134d8 aco: only apply extract if not used more than 4 times
Totals from 61 (0.04% of 150170) affected shaders: (GFX10.3)
CodeSize: 1087732 -> 1087380 (-0.03%); split: -0.22%, +0.18%
Instrs: 192343 -> 192205 (-0.07%); split: -0.16%, +0.09%
Latency: 7231670 -> 7148073 (-1.16%); split: -1.19%, +0.04%
InvThroughput: 3436715 -> 3394926 (-1.22%); split: -1.25%, +0.04%
VClause: 4831 -> 4833 (+0.04%)
Copies: 50130 -> 49934 (-0.39%); split: -0.67%, +0.28%
Branches: 5945 -> 5948 (+0.05%)
PreSGPRs: 3486 -> 3472 (-0.40%)
PreVGPRs: 5154 -> 5152 (-0.04%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11678>
2021-08-27 19:57:59 +00:00
Timur Kristóf
589ccf3d77 aco: Consider maximum number of workgroups per CU/WGP on Navi.
No Fossil DB changes.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12517>
2021-08-27 16:41:08 +00:00
Timur Kristóf
c8698199a1 aco: Consider LDS usage by PS inputs in MaxWaves calculation.
Before PS waves are launched, PS inputs are moved from PC to LDS
and the corresponding part of the PC is deallocated.
Each PS input occupies 3 * vec4 (3 * 16 = 48 bytes) of LDS space.
See Figure 10.3 in the GCN3 ISA manual.

These limit occupancy the same way as other stages' LDS usage does.
Note that PS can request additional LDS space via EXTRA_LDS_SIZE,
so that also must be taken into account here.

No Fossil DB changes.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12517>
2021-08-27 16:41:08 +00:00
Timur Kristóf
626b125857 aco: Use workgroup size from input shader info.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12321>
2021-08-26 09:46:18 +00:00
Timur Kristóf
5b7446d74c radv, ac, aco: Use indices 0-2 of gs_vtx_offset argument array on GFX9+.
Previously, indices 0, 2, 4 were used.
This worked, but it was somewhat unintuitive.
This commit changes it to use indices 0, 1, 2 instead, which
makes the code easier to understand.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12511>
2021-08-26 05:20:15 +00:00
Daniel Schürmann
cd489e5388 aco: remove redundant s_and exec after nir_op_inot
Totals from 22585 (15.04% of 150170) affected shaders: (GFX10.3)
VGPRs: 1474048 -> 1473904 (-0.01%)
CodeSize: 155238876 -> 155187688 (-0.03%); split: -0.06%, +0.03%
MaxWaves: 385086 -> 385122 (+0.01%)
Instrs: 29297735 -> 29284442 (-0.05%); split: -0.08%, +0.04%
Latency: 675841742 -> 675764151 (-0.01%); split: -0.02%, +0.01%
InvThroughput: 174859037 -> 174854796 (-0.00%); split: -0.01%, +0.01%
VClause: 479790 -> 479781 (-0.00%); split: -0.01%, +0.00%
SClause: 1106900 -> 1106615 (-0.03%); split: -0.03%, +0.01%
Copies: 1829037 -> 1828042 (-0.05%); split: -0.09%, +0.03%
Branches: 859971 -> 859967 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 1341850 -> 1342356 (+0.04%); split: -0.01%, +0.04%
PreVGPRs: 1327322 -> 1327034 (-0.02%)

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11573>
2021-08-25 12:43:50 +00:00
Timur Kristóf
abcc83e713 aco: Fix to_uniform_bool_instr when operands are not suitable.
Don't attempt to transform uniform boolean instructions when
their operands are unsuitable. This can happen eg. due to other
optimizations that combine SALU instructions which clear out
the uniform instruction labels.

Cc: mesa-stable
Fixes: 8a32f57fff
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11573>
2021-08-25 12:43:50 +00:00
Rhys Perry
b23a9dd1f6 aco/scheduler: allow moving down VMEM stores to below VMEM loads
fossil-db (Vega10):
Totals from 93 (0.06% of 150305) affected shaders:
SGPRs: 4832 -> 4768 (-1.32%)
VGPRs: 4084 -> 4144 (+1.47%)
CodeSize: 316080 -> 317208 (+0.36%); split: -0.11%, +0.47%
MaxWaves: 589 -> 580 (-1.53%)
Instrs: 60229 -> 60511 (+0.47%); split: -0.15%, +0.61%
Latency: 636477 -> 540029 (-15.15%); split: -15.26%, +0.10%
InvThroughput: 293027 -> 283043 (-3.41%); split: -4.21%, +0.80%
VClause: 2557 -> 2716 (+6.22%); split: -0.35%, +6.57%
SClause: 1381 -> 1395 (+1.01%); split: -0.14%, +1.16%
Copies: 9424 -> 9728 (+3.23%); split: -0.74%, +3.97%

fossil-db (Sienna Cichlid):
Totals from 88 (0.06% of 150170) affected shaders:
VGPRs: 3840 -> 3872 (+0.83%)
CodeSize: 300544 -> 300960 (+0.14%); split: -0.09%, +0.23%
Instrs: 53714 -> 53871 (+0.29%); split: -0.05%, +0.35%
Latency: 489854 -> 462001 (-5.69%); split: -6.30%, +0.61%
InvThroughput: 100307 -> 95142 (-5.15%); split: -5.50%, +0.35%
VClause: 2322 -> 2564 (+10.42%); split: -0.39%, +10.81%
SClause: 1345 -> 1358 (+0.97%); split: -0.15%, +1.12%
Copies: 4113 -> 4351 (+5.79%); split: -0.66%, +6.44%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12211>
2021-08-23 16:48:31 +00:00
Rhys Perry
2201f5a58c aco: remove label_extract if the extract is used by a non-VALU
If an extract is used by a non-VALU instruction, it can't be applied to
all instructions, so it's not beneficial to try to apply it.

This check isn't needed because can_apply_extract()/can_use_SDWA() should
already handle non-VALU instructions.

fossil-db (Sienna Cichlid):
Totals from 1020 (0.68% of 150170) affected shaders:
SpillSGPRs: 1577 -> 1571 (-0.38%)
CodeSize: 7863668 -> 7858336 (-0.07%); split: -0.07%, +0.00%
Instrs: 1431583 -> 1431083 (-0.03%); split: -0.04%, +0.01%
Latency: 25891250 -> 25890916 (-0.00%); split: -0.01%, +0.01%
InvThroughput: 7248683 -> 7248655 (-0.00%); split: -0.01%, +0.01%
SClause: 49072 -> 49071 (-0.00%)
Copies: 126649 -> 126580 (-0.05%); split: -0.11%, +0.06%
Branches: 39129 -> 39120 (-0.02%); split: -0.03%, +0.01%
PreSGPRs: 53071 -> 52943 (-0.24%); split: -0.26%, +0.02%
PreVGPRs: 57437 -> 57435 (-0.00%); split: -0.01%, +0.01%

fossil-db (Polaris10):
Totals from 654 (0.43% of 151696) affected shaders:
CodeSize: 5814552 -> 5811568 (-0.05%); split: -0.05%, +0.00%
Instrs: 1105783 -> 1105049 (-0.07%); split: -0.07%, +0.00%
Latency: 20261458 -> 20259744 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 9011785 -> 9011749 (-0.00%); split: -0.00%, +0.00%
Copies: 104693 -> 103904 (-0.75%); split: -0.76%, +0.00%
PreSGPRs: 36105 -> 36095 (-0.03%); split: -0.03%, +0.01%
PreVGPRs: 43813 -> 43809 (-0.01%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12212>
2021-08-23 14:56:37 +01:00
Daniel Schürmann
77ffdf41b1 aco: add more validation rules for SDWA operands
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>
2021-08-23 10:31:40 +00:00
Daniel Schürmann
077776a866 aco/opcodes: remove definition_size[]
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>
2021-08-23 10:31:40 +00:00
Daniel Schürmann
f6b281a1c2 aco/validate: simplify get_subdword_bytes_written()
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>
2021-08-23 10:31:40 +00:00
Daniel Schürmann
ec1bbfa608 aco/ra: refactor subdword operand stride
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>
2021-08-23 10:31:40 +00:00
Daniel Schürmann
c75138ed64 aco/ra: refactor subdword definition info
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>
2021-08-23 10:31:40 +00:00
Daniel Schürmann
e11b23f7cd aco: add instr_is_16bit() helper function
to indicate whether some instruction writes partial registers, only.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>
2021-08-23 10:31:40 +00:00
Daniel Schürmann
3d6ca41e44 aco: use VOPC_SDWA on GFX9+
Totals from 5138 (3.42% of 150170) affected shaders: (GFX10.3)
VGPRs: 409520 -> 409416 (-0.03%); split: -0.03%, +0.00%
CodeSize: 43056360 -> 43035696 (-0.05%); split: -0.06%, +0.02%
MaxWaves: 69296 -> 69310 (+0.02%)
Instrs: 8161016 -> 8153365 (-0.09%); split: -0.10%, +0.01%
Latency: 109397002 -> 109756208 (+0.33%); split: -0.05%, +0.38%
InvThroughput: 23238920 -> 23310761 (+0.31%); split: -0.11%, +0.42%
VClause: 135141 -> 135100 (-0.03%); split: -0.05%, +0.02%
SClause: 349511 -> 349489 (-0.01%); split: -0.01%, +0.00%
Copies: 388107 -> 387754 (-0.09%); split: -0.48%, +0.38%
Branches: 184629 -> 184503 (-0.07%); split: -0.08%, +0.01%
PreSGPRs: 258807 -> 258839 (+0.01%)
PreVGPRs: 372561 -> 372184 (-0.10%); split: -0.10%, +0.00%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>
2021-08-23 10:31:40 +00:00
Daniel Schürmann
60e171af06 aco/print_ir: fix printing of VOPC_SDWA definitions
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>
2021-08-23 10:31:40 +00:00