Commit graph

173 commits

Author SHA1 Message Date
Rhys Perry
d1621834f3 aco: combine VALU and SALU into various VOP3 instructions
shader-db (Navi):
Totals from 2916 (2.28% of 127638) affected shaders:
SGPRs: 184427 -> 184283 (-0.08%); split: -0.10%, +0.02%
VGPRs: 143520 -> 143640 (+0.08%); split: -0.00%, +0.09%
CodeSize: 14913548 -> 14913288 (-0.00%); split: -0.00%, +0.00%
MaxWaves: 26034 -> 26012 (-0.08%)
Instrs: 2935435 -> 2930960 (-0.15%); split: -0.15%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>
2020-04-23 12:39:33 +00:00
Rhys Perry
2dc550202e aco: copy-propagate p_create_vector copies of vectors
Instead of copying the operands of the other p_create_vector and labelling
the definition with label_vec, copy the operands and label it with
label_temp so that it can be copy-propagated.

This was found while removing a redundant copy in load_input_from_temps()
which removed duplicate p_create_vector instructions.

shader-db (Navi):
Totals from 139 (0.11% of 127638) affected shaders:
VGPRs: 8472 -> 7948 (-6.19%)
CodeSize: 514592 -> 512368 (-0.43%)
MaxWaves: 1089 -> 1195 (+9.73%)
Instrs: 100214 -> 99658 (-0.55%)
Cycles: 400856 -> 398632 (-0.55%)
VMEM: 15545 -> 15338 (-1.33%)
Copies: 5140 -> 4584 (-10.82%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>
2020-04-23 12:39:33 +00:00
Rhys Perry
41ac44e1b3 aco: improve vector optimization with sub-dword vectors
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4507>
2020-04-14 10:49:12 +00:00
Daniel Schürmann
28d36d26c2 aco: fix p_extract_vector optimization in presence of unequally sized vector operands
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4506>
2020-04-13 16:35:40 +00:00
Daniel Schürmann
a39df3bfce aco: don't constant-propagate into subdword PSEUDO instructions
PSEUDO instructions are lowered using SDWA, and thus,
cannot take literals and before GFX9 cannot take constants
at all. As the in-register representation differs between
32bit and 16bit floats, we first need to ensure correct
behavior.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4492>
2020-04-10 07:19:27 +00:00
Rhys Perry
20a4b1461b aco: zero-initialize Temp
Fixes dEQP-VK.transform_feedback.* crashes from accesses garbage
temporaries in emit_extract_vector().

Fixes: 85521061 ("aco: prepare helper functions for subdword handling")
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4463>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4463>
2020-04-06 19:15:19 +00:00
Daniel Schürmann
0bb3537676 aco: don't assume split_vector(create_vector) has the same number of elements when optimizing
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4002>
2020-04-03 23:13:15 +01:00
Daniel Schürmann
c436743b0c aco: don't propagate SGPRs into subdword PSEUDO instructions
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4002>
2020-04-03 23:13:15 +01:00
Samuel Pitoiset
c953292630 aco: always optimize v_mad to v_madak in presence of literals
v_mad and v_madak are both 64-bit instructions, so it doesn't
increase code size to always apply a 32-bit literal instead of
using v_mad and a sgpr which contains that literal.

Found with some Youngblood shaders but help some other games.

vkpipeline-db (VEGA10):
Totals from affected shaders:
SGPRS: 46168 -> 46016 (-0.33 %)
VGPRS: 45576 -> 45564 (-0.03 %)
Code Size: 5187208 -> 5179584 (-0.15 %) bytes
Max Waves: 3297 -> 3297 (0.00 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4410>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4410>
2020-04-03 07:30:49 +00:00
Timur Kristóf
655c050119 aco: Fix combining DS additions in the optimizer.
Previously, it was calculated incorrectly for 64-bit writes and reads.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964>
2020-03-11 08:34:10 +00:00
Rhys Perry
2d1ba86382 aco: handle v_add_co_u32_e64 in parse_base_offset()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3902>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3902>
2020-03-03 18:31:06 +00:00
Rhys Perry
483d4ec57c aco: improve SCC handling in some SALU combines
Add some checks and remove some unnecessary checks.

Found by observation. No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3599>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3599>
2020-02-12 19:18:45 +00:00
Rhys Perry
d45e9451cf aco: disable some instruction combining if it could change an exec operand
Found by observation. No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3599>
2020-02-12 19:18:40 +00:00
Samuel Pitoiset
ddd767387f aco: fix creating v_madak if v_mad_f32 has two sgpr literals
Do not ignore that src1 can be a sgpr.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2435
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3759>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3759>
2020-02-11 07:17:31 +00:00
Timur Kristóf
4d34abd15c aco/optimizer: Don't combine uniform bool s_and to s_andn2.
Fixes: 8a32f57fff

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3714>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3714>
2020-02-05 22:53:45 +00:00
Daniel Schürmann
71440ba0f5 aco: reorder VMEM operands in ACO IR
For all VMEM instructions, the resource constant is now
in operands[0]. For MIMG instructions, the sampler shares
operands[1] with write data in case this instruction writes memory.
Moving the VADDR to be the last operand for MIMG is the first step to
support Navi NSA encoding.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
2020-01-29 18:45:23 +00:00
Daniel Schürmann
396be00640 aco: fix combine_salu_not_bitwise() when SCC is used
Previously, we didn't use the SCC bit, and thus, we didn't care about it.
With 'aco: Transform uniform bitwise instructions to 32-bit if possible.'
that changed, so that we have to handle it.

Fixes: 8a32f57fff ('aco: Transform uniform bitwise instructions to 32-bit if possible.')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3598>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3598>
2020-01-28 18:14:02 +01:00
Rhys Perry
2dc63d39d3 aco: fix literal application with v_cndmask_b32/v_addc_co_u32/etc
No pipeline-db changes

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 0be7409069 ('aco: rewrite literal combining')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>
2020-01-27 14:50:37 +00:00
Rhys Perry
827681f921 aco: always add sgprs to sgpr_ids when choosing literals
Even if it's a literal, we should add this to sgpr_ids.
No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 0be7409069 ('aco: rewrite literal combining')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>
2020-01-27 14:50:37 +00:00
Timur Kristóf
8a32f57fff aco: Transform uniform bitwise instructions to 32-bit if possible.
This allows removing superfluous s_cselect instructions
that come from turning booleans into 64-bit vector condition.

v2 by Daniel Schürmann:
- Make the code massively simpler
v3 by Timur Kristóf:
- Fix regressions, make it work in wave32 mode
- Eliminate extra moves by not always using the SCC definition
- Use s_absdiff_i32 for uniform XOR
- Skip the transformation for uncommon or invalid instructions

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3450>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3450>
2020-01-24 14:40:45 +00:00
Timur Kristóf
23edcf6490 aco: Make a better guess at which instructions need the VCC hint.
Previously, bool_to_vector_condition would always set the VCC hint
on its result. This commit improves it by having the optimizer set
the VCC hint only when the result really needs to be in the VCC.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3451>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3451>
2020-01-24 13:14:23 +00:00
Samuel Pitoiset
b8abfafe86 aco: fix constant folding of SMRD instructions on GFX6
SMRD instructions have an 8-bit dword offset on SI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>
2020-01-20 16:24:55 +00:00
Rhys Perry
e151398de6 aco: fix stack buffer overflow in apply_sgprs()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: cef7879719 ('aco: rewrite apply_sgprs()')
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2361
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3442>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3442>
2020-01-20 11:13:11 +00:00
Samuel Pitoiset
a445cb35bd aco: do not combine additions of DS instructions on GFX6
The offset field doesn't work as expected on GFX6.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3412>
2020-01-16 14:06:06 +00:00
Timur Kristóf
dfaa3c0af6 aco: Flip s_cbranch / s_cselect to optimize out an s_not if possible.
When possible, get rid of an s_not when all it does is invert the SCC,
and its successor s_cbranch / s_cselect can be inverted instead.

Also modify some parts of instruction_selection to take advantage of
this feature.

Example:
s2: %3900,  s1: %3899:scc = s_andn2_b64 %0:exec, %406
s2: %3902 = s_cselect_b64 -1, 0, %3900:scc
s2: %407,  s1: %3903:scc = s_not_b64 %3902
s2: %3906,  s1: %3905:scc = s_and_b64 %407, %0:exec
p_cbranch_z %3905:scc
Can now be optimized to:
s2: %3900,  s1: %3899:scc = s_andn2_b64 %0:exec, %406
p_cbranch_nz %3900:scc

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2020-01-14 21:21:06 +01:00
Timur Kristóf
c0f82165a7 aco: Optimize out s_and with exec, when used on uniform bitwise values.
Previously all booleans needed an s_and with exec when they were turned
into a scalar condition. However, this is not needed for uniform booleans.

v2 by Daniel Schürmann:
- Make the code more readable
v3 by Timur Kristóf:
- Fix regressions, make it work in wave32 mode

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2020-01-14 21:21:06 +01:00
Timur Kristóf
1c44129db3 aco: Don't skip combine_instruction when definitions[1] is used.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2020-01-14 21:21:06 +01:00
Timur Kristóf
d962bbd895 aco: Implement 64-bit constant propagation.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2020-01-14 21:21:06 +01:00
Rhys Perry
f978e0e516 aco: add integer min/max to can_swap_operands
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
92ace0bb31 aco: replace extract_vector with copies
Helps a small number of small shaders with situations like this:
a = p_create_vector ...
b = p_extract_vector a, 3
and copy propagation can't be done

Totals from affected shaders:
SGPRS: 14304 -> 14416 (0.78 %)
VGPRS: 8716 -> 6592 (-24.37 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 184664 -> 176888 (-4.21 %) bytes
Max Waves: 6260 -> 6260 (0.00 %)
Instructions: 35561 -> 33617 (-5.47 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
e686e4765e aco: add min(-max(), ) and max(-min(), ) optimization
No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
fa8357eb70 aco: improve clamp optimization
Not sure why it checked the use count, it doesn't apply the constants.

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 269409 -> 269745 (0.12 %)
VGPRS: 238120 -> 238132 (0.01 %)
Spilled SGPRs: 305 -> 305 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 22908584 -> 22904672 (-0.02 %) bytes
Max Waves: 20217 -> 20217 (0.00 %)
Instructions: 4275312 -> 4263869 (-0.27 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 155409 -> 155233 (-0.11 %)
VGPRS: 153072 -> 153072 (0.00 %)
Spilled SGPRs: 269 -> 269 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 14650824 -> 14650396 (-0.00 %) bytes
Max Waves: 9609 -> 9609 (0.00 %)
Instructions: 2762802 -> 2755517 (-0.26 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
edc888ccb1 aco: fix clamp optimization
We can't do the optimization if there are neg/abs in-between.

No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
f664cb01ec aco: improve creation of v_madmk_f32/v_madak_f32
Using needs_vop3 check was flawed because it would only combine the
literal if the first operand is the literal. If the second or third
operand is the literal, then needs_vop3 will be true and the literal will
not be combined.

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 782051 -> 782051 (0.00 %)
VGPRS: 630048 -> 630048 (0.00 %)
Spilled SGPRs: 195 -> 195 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 54743740 -> 54585548 (-0.29 %) bytes
Max Waves: 67340 -> 67340 (0.00 %)
Instructions: 10182030 -> 10182030 (0.00 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 701990 -> 699590 (-0.34 %)
VGPRS: 566632 -> 566784 (0.03 %)
Spilled SGPRs: 218 -> 218 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 49173564 -> 49007856 (-0.34 %) bytes
Max Waves: 59650 -> 59612 (-0.06 %)
Instructions: 9315135 -> 9293330 (-0.23 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
15e25da3e5 aco: take advantage of GFX10's constant bus limit and VOP3 literals
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 2397159 -> 2392494 (-0.19 %)
VGPRS: 1756036 -> 1753920 (-0.12 %)
Spilled SGPRs: 461 -> 470 (1.95 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 110287304 -> 109946304 (-0.31 %) bytes
Max Waves: 318341 -> 318475 (0.04 %)
Instructions: 21019327 -> 20533618 (-2.31 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 0 -> 0 (0.00 %)
VGPRS: 0 -> 0 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 0 -> 0 (0.00 %) bytes
Max Waves: 0 -> 0 (0.00 %)
Instructions: 0 -> 0 (0.00 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
9c2d37308f aco: allow an extra SGPR with multiple uses to be applied to VOP3
This is in a separate patch from the apply_sgprs() rewrite so that the
rewrite can be more easily tested.

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 3056 -> 3056 (0.00 %)
VGPRS: 1632 -> 1632 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 156468 -> 156304 (-0.10 %) bytes
Max Waves: 288 -> 288 (0.00 %)
Instructions: 29510 -> 29469 (-0.14 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 2984 -> 2984 (0.00 %)
VGPRS: 1616 -> 1616 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 156132 -> 155968 (-0.11 %) bytes
Max Waves: 289 -> 289 (0.00 %)
Instructions: 29426 -> 29385 (-0.14 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
f4c2c90e1a aco: allow applying two sgprs to an instruction
We could create VALU instructions which read two sgprs, but only if isel
created an instruction which already read one of them.

This change is in a separate patch from the apply_sgprs() rewrite so that
it can be tested if the rewrite affected anything.

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 216 -> 216 (0.00 %)
VGPRS: 64 -> 64 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 1756 -> 1708 (-2.73 %) bytes
Max Waves: 120 -> 120 (0.00 %)
Instructions: 312 -> 300 (-3.85 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 216 -> 216 (0.00 %)
VGPRS: 64 -> 64 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 1784 -> 1736 (-2.69 %) bytes
Max Waves: 120 -> 120 (0.00 %)
Instructions: 319 -> 307 (-3.76 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
7da07ca3e4 aco: follow through temporary when merging tests into constant comparisons
This can happen with v_mov_b32(s_mov_b32(literal))

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 632 -> 632 (0.00 %)
VGPRS: 492 -> 492 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 77488 -> 76928 (-0.72 %) bytes
Max Waves: 67 -> 67 (0.00 %)
Instructions: 14426 -> 14332 (-0.65 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 632 -> 632 (0.00 %)
VGPRS: 492 -> 492 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 77512 -> 76952 (-0.72 %) bytes
Max Waves: 67 -> 67 (0.00 %)
Instructions: 14432 -> 14338 (-0.65 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
dc6c35e1c3 aco: be more careful with literals in combine_salu_{n2,lshl_add}
No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
fcf52eb42d aco: add check_vop3_operands()
This will be useful when taking advantage of GFX10 features.

No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
cef7879719 aco: rewrite apply_sgprs()
This will make it easier to apply two different sgprs (for GFX10) or apply
the same sgpr twice (just remove the break).

No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
0be7409069 aco: rewrite literal combining
Should make taking advantage of GFX10's increased constant bus limit and
VOP3 literals easier.

No pipeline-db changes

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
84b9f3786b aco: improve can_use_VOP3()
No pipeline-db changes

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
3cb98ed939 aco: combine two sgprs into a VALU if they're the same
This was supposed to be done before but it wasn't done correctly and
everywhere.

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 784680 -> 786128 (0.18 %)
VGPRS: 574012 -> 573892 (-0.02 %)
Spilled SGPRs: 461 -> 461 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 45477088 -> 45478172 (0.00 %) bytes
Max Waves: 81294 -> 81277 (-0.02 %)
Instructions: 8657970 -> 8622483 (-0.41 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 780664 -> 782072 (0.18 %)
VGPRS: 573880 -> 573760 (-0.02 %)
Spilled SGPRs: 629 -> 629 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 45445244 -> 45448340 (0.01 %) bytes
Max Waves: 81178 -> 81161 (-0.02 %)
Instructions: 8649902 -> 8614918 (-0.40 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
c240c1aecf aco: apply literals to split mads
Removing the return is also needed to apply literals to mads (which can be
done on GFX10).

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 368787 -> 367555 (-0.33 %)
VGPRS: 312436 -> 312448 (0.00 %)
Spilled SGPRs: 461 -> 461 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 26113388 -> 26098260 (-0.06 %) bytes
Max Waves: 35982 -> 35982 (0.00 %)
Instructions: 5038670 -> 5028941 (-0.19 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 369843 -> 368659 (-0.32 %)
VGPRS: 317224 -> 317196 (-0.01 %)
Spilled SGPRs: 629 -> 629 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 26310540 -> 26295156 (-0.06 %) bytes
Max Waves: 36324 -> 36326 (0.01 %)
Instructions: 5073957 -> 5064164 (-0.19 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
809c8feb92 aco: check if multiplication/clamp is live when applying output modifier
It's possible that a multiplication/clamp is dead code and the single use
is from a different user.

Fixes portal rendering in Path of Exile when global illumination is
enabled.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
2020-01-13 13:26:43 +00:00
Rhys Perry
ef8abfa790 aco: disable add combining for ds_swizzle_b32
ds_bpermute_b32/ds_permute_b32 are fine, I think

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
2020-01-13 13:26:43 +00:00
Rhys Perry
69bed1c918 aco: don't DCE atomics with return values
We don't create atomics with definitions if they are not used in NIR, but
our own DCE can remove the uses if an export turns out to be null.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
2020-01-13 13:26:43 +00:00
Rhys Perry
21eafe30df aco: better handle neg/abs of sgprs
isel/label_instruction currently doesn't create these but we should
probably check anyway.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
2020-01-13 13:26:43 +00:00
Rhys Perry
f29a5a205c aco: check usesModifiers() when identifying a neg/abs
This was fine because a literal used to mean that it didn't use modifiers,
but now VOP3 can take a literal on GFX10.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
2020-01-13 13:26:43 +00:00