Commit graph

2571 commits

Author SHA1 Message Date
Rhys Perry
607fb4153d aco: move call to store_output_to_temps in store_ls_or_es_output earlier
Skips get_intrinsic_io_basic_offset()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>
2020-04-23 12:39:33 +00:00
Rhys Perry
b497b774a5 aco: remove copy in load_input_from_temps()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>
2020-04-23 12:39:33 +00:00
Rhys Perry
2dc550202e aco: copy-propagate p_create_vector copies of vectors
Instead of copying the operands of the other p_create_vector and labelling
the definition with label_vec, copy the operands and label it with
label_temp so that it can be copy-propagated.

This was found while removing a redundant copy in load_input_from_temps()
which removed duplicate p_create_vector instructions.

shader-db (Navi):
Totals from 139 (0.11% of 127638) affected shaders:
VGPRs: 8472 -> 7948 (-6.19%)
CodeSize: 514592 -> 512368 (-0.43%)
MaxWaves: 1089 -> 1195 (+9.73%)
Instrs: 100214 -> 99658 (-0.55%)
Cycles: 400856 -> 398632 (-0.55%)
VMEM: 15545 -> 15338 (-1.33%)
Copies: 5140 -> 4584 (-10.82%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>
2020-04-23 12:39:33 +00:00
Rhys Perry
e4383b5c7f aco: decrease the uses of other copy operations after splitting/removing
For copies like v[7:8] = v[8:9], what currently happens is:
- do_copy() will skip the second dword
- the uses of the second dword will be reduced to 0
- the copy operation will be removed from the map
and v8 will never be set to v9.

So just decrease the uses of other operations after splitting or removing
the current operation, so: "v8 = v9" will be split off, it's uses reduced
and then the new copy will be done in the next iteration.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4686>
2020-04-23 11:39:23 +00:00
Daniel Schürmann
36e0d2f39b aco: coalesce v_mad's accumulator with definition's affinities
Totals from affected shaders:
Code Size: 8922676 -> 8915192 (-0.08 %) bytes

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
d000d76f13 aco: use upper part of gap in register file if it is beneficial for striding
Totals from affected shaders:
SGPRS: 1717288 -> 1716984 (-0.02 %)
VGPRS: 1305924 -> 1304904 (-0.08 %)
Code Size: 138508892 -> 138420144 (-0.06 %) bytes
Max Waves: 115726 -> 115735 (0.01 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
d666d83be2 aco: try to always find a register with stride for even sizes
Totals from affected shaders:
SGPRS: 1162400 -> 1162400 (0.00 %)
VGPRS: 947364 -> 946960 (-0.04 %)
Code Size: 98399300 -> 98399004 (-0.00 %) bytes
Max Waves: 74665 -> 74682 (0.02 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
5a3c1f4f0b aco: stop get_reg_simple after reaching max_used_gpr
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
2796cb4c24 aco: refactor get_reg_simple() to return early on exact matches
in the best fit algorithm

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
6792e134f3 aco: don't create vector affinities for operands which are not killed or are duplicates
Totals from affected shaders:
SGPRS: 825184 -> 825184 (0.00 %)
VGPRS: 697640 -> 697240 (-0.06 %)
Code Size: 79244104 -> 79201072 (-0.05 %) bytes
Max Waves: 42388 -> 42386 (-0.00 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
edc2b57ac1 aco: allocate full register for subdword definitions if HW doesn't support it
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
97a870cf88 aco: move attempt to find strided register into get_reg_simple()
This simplifies code and helps some shaders

Totals from affected shaders:
Code Size: 51227172 -> 51202216 (-0.05 %) bytes
Max Waves: 19955 -> 19948 (-0.04 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
c7f97f110c aco: use DefInfo in more places to simplify RA
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
734f86db6b aco: create and use DefInfo struct in RA
for maintaining all information necessary to find a register.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
5b2f628da3 aco: create pseudo dummy instruction in RA to be used for live-range splits
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
d9f7d1d5cb aco: refactor get_reg() to also handle affinities
This simplifies definition handling and
helps a few shaders

Totals from affected shaders:
Code Size: 659540 -> 659376 (-0.02 %) bytes

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:23 +00:00
Daniel Schürmann
7c8f4ebca9 aco: refactor get_reg() to take Temp instead of RegClass
This patch also moves get_reg_specified() and
get_reg_vec() before get_reg() to make use of it later.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:22 +00:00
Daniel Schürmann
0a9ed98178 aco: simplify operand handling in RA
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>
2020-04-22 18:23:22 +00:00
Rhys Perry
f13049f48a aco: implement 64-bit sgpr swaps
In our pipeline-db, helps almost exclusively Detroit: Become Human.

Totals from 6726 (5.36% of 125503) affected shaders:
CodeSize: 74680952 -> 74102228 (-0.77%)
Instrs: 14551507 -> 14406001 (-1.00%)
Cycles: 1748272436 -> 1690173104 (-3.32%)
VMEM: 964671 -> 964058 (-0.06%)
Copies: 1993312 -> 1847806 (-7.30%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4469>
2020-04-22 13:25:17 +00:00
Rhys Perry
2ab45f41e0 aco: implement sub-dword swaps
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4469>
2020-04-22 13:25:17 +00:00
Rhys Perry
83fdb1ed3d aco: add VOP3P_instruction
The optimizer isn't yet updated to handle this, since lower_to_hw_instr
will be the only user for now.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4469>
2020-04-22 13:25:17 +00:00
Rhys Perry
8fc24f9a45 aco: fix copy statistic for 64-bit vgpr constant copy
The statistic is in units of instructions.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4469>
2020-04-22 13:25:17 +00:00
Daniel Schürmann
c3c1f4d6bc aco: move src1 to vgpr instead of using VOP3 for VOP2 instructions during isel
Is simpler and helps a couple of shaders.
Totals from affected shaders: (Vega)
Code Size: 16341296 -> 16335460 (-0.04 %) bytes

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4642>
2020-04-20 15:12:50 +00:00
Daniel Schürmann
be0bb7e101 aco: fix 64bit fsub
Fixes: 425558bfd5 ('aco: use v_subrev_f32 for fsub with an sgpr operand in src1')

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4642>
2020-04-20 15:12:50 +00:00
Daniel Schürmann
425558bfd5 aco: use v_subrev_f32 for fsub with an sgpr operand in src1
This fixes an accidentally introduced regression.

Fixes: 9be4be515f ('aco: implement 16-bit nir_op_fsub/nir_op_fadd')
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4633>
2020-04-19 16:16:27 +00:00
Samuel Pitoiset
c4ca9e66dd aco: fix exporting the viewport index if the fragment shader needs it
It's like the layer, it has to be exported via the pos and also
as a varying if the fragment shader reads it.

Fixes dEQP-VK.draw.shader_viewport_index.fragment_shader_*

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4564>
2020-04-17 16:23:24 +00:00
Rhys Perry
839c886b34 aco: add missing scc clobber to nir_op_unpack_32_2x16_split_y
The ISA doc is inconsistent whether this instruction writes SCC. It does.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4552>
2020-04-16 17:04:53 +01:00
Rhys Perry
ac74367bef aco: implement various 8/16-bit conversions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4552>
2020-04-16 17:04:45 +01:00
Samuel Pitoiset
11faaf646d aco: fix emitting stream output with tess eval shaders
Fixes dEQP-VK.transform_feedback.simple.winding_patch_list_12.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4553>
2020-04-16 07:57:39 +00:00
Samuel Pitoiset
91aa596ca7 aco: implement nir_op_f2i8/nir_op_f2u8
I think we should really refactor the conversions path.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4551>
2020-04-16 08:47:49 +02:00
Rhys Perry
c818b5c089 aco: fix 1D textureGrad() on GFX9
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 6f718edced ('aco: simplify gathering of MIMG address components')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4550>
2020-04-15 10:45:07 +00:00
Samuel Pitoiset
08a396033b aco: fix nir_op_frexp_exp with 16-bit floats and negative exponents
v_frexp_exp_i16_f16 returns the two's complement for negative
exponents. For example, with 0.333252 it returns 0.666504 for
the mantissa and 65535 for the exponent (-1 in decimal).

RADV/LLVM and AMDVLK do a v_bfe_i32 and AMDGPU-PRO uses SDWA with
the sign extension bit set. The latter is probably what we want to
do in long term but for now RA doesn't support changing non-SDWA
instructions to SDWA if useful/needed.

Fixes dEQP-VK.glsl.builtin.precision_fp16_storage16b.frexp.compute.*.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4546>
2020-04-15 10:12:44 +02:00
Rhys Perry
fbd2be3f5d aco: clear moved operands in get_reg_create_vector()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4507>
2020-04-14 10:49:12 +00:00
Rhys Perry
52cc1f8237 aco: improve p_create_vector RA for sub-dword operands
These's still improvements needed for sub-dword definitions, but that's
not as simple.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4507>
2020-04-14 10:49:12 +00:00
Rhys Perry
e18711cda3 aco: fix p_extract_vector validation
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4507>
2020-04-14 10:49:12 +00:00
Rhys Perry
41ac44e1b3 aco: improve vector optimization with sub-dword vectors
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4507>
2020-04-14 10:49:12 +00:00
Daniel Schürmann
28d36d26c2 aco: fix p_extract_vector optimization in presence of unequally sized vector operands
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4506>
2020-04-13 16:35:40 +00:00
Samuel Pitoiset
fc1068de0d aco: fix nir_op_pack_32_2x16_split if one operand is a constant
Because 16-bit constants are represented with the s1 RegClass, we
have to extract the low half.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4509>
2020-04-13 11:51:17 +00:00
Samuel Pitoiset
4cfaef68d7 aco: implement 16-bit nir_op_f2i64/nir_op_f2u64
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4509>
2020-04-13 11:51:17 +00:00
Samuel Pitoiset
729bdc0d70 aco: fix f2i64/f2u64 with sgprs if the exponent computation overflow
This fixes f16->{i64,u64} conversions for +0/-0.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4509>
2020-04-13 11:51:17 +00:00
Daniel Schürmann
38622de2ec aco: make some reg_file helpers private and fix their uses
Fixes various subdword RA issues

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4492>
2020-04-10 07:19:27 +00:00
Daniel Schürmann
331794495e aco: rename aco_lower_bool_phis() -> aco_lower_phis()
We also lower subdword phis, now.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4492>
2020-04-10 07:19:27 +00:00
Daniel Schürmann
1d41521b16 aco: lower subdword phis with SGPR operands
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4492>
2020-04-10 07:19:27 +00:00
Daniel Schürmann
a39df3bfce aco: don't constant-propagate into subdword PSEUDO instructions
PSEUDO instructions are lowered using SDWA, and thus,
cannot take literals and before GFX9 cannot take constants
at all. As the in-register representation differs between
32bit and 16bit floats, we first need to ensure correct
behavior.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4492>
2020-04-10 07:19:27 +00:00
Daniel Schürmann
1de18708cb aco: ensure correct bit representation of subdword constants
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4492>
2020-04-10 07:19:27 +00:00
Daniel Schürmann
637f45f390 aco: setup subdword regclasses for ssa_undef & load_const
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4492>
2020-04-10 07:19:27 +00:00
Samuel Pitoiset
67b567d0d0 aco: implement nir_op_b2f16/nir_op_i2f16/nir_op_u2f16
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452>
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
3119f978e5 aco: implement 16-bit comparisons
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452>
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
ccf8e23f59 aco: implement 16-bit nir_op_fmax3/nir_op_fmin3/nir_op_fmed3
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452>
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
981ced07a5 aco: implement 16-bit nir_op_ldexp
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452>
2020-04-10 08:05:05 +02:00