Commit graph

371 commits

Author SHA1 Message Date
Rhys Perry
b088a4b113 aco: only reserve sgprs for vcc if it's used
pipeline-db (Vega):

Totals:
SGPRS: 5186302 -> 5075616 (-2.13 %)
VGPRS: 3704580 -> 3704580 (0.00 %)
Spilled SGPRs: 144859 -> 144859 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 4124 -> 4124 (0.00 %) dwords per thread
Code Size: 247315944 -> 247315944 (0.00 %) bytes
LDS: 1311 -> 1311 (0.00 %) blocks
Max Waves: 674560 -> 674562 (0.00 %)

Totals from affected shaders:
SGPRS: 536992 -> 426306 (-20.61 %)
VGPRS: 356404 -> 356404 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 8498748 -> 8498748 (0.00 %) bytes
LDS: 8 -> 8 (0.00 %) blocks
Max Waves: 113832 -> 113834 (0.00 %)

There are some small code size changes in a few RotTR shaders and a small
increase in max_waves in two Detroit: Become Human shaders.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3906>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3906>
2020-03-05 20:18:34 +00:00
Rhys Perry
c6e0c062da aco: improve control flow handling in GFX6-9 NOP pass
Fixes Detroit: Become Human hang. Also affects World of Warships.

pipeline-db (Tahiti):
Totals from affected shaders:
SGPRS: 0 -> 0 (0.00 %)
VGPRS: 0 -> 0 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 0 -> 0 (0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 0 -> 0 (0.00 %)

pipeline-db (Polaris):
Totals from affected shaders:
SGPRS: 17168 -> 17168 (0.00 %)
VGPRS: 11296 -> 11296 (0.00 %)
Spilled SGPRs: 1870 -> 1870 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 1472628 -> 1473292 (0.05 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 628 -> 628 (0.00 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 17168 -> 17168 (0.00 %)
VGPRS: 11296 -> 11296 (0.00 %)
Spilled SGPRs: 1870 -> 1870 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 1409716 -> 1410380 (0.05 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 0 -> 0 (0.00 %)

Max Waves is lower than it should be because of a null winsys bug.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4004>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4004>
2020-03-05 19:37:24 +00:00
Rhys Perry
47b7f104a0 aco: consider non-hazard writes in handle_raw_hazard_internal
I think this helps GFX6 in particular because code like this is common:
s_add_i32       s4, 0x60, s3
s_mov_b32       s5, 0
s_load_dwordx4  s[4:7], s[4:5], 0x0
s_buffer_load_dword s4, s[4:7], 0xcc

pipeline-db (Tahiti):
Totals from affected shaders:
SGPRS: 1923878 -> 1923878 (0.00 %)
VGPRS: 1528964 -> 1528964 (0.00 %)
Spilled SGPRs: 476 -> 476 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 88723604 -> 88528880 (-0.22 %) bytes
LDS: 241 -> 241 (0.00 %) blocks
Max Waves: 145402 -> 145402 (0.00 %)

pipeline-db (Polaris):
Totals from affected shaders:
SGPRS: 428128 -> 428128 (0.00 %)
VGPRS: 353092 -> 353092 (0.00 %)
Spilled SGPRs: 119251 -> 119251 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 57580468 -> 57563964 (-0.03 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 11631 -> 11631 (0.00 %)

piepline-db (Vega):
Totals from affected shaders:
SGPRS: 425016 -> 425016 (0.00 %)
VGPRS: 349588 -> 349588 (0.00 %)
Spilled SGPRs: 117835 -> 117835 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 54890792 -> 54874432 (-0.03 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 54 -> 54 (0.00 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4004>
2020-03-05 19:37:24 +00:00
Rhys Perry
38743577f8 aco: improve get_wait_states()
pipeline-db (Tahiti):
Totals from affected shaders:
SGPRS: 21208 -> 21208 (0.00 %)
VGPRS: 22388 -> 22388 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 3278596 -> 3277004 (-0.05 %) bytes
LDS: 19 -> 19 (0.00 %) blocks
Max Waves: 238 -> 238 (0.00 %)

pipeline-db (Polaris):
Totals from affected shaders:
SGPRS: 64 -> 64 (0.00 %)
VGPRS: 96 -> 96 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 5200 -> 5192 (-0.15 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 10 -> 10 (0.00 %)

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 0 -> 0 (0.00 %)
VGPRS: 0 -> 0 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 0 -> 0 (0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 0 -> 0 (0.00 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4004>
2020-03-05 19:37:24 +00:00
Rhys Perry
7f1b537304 aco: add new NOP insertion pass for GFX6-9
This new pass is more similar to the GFX10 pass and should be able to
handle control flow better.

No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4004>
2020-03-05 19:37:24 +00:00
Samuel Pitoiset
7618fe1b48 aco: fix image load/store with lod and 1D images
Make sure to add the lod value if non-null as the 2nd operand.

Fixes dEQP-VK.image.load_store_lod.with_format.1d.* on all gens
except GFX9.

Fixes: 4d49a7ac73 ("aco: handle nir_intrinsic_image_deref_{load,store} with lod")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4060>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4060>
2020-03-05 14:29:27 +01:00
Rhys Perry
2d1ba86382 aco: handle v_add_co_u32_e64 in parse_base_offset()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3902>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3902>
2020-03-03 18:31:06 +00:00
Rhys Perry
215df21dea aco: fix carry-out size for wave32 v_add_co_u32_e64
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Fixes: e0bcefc3a0 ('aco/wave32: Use lane mask regclass for exec/vcc.')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3902>
2020-03-03 18:31:06 +00:00
Rhys Perry
9fea90ad51 aco: keep track of which events are used in a barrier
And properly handle unordered events so that they always wait for 0.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3774>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3774>
2020-03-03 15:38:13 +00:00
Albert Astals Cid
760fe44e8c aco: pass vars by const &
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3935>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3935>
2020-03-02 13:18:49 +00:00
Albert Astals Cid
2521c81c9e aco: Minor optimization in spill_ctx constructor
'register_demand' is passed by value and only copied once; consider moving it to avoid unnecessary copies

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3968>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3968>
2020-03-02 12:21:03 +00:00
Eric Anholt
b9773631d3 aco: Fix signed-vs-unsigned warning.
The previous instance of this comparision was 1u to avoid the warning, fix
this one too.

Fixes: dba71de5c6 ("aco: only create parallelcopy to restore exec at loop exit if needed")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3607>
2020-02-27 21:59:31 -08:00
Rhys Perry
8291d728dc aco: improve GFX9 1D ddx/ddy assertion
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2547
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3890>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3890>
2020-02-20 15:41:26 +00:00
Rhys Perry
fe5c5507bd aco: add some helpers for filling/testing register ranges
We do this a lot

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3768>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3768>
2020-02-19 12:23:50 +00:00
Rhys Perry
43497e30e2 aco: add RegisterFile
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3768>
2020-02-19 12:23:50 +00:00
Rhys Perry
483d4ec57c aco: improve SCC handling in some SALU combines
Add some checks and remove some unnecessary checks.

Found by observation. No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3599>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3599>
2020-02-12 19:18:45 +00:00
Rhys Perry
d45e9451cf aco: disable some instruction combining if it could change an exec operand
Found by observation. No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3599>
2020-02-12 19:18:40 +00:00
Samuel Pitoiset
ddd767387f aco: fix creating v_madak if v_mad_f32 has two sgpr literals
Do not ignore that src1 can be a sgpr.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2435
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3759>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3759>
2020-02-11 07:17:31 +00:00
Samuel Pitoiset
34fd894e42 aco: fix waiting for scalar stores before "writing back" data on GFX8-GFX9
Seems required also on GFX8-GFX9 to achieve correct behaviour. This
is an undocumented behaviour but it makes real sense to me.

pipeline-db on GFX9:
Totals from affected shaders:
SGPRS: 1018 -> 1018 (0.00 %)
VGPRS: 516 -> 516 (0.00 %)
Code Size: 40516 -> 40636 (0.30 %) bytes
Max Waves: 280 -> 280 (0.00 %)

This fixes some sort of sun flickering with Assassins Creed Origins.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2488
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3750>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3750>
2020-02-10 12:07:25 +00:00
Samuel Pitoiset
4b978cd950 aco: do not use ds_{read,write}2 on GFX6
According to LLVM, these instructions have a bounds checking bug.
LLVM only uses them on GFX7+.

This fixes broken geometry in Assassins Creed Origins.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2489
Fixes: 4a553212fa ("radv: enable ACO support for GFX6")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3746>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3746>
2020-02-07 14:17:06 +01:00
Rhys Perry
ce23911b77 aco: gfx10_wave64_bpermute reduce op to print_ir
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3683>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3683>
2020-02-06 16:43:03 +00:00
Rhys Perry
20eb1acb6f aco: fix gfx10_wave64_bpermute
Since 9254fb4fc7, the pass replaced the SCC clobber with the scalar
identity temporary. Just skip most of the temporary setup, since we don't
need it for gfx10_wave64_bpermute.

Although shuffles are disabled on GFX10, Detroit: Become Human seems to
use them anyway.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Fixes: 9254fb4fc7 ('aco: don't use a scalar
       temporary for reductions on GFX10')

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3683>
2020-02-06 16:43:03 +00:00
Timur Kristóf
4d34abd15c aco/optimizer: Don't combine uniform bool s_and to s_andn2.
Fixes: 8a32f57fff

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3714>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3714>
2020-02-05 22:53:45 +00:00
Daniel Schürmann
3b323d6601 aco: fix image_atomic_cmp_swap
Fixes: 71440ba0f5 ('aco: reorder VMEM operands in ACO IR')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3652>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3652>
2020-01-31 16:51:46 +00:00
Samuel Pitoiset
0d14f41625 aco: fix MUBUF VS input loads when expanding vec3 to vec4 on GFX6
When some unused channels are skipped and that we expand vec3 loads
to vec4 loads, we have to adjust the fourth component.

While we are at it, add an assertion to make sure we don't use
MUBUF for vec3 loads on GFX6.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2450
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2442
Fixes: 6aecc316 ("aco: fix VS input loads with MUBUF on GFX6")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3641>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3641>
2020-01-31 13:48:56 +01:00
Timur Kristóf
e73f604b21 aco: Fix the meaning of is_atomic.
Previously, is_atomic really meant "is not atomic", contrary to its name.
This commit fixes it to mean what one would think it means.

Fixes: 69bed1c918

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3618>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3618>
2020-01-29 20:32:31 +00:00
Daniel Schürmann
6f718edced aco: simplify gathering of MIMG address components
This patch has a slight effect on pipelinedb:
Totals from affected shaders:
SGPRS: 23616 -> 21504 (-8.94 %)
VGPRS: 15088 -> 14444 (-4.27 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 662660 -> 664600 (0.29 %) bytes
LDS: 49 -> 49 (0.00 %) blocks
Max Waves: 3079 -> 3204 (4.06 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
2020-01-29 18:45:23 +00:00
Daniel Schürmann
901f06e9ad aco: simplify adjust_sample_index_using_fmask() & get_image_coords()
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
2020-01-29 18:45:23 +00:00
Daniel Schürmann
99d032f3cd aco: fix register allocation with multiple live-range splits
This patch fixes register allocation if multiple live-range splits
occur to the same variable within one instruction.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
2020-01-29 18:45:23 +00:00
Daniel Schürmann
71440ba0f5 aco: reorder VMEM operands in ACO IR
For all VMEM instructions, the resource constant is now
in operands[0]. For MIMG instructions, the sampler shares
operands[1] with write data in case this instruction writes memory.
Moving the VADDR to be the last operand for MIMG is the first step to
support Navi NSA encoding.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
2020-01-29 18:45:23 +00:00
Rhys Perry
db19e96c8c aco: fix exec mask consistency issues
There seems to be more, these are just the ones found in
Detroit: Become Human shaders.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
c7d0514168 aco: parallelcopy exec mask before s_wqm
It can be used later and we want any uses to not be fixed to exec, so it's
definition can't be fixed to exec because of how exec masks interact with
register demand calculation.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
517fc3abc4 aco: fill reg_demand with sensible information in add_coupling_code()
process_block() will use this to determine the register demand of the
before the current instruction. Previously, it was filled with zeroes
which could result in process_block() only using the register demand
of after the current instruction.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
26d2511bcb aco: improve assertion at the end of spiller
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
5ea23ba659 aco: set exec_potentially_empty after continues/breaks in nested IFs
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
4e83e05e62 aco: error when block has no logical preds but VGPRs are live at the start
This would have caught the liveness error fixed in the previous commit.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
d282a292ec aco: don't always add logical edges from continue_break blocks to headers
Otherwise, code like this will be broken:
loop {
   if (...) {
      break;
   } else {
      break;
   }
}
The continue_or_break block doesn't have any logical predecessors but it's
a logical predecessor of the header block. This liveness error breaks the
spiller in init_live_in_vars() (under "keep variables spilled on all
incoming paths") and eventually creates garbage reloads.

Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
dba71de5c6 aco: only create parallelcopy to restore exec at loop exit if needed
The operand isn't fixed to exec, which can mess up the spiller. This also
adds a new situation where a phi is needed.

Fixes dEQP-VK.ssbo.layout.random.descriptor_indexing.2 and an assertion
when compiling a Detroit: Become Human shader.

Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
4537b97410 aco: don't update demand in add_coupling_code() for loop headers
We don't need to update it since it won't be used later.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
521525fc0a aco: don't consider loop header blocks branch blocks in add_coupling_code
Loops without continues create header blocks with only 1 predecessor.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Rhys Perry
590c26beab aco: fix target calculation when vgpr spilling introduces sgpr spilling
A shader might require vgpr spilling but not require sgpr spilling. In
that case, the spiller lowers the sgpr target by 5 which could mean sgpr
spilling is then required. Then the vgpr target has to be lowered to make
space for the linear vgprs. Previously, space wasn't make for the linear
vgprs.

Found while testing the spiller on the pipeline-db with a lowered limit

Fixes: a7ff1bb5b9
   ('aco: simplify calculation of target register pressure when spilling')

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
2020-01-29 18:02:27 +00:00
Samuel Pitoiset
6aecc316c0 aco: fix VS input loads with MUBUF on GFX6
Only MTBUF supports vec3.

Fixes: 03a0d39366 ("aco: use MUBUF in some situations instead of splitting vertex fetches")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3615>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3615>
2020-01-29 13:58:37 +00:00
Rhys Perry
404818dd28 aco: run p_wqm instructions in WQM
If the p_wqm ends up creating copies, these need to be in WQM. Helps (but
doesn't completely fix) artifacts in Strange Brigade. The actual issue
still exists and is harder to fix.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3273>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3273>
2020-01-29 13:23:03 +00:00
Rhys Perry
2d7386a2d0 aco: ensure predecessors' p_logical_end is in WQM when a p_phi is in WQM
We want any copies to be in WQM. I don't know if this fixes any real
application, but I can create a vkrunner test than reproduces the issue.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3273>
2020-01-29 13:23:03 +00:00
Samuel Pitoiset
3922d95b51 aco: implement VK_AMD_shader_explicit_vertex_parameter
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
2020-01-29 09:49:50 +00:00
Daniel Schürmann
396be00640 aco: fix combine_salu_not_bitwise() when SCC is used
Previously, we didn't use the SCC bit, and thus, we didn't care about it.
With 'aco: Transform uniform bitwise instructions to 32-bit if possible.'
that changed, so that we have to handle it.

Fixes: 8a32f57fff ('aco: Transform uniform bitwise instructions to 32-bit if possible.')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3598>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3598>
2020-01-28 18:14:02 +01:00
Rhys Perry
7edcf4a59d aco: fix rebase error from GS copy shader support
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: f8f7712666 ('aco: implement GS copy shaders')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3601>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3601>
2020-01-28 13:50:53 +00:00
Rhys Perry
03a0d39366 aco: use MUBUF in some situations instead of splitting vertex fetches
Fixes most of the regressions from splitting vertex fetches in an earlier
commit.

pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 0 -> 0 (0.00 %)
VGPRS: 0 -> 0 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 0 -> 0 (0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 0 -> 0 (0.00 %)

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 562696 -> 558344 (-0.77 %)
VGPRS: 395596 -> 393752 (-0.47 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 11600912 -> 11311804 (-2.49 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 101839 -> 102372 (0.52 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
2020-01-28 11:44:52 +00:00
Rhys Perry
21d2799cee aco: value-number MUBUF instructions
We will have to do this when we start creating MUBUF instructions for
load_input because NIR might not be able to tell they are identical since
it doesn't know whether two vertex attributes have the same offset.

No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
2020-01-28 11:40:22 +00:00
Rhys Perry
d39f5519a1 aco: handle unaligned vertex fetch on GFX10
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 0 -> 0 (0.00 %)
VGPRS: 0 -> 0 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 0 -> 0 (0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 0 -> 0 (0.00 %)

pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 795000 -> 802368 (0.93 %)
VGPRS: 579632 -> 581280 (0.28 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 17208408 -> 17583652 (2.18 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 145731 -> 145279 (-0.31 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
2020-01-28 11:40:10 +00:00