Commit graph

14 commits

Author SHA1 Message Date
Samuel Pitoiset
9e2fde84fc aco: add new addr64 bit to MUBUF instructions on GFX6-GFX7
According to the different ISA docs (and to LLVM), this bit seems
to only exists on GFX6-GFX7.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>
2020-01-20 16:24:55 +00:00
Rhys Perry
20d869079d aco: allow input modifiers on v_cndmask_b32
Totals from affected shaders:
SGPRS: 594099 -> 594019 (-0.01 %)
VGPRS: 441016 -> 441124 (0.02 %)
Spilled SGPRs: 101 -> 101 (0.00 %)
Spilled VGPRs: 18 -> 18 (0.00 %)
Code Size: 30266652 -> 30125256 (-0.47 %) bytes
Max Waves: 67044 -> 67057 (0.02 %)
Instructions: 5753097 -> 5726607 (-0.46 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
2020-01-14 12:56:28 +00:00
Rhys Perry
69bed1c918 aco: don't DCE atomics with return values
We don't create atomics with definitions if they are not used in NIR, but
our own DCE can remove the uses if an export turns out to be null.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
2020-01-13 13:26:43 +00:00
Daniel Schürmann
6a586a6006 aco: split read/writelane opcode into VOP2/VOP3 version for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
0d42e4d7a0 aco: Initial GFX7 Support
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Rhys Perry
389ee819c0 aco: improve FLAT/GLOBAL scheduling
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-29 17:46:02 +00:00
Rhys Perry
f1381e6715 aco: fix GFX10 opcodes for some global/flat atomics
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-29 17:46:01 +00:00
Timur Kristóf
d59f702e26 aco: Implement subgroup shuffle in GFX10 wave64 mode.
Previously subgroup shuffle was implemented using the bpermute
instruction, which only works accross half-waves, so by itself it's
not suitable for implementing subgroup shuffle when the shader is
running in wave64 mode.

This commit adds a trick using shared VGPRs that allows to implement
subgroup shuffle still relatively effectively in this mode.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-28 23:52:50 +00:00
Rhys Perry
2ea9e59e8d aco: move s_andn2_b64 instructions out of the p_discard_if
And use a new p_discard_early_exit instruction. This fixes some cases
where a definition having the same register as an operand causes issues.

v2: rename instruction to p_exit_early_if
v2: modify the existing instruction instead of creating a new one
v3: merge the "i == num - 1" IFs

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-09 16:19:02 +00:00
Rhys Perry
db2ca45102 aco: check for duplicate opcode numbers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-25 15:28:44 +00:00
Rhys Perry
101f47fdd7 aco: fix opcode for s_mul_hi_i32
Fixes dEQP-VK.glsl.builtin.function.integer.imulextended.*_compute

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-25 15:28:44 +00:00
Rhys Perry
2faaf04c62 aco: fix v_subrev_co_u32_e64 opcode
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-25 15:28:44 +00:00
Rhys Perry
00aa413bae aco: fix GFX9 opcode for v_xad_u32
Fixes various dEQP-VK.image.store.* tests.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-25 15:28:44 +00:00
Daniel Schürmann
93c8ebfa78 aco: Initial commit of independent AMD compiler
ACO (short for AMD Compiler) is a new compiler backend with the goal to replace
LLVM for Radeon hardware for the RADV driver.

ACO currently supports only VS, PS and CS on VI and Vega.
There are some optimizations missing because of unmerged NIR changes
which may decrease performance.

Full commit history can be found at
https://github.com/daniel-schuermann/mesa/commits/backend

Co-authored-by: Daniel Schürmann <daniel@schuermann.dev>
Co-authored-by: Rhys Perry <pendingchaos02@gmail.com>
Co-authored-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Co-authored-by: Connor Abbott <cwabbott0@gmail.com>
Co-authored-by: Michael Schellenberger Costa <mschellenbergercosta@googlemail.com>
Co-authored-by: Timur Kristóf <timur.kristof@gmail.com>

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-19 12:10:00 +02:00