Commit graph

2571 commits

Author SHA1 Message Date
Michael Schellenberger Costa
a607ea51a7 aco: Cleanup insert_before_logical_end
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-09 17:50:23 +02:00
Timur Kristóf
3a08110d43 amd: Move all amd/common code that depends on LLVM to amd/llvm.
This commit is a step towards the goal of being able to build RADV
without LLVM. In the future we would like to offer the option to
use RADV solely with ACO. There is still a need for the common AMD
code located in amd/common but the LLVM specific parts need to be
separated.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-08 00:44:08 +00:00
Rhys Perry
77ebb030ed aco: fix load_constant with multiple arrays
I thought I fixed this, but I guess I must have broken it again.

Fixes various dEQP-VK.draw.* tests

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-04 22:43:11 +01:00
Rhys Perry
a87b0f5141 radv/aco,aco: set lower_fmod
This simplifies ACO and allows the lowered code to be optimized (in
particular, constant folded).

Totals from affected shaders:
SGPRS: 1776 -> 1776 (0.00 %)
VGPRS: 1436 -> 1436 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 203452 -> 203564 (0.06 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 103 -> 103 (0.00 %)

At least some of the code size increase seems to be from literals being
applied to instructions as a result of constant folding.

v2: remove fmod/frem handling in init_context()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-04 14:00:46 +00:00
Daniel Schürmann
1d29895e5b aco: call nir_opt_algebraic_late() exhaustively
57559 shaders in 28980 tests
Totals:
SGPRS: 2963407 -> 2959935 (-0.12 %)
VGPRS: 2014812 -> 2016328 (0.08 %)
Spilled SGPRs: 1077 -> 1077 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 10348 -> 10348 (0.00 %) dwords per thread
Code Size: 114545436 -> 114498084 (-0.04 %) bytes
LDS: 933 -> 933 (0.00 %) blocks
Max Waves: 375997 -> 375866 (-0.03 %)

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-30 09:44:10 +00:00
Mauro Rossi
c24ad565ae android: aco: fix undefined template 'std::__1::array' build errors
Fixes a few building errors similar to the following:

In file included from external/mesa/src/amd/compiler/aco_instruction_selection.cpp:26:
In file included from external/libcxx/include/algorithm:639:
external/libcxx/include/utility:321:9:
error: implicit instantiation of undefined template 'std::__1::array<aco::Temp, 4>'
    _T2 second;
        ^

Fixes: 93c8ebf ("aco: Initial commit of independent AMD compiler")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
2019-09-28 15:56:23 +02:00
Rhys Perry
1f2813e103 aco: don't remove the loop exec mask in transition_to_Exact()
No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-09-27 10:57:03 +01:00
Rhys Perry
b711e62e61 aco: set loop_info::has_discard for demotes
We need the loop header phis for the outer exec masks. Needed for
dEQP-VK.glsl.demote.dynamic_loop_texture

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-09-27 10:57:03 +01:00
Rhys Perry
06ea3325c3 aco: CSE readlane/readfirstlane/permute/reduce with the same exec mask
v2: rename pass_temp to pass_flags
v2: also CSE reductions
v3: add ds_swizzle_b32 support
v3: check gds/offset0/offset1 fields

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-09-26 13:19:51 +01:00
Rhys Perry
86ecf92c23 aco: don't CSE v_readlane_b32/v_readfirstlane_b32
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-09-26 13:19:51 +01:00
Rhys Perry
3c966fd688 aco,radv: rename record_llvm_ir/llvm_ir_string to record_ir/ir_string
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-26 11:08:47 +01:00
Rhys Perry
15ea1c5cff aco: store printed backend IR in binary
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-26 11:08:31 +01:00
Rhys Perry
6613b81327 aco,radv/aco: get dissassembly for release builds if requested
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-26 11:08:09 +01:00
Rhys Perry
db2ca45102 aco: check for duplicate opcode numbers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-25 15:28:44 +00:00
Rhys Perry
101f47fdd7 aco: fix opcode for s_mul_hi_i32
Fixes dEQP-VK.glsl.builtin.function.integer.imulextended.*_compute

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-25 15:28:44 +00:00
Rhys Perry
2faaf04c62 aco: fix v_subrev_co_u32_e64 opcode
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-25 15:28:44 +00:00
Rhys Perry
00aa413bae aco: fix GFX9 opcode for v_xad_u32
Fixes various dEQP-VK.image.store.* tests.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-25 15:28:44 +00:00
Rhys Perry
b125dc4839 aco: implement 64-bit ineg
We currently lower them, but nir_opt_algebraic() can add new ones because
lower_sub=true.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-09-25 15:27:48 +00:00
Rhys Perry
641eac953c aco: run nir_lower_int64() before nir_lower_idiv()
nir_lower_idiv() asserts on 64-bit integers.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-09-25 15:27:48 +00:00
Daniel Schürmann
2c050b49b3 aco: only emit waitcnt on loop continues if we there was some load or export
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-09-23 13:39:33 +02:00
Daniel Schürmann
93c8ebfa78 aco: Initial commit of independent AMD compiler
ACO (short for AMD Compiler) is a new compiler backend with the goal to replace
LLVM for Radeon hardware for the RADV driver.

ACO currently supports only VS, PS and CS on VI and Vega.
There are some optimizations missing because of unmerged NIR changes
which may decrease performance.

Full commit history can be found at
https://github.com/daniel-schuermann/mesa/commits/backend

Co-authored-by: Daniel Schürmann <daniel@schuermann.dev>
Co-authored-by: Rhys Perry <pendingchaos02@gmail.com>
Co-authored-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Co-authored-by: Connor Abbott <cwabbott0@gmail.com>
Co-authored-by: Michael Schellenberger Costa <mschellenbergercosta@googlemail.com>
Co-authored-by: Timur Kristóf <timur.kristof@gmail.com>

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-19 12:10:00 +02:00