Commit graph

58 commits

Author SHA1 Message Date
Timur Kristóf
3a08110d43 amd: Move all amd/common code that depends on LLVM to amd/llvm.
This commit is a step towards the goal of being able to build RADV
without LLVM. In the future we would like to offer the option to
use RADV solely with ACO. There is still a need for the common AMD
code located in amd/common but the LLVM specific parts need to be
separated.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-08 00:44:08 +00:00
Eric Engestrom
19d9e57f2c amd: replace major llvm version checks with LLVM_VERSION_MAJOR
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:26:29 +01:00
Marek Olšák
73aa04e40d ac: add Wave32 LLVM target machine
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Nicolai Hähnle
cb07f91489 amd/common: move ac_shader_{binary,reloc} into r600 and rename
They are no longer used by radeonsi or radv.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-07-04 10:52:26 +00:00
Nicolai Hähnle
b398230e6d amd/common: remove unused ac_compile_module_to_binary
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-07-04 10:52:26 +00:00
Connor Abbott
3bf8981c51 ac,radeonsi: Always mark buffer stores as inaccessiblememonly
inaccessiblememonly means that it doesn't modify memory accesible via
normal LLVM pointers. This lets LLVM's dead store elimination, memcpy
forwarding, etc. ignore functions with this attribute. We don't
represent descriptors as pointers, so this property is always true of
buffer and image stores. There are plans to represent descriptors via
pointers, but this just means that now nothing is inaccessiblememonly,
as LLVM will then understand loads/stores via its usual alias analysis.

Radeonsi was mistakenly only setting it if the driver could prove that
there were no reads, and then it was cargo-culted into ac_llvm_build
and ac_llvm_to_nir. Rip it out of everything.

statistics with nir enabled:

Totals from affected shaders:
SGPRS: 152 -> 152 (0.00 %)
VGPRS: 128 -> 132 (3.12 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 9324 -> 9244 (-0.86 %) bytes
LDS: 2 -> 2 (0.00 %) blocks
Max Waves: 17 -> 17 (0.00 %)
Wait states: 0 -> 0 (0.00 %)

The only difference was a manhattan31 shader.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-19 14:08:27 +02:00
Nicolai Hähnle
3c958d924a amd/common: add ac_compile_module_to_elf
A new variant of ac_compile_module_to_binary that allows us to
keep the entire ELF around.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-12 20:28:23 -04:00
Samuel Pitoiset
33f4e04d5a ac,radv: do not emit vec3 for raw load/store on SI
It's unsupported, only load/store format with vec3 are supported.

Fixes: 6970a9a6ca ("ac,radv: remove the vec3 restriction with LLVM 9+")"
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-04 08:47:26 +02:00
Marek Olšák
486bc1e17e ac: use amdgpu-flat-work-group-size
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-03 14:32:47 -04:00
Samuel Pitoiset
d7501834cd radv: add a workaround for Monster Hunter World and LLVM 7&8
The load/store optimizer pass doesn't handle WaW hazards correctly
and this is the root cause of the reflection issue with Monster
Hunter World. AFAIK, it's the only game that are affected by this
issue.

This is fixed with LLVM r361008, but we need a workaround for older
LLVM versions unfortunately.

Cc: "19.0" "19.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-17 11:41:19 +02:00
Samuel Pitoiset
3fbdcd942f amd: remove support for LLVM 6.0
User are encouraged to switch to LLVM 7.0 released in September 2018.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-06 14:02:56 +01:00
Marek Olšák
a4a104fc81 ac: completely remove +auto-waitcnt-before-barrier
it causes corruption on several different GPU generations.

Cc: 18.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-21 20:50:37 -04:00
Samuel Pitoiset
71d5b2fbf8 radv: disable the auto-waitcnt-before-barrier LLVM option
This option allows us to remove additional s_waitcnt instructions
because s_barrier internally does s_waitcnt 0.

Though, apparently there is a problem with LDS accesses that
causes rendering issues with FFXV and DXVK. Disable this
optimization for now (RadeonSI still uses it).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107460
CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-08-15 16:21:50 +02:00
Marek Olšák
cb6b241c30 ac,radeonsi: reduce optimizations for complex compute shaders on older APUs (v2)
To make dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23
finish sooner on the older CPUs. (otherwise it gets killed and we fail
the test)

Acked-by: Dave Airlie <airlied@gmail.com>
2018-08-01 15:25:18 -04:00
Tom Stellard
0866edede0 radeonsi: Add debug option to enable LLVM GlobalISel (v2)
R600_DEBUG=gisel will tell LLVM to use GlobalISel rather than
SelectionDAG for instruction selection.

v2: mareko: move the helper to src/amd/common

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tom Stellard <tstellar@redhat.com>
2018-07-23 20:23:48 -04:00
Marek Olšák
9b82d128c9 ac: run LLVM optimization passes only on the final function after inlining 2018-07-19 00:58:49 -04:00
Marek Olšák
0075e5fed8 ac: add reusable helpers for direct LLVM compilation
This is basically LLVMTargetMachineEmitToMemoryBuffer inlined and reworked.

struct ac_compiler_passes (opaque type) contains the main pass manager.

ac_create_llvm_passes -- the result can go to thread local storage
ac_destroy_llvm_passes -- can be called by a destructor in TLS
ac_compile_module_to_binary -- from LLVMModuleRef to ac_shader_binary

The motivation is to do the expensive call addPassesToEmitFile once
per context or thread.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-07-04 15:48:18 -04:00
Dave Airlie
5b32b246cf ac: make some fns static
Some of the compiler functions are no longer called outside
the util file.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-04 10:29:26 +10:00
Dave Airlie
7398913a62 ac/radv: move llvm compiler info to struct and init in one place
This ports radv to the shared code, however due to a bug in LLVM
version prior to 7, radv cannot add target info at this stage,
as it would leak one for every shader compile, however I'd prefer
to keep this llvm damage in the shared code, since it isn't the
driver at fault here. We just add a flag to denote if the driver
can support leaking the target info or not, and the common code
does the right thing depending on the llvm version.

 Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-04 10:29:16 +10:00
Dave Airlie
d853d3a59b ac/radeonsi: port compiler init/destroy out of radeonsi.
We want to share this code with radv in the future, so port
it out of radeonsi.

Add a return value as radv will want that to know if this
succeeds

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-04 10:29:03 +10:00
Dave Airlie
35c82af539 radv/radeonsi: add a check ir tm options
This doesn't do much yet, but it makes it easier to move the code
to a common shared code base.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-04 05:32:35 +10:00
Dave Airlie
0eb65b4944 radeonsi: rename si_compiler -> ac_llvm_compiler
As precursor to moving init to common code, just rename the struct
and move it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-04 05:31:32 +10:00
Dave Airlie
887ba45c93 ac: add target library info helpers
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-04 05:31:29 +10:00
Dave Airlie
584ad1eda9 ac/radeonsi: refactor out pass manager init to common code.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-04 05:18:01 +10:00
Dave Airlie
473be16c74 ac/radv: split the non-common init_once code from the common target code. (v2)
This just splits out the non-shared code and reuses ac_get_llvm_target in radv.

v2: rebase on Marek's patch - fixup brace position/whitespace

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-04 05:15:23 +10:00
Marek Olšák
32e413ca59 ac: move all LLVM module initialization into ac_create_module
This removes some ugly code around module initialization.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-07-02 14:34:39 -04:00
Marek Olšák
43f0a10051 radeonsi: add triple into si_compiler
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Benedikt Schemmer <ben at besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
2c3f3651c4 radeonsi: fix passing address32_hi to LLVM for high values
The old function treats high values as negative, which LLVM interprets as 0.
2018-03-07 13:55:49 -05:00
Samuel Pitoiset
3b8e7459f2 ac: add ac_count_scratch_private_memory()
Imported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 10:38:38 +01:00
Marek Olšák
3bf1e036e8 amd: remove support for LLVM 3.9
Only these are supported:
- LLVM 4.0
- LLVM 5.0
- LLVM 6.0
- master (7.0)

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-02 23:47:40 +01:00
Marek Olšák
0d62370bbb ac: don't use byval LLVM qualifier in shaders
shader-db doesn't show any regression and 32-bit pointers with byval
are declared as VGPRs for some reason.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-01-27 02:09:09 +01:00
Samuel Pitoiset
2091206ad3 ac: import lp_create_builder() from gallivm
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-16 21:39:53 +01:00
Samuel Pitoiset
7239e265eb amd/common: import get_{load,store}_intr_attribs() from RadeonSI
v2: move those helpers to the header and use static inline

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
2018-01-10 19:02:23 +01:00
Marek Olšák
cde664ab81 radeonsi: use ac_create_target_machine
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-07 17:58:38 +01:00
Marek Olšák
81f81fdb54 radeonsi: use ac_get_llvm_processor_name
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-07 17:58:36 +01:00
Marek Olšák
4560f2b90a radeonsi: merge si_llvm_get_amdgpu_target into ac_get_llvm_target
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Dave Airlie
09d7c7be4f radv: enable sisched toggle in perftest flags.
RADV_PERFTEST=sisched

to enable it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-06 23:07:49 +01:00
Dave Airlie
9d9f051390 ac/radv: change api to create target machine
This just modifies the API to make it easier to add other flags
to target machine creation.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-06 23:05:59 +01:00
Dave Airlie
68c812f699 ac: add new helper function to add a integer target dependent function attr.
This is needed to add the max workgroup size attribute.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-05 01:37:29 +01:00
Dave Airlie
e2659176ce radeonsi/ac: move vertex export remove to common code.
This code can be shared by radv, we bump the max to
VARYING_SLOT_MAX here, but that shouldn't have too
much fallout.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-27 05:17:47 +01:00
Emil Velikov
95ab07c586 ac: consistently use ifndef guards over pragma once
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:21 +00:00
Marek Olšák
ef883fc554 gallivm,ac: add LP_FUNC_ATTR_CONVERGENT
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-06 14:13:04 +01:00
Marek Olšák
b5744310d4 gallivm, ac: add writeonly and inaccessiblememonly attributes
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 15:29:30 +01:00
Tobias Klausmann
6d600cf632 amd/common: Fix build with new ac_add_function_attr()
Fix usage of ac_add_function_attr() and make it known!

common/ac_nir_to_llvm.c: In function 'create_llvm_function':
common/ac_nir_to_llvm.c:265:4: error: implicit declaration of function
'ac_add_function_attr' [-Werror=implicit-function-declaration]
    ac_add_function_attr(main_function, i + 1, AC_FUNC_ATTR_BYVAL);
    ^~~~~~~~~~~~~~~~~~~~

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-01 23:53:38 +01:00
Marek Olšák
940da36a65 gallivm,ac: add function attributes at call sites instead of declarations
They can vary at call sites if the intrinsic is NOT a legacy SI intrinsic.
We need this to force readnone or inaccessiblememonly on some amdgcn
intrinsics.

This is only used with LLVM 4.0 and later. Intrinsics only used with
LLVM <= 3.9 don't need the LEGACY flag.

gallivm and ac code is in the same patch, because splitting would be
more complicated with all the LEGACY uses all over the place.

v2: don't change the prototype of lp_add_function_attr.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com> (v1)
2017-03-01 18:59:36 +01:00
Marek Olšák
408f370710 gallivm,ac: remove unused FUNC_ATTR_LAST enums
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-03-01 18:59:36 +01:00
Dave Airlie
13a28ff236 radeon/ac: move common llvm build functions to a separate file.
Suggested by Marek.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-07 05:46:35 +10:00
Dave Airlie
c9a2fc3679 radeonsi/ac: move most of emit_ddxy to shared code.
We can reuse this in radv.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-03 09:54:04 +10:00
Dave Airlie
c5f0a56aeb radeonsi/ac: move get thread id to shared code.
radv will use this.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-03 09:54:04 +10:00
Dave Airlie
b3c28942c7 radeonsi/ac: move tbuffer store and buffer load to shared code.
These are all reuseable by radv.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-03 09:54:04 +10:00