Commit graph

4814 commits

Author SHA1 Message Date
Mauro Rossi
c24ad565ae android: aco: fix undefined template 'std::__1::array' build errors
Fixes a few building errors similar to the following:

In file included from external/mesa/src/amd/compiler/aco_instruction_selection.cpp:26:
In file included from external/libcxx/include/algorithm:639:
external/libcxx/include/utility:321:9:
error: implicit instantiation of undefined template 'std::__1::array<aco::Temp, 4>'
    _T2 second;
        ^

Fixes: 93c8ebf ("aco: Initial commit of independent AMD compiler")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
2019-09-28 15:56:23 +02:00
Rhys Perry
1f2813e103 aco: don't remove the loop exec mask in transition_to_Exact()
No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-09-27 10:57:03 +01:00
Rhys Perry
b711e62e61 aco: set loop_info::has_discard for demotes
We need the loop header phis for the outer exec masks. Needed for
dEQP-VK.glsl.demote.dynamic_loop_texture

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-09-27 10:57:03 +01:00
Timur Kristóf
c372dc762d radv: Fix L2 cache rinse programming.
According to radeonsi, GLM doesn't support WB alone, so
we have to set INV too when WB is set.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-26 22:18:16 +00:00
Timur Kristóf
30f0c0ea7d radv: Add debug option to dump meta shaders.
This new option can help debug shader compiler problems when
there are issues with the meta shaders.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-26 13:36:49 +00:00
Timur Kristóf
a4fd8ba7e3 amd/common: Introduce ac_get_fs_input_vgpr_cnt.
Add a function called ac_get_fs_input_vgpr_cnt which will return
the number of input VGPRs used by an AMD shader. Previously,
radv and radeonsi had the same code duplicated, but this commit also
allows them to share this code.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-26 13:36:49 +00:00
Timur Kristóf
83eebdb507 radv: Set shared VGPR count in radv_postprocess_config.
This commit allows RADV to set the shared VGPR count according to
the shader config.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-26 13:36:49 +00:00
Timur Kristóf
7bde4ddaf7 amd/common: Add num_shared_vgprs to ac_shader_config for GFX10.
In GFX10 wave64 mode, shared VGPRs allow the two wave halves to
share some data with each other.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-26 13:36:49 +00:00
Timur Kristóf
db1fddcf0f amd/common: Extract some helper functions to ac_shader_util.
This commit moves ac_get_tbuffer_format, ac_get_sampler_dim and
ac_get_image_dim into ac_shader_util, thus enabling them to be used
by compilers other than LLVM.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-26 13:36:49 +00:00
Timur Kristóf
d8b46f8964 amd/common: Move ac_export_mrt_z to ac_llvm_build.
The aim of this commit is to keep ac_shader_util LLVM-free,
since we would like to use it in ACO later.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-26 13:36:49 +00:00
Rhys Perry
06ea3325c3 aco: CSE readlane/readfirstlane/permute/reduce with the same exec mask
v2: rename pass_temp to pass_flags
v2: also CSE reductions
v3: add ds_swizzle_b32 support
v3: check gds/offset0/offset1 fields

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-09-26 13:19:51 +01:00
Rhys Perry
86ecf92c23 aco: don't CSE v_readlane_b32/v_readfirstlane_b32
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-09-26 13:19:51 +01:00
Rhys Perry
3c966fd688 aco,radv: rename record_llvm_ir/llvm_ir_string to record_ir/ir_string
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-26 11:08:47 +01:00
Rhys Perry
ec8ced9123 radv/aco: return a correct name and description for the backend IR
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-26 11:08:43 +01:00
Rhys Perry
15ea1c5cff aco: store printed backend IR in binary
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-26 11:08:31 +01:00
Rhys Perry
6613b81327 aco,radv/aco: get dissassembly for release builds if requested
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-26 11:08:09 +01:00
Rhys Perry
0aef1a230e radv/aco: actually disable ACO when unsupported
We were setting this twice. The second time, we weren't later disabling
it if unsupported.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-26 11:04:45 +01:00
Rhys Perry
db2ca45102 aco: check for duplicate opcode numbers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-25 15:28:44 +00:00
Rhys Perry
101f47fdd7 aco: fix opcode for s_mul_hi_i32
Fixes dEQP-VK.glsl.builtin.function.integer.imulextended.*_compute

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-25 15:28:44 +00:00
Rhys Perry
2faaf04c62 aco: fix v_subrev_co_u32_e64 opcode
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-25 15:28:44 +00:00
Rhys Perry
00aa413bae aco: fix GFX9 opcode for v_xad_u32
Fixes various dEQP-VK.image.store.* tests.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-25 15:28:44 +00:00
Rhys Perry
b125dc4839 aco: implement 64-bit ineg
We currently lower them, but nir_opt_algebraic() can add new ones because
lower_sub=true.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-09-25 15:27:48 +00:00
Rhys Perry
641eac953c aco: run nir_lower_int64() before nir_lower_idiv()
nir_lower_idiv() asserts on 64-bit integers.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-09-25 15:27:48 +00:00
Eric Engestrom
30f639c181 radv: fix s/load/store/ copy-paste typo
Fixes: cdc6efddf9 ("radv: implement all depth/stencil resolve modes using graphics")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-24 19:18:54 +01:00
Bas Nieuwenhuizen
780182f0a0 radv: Add workaround for hang in The Surge 2.
Released today and hangs on RADV. We don't have the root cause yet,
but this should unblock people playing the game.

No drirc because the radv debugflags are not usable from drirc and
I want this backported.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-09-24 09:51:40 +00:00
Marek Olšák
05d32850ff ac/nir: force unnormalized coordinates for RECT
This fixes VAAPI.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-23 15:34:54 -04:00
Marek Olšák
500181b2ba ac/nir: port Z compare value clamping from radeonsi
This fixes some dEQP tests.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-23 15:34:54 -04:00
Marek Olšák
9429714233 ac: stop using PCI IDs for chip identification
PCI IDs for amdgpu will be removed from Mesa.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-09-23 15:14:11 -04:00
Marek Olšák
48742de601 ac/addrlib: fix chip identification for Vega10, Arcturus, Raven2, Renoir
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-09-23 15:14:11 -04:00
Daniel Schürmann
2c050b49b3 aco: only emit waitcnt on loop continues if we there was some load or export
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-09-23 13:39:33 +02:00
Bas Nieuwenhuizen
40087ffc5b amd: Build aco only if radv is enabled
ACO depends on C++14, but radeonsi/radv with LLVM 8,9 do not. Let us
only require it for RADV, since that is the only user.

Fixes: a70a998718 "radv/aco: Setup alternate path in RADV to support the experimental ACO compiler"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-21 14:06:41 +02:00
Daniel Schürmann
8b78cce433 radv: remove dead shared variables
LLVM does this anyway, but for ACO we need to do it in NIR.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-19 12:10:00 +02:00
Daniel Schürmann
281262281b radv/aco: enable VK_EXT_shader_demote_to_helper_invocation
For now, this extension will only be enabled for ACO.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-19 12:10:00 +02:00
Daniel Schürmann
e01b522a72 radv: enable clustered reductions
These work with both, LLVM and ACO.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-19 12:10:00 +02:00
Daniel Schürmann
a70a998718 radv/aco: Setup alternate path in RADV to support the experimental ACO compiler
LLVM remains default and ACO can be enabled with RADV_PERFTEST=aco.

Co-authored-by: Daniel Schürmann <daniel@schuermann.dev>
Co-authored-by: Rhys Perry <pendingchaos02@gmail.com>

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-19 12:10:00 +02:00
Daniel Schürmann
93c8ebfa78 aco: Initial commit of independent AMD compiler
ACO (short for AMD Compiler) is a new compiler backend with the goal to replace
LLVM for Radeon hardware for the RADV driver.

ACO currently supports only VS, PS and CS on VI and Vega.
There are some optimizations missing because of unmerged NIR changes
which may decrease performance.

Full commit history can be found at
https://github.com/daniel-schuermann/mesa/commits/backend

Co-authored-by: Daniel Schürmann <daniel@schuermann.dev>
Co-authored-by: Rhys Perry <pendingchaos02@gmail.com>
Co-authored-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Co-authored-by: Connor Abbott <cwabbott0@gmail.com>
Co-authored-by: Michael Schellenberger Costa <mschellenbergercosta@googlemail.com>
Co-authored-by: Timur Kristóf <timur.kristof@gmail.com>

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-19 12:10:00 +02:00
Bas Nieuwenhuizen
4b7e7956f0 radv: Add DFSM support.
Apparently we already enabled it without having support ...

Not sure if we also need to set disable_start_of_prim when the PS
has memory writes, but this mirrors radeonsi.

Doubles fillrate in my dual_quad_bench from ~16 pixels/cycles to
~32 pixels/cycle on a Raven.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 21:28:51 +00:00
Bas Nieuwenhuizen
0fa2740059 radv: Disable dfsm by default even on Raven.
When actually implementing it, Talos on low is still 3% slower.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 21:28:51 +00:00
Bas Nieuwenhuizen
f2dffb395f radv: Only break batch on framebuffer change with dfsm.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 21:28:51 +00:00
Rhys Perry
b3f71685d9 radv: never kill a NGG GS shader
Seems to fix a hang with excessive vertex emissions when NGG is used for
GS.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-18 19:26:58 +00:00
Samuel Pitoiset
99c186fbbe radv/gfx10: fix VK_KHR_pipeline_executable_properties with NGG GS
No GS copy shader if a pipeline enables NGG GS.

This fixes
dEQP-VK.pipeline.executable_properties.graphics.*geometry_stage*.

Fixes: 86864eedd2 ("radv: Implement radv_GetPipelineExecutablePropertiesKHR.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-18 21:19:28 +02:00
Marek Olšák
aae35fbd3a ac: move ac_get_num_physical_vgprs into radeon_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 14:39:06 -04:00
Marek Olšák
0692ae34e9 ac: move ac_get_num_physical_sgprs into radeon_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 14:39:06 -04:00
Marek Olšák
ca43006fd2 ac: move ac_get_max_wave64_per_simd into radeon_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 14:39:06 -04:00
Marek Olšák
deab3a23f6 ac: move num_sdp_interfaces into radeon_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 14:39:06 -04:00
Marek Olšák
2c62b461e9 ac: move PBB MAX_ALLOC_COUNT into radeon_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 14:39:06 -04:00
Samuel Pitoiset
68820007fd radv: fix loading 64-bit GS inputs
We have to load 2 32-bit integer and to cast correctly.

This fixes crashes with gs-double-interpolator.vk_shader_test.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111734
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-18 17:16:36 +02:00
Samuel Pitoiset
46b7512b0a radv: fix writing depth/stencil clear values to image
Use the fastest way only if both aspects are used. Oops.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111728
Fixes: 218ce34962 ("radv: add mipmap support for the clear depth/stencil values")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-18 13:27:46 +02:00
Michel Dänzer
aed3babef7 ac: Remove DEBUG workaround
As of version 7, LLVM uses LLVM_DEBUG instead of just DEBUG.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-17 10:24:29 +00:00
Rhys Perry
ffabcbba60 radv: always emit a position export in gs copy shaders
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: f8d0337299 ('radv: add multiple streams support for the GS copy shader')
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-16 19:42:30 +00:00