Commit graph

3200 commits

Author SHA1 Message Date
Samuel Pitoiset
9cf55b022d radv: add VK_KHR_shader_atomic_int64 but disable it for now
No support for 64-bit compare&swap atomic operations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-17 21:59:56 +02:00
Samuel Pitoiset
d118e382dd ac/nir: add 64-bit SSBO atomic operations support
Except compare&swap which is still buggy.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-17 21:59:54 +02:00
Samuel Pitoiset
78c551aca1 ac/nir: use new LLVM 8 intrinsics for SSBO atomics except cmpswap
Use the raw version (ie. IDXEN=0) because vindex is unused.
Use the old intrinsic for compare&swap because the new one
hangs the GPU for some reasons.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-17 21:59:52 +02:00
Tapani Pälli
624789e370 compiler/glsl: handle case where we have multiple users for types
Both Vulkan and OpenGL might be using glsl_types simultaneously or we
can also have multiple concurrent Vulkan instances using glsl_types.
Patch adds a one time init to track number of users and will release
types only when last user calls _glsl_type_singleton_decref().

This change fixes glsl_type memory leaks we have with anv driver.

v2: reuse hash_mutex, cleanup, apply fix also to radv driver and
    rename helper functions (Jason)

v3: move init, destroy to happen on GL context init and destroy

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-04-16 12:58:00 +03:00
Samuel Pitoiset
ecbe6cb805 radv: sort the shader capabilities alphabetically
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-04-16 09:14:22 +02:00
Samuel Pitoiset
8704bd5588 radv: enable shaderInt8 on SI and CIK
No CTS failures.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-16 08:22:54 +02:00
Dylan Baker
95aefc94a9 Delete autotools
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Matt Turner <mattst88@gmail.com>
2019-04-15 13:44:29 -07:00
Samuel Pitoiset
bf4a0485d9 radv: set ACCESS_NON_READABLE on stores for copy/fill/clear meta shaders
The compiler will emit GLC=1.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-15 21:36:53 +02:00
Bas Nieuwenhuizen
f6fdd39eab radv: Use local buffers for the global bo list.
Even if we don't use local buffers in general. Turns out that even
though the performance is not the best the kernel still does it
better than our own list.

We still have to keep the radv bo list for buffers that are shared
externally.

This improves Talos on lowest quality setting (so as CPU bound as
possible) by ~10% if the global bo list is enabled.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-04-15 20:39:38 +02:00
Bas Nieuwenhuizen
af9534b9f3 ac: Move has_local_buffers disable to radeonsi.
In radv we had a separate flag to actually use it + an env option
to experimentally use it.

The common code setting has_local_buffers to false of course broke
that experimental option.

Also the "enable on APU" did not make sense for RADV as it is still
disabled by default.

Fixes: b21a4efb55 "radv/winsys: allow local BOs on APUs"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-04-15 20:39:28 +02:00
Bas Nieuwenhuizen
a589d8c0ab radv: Add bolist RADV_PERFTEST flag.
To test global_bo_list performance.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-04-15 20:39:05 +02:00
Marek Olšák
dbab755ecf ac: fix incorrect bindless atomic code in visit_image_atomic
Coverity: CID 1444664

Fixes: d62d434fe9 ("ac/nir_to_llvm: add image bindless support")

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-04-15 12:52:02 -04:00
Rhys Perry
8671cfe2a2 nir,ac/nir: fix cube_face_coord
Seems it was missing the "/ ma + 0.5" and the order was swapped.

Fixes: a1a2a8dfda ('nir: add AMD_gcn_shader extended instructions')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-04-15 17:22:47 +01:00
Samuel Pitoiset
14f03978ed radv: enable VK_KHR_shader_float16_int8
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-15 10:43:55 +02:00
Karol Herbst
14531d676b nir: make nir_const_value scalar
v2: remove & operator in a couple of memsets
    add some memsets
v3: fixup lima

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
2019-04-14 22:25:56 +02:00
Karol Herbst
2a36699ed3 radv: use nir constant helpers
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-04-14 22:25:56 +02:00
Karol Herbst
adb2263014 amd/nir: some cleanups
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-04-14 22:25:56 +02:00
Marek Olšák
f4ae188d50 ac: use the common helper ac_apply_fmask_to_sample
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-04-12 11:35:31 -04:00
Marek Olšák
971bc10177 radeonsi: set AC_FUNC_ATTR_READNONE for image opcodes where it was missing
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-04-12 11:34:39 -04:00
Samuel Pitoiset
6718bb57ac ac/nir: remove some useless integer casts for ALU operations
Sources are always casted to integers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 17:30:55 +02:00
Samuel Pitoiset
8a6442075f ac/nir: remove useless integer cast in visit_image_load()
ac_build_image_opcode() casts if necessary and buffer images
are casted too.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 17:30:55 +02:00
Samuel Pitoiset
ffbb62f808 ac/nir: remove useless integer cast in adjust_sample_index_using_fmask()
It's already casted if necessary in ac_build_image_opcode().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 17:30:55 +02:00
Samuel Pitoiset
7b5b27a685 ac/nir: remove useles LLVMGetUndef for nir_op_pack_64_2x32_split
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 17:30:55 +02:00
Samuel Pitoiset
fd4041987b ac: add ac_build_load_helper_invocation() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 17:30:55 +02:00
Samuel Pitoiset
590a4c8981 ac: add ac_build_ddxy_interp() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 17:30:55 +02:00
Samuel Pitoiset
4cb13e9462 ac: add ac_build_umax() and use it where possible
This changes the predicate from LessThan to Equal.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 17:30:55 +02:00
Samuel Pitoiset
cf88bfa75a ac/nir: make use of ac_build_umin() where possible
This changes the predicate from LessThan to Equal.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 17:30:54 +02:00
Samuel Pitoiset
15dd81913f ac/nir: make use of ac_build_imin() where possible
This changes the predicate from LessThan to Equal.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 17:30:54 +02:00
Samuel Pitoiset
d7a0c0d53b ac/nir: make use of ac_build_imax() where possible
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 17:30:54 +02:00
Timothy Arceri
d62d434fe9 ac/nir_to_llvm: add image bindless support
With this all piglit bindless image tests pass on radeonsi.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 09:02:59 +02:00
Timothy Arceri
55fb93b586 ac/nir_to_llvm: make get_sampler_desc() more generic and pass it the image intrinsic
This will be required by the bindless support in the following patches.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 09:02:59 +02:00
Karol Herbst
d7bbb3caf1 glsl_to_nir: handle bindless textures
v2: add support for AMD

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-12 09:02:59 +02:00
Samuel Pitoiset
09b4049be3 radv: enable VK_AMD_gpu_shader_half_float
Should be safe to enable as all instructions seem to support 16-bit.
Unfortunately, there is no CTS test.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-10 09:07:17 +02:00
Rhys Perry
fd1fc255d9 ac: add 16-bit support to ac_build_ddxy()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-10 09:05:58 +02:00
Samuel Pitoiset
bc6d486c78 ac/nir: fix nir_op_b2f16
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-10 09:05:55 +02:00
Bas Nieuwenhuizen
028ce52739 radv: Add non-uniform indexing lowering.
This patch does it as late as possible so the potential extra
basic blocks don't inhibit other optimizations.

Big thanks to Jason for writing the lowering pass.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-04-10 02:04:13 +02:00
Timothy Arceri
e30804c602 nir/radv: remove restrictions on opt_if_loop_last_continue()
When I implemented opt_if_loop_last_continue() I had restricted
this pass from moving other if-statements inside the branch opposite
the continue. At the time it was causing a bunch of spilling in
shader-db for i965.

However Samuel Pitoiset noticed that making this pass more aggressive
significantly improved the performance of Doom on RADV. Below are
the statistics he gathered.

28717 shaders in 14931 tests
Totals:
SGPRS: 1267317 -> 1267549 (0.02 %)
VGPRS: 896876 -> 895920 (-0.11 %)
Spilled SGPRs: 24701 -> 26367 (6.74 %)
Code Size: 48379452 -> 48507880 (0.27 %) bytes
Max Waves: 241159 -> 241190 (0.01 %)

Totals from affected shaders:
SGPRS: 23584 -> 23816 (0.98 %)
VGPRS: 25908 -> 24952 (-3.69 %)
Spilled SGPRs: 503 -> 2169 (331.21 %)
Code Size: 2471392 -> 2599820 (5.20 %) bytes
Max Waves: 586 -> 617 (5.29 %)

The codesize increases is related to Wolfenstein II it seems largely
due to an increase in phis rather than the existing jumps.

This gives +10% FPS with Doom on my Vega56.

Rhys Perry also benchmarked Doom on his VEGA64:

Before: 72.53 FPS
After:  80.77 FPS

v2: disable pass on non-AMD drivers

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-04-09 11:29:41 +10:00
Samuel Pitoiset
775191cd99 radv: fix getting the vertex strides if the bindings aren't contiguous
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110349
Fixes: a66b186beb ("radv: use typed buffer loads for vertex input fetches")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-08 21:17:15 +02:00
Samuel Pitoiset
27b8f3ecc3 ac/nir: fix intrinsic names for atomic operations with LLVM 9+
This fixes the following LLVM error when using RADV_DEBUG=checkir:
Intrinsic name not mangled correctly for type arguments! Should be: llvm.amdgcn.buffer.atomic.add.i32
i32 (i32, <4 x i32>, i32, i32, i1)* @llvm.amdgcn.buffer.atomic.add

The cmpswap operation still uses the old intrinsic.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-08 13:16:50 +02:00
Eric Engestrom
05b114e526 simplify LLVM version string printing
Figure it out once in the build system, then just use that all over the place.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-04-04 16:08:11 +00:00
Marek Olšák
b563460b49 radeonsi: enable displayable DCC on Ravens 2019-04-04 09:53:24 -04:00
Marek Olšák
1f21396431 radeonsi: add support for displayable DCC for multi-RB chips
A compute shader is used to reorder DCC data from aligned to unaligned.
2019-04-04 09:53:24 -04:00
Marek Olšák
2c09eb4122 radeonsi: add support for displayable DCC for 1 RB chips
This is the simpler codepath - just disable RB and pipe alignment for DCC.
2019-04-04 09:53:24 -04:00
Marek Olšák
e457454cb6 amd/addrlib: fix uninitialized values for Addr2ComputeDccAddrFromCoord
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-04 09:30:40 -04:00
Samuel Pitoiset
c25f63872b radv: partially enable VK_KHR_shader_float16_int8
Only 8-bit integers for now, float16 requires a bit more work.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-01 18:53:59 +02:00
Samuel Pitoiset
d099bc5829 ac: add 8-bit and 64-bit support to ac_build_bitfield_reverse()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-01 18:53:57 +02:00
Samuel Pitoiset
2cecf6c5cc ac: add 8-bit support to ac_build_umsb()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-01 18:53:55 +02:00
Samuel Pitoiset
a45d9e3e8d ac: add 8-bit support to ac_find_lsb()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-01 18:53:53 +02:00
Samuel Pitoiset
89cf8ca0ae ac: add 8-bit support to ac_build_bit_count()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-01 18:53:52 +02:00
Samuel Pitoiset
869af0464a ac/nir: add support for nir_op_b2i8
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-04-01 18:53:49 +02:00