Marek Olšák
9fb2fd0b43
compiler: add ACCESS_STREAM_CACHE_POLICY
...
radeonsi will use this.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-08-12 14:52:17 -04:00
Lionel Landwerlin
8a884a25c5
amd: prepare dropping include of p_compiler.h
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-09 22:59:43 +03:00
Pierre-Eric Pelloux-Prayer
25fff591c1
radeonsi: add support for nir atomic_inc_wrap/atomic_dec_wrap
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:41:06 -04:00
Pierre-Eric Pelloux-Prayer
704a6b5948
ac: add ac_atomic_inc_wrap / ac_atomic_dec_wrap support
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:41:03 -04:00
Marek Olšák
f818d9ae3c
radeonsi/nir: handle key.mono.u.ps.interpolate_at_sample_force_center
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-06 17:08:39 -04:00
Bas Nieuwenhuizen
2af00b1fdd
ac/nir: Use correct cast for readfirstlane and ptrs.
...
Fixes: 028ce527 "radv: Add non-uniform indexing lowering."
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-06 15:48:50 +00:00
Connor Abbott
74470baebb
ac/nir: Lower large indirect variables to scratch
...
results from radeonsi NIR:
Totals from affected shaders:
SGPRS: 704 -> 464 (-34.09 %)
VGPRS: 2056 -> 672 (-67.32 %)
Spilled SGPRs: 24 -> 0 (-100.00 %)
Spilled VGPRs: 28406 -> 0 (-100.00 %)
Private memory VGPRs: 0 -> 3182 (0.00 %)
Scratch size: 1064 -> 3228 (203.38 %) dwords per thread
Code Size: 935260 -> 40180 (-95.70 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 28 -> 70 (150.00 %)
Wait states: 0 -> 0 (0.00 %)
results from radv:
Totals from affected shaders:
SGPRS: 80 -> 48 (-40.00 %)
VGPRS: 204 -> 108 (-47.06 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 256 (0.00 %) dwords per thread
Code Size: 15792 -> 9504 (-39.82 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 1 -> 2 (100.00 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-05 11:45:18 +02:00
Bas Nieuwenhuizen
72e7b7a00b
ac/nir,radv: Optimize bounds check for 64 bit CAS.
...
When the application does not ask for robust buffer access.
Only implemented the check in radv.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-02 21:21:55 +02:00
Bas Nieuwenhuizen
a17f2206d3
ac/nir: Implement LLVM9 64-bit buffer compare & exchange.
...
LLVM 9 does not have a 64-bit buffer compswap intrinsic, so this
extracts the ptr, does a bound check and then uses a cmpxchg LLVM
instruction.
Not ideal, but the earliest release we're going to get a proper
intrinsic is LLVM 10.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-02 12:26:11 +02:00
Connor Abbott
73274c9ec2
Revert "ac/nir: handle negate modifier"
...
This reverts commit bfea7e4d29 .
2019-08-02 11:14:50 +02:00
Connor Abbott
4a382d66ee
Revert "ac/nir: handle abs modifier"
...
This reverts commit d3c80733cd .
These were only appearing due to memory corruption.
2019-08-02 11:14:08 +02:00
Eric Engestrom
abc226cf41
tree-wide: replace MAYBE_UNUSED with ASSERTED
...
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-31 09:41:05 +01:00
Marek Olšák
033c39a660
ac/nir: fix incorrect Phis if callbacks use control flow inside control flow
2019-07-30 22:06:23 -04:00
Marek Olšák
d3c80733cd
ac/nir: handle abs modifier
2019-07-30 22:06:23 -04:00
Marek Olšák
efe2d8c5f9
ac: fix a memory leak in the error path of ac_build_type_name_for_intr
2019-07-30 22:06:23 -04:00
Marek Olšák
f6eca14f1b
ac: allow control flow statements in NIR callbacks
...
This fixes a crash when compiling geometry shaders on radeonsi.
2019-07-30 22:06:23 -04:00
Marek Olšák
bfea7e4d29
ac/nir: handle negate modifier
2019-07-30 22:06:23 -04:00
Marek Olšák
9234275320
radeonsi/nir: implement FBFETCH for KHR_blend_equation_advanced
2019-07-30 22:06:23 -04:00
Marek Olšák
27ac9a3326
ac/surface: allow linear swizzle mode automatic selection on gfx9 & 10
...
let addrlib make the decision to get the same result as PAL.
2019-07-30 22:06:23 -04:00
Marek Olšák
7708540363
amd: add support for Arcturus
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-29 17:52:54 -04:00
Marek Olšák
19d04191c4
radeonsi: add support for compute-only chips
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-29 17:52:51 -04:00
Samuel Pitoiset
4aa450193b
ac: do not crash when the buffer data format is invalid
...
This might happen when a pipeline doesn't define the vertex input
state, so the buffer data format is 0 (aka INVALID).
This fixes crashes when compiling some shaders on GFX10.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-29 13:19:32 +02:00
Rhys Perry
a9f58af454
ac/nir: fix txf_ms with an offset
...
Seems to fix some hair artifacts in Max Payne 3:
https://github.com/daniel-schuermann/mesa/issues/76
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: f4e499ec79 ('radv: add initial non-conformant radv vulkan driver')
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-29 11:50:13 +01:00
Marek Olšák
6ac2146a98
ac/nir: implement nir_op_pack_{us}norm_2x16
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-23 15:03:44 -04:00
Samuel Pitoiset
6049745b13
ac/nir: do not clamp shadow reference on GFX10
...
RadeonSI only uses Z32_FLOAT_CLAMP for upgraded depth textures
on GFX10 and RADV doesn't promotes Z16 or Z24.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-22 09:02:22 +02:00
Marek Olšák
54e6900ede
radeonsi/gfx10: use 32-bit wavemasks for Wave32
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
81091a5183
ac: create the LLVM builder in ac_llvm_context_init
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
eb54b8c222
ac: create the LLVM module for Wave32 or Wave64 in ac_llvm_context_init
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
921c1d24d5
ac/rtld: add support for Wave32
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
73aa04e40d
ac: add Wave32 LLVM target machine
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
9e467d111b
ac: initial Wave32 support in LLVM build helpers
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
47dee97329
ac: use llvm.amdgcn.writelane
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
39d0c68321
ac: fix shader clock on LLVM 9
...
Probably relevant commit:
commit dd32dc3f72ec99b1794d62c74d2beb3b60468d50
Author: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>
Date: Tue Jul 9 03:10:18 2019 +0000
[AMDGPU] Always use s_memtime for readcyclecounter
Differential Revision: https://reviews.llvm.org/D64369
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365431 91177308-0d34-0410-b5e6-96231b3b80d8
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Eric Engestrom
09a8a39940
util: use standard name for strchrnul()
...
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-19 22:39:38 +01:00
Dave Airlie
82a2f10529
radv/gfx10: set the pgm rsrc3/4 regs using index sh reg set
...
This is ported from AMDVLK, it's probably not requires unless
we want to use "real time queues", but it might be nice to just have
in place.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-18 10:24:26 +10:00
Samuel Pitoiset
f239e22813
radv/gfx10: enable 1D textures
...
Mirror RadeonSI. This also fixes crashes in addrlib.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-12 18:25:45 +02:00
Samuel Pitoiset
e510c5ee3b
ac: import ac_get_compute_resource_limits() from RadeonSI
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-12 17:47:11 +02:00
Marek Olšák
9d1483de3b
radeonsi/gfx10: enable 1D textures
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Connor Abbott
0c114ae3be
ac/nir: Remove now-unused interp_deref handling
...
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-07-08 14:18:52 +02:00
Connor Abbott
b3a226691d
radeonsi/nir: Use NIR barycentric intrinsics
...
This is simpler than radv, since the driver_location is already assigned
for us.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-07-08 14:18:46 +02:00
Connor Abbott
0cad0424e9
ac/nir: Implement barycentric intrinsics
...
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-07-08 14:18:25 +02:00
Daniel Schürmann
e41e932e57
radv: Lower input attachments in NIR.
...
v2 (Connor)
- Fix warning in release mode using MAYBE_UNUSED
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-08 14:14:53 +02:00
Daniel Schürmann
c65e880a65
radv: Implement nir_intrinsic_load_layer_id().
...
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-07-08 14:14:53 +02:00
Samuel Pitoiset
49e5136887
ac: select the GFX ring when halting waves with UMR on GFX10
...
GFX10 has two rings, so UMR want to know which one to halt.
Select the first one by default.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-08 09:10:57 +02:00
Samuel Pitoiset
9a01eded0c
radv/gfx10: set llvm_has_working_vgpr_indexing
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-07 17:03:38 +02:00
Samuel Pitoiset
c3459968cd
ac/nir: unpacked GS invocation ID on GFX10+
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-07 17:03:38 +02:00
Samuel Pitoiset
4d7c420a94
ac: add missing formats to ac_get_tbuffer_format() for GFX10
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-07 17:03:38 +02:00
Marek Olšák
aa5dab27f9
ac: destroy passes in ac_destroy_llvm_compiler
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-04 15:39:04 -04:00
Marek Olšák
ea64d66fde
ac: use an LLVM fence instead of s.waitcnt when possible
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-04 15:39:03 -04:00
Marek Olšák
14450c8c41
ac: remove unused AC_WAIT_EXP
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-04 15:39:01 -04:00