Timothy Arceri
caf15ce670
ac: move build_varying_gather_values() to ac_llvm_build.h and expose
...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-04 12:52:19 +11:00
Timothy Arceri
6fd6cb6616
ac: add basic nir -> llvm type helper
...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-04 12:52:18 +11:00
Dave Airlie
043d14db30
ac/nir: don't write tcs outputs to LDS that aren't read back.
...
If the TCS doesn't read back the outputs, no need to store them
to LDS in the first place. (except for tess factors).
This seems to give about 50fps (3290->3330) with tessellation demo.
I haven't tested if it impacts DoW3 at all.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-27 13:50:24 +10:00
Timothy Arceri
b73ce64fb8
ac: add gs_{prim,invocation}_id to the abi
...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-16 10:54:03 +11:00
Timothy Arceri
8fe6abd964
ac: add emit_vertex to the abi
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-12 11:08:26 +11:00
Dave Airlie
6bec8bcd79
ac/nir: add support for all intrinsics. (v2)
...
This is derived from tgsi/radeonsi code from the GLSL intrinsics.
This should pre-fix radv for the upcoming spirv patches.
v2: actually use wait_cnt, sleep deprived dad time! (Bas)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-09 01:25:59 +00:00
Dave Airlie
0084f4a422
ac/nir: for ubo load use correct num_components
...
I was hacking something stupid in doom, and hit an assert for the bitcast
following this, it definitely looks like this should be the number of 32-bit
components, not the instr level ones.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-07 14:54:19 +10:00
Timothy Arceri
6e2eb96b64
ac: remove the remaining duplicate llvm types
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
e73a467005
ac: remove usused v4f32
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
7f4966731f
ac: add v2f32 to the common code and make use of it
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
cd6cfd1095
ac: use the ac f16 llvm type
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
8f651ae062
ac: use the ac f32 llvm type
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
368654a299
ac: use the ac f64 llvm type
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
d927db0672
ac: use the common v8i32 llvm type
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
9db51b2393
ac: use the common v4i32 llvm type
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
ee376ac6f4
ac: add v3i32 to the common code and make use of it
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
309a51411d
ac: add v2i32 to the common code and use it
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
c64cfa0392
ac: use the ac i64 llvm type
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
3d45acf71c
ac: remove unused i16 llvm type
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
4d4799643d
ac: use the ac ivoidt llvm type
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
209ad5c16f
ac: use the ac i8 llvm type
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
21d71189ec
ac: use the ac i1 llvm type
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
bd59a0bb8b
ac: use the ac i32 llvm type
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
439a2febc4
ac/radeonsi: add support for tex instr without a derefence
...
These are produced by nir_lower_bitmap(), adding the missing derefence
would cause other issues that need to be hacked around such as
skipping sampler lowering and uniform location assignment, so this
change seems the correct way to go.
Fixes 194 piglit crashes on radeonsi using NIR.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:19:51 +11:00
Dave Airlie
16cfbef44c
ac/llvm: drop pointless wrappers around umsb/imsb
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:34 +10:00
Dave Airlie
82d47b9d38
ac/llvm: consolidate find lsb function.
...
This was the same between si and ac.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:31 +10:00
Dave Airlie
de2b241111
ac/llvm: drop v4f32empty. (v2)
...
This was unused.
v2: drop args.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:22 +10:00
Dave Airlie
a76b6c2192
ac/llvm: add i1false/i1true to common code.
...
These get used in fair few places.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:18 +10:00
Dave Airlie
88b7ddbe65
ac/llvm: use the ac i32 0/1 and f32 0/1 llvm types.
...
This just avoids having two copies of these.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:13 +10:00
Dave Airlie
f925f5b074
ac/nir: move lds declaration/load/store into shared code.
...
This was duplicated between both drivers, share here.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:11 +10:00
Matthew Nicholls
27a0b24bf2
ac/nir: generate correct instruction for atomic min/max on unsigned images
...
v2: fix silly typo
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-25 20:52:58 +02:00
Timothy Arceri
8ebaf8192a
ac: add support for explicit component packing
...
This is needed for RADV to support explicit component packing.
This is also required to use the new NIR component splitting /
packing passes.
V2:
- add commponent packing support for interpolate_at* intrinsics
- improve store packing support when not all varyings are scalar
as spotted by Bas the store source was incorrectly offset.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-25 17:02:40 +11:00
Marek Olšák
1ff9e27cbd
ac: replace ac_build_kill with ac_build_kill_if_false
...
This will be a new LLVM intrinsic and will also work nicely with
llvm.amdgcn.wqm.vote.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-24 14:56:34 +02:00
Eric Anholt
ba85525fce
ac: Silence a compiler warning about results[0].
...
We know that num_components will be > 0, but it doesn't.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-23 10:14:40 -07:00
Bas Nieuwenhuizen
a548b727a1
ac/nir: Only clamp shadow reference on radeonsi.
...
Vulkan CTS does not expect the value to be clamped (at least for D32),
and it makes a differences even though depth is in [0,1], due
to strict inequalities.
I couldn't find anything in the Vulkan spec about this, but the test
seemed to be copied from GL tests and the GL spec only specifies
clamping for fixed point formats. Hence I expect radeonsi to run into
this at some point as well, but given that they still have a usecase
with the Z16->Z32 promotion, I'll leave that for someone else to clean
up.
This at least fixes radv dEQP-VK.texture.shadow.* on VI.
Fixes: 0f9e32519b 'ac/nir: clamp shadow texture comparison value on VI'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-23 09:13:38 +02:00
Bas Nieuwenhuizen
2c5b43c87f
ac/nir: Fix nir_texop_lod on GFX for 1D arrays.
...
Fixes: 1bcb953e16 'radv: handle GFX9 1D textures'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-23 00:27:44 +02:00
Dave Airlie
da9c3cd3ee
radv/ac/nir: only emit tess factors to storage if tes reads them
...
Otherwise we just need to write them to the tf ring.
this seems to improve the tessellation demo on Bonarie
~2190->~2230 fps
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-23 07:10:29 +10:00
Bas Nieuwenhuizen
ad727b96b6
ac/nir: Account for compact array index in GS input load from LDS.
...
Mirrors the vram path.
Fixes: d4ecc3c929 'ac/nir: Add loading from LDS for merged GS.'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-21 22:29:40 +02:00
Bas Nieuwenhuizen
24fe4e6143
ac/nir: Set larged wrokgroup size for GS on GFX9.
...
They don't take a single wave anymore and we need the barriers.
Fixes: 6bc42855f9 'radv: enable GS on GFX9'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-21 12:46:44 +02:00
Bas Nieuwenhuizen
9e82f2b3ea
ac/nir: Take the max workgroup size of all provided shaders.
...
Fixes: ffaf4d608a 'radv: Enable tessellation shaders for GFX9.'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-21 12:46:28 +02:00
Jason Ekstrand
59fb59ad54
nir: Get rid of nir_shader::stage
...
It's redundant with nir_shader::info::stage.
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-20 12:49:17 -07:00
Bas Nieuwenhuizen
9961ae2447
ac/nir: Fix up GS input vgprs.
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:37 +01:00
Bas Nieuwenhuizen
d4ecc3c929
ac/nir: Add loading from LDS for merged GS.
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:29 +01:00
Bas Nieuwenhuizen
ec53e52742
ac/nir: Add ES output to LDS for GFX9.
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:18 +01:00
Bas Nieuwenhuizen
3e77333030
ac/nir: Add merged GS function.
...
[airlied: merged fixup + and fixed up a couple more bits].
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:14 +01:00
Dave Airlie
1dda214d9c
ac/nir: init full exec mask for merged shaders.
...
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 01:50:40 +02:00
Timothy Arceri
bebfeb7e1c
ac: move some code out of loop in store_tcs_output()
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-20 08:01:26 +11:00
Bas Nieuwenhuizen
640f2c458f
ac/nir: Add LS-HS input VGPR workaround.
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:19 +02:00
Bas Nieuwenhuizen
0a182e73d9
ac/nir: Compile the bodies of multiple shaders.
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:15 +02:00
Bas Nieuwenhuizen
25efef40d2
ac/nir: Don't write to the dynamic HS word on GFX9.
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:04 +02:00