Commit graph

311 commits

Author SHA1 Message Date
Dave Airlie
6bec8bcd79 ac/nir: add support for all intrinsics. (v2)
This is derived from tgsi/radeonsi code from the GLSL intrinsics.

This should pre-fix radv for the upcoming spirv patches.

v2: actually use wait_cnt, sleep deprived dad time! (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-09 01:25:59 +00:00
Dave Airlie
0084f4a422 ac/nir: for ubo load use correct num_components
I was hacking something stupid in doom, and hit an assert for the bitcast
following this, it definitely looks like this should be the number of 32-bit
components, not the instr level ones.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-07 14:54:19 +10:00
Timothy Arceri
6e2eb96b64 ac: remove the remaining duplicate llvm types
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
e73a467005 ac: remove usused v4f32
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
7f4966731f ac: add v2f32 to the common code and make use of it
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
cd6cfd1095 ac: use the ac f16 llvm type
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
8f651ae062 ac: use the ac f32 llvm type
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
368654a299 ac: use the ac f64 llvm type
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
d927db0672 ac: use the common v8i32 llvm type
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
9db51b2393 ac: use the common v4i32 llvm type
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
ee376ac6f4 ac: add v3i32 to the common code and make use of it
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
309a51411d ac: add v2i32 to the common code and use it
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
c64cfa0392 ac: use the ac i64 llvm type
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
3d45acf71c ac: remove unused i16 llvm type
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
4d4799643d ac: use the ac ivoidt llvm type
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
209ad5c16f ac: use the ac i8 llvm type
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
21d71189ec ac: use the ac i1 llvm type
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
bd59a0bb8b ac: use the ac i32 llvm type
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
439a2febc4 ac/radeonsi: add support for tex instr without a derefence
These are produced by nir_lower_bitmap(), adding the missing derefence
would cause other issues that need to be hacked around such as
skipping sampler lowering and uniform location assignment, so this
change seems the correct way to go.

Fixes 194 piglit crashes on radeonsi using NIR.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:19:51 +11:00
Dave Airlie
16cfbef44c ac/llvm: drop pointless wrappers around umsb/imsb
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:34 +10:00
Dave Airlie
82d47b9d38 ac/llvm: consolidate find lsb function.
This was the same between si and ac.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:31 +10:00
Dave Airlie
de2b241111 ac/llvm: drop v4f32empty. (v2)
This was unused.

v2: drop args.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:22 +10:00
Dave Airlie
a76b6c2192 ac/llvm: add i1false/i1true to common code.
These get used in fair few places.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:18 +10:00
Dave Airlie
88b7ddbe65 ac/llvm: use the ac i32 0/1 and f32 0/1 llvm types.
This just avoids having two copies of these.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:13 +10:00
Dave Airlie
f925f5b074 ac/nir: move lds declaration/load/store into shared code.
This was duplicated between both drivers, share here.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:11 +10:00
Matthew Nicholls
27a0b24bf2 ac/nir: generate correct instruction for atomic min/max on unsigned images
v2: fix silly typo

Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-25 20:52:58 +02:00
Timothy Arceri
8ebaf8192a ac: add support for explicit component packing
This is needed for RADV to support explicit component packing.

This is also required to use the new NIR component splitting /
packing passes.

V2:
 - add commponent packing support for interpolate_at* intrinsics
 - improve store packing support when not all varyings are scalar
   as spotted by Bas the store source was incorrectly offset.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-25 17:02:40 +11:00
Marek Olšák
1ff9e27cbd ac: replace ac_build_kill with ac_build_kill_if_false
This will be a new LLVM intrinsic and will also work nicely with
llvm.amdgcn.wqm.vote.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-24 14:56:34 +02:00
Eric Anholt
ba85525fce ac: Silence a compiler warning about results[0].
We know that num_components will be > 0, but it doesn't.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-23 10:14:40 -07:00
Bas Nieuwenhuizen
a548b727a1 ac/nir: Only clamp shadow reference on radeonsi.
Vulkan CTS does not expect the value to be clamped (at least for D32),
and it makes a differences even though depth is in [0,1], due
to strict inequalities.

I couldn't find anything in the Vulkan spec about this, but the test
seemed to be copied from GL tests and the GL spec only specifies
clamping for fixed point formats. Hence I expect radeonsi to run into
this at some point as well, but given that they still have a usecase
with the Z16->Z32 promotion, I'll leave that for someone else to clean
up.

This at least fixes radv dEQP-VK.texture.shadow.* on VI.

Fixes: 0f9e32519b 'ac/nir: clamp shadow texture comparison value on VI'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-23 09:13:38 +02:00
Bas Nieuwenhuizen
2c5b43c87f ac/nir: Fix nir_texop_lod on GFX for 1D arrays.
Fixes: 1bcb953e16 'radv: handle GFX9 1D textures'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-23 00:27:44 +02:00
Dave Airlie
da9c3cd3ee radv/ac/nir: only emit tess factors to storage if tes reads them
Otherwise we just need to write them to the tf ring.

this seems to improve the tessellation demo on Bonarie
~2190->~2230 fps

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-23 07:10:29 +10:00
Bas Nieuwenhuizen
ad727b96b6 ac/nir: Account for compact array index in GS input load from LDS.
Mirrors the vram path.

Fixes: d4ecc3c929 'ac/nir: Add loading from LDS for merged GS.'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-21 22:29:40 +02:00
Bas Nieuwenhuizen
24fe4e6143 ac/nir: Set larged wrokgroup size for GS on GFX9.
They don't take a single wave anymore and we need the barriers.

Fixes: 6bc42855f9 'radv: enable GS on GFX9'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-21 12:46:44 +02:00
Bas Nieuwenhuizen
9e82f2b3ea ac/nir: Take the max workgroup size of all provided shaders.
Fixes: ffaf4d608a 'radv: Enable tessellation shaders for  GFX9.'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-21 12:46:28 +02:00
Jason Ekstrand
59fb59ad54 nir: Get rid of nir_shader::stage
It's redundant with nir_shader::info::stage.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-20 12:49:17 -07:00
Bas Nieuwenhuizen
9961ae2447 ac/nir: Fix up GS input vgprs.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:37 +01:00
Bas Nieuwenhuizen
d4ecc3c929 ac/nir: Add loading from LDS for merged GS.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:29 +01:00
Bas Nieuwenhuizen
ec53e52742 ac/nir: Add ES output to LDS for GFX9.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:18 +01:00
Bas Nieuwenhuizen
3e77333030 ac/nir: Add merged GS function.
[airlied: merged fixup + and fixed up a couple more bits].

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:14 +01:00
Dave Airlie
1dda214d9c ac/nir: init full exec mask for merged shaders.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 01:50:40 +02:00
Timothy Arceri
bebfeb7e1c ac: move some code out of loop in store_tcs_output()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-20 08:01:26 +11:00
Bas Nieuwenhuizen
640f2c458f ac/nir: Add LS-HS input VGPR workaround.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:19 +02:00
Bas Nieuwenhuizen
0a182e73d9 ac/nir: Compile the bodies of multiple shaders.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:15 +02:00
Bas Nieuwenhuizen
25efef40d2 ac/nir: Don't write to the dynamic HS word on GFX9.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:04 +02:00
Bas Nieuwenhuizen
d8bd693d03 ac/nir: Add function creation for merged LS+HS.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:00 +02:00
Bas Nieuwenhuizen
0cdc8b26f8 ac/nir: Make scan_shader_output_decl less dependent on the context.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:24:56 +02:00
Bas Nieuwenhuizen
a996ed1f9b ac/nir: Change interface to allow multiple source shaders.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:24:47 +02:00
Bas Nieuwenhuizen
872b21487c ac/nir: Add HS calling convention.
Needed for GFX9 merged shaders.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:24:42 +02:00
Marek Olšák
854593b8eb ac: clean up ac_build_indexed_load function interfaces
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00