Commit graph

521 commits

Author SHA1 Message Date
Timothy Arceri
cd6cfd1095 ac: use the ac f16 llvm type
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
8f651ae062 ac: use the ac f32 llvm type
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
368654a299 ac: use the ac f64 llvm type
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
d927db0672 ac: use the common v8i32 llvm type
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
9db51b2393 ac: use the common v4i32 llvm type
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:46 +11:00
Timothy Arceri
ee376ac6f4 ac: add v3i32 to the common code and make use of it
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
309a51411d ac: add v2i32 to the common code and use it
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
c64cfa0392 ac: use the ac i64 llvm type
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
3d45acf71c ac: remove unused i16 llvm type
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
4d4799643d ac: use the ac ivoidt llvm type
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
209ad5c16f ac: use the ac i8 llvm type
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
21d71189ec ac: use the ac i1 llvm type
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
bd59a0bb8b ac: use the ac i32 llvm type
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:54:45 +11:00
Timothy Arceri
439a2febc4 ac/radeonsi: add support for tex instr without a derefence
These are produced by nir_lower_bitmap(), adding the missing derefence
would cause other issues that need to be hacked around such as
skipping sampler lowering and uniform location assignment, so this
change seems the correct way to go.

Fixes 194 piglit crashes on radeonsi using NIR.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-03 14:19:51 +11:00
Marek Olšák
529cdce799 radeonsi: remove 'Authors:' comments
It's inaccurate. Instead, see the copyright and use "git log" and
"git blame" to know the authorship.

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-02 18:19:03 +01:00
Dave Airlie
16cfbef44c ac/llvm: drop pointless wrappers around umsb/imsb
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:34 +10:00
Dave Airlie
82d47b9d38 ac/llvm: consolidate find lsb function.
This was the same between si and ac.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:31 +10:00
Dave Airlie
de2b241111 ac/llvm: drop v4f32empty. (v2)
This was unused.

v2: drop args.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:22 +10:00
Dave Airlie
a76b6c2192 ac/llvm: add i1false/i1true to common code.
These get used in fair few places.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:18 +10:00
Dave Airlie
88b7ddbe65 ac/llvm: use the ac i32 0/1 and f32 0/1 llvm types.
This just avoids having two copies of these.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:13 +10:00
Dave Airlie
f925f5b074 ac/nir: move lds declaration/load/store into shared code.
This was duplicated between both drivers, share here.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 15:59:11 +10:00
Matthew Nicholls
27a0b24bf2 ac/nir: generate correct instruction for atomic min/max on unsigned images
v2: fix silly typo

Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-25 20:52:58 +02:00
Timothy Arceri
8ebaf8192a ac: add support for explicit component packing
This is needed for RADV to support explicit component packing.

This is also required to use the new NIR component splitting /
packing passes.

V2:
 - add commponent packing support for interpolate_at* intrinsics
 - improve store packing support when not all varyings are scalar
   as spotted by Bas the store source was incorrectly offset.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-25 17:02:40 +11:00
Marek Olšák
2a414c3961 radeonsi: postponed KILL isn't postponed anymore, but maintains WQM
This restores performance for the drirc workaround, i.e.
KILL_IF does:
   visible = src0 >= 0;
   kill_flag &= visible; // accumulate kills
   amdgcn_kill(wqm_vote(visible)); // kill fully dead quads only

And all helper pixels are killed at the end of the shader:
   amdgcn_kill(kill_flag);

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-24 14:56:34 +02:00
Marek Olšák
478afbe525 ac: use llvm.amdgcn.kill with LLVM 6.0
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-24 14:56:34 +02:00
Marek Olšák
1ff9e27cbd ac: replace ac_build_kill with ac_build_kill_if_false
This will be a new LLVM intrinsic and will also work nicely with
llvm.amdgcn.wqm.vote.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-24 14:56:34 +02:00
Eric Anholt
ba85525fce ac: Silence a compiler warning about results[0].
We know that num_components will be > 0, but it doesn't.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-23 10:14:40 -07:00
Eric Anholt
34c04c734f ac: Fix a compiler warning for possibly undefined "name"
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-23 10:14:40 -07:00
Nicolai Hähnle
f9ccfda9bc amd/common/gfx9: workaround DCC corruption more conservatively
Fixes KHR-GL45.texture_swizzle.smoke and others on Vega.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102809
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-23 18:10:20 +02:00
Bas Nieuwenhuizen
a548b727a1 ac/nir: Only clamp shadow reference on radeonsi.
Vulkan CTS does not expect the value to be clamped (at least for D32),
and it makes a differences even though depth is in [0,1], due
to strict inequalities.

I couldn't find anything in the Vulkan spec about this, but the test
seemed to be copied from GL tests and the GL spec only specifies
clamping for fixed point formats. Hence I expect radeonsi to run into
this at some point as well, but given that they still have a usecase
with the Z16->Z32 promotion, I'll leave that for someone else to clean
up.

This at least fixes radv dEQP-VK.texture.shadow.* on VI.

Fixes: 0f9e32519b 'ac/nir: clamp shadow texture comparison value on VI'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-23 09:13:38 +02:00
Bas Nieuwenhuizen
2c5b43c87f ac/nir: Fix nir_texop_lod on GFX for 1D arrays.
Fixes: 1bcb953e16 'radv: handle GFX9 1D textures'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-23 00:27:44 +02:00
Dave Airlie
da9c3cd3ee radv/ac/nir: only emit tess factors to storage if tes reads them
Otherwise we just need to write them to the tf ring.

this seems to improve the tessellation demo on Bonarie
~2190->~2230 fps

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-23 07:10:29 +10:00
Bas Nieuwenhuizen
ad727b96b6 ac/nir: Account for compact array index in GS input load from LDS.
Mirrors the vram path.

Fixes: d4ecc3c929 'ac/nir: Add loading from LDS for merged GS.'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-21 22:29:40 +02:00
Bas Nieuwenhuizen
24fe4e6143 ac/nir: Set larged wrokgroup size for GS on GFX9.
They don't take a single wave anymore and we need the barriers.

Fixes: 6bc42855f9 'radv: enable GS on GFX9'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-21 12:46:44 +02:00
Bas Nieuwenhuizen
9e82f2b3ea ac/nir: Take the max workgroup size of all provided shaders.
Fixes: ffaf4d608a 'radv: Enable tessellation shaders for  GFX9.'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-21 12:46:28 +02:00
Andres Rodriguez
92724338ba radv: Expose VK_EXT_global_priority
Expose the extension string as supported

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 01:01:44 +02:00
Jason Ekstrand
59fb59ad54 nir: Get rid of nir_shader::stage
It's redundant with nir_shader::info::stage.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-20 12:49:17 -07:00
Bas Nieuwenhuizen
9961ae2447 ac/nir: Fix up GS input vgprs.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:37 +01:00
Bas Nieuwenhuizen
d4ecc3c929 ac/nir: Add loading from LDS for merged GS.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:29 +01:00
Bas Nieuwenhuizen
ec53e52742 ac/nir: Add ES output to LDS for GFX9.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:18 +01:00
Bas Nieuwenhuizen
3e77333030 ac/nir: Add merged GS function.
[airlied: merged fixup + and fixed up a couple more bits].

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:14 +01:00
Dave Airlie
1dda214d9c ac/nir: init full exec mask for merged shaders.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 01:50:40 +02:00
Timothy Arceri
bebfeb7e1c ac: move some code out of loop in store_tcs_output()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-20 08:01:26 +11:00
Bas Nieuwenhuizen
ce03c119ce radv: Add code to compile merged shaders.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:23 +02:00
Bas Nieuwenhuizen
640f2c458f ac/nir: Add LS-HS input VGPR workaround.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:19 +02:00
Bas Nieuwenhuizen
0a182e73d9 ac/nir: Compile the bodies of multiple shaders.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:15 +02:00
Bas Nieuwenhuizen
56d8af1ec5 ac/nir: Expand user SGPR descriptions a bit.
To prevent VS/TCS collisions in merged shaders.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:07 +02:00
Bas Nieuwenhuizen
25efef40d2 ac/nir: Don't write to the dynamic HS word on GFX9.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:04 +02:00
Bas Nieuwenhuizen
d8bd693d03 ac/nir: Add function creation for merged LS+HS.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:00 +02:00
Bas Nieuwenhuizen
0cdc8b26f8 ac/nir: Make scan_shader_output_decl less dependent on the context.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:24:56 +02:00