Commit graph

98 commits

Author SHA1 Message Date
Samuel Pitoiset
14292310d9 ac/nir: implement nir_intrinsic_shader_clock with device scope
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5117>
2020-05-24 20:37:58 +02:00
Samuel Pitoiset
b034f6cf2a ac/nir: fix shader clock with subgroup scope
The compiler should emit s_memtime instead of s_memrealtime for
the subgroup scope. I don't know why this LLVM 9 checks was for
but LLVM 8 also has this amdgcn intrinsic.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5117>
2020-05-24 20:37:54 +02:00
Michel Dänzer
2a6811f0f9 Revert "ac,radeonsi: fix compilations issues with LLVM 11"
This reverts commit 42b1696ef6.

The corresponding LLVM changes were reverted.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5087>
2020-05-19 07:19:35 +00:00
Marek Olšák
2361e8e722 ac/nir: honor ACCESS_STREAM_CACHE_POLICY for L1 and L0 caches too
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4935>
2020-05-15 22:12:35 +00:00
Samuel Pitoiset
0d63a1a84d ac/llvm: add support for texturing with clamped LOD
This is a requirement for the shaderResourceMinLod feature which
allows to clamp LOD. This uses all image_sample_*_cl variants.

All dEQP-VK.glsl.texture_functions.texture*clamp.* pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4989>
2020-05-14 10:05:44 +00:00
Pierre-Eric Pelloux-Prayer
ae4379d81e ac/nir: export some undef as zero
NIR already optimizes undef usage.
If undef reaches llvm, it's probably because of a broken shader.

In this situation, rather than letting llvm use the undef values
to do more optimization and probably produce incorrect results,
we replace undef values by 0.

"undef" values that are directly used in exports are kept as undef,
because this allows llvm to optimize them away.

This is only enabled for radeonsi.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2689
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4607>
2020-05-05 12:26:26 +02:00
Pierre-Eric Pelloux-Prayer
64662dd5ba radeonsi: add workaround for issue 2647
For unknown reasons pixel shaders in KSP game get executed with
infinite interpolation coefficients and this causes an infinite
loop in the shader.

This commit adds a hacky workaround that kills pixel shaders if
invalid interp coeffs are detected and enables it for KSP.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2174
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2647
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4700>
2020-05-05 09:41:14 +00:00
Marek Olšák
b97cc41aa2 Revert "ac: reassociate FP expressions for inexact instructions for radeonsi"
This reverts commit cf2f3c2753.

It breaks shadows in Unigine Superposition.

Fixes: cf2f3c2753

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4837>
2020-05-04 11:51:37 -04:00
Pierre-Eric Pelloux-Prayer
7e7bb38bd8 radeonsi: fix export count
Fixes: 17acff01a0 ("radeonsi: skip vs output optimizations for some outputs")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2877
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4871>
2020-05-04 15:11:09 +02:00
Samuel Pitoiset
aa94213781 ac/llvm: fix nir_texop_texture_samples with NULL descriptors
With VK_EXT_robustness2, descriptors can be NULL and the number of
samples returned by nir_texop_texture_samples should be 0.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4775>
2020-04-29 07:29:54 +00:00
Samuel Pitoiset
42b1696ef6 ac,radeonsi: fix compilations issues with LLVM 11
Latest LLVM replaced LLVMVectorTypeKind.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2826
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4755>
2020-04-27 17:13:36 +00:00
Marek Olšák
cf2f3c2753 ac: reassociate FP expressions for inexact instructions for radeonsi
Totals:
SGPRS: 2591784 -> 2590696 (-0.04 %)
VGPRS: 1666888 -> 1666736 (-0.01 %)
Spilled SGPRs: 4131 -> 4107 (-0.58 %)
Spilled VGPRs: 38 -> 38 (0.00 %)
Private memory VGPRs: 2176 -> 2176 (0.00 %)
Scratch size: 2228 -> 2228 (0.00 %) dwords per thread
Code Size: 52715468 -> 52693584 (-0.04 %) bytes
LDS: 92 -> 92 (0.00 %) blocks
Max Waves: 479897 -> 479892 (-0.00 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4696>
2020-04-27 11:20:16 +00:00
Marek Olšák
4b9370cb0f ac: generate FMA for inexact instructions for radeonsi
NIR mostly does this already.

Totals:
SGPRS: 2588520 -> 2591784 (0.13 %)
VGPRS: 1666984 -> 1666888 (-0.01 %)
Spilled SGPRs: 4074 -> 4131 (1.40 %)
Spilled VGPRs: 38 -> 38 (0.00 %)
Private memory VGPRs: 2176 -> 2176 (0.00 %)
Scratch size: 2228 -> 2228 (0.00 %) dwords per thread
Code Size: 52726872 -> 52715468 (-0.02 %) bytes
LDS: 92 -> 92 (0.00 %) blocks
Max Waves: 479872 -> 479897 (0.01 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4696>
2020-04-27 11:20:16 +00:00
Marek Olšák
f2c2a28073 ac: update and document fast math flags used by radeonsi
This should have no effect, because we never use FP division, but
it's safer for the future.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4696>
2020-04-27 11:20:16 +00:00
Marek Olšák
3bb65c0670 ac: force enable -structurizecfg-skip-uniform-regions for LLVM 11
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4696>
2020-04-27 11:20:16 +00:00
Marek Olšák
b4fd8f1919 ac,radeonsi: simplify checking for Navi1x chips
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4698>
2020-04-24 10:38:54 +00:00
Pierre-Eric Pelloux-Prayer
17acff01a0 radeonsi: skip vs output optimizations for some outputs
If PT_SPRITE_TEX is enabled, PS inputs are overriden at runtime so
we can't apply the vs output optim.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2747
Fixes: 3ec9975555 ("radeonsi: eliminate trivial constant VS outputs")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4559>
2020-04-20 08:45:16 +02:00
Samuel Pitoiset
9f005f1f85 radv: enable lowering of GS intrinsics for the LLVM backend
This replaces emit_vertex with:
   if (vertex_count < max_vertices) {
      emit_vertex_with_counter vertex_count ...
      vertex_count += 1
   }

Which is exactly what NIR->LLVM was doing but at NIR level. This
pass is already called by ACO.

pipeline-db changes on GFX10:
Totals from affected shaders:
SGPRS: 1952 -> 1912 (-2.05 %)
VGPRS: 2112 -> 2044 (-3.22 %)
Code Size: 189368 -> 185620 (-1.98 %) bytes
Max Waves: 494 -> 491 (-0.61 %)

No pipeline-db changes on other generations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4182>
2020-04-08 08:24:05 +02:00
Samuel Pitoiset
3cd5450df5 ac/nir: split 16-bit SSBO stores on GFX6
Due to possible alignment issues, make sure to split stores of
16-bit vectors.

Doom Eternal requires storageBuffer16BitAccess.

Cc: 20.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4339>
2020-04-03 08:01:28 +00:00
Samuel Pitoiset
55fdcc03de ac/nir: split 16-bit load/store to global memory on GFX6
Due to possible alignment issues, make sure to split loads/stores
of 16-bit vectors.

Doom Eternal requires storageBuffer16BitAccess.

Cc: 20.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4339>
2020-04-03 08:01:28 +00:00
Samuel Pitoiset
c6bf1597d1 ac/nir: split 8-bit SSBO stores on GFX6
Due to possible alignment issues, make sure to split stores of
8-bit vectors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4339>
2020-04-03 08:01:28 +00:00
Samuel Pitoiset
433f3380eb ac/nir: split 8-bit load/store to global memory on GFX6
Due to possible alignment issues, make sure to split loads/stores
of 8-bit vectors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4339>
2020-04-03 08:01:28 +00:00
Eric Engestrom
79af30768d meson: inline inc_common
Let's make it clear what includes are being added everywhere, so that
they can be cleaned up.

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4360>
2020-03-28 21:36:54 +01:00
Samuel Pitoiset
ba2ec1f369 ac/nir: use llvm.amdgcn.rcp in ac_build_fdiv()
Instead of emitting 1.0 / x which includes a slow division that
LLVM doesn't always optimize even if the metadata is correctly set.

No pipeline-db changes with VEGA10/LLVM 9.

pipeline-db (VEGA10/LLVM 10):
Totals from affected shaders:
SGPRS: 6672 -> 6672 (0.00 %)
VGPRS: 6652 -> 6652 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 561780 -> 561692 (-0.02 %) bytes
Max Waves: 1043 -> 1043 (0.00 %)

pipeline-db (VEGA10/LLVM 11 - 92744f62478):
Totals from affected shaders:
SGPRS: 84608 -> 83768 (-0.99 %)
VGPRS: 106768 -> 106636 (-0.12 %)
Spilled SGPRs: 1625 -> 1713 (5.42 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 10850936 -> 10726712 (-1.14 %) bytes
Max Waves: 3152 -> 3180 (0.89 %)

LLVM 11 (master) is more affected than previous versions, but
based on the small impact with LLVM 9/10, I decided to emit it
unconditionally.

Cc: 20.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326>
2020-03-27 08:05:43 +01:00
Samuel Pitoiset
d548384fc6 ac/nir: use llvm.amdgcn.rsq for nir_op_frsq
Instead of emitting 1.0 / sqrt(x) which includes a slow division that
LLVM doesn't always optimize even if the metadata is correctly set.

pipeline-db (VEGA10/LLVM 9):
Totals from affected shaders:
SGPRS: 16872 -> 16864 (-0.05 %)
VGPRS: 15320 -> 15464 (0.94 %)
Spilled SGPRs: 2021 -> 2133 (5.54 %)
Code Size: 1915464 -> 1917476 (0.11 %) bytes
Max Waves: 641 -> 639 (-0.31 %)

pipeline-db (VEGA10/LLVM 10):
Totals from affected shaders:
SGPRS: 43936 -> 44120 (0.42 %)
VGPRS: 41776 -> 41972 (0.47 %)
Spilled SGPRs: 875 -> 875 (0.00 %)
Code Size: 4468164 -> 4468120 (-0.00 %) bytes
Max Waves: 2412 -> 2414 (0.08 %)

pipeline-db (VEGA10/LLVM 11 - 92744f62478):
Totals from affected shaders:
SGPRS: 60096 -> 60096 (0.00 %)
VGPRS: 63552 -> 63648 (0.15 %)
Spilled SGPRs: 6135 -> 6117 (-0.29 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 6252996 -> 6249772 (-0.05 %) bytes
Max Waves: 2324 -> 2337 (0.56 %)

LLVM 11 (master) is more affected than previous versions, but
based on the small impact with LLVM 9/10, I decided to emit it
unconditionally.

Cc: 20.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326>
2020-03-27 07:45:47 +01:00
Samuel Pitoiset
66426ce119 ac/nir: use llvm.amdgcn.rcp for nir_op_frcp
Instead of emitting 1.0 / x which includes a slow division that
LLVM doesn't always optimize even if the metadata is correctly set.

pipeline-db (VEG10/LLVM 9):
Totals from affected shaders:
SGPRS: 50384 -> 50312 (-0.14 %)
VGPRS: 42572 -> 42696 (0.29 %)
Spilled SGPRs: 1372 -> 1372 (0.00 %)
Code Size: 5692040 -> 5691428 (-0.01 %) bytes
Max Waves: 3954 -> 3951 (-0.08 %)

pipeline-db (VEG10/LLVM 10):
Totals from affected shaders:
SGPRS: 78512 -> 78464 (-0.06 %)
VGPRS: 62408 -> 62484 (0.12 %)
Spilled SGPRs: 1502 -> 1502 (0.00 %)
Code Size: 8106188 -> 8103372 (-0.03 %) bytes
Max Waves: 7759 -> 7753 (-0.08 %)

pipeline-db (VEGA10/LLVM 11 - 92744f62478):
Totals from affected shaders:
SGPRS: 112760 -> 113232 (0.42 %)
VGPRS: 111132 -> 110568 (-0.51 %)
Spilled SGPRs: 5870 -> 5940 (1.19 %)
Spilled VGPRs: 650 -> 652 (0.31 %)
Code Size: 11887232 -> 11561744 (-2.74 %) bytes
Max Waves: 8964 -> 9015 (0.57 %)

LLVM 11 (master) is more affected than previous versions, but
based on the small impact with LLVM 9/10, I decided to emit it
unconditionally.

Cc: 20.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326>
2020-03-27 07:45:43 +01:00
Pierre-Eric Pelloux-Prayer
5533c41541 ac: fix ac_build_is_helper_invocation when postponed_kill is null
If there was no demote() in the shader, ac_build_is_helper_invocation
behaves exactly the same as ac_build_load_helper_invocation, i.e.
the helper lanes are the same as they were at the beginning of the shader.

Fixes: de57ea2a3d ("amd/llvm: implement nir_intrinsic_demote(_if) and nir_intrinsic_is_helper_invocation")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4301>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4301>
2020-03-25 08:19:38 +01:00
Samuel Pitoiset
7ac8bb33cd radv/llvm: fix subgroup shuffle for chips without bpermute
bpermute only exists on GFX8+ and only with Wave32 on GFX10. Instead
we have to use readlane with a waterfall loop to defeat the LLVM
backend.

This fixes DOOM Eternal which requires subgroup shuffle.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4284>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4284>
2020-03-23 14:19:03 +00:00
Marek Olšák
303842b2db ac: fix fast division
This stopped working with LLVM 11 and might occasionally have been broken
on older LLVM, because the metadata was set on the mul, not on the rcp.

Cc: 19.3 20.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4268>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4268>
2020-03-21 22:34:17 +00:00
Bas Nieuwenhuizen
8e4e2cedcf amd/llvm: Fix divergent descriptor regressions with radeonsi.
piglit/bin/arb_bindless_texture-limit -auto -fbo:
  Needed to deal with non-NULL dynamic_index without deref in tex instructions.

piglit/bin/shader_runner tests/spec/arb_bindless_texture/execution/images/multiple-resident-images-reading.shader_test -auto:
  Need to deal with non-deref images in enter_waterfall_imae.

Fixes: b83c9aca4a "amd/llvm: Fix divergent descriptor indexing. (v3)"
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4191>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4191>
2020-03-17 22:53:16 +01:00
Marek Olšák
8dc5e174c7 ac: don't set old denormals flags with LLVM >= 11
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>
2020-03-17 20:47:48 +00:00
Marek Olšák
63a5051ea6 ac: set new LLVM denormal flags
See: https://reviews.llvm.org/D71358

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>
2020-03-17 20:47:48 +00:00
Bas Nieuwenhuizen
b83c9aca4a amd/llvm: Fix divergent descriptor indexing. (v3)
There are multiple LLVM passes that very much move the
intrinsic using the descriptor outside of the loop, defeating
the entire point of creating the loop.

Defeat the optimizer by  splitting the break into a separate
if-statement and putting an optimization barrier on the bool
in between.

v2: Move from a callback based system to begin/end loop.
    This does not make it significantly less intrusive but
    is a bit nicer with all the extra struct and callback
    stubs.
v3: Deal with non-divergent values in divergent path.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2160
Fixes: 028ce52739 "radv: Add non-uniform indexing lowering."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4109>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4109>
2020-03-12 16:12:02 +00:00
Samuel Pitoiset
cc320ef9af ac/llvm: add missing optimization barrier for 64-bit readlanes
Otherwise, LLVM optimizes it but it's actually incorrect.

Fixes: 0f45d4dc2b ("ac: add ac_build_readlane without optimization barrier")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3585>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3585>
2020-03-12 08:46:42 +01:00
Marek Olšák
fc65df5651 ac: add a bug workaround for the 100% NGG culling case
Fixes: 8db00a51f8 - radeonsi/gfx10: implement NGG culling for 4x wave32 subgroups
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4079>
2020-03-09 16:08:11 -04:00
Daniel Schürmann
61fb17e8d7 amd: join emit_kill() from radv and radeonsi in ac_nir_to_llvm
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4047>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4047>
2020-03-09 12:29:32 +00:00
Daniel Schürmann
de57ea2a3d amd/llvm: implement nir_intrinsic_demote(_if) and nir_intrinsic_is_helper_invocation
The current implementation uses a temporary helper variable
to ensure correct behavior until LLVM provides an intrinsic.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4047>
2020-03-09 12:29:32 +00:00
Pierre-Eric Pelloux-Prayer
771f16cf61 radeonsi: remove AMD_DEBUG=sisched option
sisched is not maintained anymore in LLVM.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4059>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4059>
2020-03-06 11:35:12 +01:00
Samuel Pitoiset
9e5d2a73c5 ac/llvm: flush denorms for nir_op_fmed3 on GFX8 and older gens
The hardware doesn't flush denorms, exactly like fmin/fmax, so
we have to do it manually. This doesn't fix anything known.

Fixes: d6a07732c9 ("ac: use llvm.amdgcn.fmed3 intrinsic for nir_op_fmed3")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3962>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3962>
2020-02-27 08:04:33 +01:00
Samuel Pitoiset
30ac733680 ac/llvm: fix 16-bit fmed3 on GFX8 and older gens
16-bit med3 is only supported on GFX9+.

Fixes dEQP-VK.spirv_assembly.instruction.amd_trinary_minmax.mid3.f16.*.

Fixes: d6a07732c9 ("ac: use llvm.amdgcn.fmed3 intrinsic for nir_op_fmed3")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3962>
2020-02-27 08:04:30 +01:00
Samuel Pitoiset
50b8c25274 ac/llvm: fix 64-bit fmed3
Lower 64-bit fmed3 because LLVM doesn't expose an intrinsic.

Fixes dEQP-VK.spirv_assembly.instruction.amd_trinary_minmax.mid3.f64.*.

Fixes: d6a07732c9 ("ac: use llvm.amdgcn.fmed3 intrinsic for nir_op_fmed3")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3962>
2020-02-27 08:04:28 +01:00
Samuel Pitoiset
6f4c300919 ac/llvm: implement VK_AMD_shader_explicit_vertex_parameter
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
2020-01-29 09:49:50 +00:00
Samuel Pitoiset
a31bcf2be6 ac/llvm: fix missing casts in ac_build_readlane()
Because ac_build_optimization_barrier() overwrites the original
src_type, we have to keep track of it before emitting that barrier.
Otherwise, wrong conversions are expected for pointers or small
bitsizes.

By doing this, we no longer need to do the cast dance in
ac_build_readlane_no_opt_barrier(), it was just necessary for
ac_build_optimization_barrier().

This fixes a bunch of crashes with subgroups related tests when
RADV_DEBUG=checkir is enabled, and it also fixes a compiler crash
with The Surge 2.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2395
Fixes: 0f45d4dc2b ("ac: add ac_build_readlane without optimization barrier")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3535>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3535>
2020-01-24 07:40:07 +01:00
Samuel Pitoiset
9e477d79b7 ac/nir: add support for nir_texop_fragment_{mask}_fetch
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>
2020-01-23 10:48:02 +00:00
Marek Olšák
4e4b2d13f0 ac: add helper ac_build_triangle_strip_indices_to_triangle
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
0f45d4dc2b ac: add ac_build_readlane without optimization barrier
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
77393cf39b ac: add prefix bitcount functions
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
eeb4a11c11 ac/cull: don't read Position.Z if it's not needed for culling
It could be NULL.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-15 15:06:20 -05:00
Jason Ekstrand
d3737002ee nir/lower_atomics_to_ssbo: Also lower barriers
This is more correct for a pass which is supposed to completely lower
away atomic counters.  It also lets us stop supporting atomic counter
barriers in most of the drivers.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
2020-01-13 17:23:47 +00:00
Jason Ekstrand
e40b11bbcb nir: Rename nir_intrinsic_barrier to control_barrier
This is a more explicit name now that we don't want it to be doing any
memory barrier stuff for us.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
2020-01-13 17:23:47 +00:00