Commit graph

1110 commits

Author SHA1 Message Date
Samuel Pitoiset
ff11c9dcc7 ac: add f16_0 and f16_1 constants
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 12:13:05 +01:00
Rhys Perry
3cc72a88d8 ac/nir: implement 8-bit conversions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:25 +01:00
Rhys Perry
c73f8b6576 ac/nir: add 8-bit types to glsl_base_to_llvm_type
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:22 +01:00
Rhys Perry
9c5067acf1 ac/nir: implement 8-bit ssbo stores
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:20 +01:00
Samuel Pitoiset
b235d77e18 ac: add ac_build_tbuffer_store_byte() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:18 +01:00
Rhys Perry
b12e074b89 ac/nir: implement 8-bit push constant, ssbo and ubo loads
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:16 +01:00
Samuel Pitoiset
104dbc64a5 ac: add ac_build_tbuffer_load_byte() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:14 +01:00
Samuel Pitoiset
6e632eb24b ac: add various int8 definitions
Original patch by Rhys Perry.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:10 +01:00
Samuel Pitoiset
72e366b4c2 ac: use new LLVM 8 intrinsics in ac_build_buffer_store_dword()
New buffer intrinsics have a separate soffset parameter.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:19 +01:00
Samuel Pitoiset
9d960c17a8 ac: use new LLVM 8 intrinsic when storing 16-bit values
vindex is always 0.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:14 +01:00
Samuel Pitoiset
2a9d331898 ac: add ac_build_{struct,raw}_tbuffer_store() helpers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:12 +01:00
Samuel Pitoiset
30c2aca67f ac: use new LLVM 8 intrinsics in ac_build_buffer_load()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:08 +01:00
Samuel Pitoiset
da46dbb1be ac/nir: use ac_build_buffer_store_dword() for SSBO store operations
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:06 +01:00
Samuel Pitoiset
6b573c00c9 ac/nir: use ac_build_buffer_load() for SSBO load operations
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:19:02 +01:00
Samuel Pitoiset
29132af234 ac/nir: use new LLVM 8 intrinsics for SSBO atomic operations
Use the raw version (ie. IDXEN=0) because vindex is unused.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:18:56 +01:00
Samuel Pitoiset
b39844457f ac/nir: remove one useless check in visit_store_ssbo()
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:18:54 +01:00
Samuel Pitoiset
a2073f49f1 ac: add ac_build_buffer_store_format() helper
Similar to ac_build_buffer_load_format().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:18:50 +01:00
Samuel Pitoiset
4debe49d44 ac/nir: set attrib flags for SSBO and image store operations
For consistency regarding other store operations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:18:37 +01:00
Samuel Pitoiset
1b553dd47f ac: make use of ac_get_store_intr_attribs() where possible
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 22:18:35 +01:00
Samuel Pitoiset
f4f0e3a395 ac: use llvm.amdgcn.fract intrinsic for nir_op_ffract
Noticed with a Doom shader.

29077 shaders in 15096 tests
Totals:
SGPRS: 1282125 -> 1282133 (0.00 %)
VGPRS: 908716 -> 908616 (-0.01 %)
Spilled SGPRs: 24811 -> 24779 (-0.13 %)
Code Size: 49048176 -> 48936488 (-0.23 %) bytes
Max Waves: 244232 -> 244226 (-0.00 %)

Totals from affected shaders:
SGPRS: 229584 -> 229592 (0.00 %)
VGPRS: 163268 -> 163168 (-0.06 %)
Spilled SGPRs: 8682 -> 8650 (-0.37 %)
Code Size: 12819572 -> 12707884 (-0.87 %) bytes
Max Waves: 24398 -> 24392 (-0.02 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 09:06:35 +01:00
Timothy Arceri
010570c8e3 ac/nir_to_llvm: add assert to emit_bcsel()
nir to llvm assumes we have already split vectors to scalars via
nir_lower_alu_to_scalar().

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-18 09:39:04 +11:00
Samuel Pitoiset
cbf022cb31 ac: use the raw tbuffer version for 16-bit SSBO loads
vindex is always 0.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-13 14:16:14 +01:00
Samuel Pitoiset
045fae0f73 ac: add ac_build_{struct,raw}_tbuffer_load() helpers
The struct version sets IDXEN=1, while the raw version sets IDXEN=0.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-13 14:15:05 +01:00
Samuel Pitoiset
489dac0d21 ac: rework typed buffers loads for LLVM 7
Be more generic, this will be used by an upcoming series.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-13 13:31:06 +01:00
Rhys Perry
0f025bbccc ac/nir: fix 16-bit ssbo stores
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-03-12 15:51:52 +01:00
Timothy Arceri
54522d0506 nir: rename glsl_type_is_struct() -> glsl_type_is_struct_or_ifc()
Replace done using:
find ./src -type f -exec sed -i -- \
's/glsl_type_is_struct(/glsl_type_is_struct_or_ifc(/g' {} \;

Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-06 13:10:02 +11:00
Timothy Arceri
8294295dbd glsl: rename record_location_offset() -> struct_location_offset()
Replace done using:
find ./src -type f -exec sed -i -- \
's/record_location_offset(/struct_location_offset(/g' {} \;

Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-06 13:10:02 +11:00
Bas Nieuwenhuizen
a1fdd4a4a7 radv: Fix float16 interpolation set up.
float16 types can have non-flat interpolation so set up the HW
correctly for that.

Fixes: 62024fa775 "radv: enable VK_KHR_16bit_storage extension / 16bit storage features"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-22 17:06:55 +01:00
Bas Nieuwenhuizen
1ef2855692 radv: Handle clip+cull distances more generally as compact arrays.
Needed for https://gitlab.freedesktop.org/mesa/mesa/merge_requests/248 .

That MR keeps the clip and cull arrays split.

So we have to handle
 - compact arrays with location_frac != 0
 - VARYING_SLOT_CLIP_DIST1

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-20 22:49:52 +00:00
Kenneth Graunke
ba7519ca36 radeonsi: Go back to using llvm.pow intrinsic for nir_op_fpow
ARB_vertex_program and ARB_fragment_program define 0^0 = 1 (while GLSL
leaves it undefined).  Performing fpow lowering in NIR would break this
behavior, preventing us from using prog_to_nir.

According to llvm/lib/Target/AMDGPU/SIInstructions.td, POW_common
expands to <V_LOG_F32_e32, V_EXP_F32_e32, V_MUL_LEGACY_F32_e32>,
which presumably does a zero-wins multiply.

Lowering in NIR results in a non-legacy multiply, where:

   pow(0, 0) = 2^(log2(0) * 0)
             = 2^(-INF * 0)
             = 2^(-NaN)
             = -NaN

which isn't the desired result.

This reverts:
- commit d6b7539206
  (ac/nir: remove emission of nir_op_fpow)
- commit 22430224fe
  (radeonsi/nir: enable lowering of fpow)

and prevents a regression in gl-1.0-spot-light with AMD_DEBUG=nir
after enabling prog_to_nir in st/mesa later in this series.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-02-19 15:56:19 -08:00
Rhys Perry
238730daef ac/nir: implement half-float nir_op_ldexp
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:04:46 +00:00
Rhys Perry
6971e8d342 ac/nir: implement half-float nir_op_frsq
v2: don't use ac_get_onef()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:04:41 +00:00
Rhys Perry
2038aec22a ac/nir: implement half-float nir_op_frcp
v2: don't use ac_get_onef()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:04:35 +00:00
Rhys Perry
4261edc067 ac/nir: make ac_build_fdiv support 16-bit floats
v2: don't use ac_get_onef()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:04:29 +00:00
Rhys Perry
6790b3a8db ac/nir: make ac_build_isign work on all bit sizes
v2: don't use ac_get_zero(), ac_get_one() and ac_int_of_size()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:04:20 +00:00
Rhys Perry
bbbfdef683 ac/nir: make ac_build_clamp work on all bit sizes
v2: don't use ac_get_zerof() and ac_get_onef()
v3: rename "intr" to "name"

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:03:58 +00:00
Rhys Perry
7e5004e30a ac/nir: fix 64-bit nir_op_f2f16_rtz
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:03:44 +00:00
Rhys Perry
c4ea20c0a0 ac/nir: implement 8-bit nir_load_const_instr
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-19 11:03:33 +00:00
Samuel Pitoiset
2cf5433b99 ac: use new LLVM 8 intrinsic when loading 16-bit values
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-18 12:14:20 +01:00
Samuel Pitoiset
f0223143a8 ac: add ac_build_llvm8_tbuffer_load() helper
It uses the new LLVM intrinsics.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-18 12:14:17 +01:00
Samuel Pitoiset
2154fac6f3 ac: make use of ac_build_expand_to_vec4() in visit_image_store()
And make ac_build_expand() a static function.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-14 09:09:48 +01:00
Samuel Pitoiset
bd1186572f radv: add support for push constants inlining when possible
This removes some scalar loads from shaders, but it increases
the number of SET_SH_REG packets. This is currently basic but
it could be improved if needed. Inlining dynamic offsets might
also help.

Original idea from Dave Airlie.

29077 shaders in 15096 tests
Totals:
SGPRS: 1321325 -> 1357101 (2.71 %)
VGPRS: 936000 -> 932576 (-0.37 %)
Spilled SGPRs: 24804 -> 24791 (-0.05 %)
Code Size: 49827960 -> 49642232 (-0.37 %) bytes
Max Waves: 242007 -> 242700 (0.29 %)

Totals from affected shaders:
SGPRS: 290989 -> 326765 (12.29 %)
VGPRS: 244680 -> 241256 (-1.40 %)
Spilled SGPRs: 1442 -> 1429 (-0.90 %)
Code Size: 8126688 -> 7940960 (-2.29 %) bytes
Max Waves: 80952 -> 81645 (0.86 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-12 17:25:54 +01:00
Bas Nieuwenhuizen
8a15950211 amd/common: Implement global memory accesses.
Needed for VK_EXT_buffer_device_address.

The pointers are implmemented as i8*, since I could not figure
out how to emulate setting struct offsets in LLVM based on the
SPIR-V offsets (and more weird stuff like row major matrices).

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:36:11 +01:00
Bas Nieuwenhuizen
5703ecf651 amd/common: Do not use 32-bit loads for shared memory.
We use a straight glsl->llvm type conversion so types should already be right.

Also even though the writemasks were changed we we not actually doing 32-bit
things, so this fails miserably.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:36:06 +01:00
Bas Nieuwenhuizen
8d1718590b amd/common: handle nir_deref_cast for shared memory from integers.
Can happen e.g. after a phi.

Fixes: a2b5cc3c39 "radv: enable variable pointers"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:36:02 +01:00
Bas Nieuwenhuizen
830fd0efc1 amd/common: Handle nir_deref_type_ptr_as_array for shared memory.
Fixes: a2b5cc3c39 "radv: enable variable pointers"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:35:58 +01:00
Bas Nieuwenhuizen
dbdb44d575 amd/common: Fix stores to derefs with unknown variable.
Fixes: a2b5cc3c39 "radv: enable variable pointers"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:35:54 +01:00
Bas Nieuwenhuizen
3c24fc64c7 amd/common: Use correct writemask for shared memory stores.
The check was for 1 bit being set, which is clearly not what we want.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:35:49 +01:00
Bas Nieuwenhuizen
58c8dadd32 amd/common: Implement ptr->int casts in ac_to_integer.
For the implicit casts inherent in nir.

This should probably have been done for shared memory for
VK_KHR_variable_pointers.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:35:40 +01:00
Bas Nieuwenhuizen
e00d9a9a72 amd/common: Add gep helper for pointer increment.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-06 22:35:36 +01:00