Timur Kristóf
eafc1e7365
aco: Use 24-bit multiplication in TCS I/O
...
The TCS inputs and outputs must always fit into the LDS,
which implies that their addresses also always fit 24 bits.
On AMD GPUs, 24-bit multiplication is much faster than 32-bit
multiplication, so we can take the opportunity to use that
for TCS I/O instead.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536 >
2020-04-24 17:58:57 +00:00
Timur Kristóf
25775d346c
aco: Only store TCS outputs to VMEM when they are read by TES.
...
Totals from affected shaders (GFX10):
Code Size: 10832 -> 10736 (-0.89 %) bytes
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536 >
2020-04-24 17:58:57 +00:00
Rhys Perry
0d9fe0405f
aco: improve code for 32-bit isign
...
No shader-db changes on Navi.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667 >
2020-04-23 12:39:33 +00:00
Rhys Perry
607fb4153d
aco: move call to store_output_to_temps in store_ls_or_es_output earlier
...
Skips get_intrinsic_io_basic_offset()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667 >
2020-04-23 12:39:33 +00:00
Rhys Perry
b497b774a5
aco: remove copy in load_input_from_temps()
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667 >
2020-04-23 12:39:33 +00:00
Daniel Schürmann
c3c1f4d6bc
aco: move src1 to vgpr instead of using VOP3 for VOP2 instructions during isel
...
Is simpler and helps a couple of shaders.
Totals from affected shaders: (Vega)
Code Size: 16341296 -> 16335460 (-0.04 %) bytes
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4642 >
2020-04-20 15:12:50 +00:00
Daniel Schürmann
be0bb7e101
aco: fix 64bit fsub
...
Fixes: 425558bfd5 ('aco: use v_subrev_f32 for fsub with an sgpr operand in src1')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4642 >
2020-04-20 15:12:50 +00:00
Daniel Schürmann
425558bfd5
aco: use v_subrev_f32 for fsub with an sgpr operand in src1
...
This fixes an accidentally introduced regression.
Fixes: 9be4be515f ('aco: implement 16-bit nir_op_fsub/nir_op_fadd')
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4633 >
2020-04-19 16:16:27 +00:00
Samuel Pitoiset
c4ca9e66dd
aco: fix exporting the viewport index if the fragment shader needs it
...
It's like the layer, it has to be exported via the pos and also
as a varying if the fragment shader reads it.
Fixes dEQP-VK.draw.shader_viewport_index.fragment_shader_*
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4564 >
2020-04-17 16:23:24 +00:00
Rhys Perry
839c886b34
aco: add missing scc clobber to nir_op_unpack_32_2x16_split_y
...
The ISA doc is inconsistent whether this instruction writes SCC. It does.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4552 >
2020-04-16 17:04:53 +01:00
Rhys Perry
ac74367bef
aco: implement various 8/16-bit conversions
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4552 >
2020-04-16 17:04:45 +01:00
Samuel Pitoiset
11faaf646d
aco: fix emitting stream output with tess eval shaders
...
Fixes dEQP-VK.transform_feedback.simple.winding_patch_list_12.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4553 >
2020-04-16 07:57:39 +00:00
Samuel Pitoiset
91aa596ca7
aco: implement nir_op_f2i8/nir_op_f2u8
...
I think we should really refactor the conversions path.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4551 >
2020-04-16 08:47:49 +02:00
Rhys Perry
c818b5c089
aco: fix 1D textureGrad() on GFX9
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 6f718edced ('aco: simplify gathering of MIMG address components')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4550 >
2020-04-15 10:45:07 +00:00
Samuel Pitoiset
08a396033b
aco: fix nir_op_frexp_exp with 16-bit floats and negative exponents
...
v_frexp_exp_i16_f16 returns the two's complement for negative
exponents. For example, with 0.333252 it returns 0.666504 for
the mantissa and 65535 for the exponent (-1 in decimal).
RADV/LLVM and AMDVLK do a v_bfe_i32 and AMDGPU-PRO uses SDWA with
the sign extension bit set. The latter is probably what we want to
do in long term but for now RA doesn't support changing non-SDWA
instructions to SDWA if useful/needed.
Fixes dEQP-VK.glsl.builtin.precision_fp16_storage16b.frexp.compute.*.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4546 >
2020-04-15 10:12:44 +02:00
Samuel Pitoiset
fc1068de0d
aco: fix nir_op_pack_32_2x16_split if one operand is a constant
...
Because 16-bit constants are represented with the s1 RegClass, we
have to extract the low half.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4509 >
2020-04-13 11:51:17 +00:00
Samuel Pitoiset
4cfaef68d7
aco: implement 16-bit nir_op_f2i64/nir_op_f2u64
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4509 >
2020-04-13 11:51:17 +00:00
Samuel Pitoiset
729bdc0d70
aco: fix f2i64/f2u64 with sgprs if the exponent computation overflow
...
This fixes f16->{i64,u64} conversions for +0/-0.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4509 >
2020-04-13 11:51:17 +00:00
Daniel Schürmann
1de18708cb
aco: ensure correct bit representation of subdword constants
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4492 >
2020-04-10 07:19:27 +00:00
Samuel Pitoiset
67b567d0d0
aco: implement nir_op_b2f16/nir_op_i2f16/nir_op_u2f16
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
3119f978e5
aco: implement 16-bit comparisons
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
ccf8e23f59
aco: implement 16-bit nir_op_fmax3/nir_op_fmin3/nir_op_fmed3
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
981ced07a5
aco: implement 16-bit nir_op_ldexp
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
55537ed9d3
aco: implement 16-bit nir_op_f2i32/nir_op_f2u32
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
68339ff7a7
aco: implement 16-bit nir_op_bcsel
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
0646562a17
aco: implement 16-bit nir_op_fsign
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
6793ae1c5e
aco: implement 16-bit nir_op_fsat
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
0ecca65d11
aco: implement 16-bit nir_op_fmul
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
b0c60999bc
aco: implement 16-bit nir_op_fcos/nir_op_fsin
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
9be4be515f
aco: implement 16-bit nir_op_fsub/nir_op_fadd
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
b0b637ca17
aco: implement 16-bit nir_op_fabs/nir_op_fneg
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
acc5912786
aco: implement 16-bit nir_op_fmax/nir_op_fmin
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
66d5bfb09a
aco: implement 16-bit nir_op_ffloor/nir_op_fceil
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
c097c9f20c
aco: implement 16-bit nir_op_fsqrt/nir_op_frcp/nir_op_frsq
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
26ed9fb79e
aco: implement 16-bit nir_op_ftrunc/nir_op_fround_even
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
ee96181ad9
aco: implement 16-bit nir_op_fexp2/nir_op_flog2
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
b8486041df
aco: implement 16-bit nir_op_ffract
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:04 +02:00
Samuel Pitoiset
a8b45d7034
aco: implement 16-bit nir_op_frexp_sig/nir_op_frexp_exp
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:04 +02:00
Timur Kristóf
64225c4f96
aco/ngg: Run GS_ALLOC_REQ on priority 3 for NGG VS and TES.
...
It is recommended to do this as quickly as possible.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576 >
2020-04-07 11:29:35 +00:00
Timur Kristóf
c633edad72
aco/ngg: Implement NGG VS and TES.
...
When NGG is used, vertex and tess eval shaders are executed on the
hardware NGG geometry stage. There is a series of steps they
must perform:
* Request GS space using GS_ALLOC_REQ
* Export the primitive
* Finally, export the normal VS outputs
In this commit, two modes are implemented:
* "late" which matches what the RADV LLVM backend currently does
* "early" which is an optimized version as seen in radeonsi
Vulkan doesn't allow the shader to write the edge flags, so we can
currently always use the "early" mode.
Exporting the primitive ID is also supported by having the GS threads
write that into LDS and reading them from LDS in the ES threads.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576 >
2020-04-07 11:29:35 +00:00
Timur Kristóf
d345bfe195
aco: Extract merged_wave_info_to_mask to its own function.
...
Currently we only use this at the beginning of merged shader parts,
but we are going to need to use it with some NGG code as well.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576 >
2020-04-07 11:29:35 +00:00
Timur Kristóf
b9cbdb6a45
aco: Extract uniform if handling to separate functions.
...
Currently we only use this for uniform ifs that come from NIR,
but we are going to need to use it with some NGG parts as well.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576 >
2020-04-07 11:29:35 +00:00
Rhys Perry
20a4b1461b
aco: zero-initialize Temp
...
Fixes dEQP-VK.transform_feedback.* crashes from accesses garbage
temporaries in emit_extract_vector().
Fixes: 85521061 ("aco: prepare helper functions for subdword handling")
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4463 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4463 >
2020-04-06 19:15:19 +00:00
Daniel Schürmann
1d293096d0
aco: use MUBUF to load subdword SSBO
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4002 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4002 >
2020-04-03 23:13:15 +01:00
Daniel Schürmann
8cfddc9199
aco: implement 8bit/16bit store_ssbo
...
Currently without alignment check, so that
we can only use the _byte and _short versions
and multi-component stores are split.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4002 >
2020-04-03 23:13:15 +01:00
Daniel Schürmann
3df0a41c75
aco: implement 8bit/16bit load_buffer
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4002 >
2020-04-03 23:13:15 +01:00
Daniel Schürmann
c70d014455
aco: implement storagePushConstant8 & storagePushConstant16
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4002 >
2020-04-03 23:13:15 +01:00
Daniel Schürmann
5718347c2b
aco: implement vec2/3/4 with subdword operands
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4002 >
2020-04-03 23:13:15 +01:00
Daniel Schürmann
85521061d6
aco: prepare helper functions for subdword handling
...
- get_alu_src()
- emit_extract_vector()
- emit_split_vector()
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4002 >
2020-04-03 23:13:15 +01:00
Daniel Schürmann
fe08f0ccf9
aco: add byte_align_scalar() & trim_subdword_vector() helper functions
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4002 >
2020-04-03 23:13:15 +01:00