Rhys Perry
c7bd69b3ae
aco: refactor visit_store_ssbo() to use new helpers
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639 >
2020-04-24 18:52:54 +00:00
Rhys Perry
f75c830433
aco: refactor store_vmem_mubuf() to use new helpers
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639 >
2020-04-24 18:52:54 +00:00
Rhys Perry
98b4cc7110
aco: refactor store_lds() to use new helpers
...
It should also work correctly for 8/16-bit stores
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639 >
2020-04-24 18:52:54 +00:00
Rhys Perry
562353e1f1
aco: add helpers for splitting stores
...
split_store_data() splits a vector and p_as_uniforms it if needed.
scan_write_mask()/advance_write_mask() are similar to
u_bit_scan_consecutive_range(), but makes it easier to only clear part of
the range and will also give ranges for zero'd bits.
split_buffer_store() is a helper for splitting VMEM/SMEM stores.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639 >
2020-04-24 18:52:54 +00:00
Rhys Perry
211a9f2057
aco: use emit_load helper for VMEM/SMEM loads
...
Also implements 8/16-bit loads for scratch/global.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639 >
2020-04-24 18:52:54 +00:00
Rhys Perry
57e6886f98
aco: refactor load_lds to use new helpers
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639 >
2020-04-24 18:52:54 +00:00
Rhys Perry
542733dbbf
aco: add emit_load helper
...
This helper is used for recombining split loads, passing the result to
p_as_uniform, aligning the offset down and shifting it right if needed and
handling large constant offsets.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639 >
2020-04-24 18:52:54 +00:00
Rhys Perry
69b92db131
aco: be more careful about using SMEM for load_global
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639 >
2020-04-24 18:52:54 +00:00
Timur Kristóf
62ff2ff808
aco: Move s_setprio to correct place after the gs_alloc_req.
...
Previously the setprio was inside the branch, so it would only reset
the priority on the first wave, but not the others.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536 >
2020-04-24 17:58:57 +00:00
Timur Kristóf
277f37d036
aco: Use 24-bit multiplication for NGG wave id and thread id.
...
Both of them should always fit 24 bits anyway.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536 >
2020-04-24 17:58:57 +00:00
Timur Kristóf
eafc1e7365
aco: Use 24-bit multiplication in TCS I/O
...
The TCS inputs and outputs must always fit into the LDS,
which implies that their addresses also always fit 24 bits.
On AMD GPUs, 24-bit multiplication is much faster than 32-bit
multiplication, so we can take the opportunity to use that
for TCS I/O instead.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536 >
2020-04-24 17:58:57 +00:00
Timur Kristóf
25775d346c
aco: Only store TCS outputs to VMEM when they are read by TES.
...
Totals from affected shaders (GFX10):
Code Size: 10832 -> 10736 (-0.89 %) bytes
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536 >
2020-04-24 17:58:57 +00:00
Rhys Perry
0d9fe0405f
aco: improve code for 32-bit isign
...
No shader-db changes on Navi.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667 >
2020-04-23 12:39:33 +00:00
Rhys Perry
607fb4153d
aco: move call to store_output_to_temps in store_ls_or_es_output earlier
...
Skips get_intrinsic_io_basic_offset()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667 >
2020-04-23 12:39:33 +00:00
Rhys Perry
b497b774a5
aco: remove copy in load_input_from_temps()
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667 >
2020-04-23 12:39:33 +00:00
Daniel Schürmann
c3c1f4d6bc
aco: move src1 to vgpr instead of using VOP3 for VOP2 instructions during isel
...
Is simpler and helps a couple of shaders.
Totals from affected shaders: (Vega)
Code Size: 16341296 -> 16335460 (-0.04 %) bytes
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4642 >
2020-04-20 15:12:50 +00:00
Daniel Schürmann
be0bb7e101
aco: fix 64bit fsub
...
Fixes: 425558bfd5 ('aco: use v_subrev_f32 for fsub with an sgpr operand in src1')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4642 >
2020-04-20 15:12:50 +00:00
Daniel Schürmann
425558bfd5
aco: use v_subrev_f32 for fsub with an sgpr operand in src1
...
This fixes an accidentally introduced regression.
Fixes: 9be4be515f ('aco: implement 16-bit nir_op_fsub/nir_op_fadd')
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4633 >
2020-04-19 16:16:27 +00:00
Samuel Pitoiset
c4ca9e66dd
aco: fix exporting the viewport index if the fragment shader needs it
...
It's like the layer, it has to be exported via the pos and also
as a varying if the fragment shader reads it.
Fixes dEQP-VK.draw.shader_viewport_index.fragment_shader_*
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4564 >
2020-04-17 16:23:24 +00:00
Rhys Perry
839c886b34
aco: add missing scc clobber to nir_op_unpack_32_2x16_split_y
...
The ISA doc is inconsistent whether this instruction writes SCC. It does.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4552 >
2020-04-16 17:04:53 +01:00
Rhys Perry
ac74367bef
aco: implement various 8/16-bit conversions
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4552 >
2020-04-16 17:04:45 +01:00
Samuel Pitoiset
11faaf646d
aco: fix emitting stream output with tess eval shaders
...
Fixes dEQP-VK.transform_feedback.simple.winding_patch_list_12.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4553 >
2020-04-16 07:57:39 +00:00
Samuel Pitoiset
91aa596ca7
aco: implement nir_op_f2i8/nir_op_f2u8
...
I think we should really refactor the conversions path.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4551 >
2020-04-16 08:47:49 +02:00
Rhys Perry
c818b5c089
aco: fix 1D textureGrad() on GFX9
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 6f718edced ('aco: simplify gathering of MIMG address components')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4550 >
2020-04-15 10:45:07 +00:00
Samuel Pitoiset
08a396033b
aco: fix nir_op_frexp_exp with 16-bit floats and negative exponents
...
v_frexp_exp_i16_f16 returns the two's complement for negative
exponents. For example, with 0.333252 it returns 0.666504 for
the mantissa and 65535 for the exponent (-1 in decimal).
RADV/LLVM and AMDVLK do a v_bfe_i32 and AMDGPU-PRO uses SDWA with
the sign extension bit set. The latter is probably what we want to
do in long term but for now RA doesn't support changing non-SDWA
instructions to SDWA if useful/needed.
Fixes dEQP-VK.glsl.builtin.precision_fp16_storage16b.frexp.compute.*.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4546 >
2020-04-15 10:12:44 +02:00
Samuel Pitoiset
fc1068de0d
aco: fix nir_op_pack_32_2x16_split if one operand is a constant
...
Because 16-bit constants are represented with the s1 RegClass, we
have to extract the low half.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4509 >
2020-04-13 11:51:17 +00:00
Samuel Pitoiset
4cfaef68d7
aco: implement 16-bit nir_op_f2i64/nir_op_f2u64
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4509 >
2020-04-13 11:51:17 +00:00
Samuel Pitoiset
729bdc0d70
aco: fix f2i64/f2u64 with sgprs if the exponent computation overflow
...
This fixes f16->{i64,u64} conversions for +0/-0.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4509 >
2020-04-13 11:51:17 +00:00
Daniel Schürmann
1de18708cb
aco: ensure correct bit representation of subdword constants
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4492 >
2020-04-10 07:19:27 +00:00
Samuel Pitoiset
67b567d0d0
aco: implement nir_op_b2f16/nir_op_i2f16/nir_op_u2f16
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
3119f978e5
aco: implement 16-bit comparisons
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
ccf8e23f59
aco: implement 16-bit nir_op_fmax3/nir_op_fmin3/nir_op_fmed3
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
981ced07a5
aco: implement 16-bit nir_op_ldexp
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
55537ed9d3
aco: implement 16-bit nir_op_f2i32/nir_op_f2u32
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
68339ff7a7
aco: implement 16-bit nir_op_bcsel
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
0646562a17
aco: implement 16-bit nir_op_fsign
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
6793ae1c5e
aco: implement 16-bit nir_op_fsat
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
0ecca65d11
aco: implement 16-bit nir_op_fmul
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
b0c60999bc
aco: implement 16-bit nir_op_fcos/nir_op_fsin
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
9be4be515f
aco: implement 16-bit nir_op_fsub/nir_op_fadd
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
b0b637ca17
aco: implement 16-bit nir_op_fabs/nir_op_fneg
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
acc5912786
aco: implement 16-bit nir_op_fmax/nir_op_fmin
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
66d5bfb09a
aco: implement 16-bit nir_op_ffloor/nir_op_fceil
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
c097c9f20c
aco: implement 16-bit nir_op_fsqrt/nir_op_frcp/nir_op_frsq
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
26ed9fb79e
aco: implement 16-bit nir_op_ftrunc/nir_op_fround_even
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
ee96181ad9
aco: implement 16-bit nir_op_fexp2/nir_op_flog2
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:05 +02:00
Samuel Pitoiset
b8486041df
aco: implement 16-bit nir_op_ffract
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:04 +02:00
Samuel Pitoiset
a8b45d7034
aco: implement 16-bit nir_op_frexp_sig/nir_op_frexp_exp
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4452 >
2020-04-10 08:05:04 +02:00
Timur Kristóf
64225c4f96
aco/ngg: Run GS_ALLOC_REQ on priority 3 for NGG VS and TES.
...
It is recommended to do this as quickly as possible.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576 >
2020-04-07 11:29:35 +00:00
Timur Kristóf
c633edad72
aco/ngg: Implement NGG VS and TES.
...
When NGG is used, vertex and tess eval shaders are executed on the
hardware NGG geometry stage. There is a series of steps they
must perform:
* Request GS space using GS_ALLOC_REQ
* Export the primitive
* Finally, export the normal VS outputs
In this commit, two modes are implemented:
* "late" which matches what the RADV LLVM backend currently does
* "early" which is an optimized version as seen in radeonsi
Vulkan doesn't allow the shader to write the edge flags, so we can
currently always use the "early" mode.
Exporting the primitive ID is also supported by having the GS threads
write that into LDS and reading them from LDS in the ES threads.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3576 >
2020-04-07 11:29:35 +00:00