Commit graph

289 commits

Author SHA1 Message Date
Rhys Perry
1b6a319c15 aco: add and set precise flag
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
2020-06-15 18:24:22 +00:00
Rhys Perry
a8f800a836 aco: use p_as_uniform in emit_vop1_instruction
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
2020-06-15 18:24:22 +00:00
Rhys Perry
b6d9e45f47 aco: improve code for f2{i,u}{8,16}
Use sub-dword definitions so that the RA can use SDWA

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
2020-06-15 18:24:22 +00:00
Daniel Schürmann
1f98d8c804 aco: fix shared subdword loads
Shared subdword loads don't need byte alignment as they are split
into multiple loads if necessary.

Fixes: 5cde4989d3 ('aco: remove unnecessary split- and create_vector instructions for subdword loads')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5441>
2020-06-12 13:56:12 +00:00
Samuel Pitoiset
7b44f549b3 aco: implement radv_enable_mrt_output_nan_fixup workaround
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5359>
2020-06-12 14:43:57 +02:00
Rhys Perry
56345b8c61 aco: allow reading/writing upper halves/bytes when possible
Use SDWA, opsel or a different opcode to achieve this.

shader-db (Navi, fp16 enabled):
Totals from 42 (0.03% of 127638) affected shaders:
VGPRs: 3424 -> 3416 (-0.23%)
CodeSize: 811124 -> 811984 (+0.11%); split: -0.12%, +0.23%
Instrs: 156638 -> 155733 (-0.58%)
Cycles: 1994180 -> 1982568 (-0.58%); split: -0.59%, +0.00%
VMEM: 7019 -> 7187 (+2.39%); split: +3.45%, -1.05%
SMEM: 1771 -> 1770 (-0.06%); split: +0.06%, -0.11%
VClause: 1477 -> 1475 (-0.14%)
Copies: 13216 -> 12406 (-6.13%)
Branches: 5942 -> 5901 (-0.69%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>
2020-06-10 15:05:11 +00:00
Rhys Perry
98060ba0f0 aco: p_extract_vector in 64-bit u2f16/i2f16
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>
2020-06-10 15:05:11 +00:00
Daniel Schürmann
5cde4989d3 aco: remove unnecessary split- and create_vector instructions for subdword loads
This helps GFX6/7 by removing unnecessary shuffle code.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>
2020-06-09 21:25:38 +00:00
Samuel Pitoiset
5446e3cf2e aco: fix alignment of vectors with 4 elements
I think this case was just missing.

This fixes a bunch of 16-bit storage related CTS failures like
dEQP-VK.ssbo.phys.layout.single_basic_type.std430.u16vec4.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>
2020-06-09 21:25:38 +00:00
Samuel Pitoiset
c7bd0f8cd5 aco: implement 8-bit/16-bit conversions on GFX6-GFX7
Use v_bfe to implement small bitsize conversions because the
compiler probably optimizes this better.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>
2020-06-09 21:25:38 +00:00
Samuel Pitoiset
6391f9ab4c aco: fix nir_intrinsic_quad_* with 8-bit in GFX6-GFX7
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5327>
2020-06-05 16:04:06 +02:00
Samuel Pitoiset
a521c67d22 aco: implement 16-bit nir_intrinsic_quad_* on GFX6-GFX7
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5227>
2020-06-03 19:48:40 +02:00
Timur Kristóf
045c9ffa7d aco: Implement subgroup shuffle on GFX6-7.
GFX6 and GFX7 don't have the ds_bpermute (or permute) instruction,
but we would like to support subgroup shuffle on these old GPUs.

So we introduce a new pseudio instruction which will be lowered
to an "unrolled loop" that emulates bpermute on GFX6 and GFX7
using readlane instructions, while also respecting the exec mask
thanks to v_cmpx.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5223>
2020-06-02 21:12:12 +00:00
Timur Kristóf
14a5021aff aco/gfx10: Refactor of GFX10 wave64 bpermute.
The emulated GFX10 wave64 bpermute no longer needs a linear_vgpr,
so we don't consider it a reduction anymore. Additionally, the
code is slightly reorganized in preparation for the GFX6 emulated
bpermute.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5223>
2020-06-02 21:12:12 +00:00
Rhys Perry
01ce7887bf aco: fix 64-bit shared_atomic_exchange
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4880>
2020-05-28 10:34:03 +00:00
Samuel Pitoiset
94570e87bd aco: add support for bias/lod with texture gather
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5147>
2020-05-25 08:51:10 +02:00
Samuel Pitoiset
cecd4aad46 aco: implement nir_intrinsic_shader_clock with device scope
Use s_memrealtime instead.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5117>
2020-05-24 20:37:52 +02:00
Samuel Pitoiset
b3c87c52ea aco: implement 8-bit/16-bit nir_intrinsic_quad_*
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>
2020-05-21 15:06:48 +00:00
Samuel Pitoiset
dfa62d97a0 aco: implement 8-bit/16-bit nir_intrinsic_{shuffle,_read_invocation}
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>
2020-05-21 15:06:48 +00:00
Samuel Pitoiset
f03e56eaf0 aco: implement 8-bit/16-bit nir_intrinsic_read_first_invocation
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>
2020-05-21 15:06:48 +00:00
Samuel Pitoiset
86e2b03e3f aco: implement 8-bit/16-bit reductions
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>
2020-05-21 15:06:48 +00:00
Samuel Pitoiset
1647e098e9 aco: implement 16-bit interp
For 16-bit bank LDS (ie. Kabini/Stoney) we need a slightly different
path. It's completely untested though because I don't have these
chips but according to vkpipeline-db the generated assembly seems fine.

Note that 16-bit I/O is currently only exposed on GFX9+ for both
compiler backends.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>
2020-05-19 17:05:05 +00:00
Samuel Pitoiset
3fba5bb9cc aco: implement 16-bit vertex fetches with tbuffer_load_format_d16_*
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>
2020-05-19 17:05:05 +00:00
Samuel Pitoiset
7ffd394605 aco: implement 8-bit/16-bit mov's with p_create_vector
ACO doesn't lower 8-bit/16-bit mov's in NIR.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2997
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>
2020-05-19 17:05:05 +00:00
Samuel Pitoiset
860b4d16f4 aco: allow to load/store 16-bit values in VMEM for tess and geom
We only have to adjust some assertions to allow storing/loading
16-bit values.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>
2020-05-19 17:05:05 +00:00
Samuel Pitoiset
9bd3b67163 aco: convert 16-bit values before exporting MRTs
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>
2020-05-19 17:05:05 +00:00
Samuel Pitoiset
462a5fe6f4 aco: store 16-bit temporary outputs as v2b
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>
2020-05-19 17:05:05 +00:00
Samuel Pitoiset
aaf5706aa3 aco: add support for texturing with clamped LOD
This is a requirement for the shaderResourceMinLod feature which
allows to clamp LOD. This uses all image_sample_*_cl variants.

All dEQP-VK.glsl.texture_functions.texture*clamp.* pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4989>
2020-05-14 10:05:44 +00:00
Samuel Pitoiset
47a769143b aco: remove useless check for nir_tex_src_bias
I think only nir_texop_txb can have a bias operand anyways.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4989>
2020-05-14 10:05:44 +00:00
Jason Ekstrand
ca2d53f451 nir: Make "divergent" a property of an SSA value
v2: fix usage in ACO (by Daniel Schürmann)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
2020-05-13 18:49:22 +00:00
Samuel Pitoiset
3fba0a7a6f aco: fix 64-bit trunc with negative exponents on GFX6
v_frexp_exp returns the exponent as an unsigned value.

Also, v_ashr returns either 0 or -1 depending on the sign of the
source operand, but what we want is only the sign bit.

Fixes a bunch of recent dEQP-VK.glsl.builtin.precision_double.* tests.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4921>
2020-05-11 08:31:23 +02:00
Samuel Pitoiset
90d9f9a37e aco: remove unecessary p_split_vector with v2b reg class
Should be fine now that RA take full registers for v2b if it's
not an SDWA instruction.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4879>
2020-05-05 08:50:10 +02:00
Timur Kristóf
fdbb296853 aco: Remember VS/TCS output driver locations.
Instead of relying on calling shader_io_get_unique_index repeatedly,
remember the which output driver location corresponds to which
varying slot.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4388>
2020-04-29 11:51:04 +00:00
Timur Kristóf
ab07c4ea70 aco: Use context variables instead of calculating TCS inputs/outputs.
VS needs the number of TCS inputs, and TES needs the number of TCS
outputs.

It is error-prone to repeat those calculations in both instruction
selection and setup. Just set them in one place instead.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4388>
2020-04-29 11:51:04 +00:00
Rhys Perry
9392ddab43 aco: consider blocks unreachable if they are in the logical cfg
unreachable was true if the last block is unreachable in the linear cfg,
but it should also be true if it is unreachable in the logical cfg.

Fixes dEQP-VK.graphicsfuzz.for-with-ifs-and-return

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 8d8c864beb
    ('aco: improve check for unreachable loop continue blocks')

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4764>
2020-04-29 11:07:09 +00:00
Samuel Pitoiset
60cc065c7d aco: fix adjusting the sample index with FMASK if value is negative
The SPIR-V spec doesn't say explicitly that the sample index
must be an unsigned integer.

This fixes crashes with some new VK_EXT_robustness2 tests.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4775>
2020-04-29 07:29:54 +00:00
Samuel Pitoiset
a112ec4c11 aco: fix nir_texop_texture_samples with NULL descriptors
With VK_EXT_robustness2, descriptors can be NULL and the number of
samples returned by nir_texop_texture_samples should be 0.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4775>
2020-04-29 07:29:54 +00:00
Rhys Perry
3ee3ad561a aco: fix vgpr nir_op_vecn with sgpr operands
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4772>
2020-04-28 23:16:55 +00:00
Rhys Perry
bcd9467d5c aco: improve sub-dword emit_split_vector() with sgprs
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
2020-04-24 18:52:54 +00:00
Rhys Perry
a3dc1441f0 aco: clobber scc in s_bfe_u32 in get_alu_src()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
2020-04-24 18:52:54 +00:00
Rhys Perry
7db7206631 aco: allow 8/16-bit shared loads
These should work now

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
2020-04-24 18:52:54 +00:00
Rhys Perry
48b7beb7b0 aco: add and use get_buffer_store_op() helper
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
2020-04-24 18:52:54 +00:00
Rhys Perry
936b70c8cf aco: refactor visit_store_scratch() to use new helpers
Should support 8/16-bit stores now

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
2020-04-24 18:52:54 +00:00
Rhys Perry
18817041f7 aco: refactor visit_store_global() to use new helpers
Should support 8/16-bit stores now

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
2020-04-24 18:52:54 +00:00
Rhys Perry
c7bd69b3ae aco: refactor visit_store_ssbo() to use new helpers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
2020-04-24 18:52:54 +00:00
Rhys Perry
f75c830433 aco: refactor store_vmem_mubuf() to use new helpers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
2020-04-24 18:52:54 +00:00
Rhys Perry
98b4cc7110 aco: refactor store_lds() to use new helpers
It should also work correctly for 8/16-bit stores

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
2020-04-24 18:52:54 +00:00
Rhys Perry
562353e1f1 aco: add helpers for splitting stores
split_store_data() splits a vector and p_as_uniforms it if needed.

scan_write_mask()/advance_write_mask() are similar to
u_bit_scan_consecutive_range(), but makes it easier to only clear part of
the range and will also give ranges for zero'd bits.

split_buffer_store() is a helper for splitting VMEM/SMEM stores.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
2020-04-24 18:52:54 +00:00
Rhys Perry
211a9f2057 aco: use emit_load helper for VMEM/SMEM loads
Also implements 8/16-bit loads for scratch/global.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
2020-04-24 18:52:54 +00:00
Rhys Perry
57e6886f98 aco: refactor load_lds to use new helpers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
2020-04-24 18:52:54 +00:00