Commit graph

875 commits

Author SHA1 Message Date
Samuel Pitoiset
75a730ced5 aco: fix register allocation for subdword instructions on GFX10
Cc: 20.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5148>
2020-05-29 11:20:58 +00:00
Rhys Perry
01ce7887bf aco: fix 64-bit shared_atomic_exchange
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4880>
2020-05-28 10:34:03 +00:00
Rhys Perry
1f2fd9c62e aco: don't reorder barriers in the scheduler
Unless we're reordering it around a barrier of the same type

No shader-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4880>
2020-05-28 10:34:03 +00:00
Rhys Perry
e1900ee2c7 aco: preserve more fields when combining additions into SMEM
Totals from 11 (0.01% of 127638) affected shaders:

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4880>
2020-05-28 10:34:03 +00:00
Rhys Perry
95d5c1b8a1 aco: check instruction format before waiting for a previous SMEM store
Totals from 7 (0.01% of 127638) affected shaders:
CodeSize: 40336 -> 40320 (-0.04%)
Instrs: 7807 -> 7803 (-0.05%)
Cycles: 118588 -> 118344 (-0.21%); split: -0.23%, +0.02%
SMEM: 331 -> 339 (+2.42%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 1749953ea3 ('aco/gfx10: Wait for pending SMEM stores before loads')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4880>
2020-05-28 10:34:03 +00:00
Rhys Perry
5ccc7c277c aco: consider SDWA during value numbering
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 23ac24f5b1
   ('aco: add missing conversion operations for small bitsizes')

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5164>
2020-05-28 09:55:58 +00:00
Rhys Perry
8aa98cebc1 aco: fix interaction with 3f branch workaround and p_constaddr
The offset was incorrect if we inserted a nop before the p_constaddr.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5164>
2020-05-28 09:55:58 +00:00
Samuel Pitoiset
94570e87bd aco: add support for bias/lod with texture gather
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5147>
2020-05-25 08:51:10 +02:00
Samuel Pitoiset
cecd4aad46 aco: implement nir_intrinsic_shader_clock with device scope
Use s_memrealtime instead.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5117>
2020-05-24 20:37:52 +02:00
Samuel Pitoiset
2d4493ee11 aco: sign-extend the input and identity for 8-bit subgroup operations
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>
2020-05-21 15:06:48 +00:00
Samuel Pitoiset
c76595aec2 aco: use a temporary SGPR for 8-bit/16-bit literal reduction identities
Otherwise, the compiler overwrites s0 which contains the exec mask.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>
2020-05-21 15:06:48 +00:00
Samuel Pitoiset
b3c87c52ea aco: implement 8-bit/16-bit nir_intrinsic_quad_*
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>
2020-05-21 15:06:48 +00:00
Samuel Pitoiset
dfa62d97a0 aco: implement 8-bit/16-bit nir_intrinsic_{shuffle,_read_invocation}
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>
2020-05-21 15:06:48 +00:00
Samuel Pitoiset
f03e56eaf0 aco: implement 8-bit/16-bit nir_intrinsic_read_first_invocation
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>
2020-05-21 15:06:48 +00:00
Samuel Pitoiset
af7e2c6133 aco: validate 8-bit/16-bit VGPR operands for readfirstlane/readlane/writelane
I would expect it to just work as intended and other solutions,
like v_and_b32 to make sure the upper bits are 0, might have some
overhead.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>
2020-05-21 15:06:48 +00:00
Samuel Pitoiset
86e2b03e3f aco: implement 8-bit/16-bit reductions
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>
2020-05-21 15:06:48 +00:00
Samuel Pitoiset
cc79945b21 aco: declare 8-bit/16-bit reduce operations
The 8-bit float variants are only for consistency but are unused.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4494>
2020-05-21 15:06:48 +00:00
Rhys Perry
40ed7fcc0b aco: fix typo in insert_waitcnt's kill()
No shader-db changes

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3004
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5126>
2020-05-21 13:01:41 +00:00
Daniel Schürmann
51f4b22fee aco: don't allow unaligned subdword accesses on GFX6/7
There are no SDWA instructions which means that only
full registers can be accessed.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5070>
2020-05-21 12:07:40 +00:00
Daniel Schürmann
ae390755fe aco: fix corner case in register allocation
We mark dead operands in the register file when searching for
a register for a definition. Only do so, if this space has not
yet been taken by a different definition.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5070>
2020-05-21 12:07:40 +00:00
Daniel Schürmann
acec00eae0 aco: don't move create_vector subdword operands to unsupported register offsets
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5070>
2020-05-21 12:07:40 +00:00
Daniel Schürmann
5201985332 aco: restrict copying of create_vector operands to GFX9+
This improves code size for Polaris and earlier due to less register swapping

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5070>
2020-05-21 12:07:40 +00:00
Samuel Pitoiset
1ad9a8a884 aco: fix missing break in label_instruction()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5129>
2020-05-21 09:00:02 +02:00
Samuel Pitoiset
cc1a1da8ab aco: fix off-by-one error with 16-bit MTBUF opcodes on GFX10
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>
2020-05-19 17:05:05 +00:00
Samuel Pitoiset
1647e098e9 aco: implement 16-bit interp
For 16-bit bank LDS (ie. Kabini/Stoney) we need a slightly different
path. It's completely untested though because I don't have these
chips but according to vkpipeline-db the generated assembly seems fine.

Note that 16-bit I/O is currently only exposed on GFX9+ for both
compiler backends.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>
2020-05-19 17:05:05 +00:00
Samuel Pitoiset
bbbb4057e6 aco: emit v_interp_*_f16 instructions as VOP3 instead of VINTRP
This adds a separate emission path in the assembly for the 16-bit
interp instructions.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>
2020-05-19 17:05:05 +00:00
Samuel Pitoiset
34f2c4dc6a aco: validate v_interp_*_f16 as VOP3 instructions instead of VINTRP
16-bit interp instructions are considered VINTRP by the compiler
but they are emitted as VOP3 by the assembler.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>
2020-05-19 17:05:05 +00:00
Samuel Pitoiset
3fba5bb9cc aco: implement 16-bit vertex fetches with tbuffer_load_format_d16_*
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>
2020-05-19 17:05:05 +00:00
Samuel Pitoiset
7ffd394605 aco: implement 8-bit/16-bit mov's with p_create_vector
ACO doesn't lower 8-bit/16-bit mov's in NIR.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2997
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>
2020-05-19 17:05:05 +00:00
Samuel Pitoiset
860b4d16f4 aco: allow to load/store 16-bit values in VMEM for tess and geom
We only have to adjust some assertions to allow storing/loading
16-bit values.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>
2020-05-19 17:05:05 +00:00
Samuel Pitoiset
9bd3b67163 aco: convert 16-bit values before exporting MRTs
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>
2020-05-19 17:05:05 +00:00
Samuel Pitoiset
462a5fe6f4 aco: store 16-bit temporary outputs as v2b
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4966>
2020-05-19 17:05:05 +00:00
Rhys Perry
bcb0038c83 aco: fix disassembly with LLVM 11
SymbolInfoTy was modified in LLVM 11. It is also in MCDisassembler.h now
and we don't have to duplicate it anymore.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5060>
2020-05-19 14:18:26 +01:00
Rhys Perry
cdfede7336 aco: split operations that use a swap's definition
Instead of relying it's read being entirely within the swap's definition.

No shader-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4950>
2020-05-14 18:36:33 +00:00
Daniel Schürmann
66e3c74f9c aco: fix WQM coalescing
get_reg_specified() doesn't handle special registers like SCC.
Fixes: a5fc96b533 ('aco: coalesce parallelcopies during register allocation')

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5036>
2020-05-14 16:30:19 +00:00
Samuel Pitoiset
aaf5706aa3 aco: add support for texturing with clamped LOD
This is a requirement for the shaderResourceMinLod feature which
allows to clamp LOD. This uses all image_sample_*_cl variants.

All dEQP-VK.glsl.texture_functions.texture*clamp.* pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4989>
2020-05-14 10:05:44 +00:00
Samuel Pitoiset
47a769143b aco: remove useless check for nir_tex_src_bias
I think only nir_texop_txb can have a bias operand anyways.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4989>
2020-05-14 10:05:44 +00:00
Jason Ekstrand
ca2d53f451 nir: Make "divergent" a property of an SSA value
v2: fix usage in ACO (by Daniel Schürmann)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
2020-05-13 18:49:22 +00:00
Rhys Perry
7ce527a4fe aco: improve phi affinities with p_split_vector
Totals from 5860 (4.59% of 127638) affected shaders:
VGPRs: 460212 -> 460216 (+0.00%)
CodeSize: 65554356 -> 65464816 (-0.14%)
Instrs: 12655972 -> 12633578 (-0.18%)
Copies: 1309994 -> 1292163 (-1.36%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4990>
2020-05-13 13:12:08 +00:00
Rhys Perry
51e797e233 aco: consider affinities when creating v_mac_f32
Totals from 8487 (6.65% of 127638) affected shaders:
CodeSize: 62061988 -> 62058020 (-0.01%); split: -0.01%, +0.01%
Instrs: 11910757 -> 11885409 (-0.21%); split: -0.21%, +0.00%
Copies: 1065244 -> 1040945 (-2.28%); split: -2.30%, +0.02%
Branches: 349665 -> 348914 (-0.21%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4990>
2020-05-13 13:12:08 +00:00
Rhys Perry
138eed45b5 aco: mark phi definitions as last-seen phi operands
Totals from 14340 (11.23% of 127638) affected shaders:
SGPRs: 1251648 -> 1251512 (-0.01%)
VGPRs: 994556 -> 994104 (-0.05%); split: -0.06%, +0.01%
CodeSize: 122894528 -> 121099604 (-1.46%); split: -1.49%, +0.03%
MaxWaves: 106039 -> 106103 (+0.06%); split: +0.06%, -0.00%
Instrs: 23860066 -> 23414317 (-1.87%); split: -1.90%, +0.03%
Copies: 2448228 -> 2049305 (-16.29%); split: -16.37%, +0.07%
Branches: 789381 -> 757921 (-3.99%); split: -4.62%, +0.64%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4990>
2020-05-13 13:12:08 +00:00
Rhys Perry
c1c0cf7a66 aco: fix consecutively written vgprs from vmem instructions
If one VMEM instruction uses a sampler and the other doesn't, we can't do
this optimization.

Totals from 47 (0.04% of 127638) affected shaders:
CodeSize: 271744 -> 271656 (-0.03%); split: -0.04%, +0.01%
Instrs: 52783 -> 52761 (-0.04%); split: -0.05%, +0.01%
Cycles: 5547040 -> 5546952 (-0.00%); split: -0.00%, +0.00%
VMEM: 10022 -> 9887 (-1.35%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4949>
2020-05-13 12:26:42 +00:00
Rhys Perry
0c7bed72f7 aco: simplify consecutive ordered vmem/lds writes optimization
This was unnecessary and messed with statistics

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4949>
2020-05-13 12:26:42 +00:00
Samuel Pitoiset
4235624b6a aco: optimize add/sub(a, cndmask(b, 0, 1, cond)) -> addc/subbrev_co(0, a, b)
v2: outline into a separate function and also optimize additions (by Daniel Schürmann)

Totals from affected shaders: (VEGA)
SGPRS: 938888 -> 941496 (0.28 %)
VGPRS: 832068 -> 831532 (-0.06 %)
Spilled SGPRs: 618 -> 618 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 3696 -> 3696 (0.00 %) dwords per thread
Code Size: 72893900 -> 72558928 (-0.46 %) bytes
LDS: 18201 -> 18201 (0.00 %) blocks
Max Waves: 64256 -> 64268 (0.02 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Co-authored-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4419>
2020-05-12 16:15:17 +00:00
Daniel Schürmann
a5fc96b533 aco: coalesce parallelcopies during register allocation
These are the result of lowering to CSSA, and should be removed if possible

Totals from affected shaders: (VEGA)
SGPRS: 544544 -> 544544 (0.00 %)
VGPRS: 418224 -> 418224 (0.00 %)
Spilled SGPRs: 141826 -> 141826 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 65853740 -> 64703380 (-1.75 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 13669 -> 13669 (0.00 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4952>
2020-05-12 15:59:31 +00:00
Samuel Pitoiset
266978f7ca aco: prevent invalid loads/stores vectorization if robustness is enabled
Only UBO, SSBO, global and push constants accesses should matter.

This fixes a bunch of new robustness2 failures. Note that RADV/LLVM
isn't affected because it relies on LLVM for loads/stores
vectorization and LLVM doesn't vectorize in this situation as well.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4881>
2020-05-11 07:25:16 +00:00
Samuel Pitoiset
04718a9cd6 nir: do not vectorize load/store if offset can overflow and robustness enabled
This prevents vectorization for loads/stores that can overflow if
the low offset is negative and the range greater or equal than 0.

The caller can pass the list of variable modes that matter for
robust access.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4881>
2020-05-11 07:25:15 +00:00
Samuel Pitoiset
3fba0a7a6f aco: fix 64-bit trunc with negative exponents on GFX6
v_frexp_exp returns the exponent as an unsigned value.

Also, v_ashr returns either 0 or -1 depending on the sign of the
source operand, but what we want is only the sign bit.

Fixes a bunch of recent dEQP-VK.glsl.builtin.precision_double.* tests.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4921>
2020-05-11 08:31:23 +02:00
Daniel Schürmann
37e89e3027 aco: either copy-propagate or inline create_vector operands
Don't do both at the same time as it breaks DCE

Fixes: 2dc550202e ('aco: copy-propagate p_create_vector copies of vectors')
Fixes: dEQP-VK.glsl.builtin.precision_double.ldexp.compute.scalar on GFX6-GFX7

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4922>
2020-05-07 20:40:41 +00:00
Samuel Pitoiset
90d9f9a37e aco: remove unecessary p_split_vector with v2b reg class
Should be fine now that RA take full registers for v2b if it's
not an SDWA instruction.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4879>
2020-05-05 08:50:10 +02:00