nir/algebraic: Optimize some extract forms resulting from 8-bit lowering

This eliminates some spurious, size-converting moves.  For example, on
Ice Lake this helps dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag:

SIMD8 shader: 56 instructions. 1 loops. 4444 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 52 instructions. 1 loops. 4164 cycles. 0:0 spills:fills, 5 sends

v2: Condition two of the patterns on !options->lower_extract_byte.
Suggested by Lionel.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
This commit is contained in:
Ian Romanick 2021-01-26 19:51:57 -08:00 committed by Marge Bot
parent f9665040f1
commit a147717a93

View file

@ -1253,6 +1253,15 @@ optimizations.extend([
(('ishr', 'a@64', 56), ('extract_i8', a, 7), '!options->lower_extract_byte'),
(('iand', 0xff, a), ('extract_u8', a, 0), '!options->lower_extract_byte'),
# Common pattern in many Vulkan CTS tests that read 8-bit integers from a
# storage buffer.
(('u2u8', ('extract_u16', a, 1)), ('u2u8', ('extract_u8', a, 2)), '!options->lower_extract_byte'),
(('u2u8', ('ushr', a, 8)), ('u2u8', ('extract_u8', a, 1)), '!options->lower_extract_byte'),
# Common pattern after lowering 8-bit integers to 16-bit.
(('i2i16', ('u2u8', ('extract_u8', a, b))), ('i2i16', ('extract_i8', a, b))),
(('u2u16', ('u2u8', ('extract_u8', a, b))), ('u2u16', ('extract_u8', a, b))),
(('ubfe', a, 0, 8), ('extract_u8', a, 0), '!options->lower_extract_byte'),
(('ubfe', a, 8, 8), ('extract_u8', a, 1), '!options->lower_extract_byte'),
(('ubfe', a, 16, 8), ('extract_u8', a, 2), '!options->lower_extract_byte'),