mirror of
https://gitlab.freedesktop.org/mesa/mesa.git
synced 2026-05-05 20:28:04 +02:00
nir/opt_algebraic: optimize patterns hit with OpenCL
This patterns were all found in the AGX quads tessellator, a medium-sized OpenCL kernel. LLVM generates a lot of garbage around booleans which we need to chew through. Though there's nothing AGX or really OpenCL specific here, so some of this could help graphics shaders too. Together, their effect is significant for that kernel instr count & occupancy: before: 2966 inst, 2310 alu, 2310 fscib, 1216 ic, 23148 bytes, 239 regs, 384 threads after: 2848 inst, 2246 alu, 2246 fscib, 1000 ic, 22260 bytes, 231 regs, 448 threads No significant changes on GL shaderdb (a single godot shader regressed 1 instruction, 1344->1345). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31892>
This commit is contained in:
parent
fc0545e6a7
commit
33299354e0
1 changed files with 10 additions and 0 deletions
|
|
@ -1956,6 +1956,16 @@ optimizations.extend([
|
|||
(('u2u32', ('iadd(is_used_once)', 'a@64', b)),
|
||||
('iadd', ('u2u32', a), ('u2u32', b))),
|
||||
|
||||
# Redundant trip through 8-bit
|
||||
(('i2i16', ('u2u8', ('iand', 'a@16', 1))), ('iand', 'a@16', 1)),
|
||||
(('u2u16', ('u2u8', ('iand', 'a@16', 1))), ('iand', 'a@16', 1)),
|
||||
|
||||
# Reduce 16-bit integers to 1-bit booleans, hit with OpenCL. In turn, this
|
||||
# lets iand(b2i1(...), 1) get simplified. Backends can usually fuse iand/inot
|
||||
# so this should be no worse when it isn't strictly better.
|
||||
(('bcsel', a, 0, ('b2i16', 'b@1')), ('b2i16', ('iand', ('inot', a), b))),
|
||||
(('bcsel', a, ('b2i16', 'b@1'), ('b2i16', 'c@1')), ('b2i16', ('bcsel', a, b, c))),
|
||||
|
||||
# Lowered pack followed by lowered unpack, for the high bits
|
||||
(('u2u32', ('ushr', ('ior', ('ishl', a, 32), ('u2u64', b)), 32)), ('u2u32', a)),
|
||||
(('u2u16', ('ushr', ('ior', ('ishl', a, 16), ('u2u32', b)), 16)), ('u2u16', a)),
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue