nir/opt_algebraic: optimize more boolean bcsel with constants

Foz-DB Navi21:
Totals from 667 (0.84% of 79377) affected shaders:
Instrs: 3890980 -> 3886878 (-0.11%); split: -0.11%, +0.00%
CodeSize: 21088576 -> 21065848 (-0.11%); split: -0.11%, +0.00%
SpillSGPRs: 458 -> 446 (-2.62%); split: -3.49%, +0.87%
Latency: 26160728 -> 26162856 (+0.01%); split: -0.02%, +0.02%
InvThroughput: 6999254 -> 7000593 (+0.02%); split: -0.01%, +0.03%
VClause: 103745 -> 103743 (-0.00%)
SClause: 93113 -> 93109 (-0.00%)
Copies: 344097 -> 344794 (+0.20%); split: -0.05%, +0.25%
Branches: 134546 -> 134764 (+0.16%); split: -0.01%, +0.17%
PreSGPRs: 40677 -> 40298 (-0.93%); split: -0.93%, +0.00%
PreVGPRs: 40185 -> 40190 (+0.01%)
VALU: 2584477 -> 2584468 (-0.00%); split: -0.00%, +0.00%
SALU: 573587 -> 569353 (-0.74%); split: -0.75%, +0.01%
SMEM: 124794 -> 124790 (-0.00%)

v2 (idr): Remove a pattern that is made redundant by this commit
combined with the previous commit.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33498>
This commit is contained in:
Georg Lehmann 2025-02-10 19:06:04 +01:00 committed by Marge Bot
parent 9785fa460c
commit f9722e35be

View file

@ -871,6 +871,8 @@ optimizations.extend([
(('bcsel', a, a, b), ('ior', a, b)),
(('bcsel', a, b, False), ('iand', a, b)),
(('bcsel', a, b, a), ('iand', a, b)),
(('bcsel', a, b, True), ('ior', ('inot', a), b)),
(('bcsel', a, False, b), ('iand', ('inot', a), b)),
(('~fmin', a, a), a),
(('~fmax', a, a), a),
(('imin', a, a), a),
@ -2093,11 +2095,6 @@ optimizations.extend([
(('i2i16', ('u2u8', ('iand', 'a@16', 1))), ('iand', 'a@16', 1)),
(('u2u16', ('u2u8', ('iand', 'a@16', 1))), ('iand', 'a@16', 1)),
# Reduce 16-bit integers to 1-bit booleans, hit with OpenCL. In turn, this
# lets iand(b2i1(...), 1) get simplified. Backends can usually fuse iand/inot
# so this should be no worse when it isn't strictly better.
(('bcsel', a, 0, ('b2i16', 'b@1')), ('b2i16', ('iand', ('inot', a), b))),
# Lowered pack followed by lowered unpack, for the high bits
(('u2u32', ('ushr', ('ior', ('ishl', a, 32), ('u2u64', 'b@8')), 32)), ('u2u32', a)),
(('u2u32', ('ushr', ('ior', ('ishl', a, 32), ('u2u64', 'b@16')), 32)), ('u2u32', a)),