nir/algebraic: Undistribute fsat from fmax

To be helpful, the thing inside the fsat has to be used with and without
the fsat. Otherwise it just moves a saturate destination modifier
around. To not be harmful, the fsat has to only be used by the bcsel.

All Broadwell and newer Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 20174475 -> 20174449 (<.01%)
instructions in affected programs: 3913 -> 3887 (-0.66%)
helped: 13 / HURT: 0

total cycles in shared programs: 866844832 -> 866844719 (<.01%)
cycles in affected programs: 46037 -> 45924 (-0.25%)
helped: 10 / HURT: 1

All Intel platforms had similar results. (Ice Lake shown)
Instructions in all programs: 161491468 -> 161491372 (-0.0%)
helped: 31 / HURT: 8

Cycles in all programs: 10933090736 -> 10933024716 (-0.0%)
helped: 32 / HURT: 18

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22169>
This commit is contained in:
Ian Romanick 2023-03-23 16:01:13 -07:00 committed by Marge Bot
parent 782de1932c
commit 71e5530c07

View file

@ -833,6 +833,7 @@ optimizations.extend([
(('umin', ('umax', ('umin', ('umax', a, b), c), b), c), ('umin', ('umax', a, b), c)),
# Both the left and right patterns are "b" when isnan(a), so this is exact.
(('fmax', ('fsat', a), '#b(is_zero_to_one)'), ('fsat', ('fmax', a, b))),
(('fmax', ('fsat(is_used_once)', a), ('fsat(is_used_once)', b)), ('fsat', ('fmax', a, b))),
# The left pattern is 0.0 when isnan(a) (because fmin(fsat(NaN), b) ->
# fmin(0.0, b)) while the right one is "b", so this optimization is inexact.
(('~fmin', ('fsat', a), '#b(is_zero_to_one)'), ('fsat', ('fmin', a, b))),