nir/algebraic: Remove some optimizations of comparisons with fsat

When most of these patterns were created, we believed, incorrectly, that
fsat(NaN) was NaN.  We have since realized that fsat(NaN) is zero.
Originally, this changed the patterns to use is_a_number.  This didn't
help any shaders, so it's easier to just drop the optimizations.

This commit crossed paths with 4c3ad4d065 ("nir/algebraic: mark more
optimization with fsat(NaN) as inexact") and bc123c396a
("nir/algebraic: mark some optimizations with fsat(NaN) as inexact").
Given that these don't impact very many shaders, it seems safer to just
remove them.

As discussed in
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8716, I tried
modifying these patterns to use !(b cmp a).  Unfortunately, on Intel
GPUs, the results were much worse than just removing the patterns
altogether.

Some other related patterns will be addressed in later commits.

There are still a number of patterns that use the identity fsat(1-X) ==
1 - fsat(X).  If X is NaN, the former is zero while the latter is 1.0.
I haven't evaluted these patterns yet.  If changes are needed in these
patterns, it should be a separate commit anyway.

v2: Replace arrow `=>` with `->` in comments because the `=>` looks a
lot like `<=` comparison.  Suggested by Rhys.

Fixes: 92b75c126b ("nir/algebraic: Replace checks that a value is between (or not) [0, 1]")
Fixes: a7f0c57673 ("nir/algebraic: Eliminate useless fsat() on operand of comparison w/value in (0, 1)")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>

All Intel hardware had similar results. (Ice Lake shown)
total instructions in shared programs: 20029060 -> 20029670 (<.01%)
instructions in affected programs: 69236 -> 69846 (0.88%)
helped: 0
HURT: 263
HURT stats (abs)   min: 1 max: 20 x̄: 2.32 x̃: 1
HURT stats (rel)   min: 0.30% max: 11.11% x̄: 1.35% x̃: 0.98%
95% mean confidence interval for instructions value: 1.86 2.78
95% mean confidence interval for instructions %-change: 1.18% 1.52%
Instructions are HURT.

total cycles in shared programs: 979821278 -> 979834425 (<.01%)
cycles in affected programs: 1476848 -> 1489995 (0.89%)
helped: 49
HURT: 204
helped stats (abs) min: 1 max: 812 x̄: 102.31 x̃: 20
helped stats (rel) min: 0.01% max: 21.43% x̄: 2.23% x̃: 0.52%
HURT stats (abs)   min: 2 max: 2600 x̄: 89.02 x̃: 16
HURT stats (rel)   min: 0.04% max: 27.27% x̄: 1.49% x̃: 0.72%
95% mean confidence interval for cycles value: 13.18 90.75
95% mean confidence interval for cycles %-change: 0.29% 1.25%
Cycles are HURT.

No fossil-db changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>
(cherry picked from commit d69ba58644)
This commit is contained in:
Ian Romanick 2020-08-05 10:38:52 -07:00 committed by Eric Engestrom
parent f60f062d80
commit 65d5737fda
2 changed files with 19 additions and 30 deletions

View file

@ -1570,7 +1570,7 @@
"description": "nir/algebraic: Remove some optimizations of comparisons with fsat",
"nominated": true,
"nomination_type": 1,
"resolution": 0,
"resolution": 1,
"main_sha": null,
"because_sha": "92b75c126bb238cdbe784b930a9916f3737c018a"
},

View file

@ -381,24 +381,22 @@ optimizations.extend([
(('fneu', ('fneg', a), -1.0), ('fneu', 1.0, a)),
(('feq', -1.0, ('fneg', a)), ('feq', a, 1.0)),
# flt(fsat(a), b > 0 && b < 1) is inexact if a is NaN (fsat(NaN) is 0)
# because it returns True while flt(a, b) always returns False.
(('~flt', ('fsat(is_used_once)', a), '#b(is_gt_0_and_lt_1)'), ('flt', a, b)),
# b < fsat(NaN) -> b < 0 -> false, and b < Nan -> false.
(('flt', '#b(is_gt_0_and_lt_1)', ('fsat(is_used_once)', a)), ('flt', b, a)),
# fsat(NaN) >= b -> 0 >= b -> false, and NaN >= b -> false.
(('fge', ('fsat(is_used_once)', a), '#b(is_gt_0_and_lt_1)'), ('fge', a, b)),
# fge(b > 0 && b < 1, fsat(a)) is inexact if a is NaN (fsat(NaN) is 0)
# because it returns True while fge(b, a) always returns False.
(('~fge', '#b(is_gt_0_and_lt_1)', ('fsat(is_used_once)', a)), ('fge', b, a)),
# b == fsat(NaN) -> b == 0 -> false, and b == NaN -> false.
(('feq', ('fsat(is_used_once)', a), '#b(is_gt_0_and_lt_1)'), ('feq', a, b)),
# b != fsat(NaN) -> b != 0 -> true, and b != NaN -> true.
(('fneu', ('fsat(is_used_once)', a), '#b(is_gt_0_and_lt_1)'), ('fneu', a, b)),
# fsat(NaN) >= 1 -> 0 >= 1 -> false, and NaN >= 1 -> false.
(('fge', ('fsat(is_used_once)', a), 1.0), ('fge', a, 1.0)),
# flt(fsat(a), 1.0) is inexact because it returns True if a is NaN
# (fsat(NaN) is 0), while flt(a, 1.0) always returns FALSE.
(('~flt', ('fsat(is_used_once)', a), 1.0), ('flt', a, 1.0)),
# fge(0.0, fsat(a)) is inexact because it returns True if a is NaN
# (fsat(NaN) is 0), while fge(0.0, a) always returns FALSE.
(('~fge', 0.0, ('fsat(is_used_once)', a)), ('fge', 0.0, a)),
# 0 < fsat(NaN) -> 0 < 0 -> false, and 0 < NaN -> false.
(('flt', 0.0, ('fsat(is_used_once)', a)), ('flt', 0.0, a)),
# 0.0 >= b2f(a)
@ -505,15 +503,16 @@ optimizations.extend([
(('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),
# (a >= 0.0) && (a <= 1.0) -> fsat(a) == a
#
# This should be NaN safe.
#
# NaN >= 0 && 1 >= NaN -> false && false -> false
#
# vs.
#
# NaN == fsat(NaN) -> NaN == 0 -> false
(('iand', ('fge', a, 0.0), ('fge', 1.0, a)), ('feq', a, ('fsat', a)), '!options->lower_fsat'),
# (a < 0.0) || (a > 1.0)
# !(!(a < 0.0) && !(a > 1.0))
# !((a >= 0.0) && (a <= 1.0))
# !(a == fsat(a))
# a != fsat(a)
(('ior', ('flt', a, 0.0), ('flt', 1.0, a)), ('fneu', a, ('fsat', a)), '!options->lower_fsat'),
# Note: fmin(-a, -b) == -fmax(a, b)
(('fmax', ('b2f(is_used_once)', 'a@1'), ('b2f', 'b@1')), ('b2f', ('ior', a, b))),
(('fmax', ('fneg(is_used_once)', ('b2f(is_used_once)', 'a@1')), ('fneg', ('b2f', 'b@1'))), ('fneg', ('b2f', ('iand', a, b)))),
@ -2237,22 +2236,12 @@ late_optimizations = [
# new patterns like these. The patterns that compare with zero are removed
# because they are unlikely to be created in by anything in
# late_optimizations.
# flt(fsat(a), b > 0 && b < 1) is inexact if a is NaN (fsat(NaN) is 0)
# because it returns True while flt(a, b) always returns False.
(('~flt', ('fsat(is_used_once)', a), '#b(is_gt_0_and_lt_1)'), ('flt', a, b)),
(('flt', '#b(is_gt_0_and_lt_1)', ('fsat(is_used_once)', a)), ('flt', b, a)),
(('fge', ('fsat(is_used_once)', a), '#b(is_gt_0_and_lt_1)'), ('fge', a, b)),
# fge(b > 0 && b < 1, fsat(a)) is inexact if a is NaN (fsat(NaN) is 0)
# because it returns True while fge(b, a) always returns False.
(('~fge', '#b(is_gt_0_and_lt_1)', ('fsat(is_used_once)', a)), ('fge', b, a)),
(('feq', ('fsat(is_used_once)', a), '#b(is_gt_0_and_lt_1)'), ('feq', a, b)),
(('fneu', ('fsat(is_used_once)', a), '#b(is_gt_0_and_lt_1)'), ('fneu', a, b)),
(('fge', ('fsat(is_used_once)', a), 1.0), ('fge', a, 1.0)),
# flt(fsat(a), 1.0) is inexact because it returns True if a is NaN
# (fsat(NaN) is 0), while flt(a, 1.0) always returns FALSE.
(('~flt', ('fsat(is_used_once)', a), 1.0), ('flt', a, 1.0)),
(('~fge', ('fmin(is_used_once)', ('fadd(is_used_once)', a, b), ('fadd', c, d)), 0.0), ('iand', ('fge', a, ('fneg', b)), ('fge', c, ('fneg', d)))),