nir/algebraic: Remove some optimizations of comparisons with fsat

When most of these patterns were created, we believed, incorrectly, that fsat(NaN) was NaN. We have since realized that fsat(NaN) is zero. Originally, this changed the patterns to use is_a_number. This didn't help any shaders, so it's easier to just drop the optimizations. This commit crossed paths with 4c3ad4d065 ("nir/algebraic: mark more optimization with fsat(NaN) as inexact") and bc123c396a ("nir/algebraic: mark some optimizations with fsat(NaN) as inexact"). Given that these don't impact very many shaders, it seems safer to just remove them. As discussed in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8716, I tried modifying these patterns to use !(b cmp a). Unfortunately, on Intel GPUs, the results were much worse than just removing the patterns altogether. Some other related patterns will be addressed in later commits. There are still a number of patterns that use the identity fsat(1-X) == 1 - fsat(X). If X is NaN, the former is zero while the latter is 1.0. I haven't evaluted these patterns yet. If changes are needed in these patterns, it should be a separate commit anyway. v2: Replace arrow `=>` with `->` in comments because the `=>` looks a lot like `<=` comparison. Suggested by Rhys. Fixes: 92b75c126b ("nir/algebraic: Replace checks that a value is between (or not) [0, 1]") Fixes: a7f0c57673 ("nir/algebraic: Eliminate useless fsat() on operand of comparison w/value in (0, 1)") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> All Intel hardware had similar results. (Ice Lake shown) total instructions in shared programs: 20029060 -> 20029670 (<.01%) instructions in affected programs: 69236 -> 69846 (0.88%) helped: 0 HURT: 263 HURT stats (abs) min: 1 max: 20 x̄: 2.32 x̃: 1 HURT stats (rel) min: 0.30% max: 11.11% x̄: 1.35% x̃: 0.98% 95% mean confidence interval for instructions value: 1.86 2.78 95% mean confidence interval for instructions %-change: 1.18% 1.52% Instructions are HURT. total cycles in shared programs: 979821278 -> 979834425 (<.01%) cycles in affected programs: 1476848 -> 1489995 (0.89%) helped: 49 HURT: 204 helped stats (abs) min: 1 max: 812 x̄: 102.31 x̃: 20 helped stats (rel) min: 0.01% max: 21.43% x̄: 2.23% x̃: 0.52% HURT stats (abs) min: 2 max: 2600 x̄: 89.02 x̃: 16 HURT stats (rel) min: 0.04% max: 27.27% x̄: 1.49% x̃: 0.72% 95% mean confidence interval for cycles value: 13.18 90.75 95% mean confidence interval for cycles %-change: 0.29% 1.25% Cycles are HURT. No fossil-db changes. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012> (cherry picked from commit d69ba58644)
2026-05-06 07:18:17 +02:00 · 2020-08-05 10:38:52 -07:00 · 2020-08-05 10:38:52 -07:00 · 65d5737fda
commit 65d5737fda
parent f60f062d80
2 changed files with 19 additions and 30 deletions
--- a/.pick_status.json
+++ b/.pick_status.json
@ -1570,7 +1570,7 @@
        "description": "nir/algebraic: Remove some optimizations of comparisons with fsat",
        "nominated": true,
        "nomination_type": 1,
-        "resolution": 0,
+        "resolution": 1,
        "main_sha": null,
        "because_sha": "92b75c126bb238cdbe784b930a9916f3737c018a"
    },
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@ -381,24 +381,22 @@ optimizations.extend([
   (('fneu', ('fneg', a), -1.0), ('fneu', 1.0, a)),
   (('feq', -1.0, ('fneg', a)), ('feq', a, 1.0)),

-   # flt(fsat(a), b > 0 && b < 1) is inexact if a is NaN (fsat(NaN) is 0)
-   # because it returns True while flt(a, b) always returns False.
-   (('~flt', ('fsat(is_used_once)', a), '#b(is_gt_0_and_lt_1)'), ('flt', a, b)),
+   # b < fsat(NaN) -> b < 0 -> false, and b < Nan -> false.
   (('flt', '#b(is_gt_0_and_lt_1)', ('fsat(is_used_once)', a)), ('flt', b, a)),
+
+   # fsat(NaN) >= b -> 0 >= b -> false, and NaN >= b -> false.
   (('fge', ('fsat(is_used_once)', a), '#b(is_gt_0_and_lt_1)'), ('fge', a, b)),
-   # fge(b > 0 && b < 1, fsat(a)) is inexact if a is NaN (fsat(NaN) is 0)
-   # because it returns True while fge(b, a) always returns False.
-   (('~fge', '#b(is_gt_0_and_lt_1)', ('fsat(is_used_once)', a)), ('fge', b, a)),
+
+   # b == fsat(NaN) -> b == 0 -> false, and b == NaN -> false.
   (('feq', ('fsat(is_used_once)', a), '#b(is_gt_0_and_lt_1)'), ('feq', a, b)),
+
+   # b != fsat(NaN) -> b != 0 -> true, and b != NaN -> true.
   (('fneu', ('fsat(is_used_once)', a), '#b(is_gt_0_and_lt_1)'), ('fneu', a, b)),

+   # fsat(NaN) >= 1 -> 0 >= 1 -> false, and NaN >= 1 -> false.
   (('fge', ('fsat(is_used_once)', a), 1.0), ('fge', a, 1.0)),
-   # flt(fsat(a), 1.0) is inexact because it returns True if a is NaN
-   # (fsat(NaN) is 0), while flt(a, 1.0) always returns FALSE.
-   (('~flt', ('fsat(is_used_once)', a), 1.0), ('flt', a, 1.0)),
-   # fge(0.0, fsat(a)) is inexact because it returns True if a is NaN
-   # (fsat(NaN) is 0), while fge(0.0, a) always returns FALSE.
-   (('~fge', 0.0, ('fsat(is_used_once)', a)), ('fge', 0.0, a)),
+
+   # 0 < fsat(NaN) -> 0 < 0 -> false, and 0 < NaN -> false.
   (('flt', 0.0, ('fsat(is_used_once)', a)), ('flt', 0.0, a)),

   # 0.0 >= b2f(a)
@ -505,15 +503,16 @@ optimizations.extend([
   (('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),

   # (a >= 0.0) && (a <= 1.0) -> fsat(a) == a
+   #
+   # This should be NaN safe.
+   #
+   # NaN >= 0 && 1 >= NaN -> false && false -> false
+   #
+   # vs.
+   #
+   # NaN == fsat(NaN) -> NaN == 0 -> false
   (('iand', ('fge', a, 0.0), ('fge', 1.0, a)), ('feq', a, ('fsat', a)), '!options->lower_fsat'),

-   # (a < 0.0) || (a > 1.0)
-   # !(!(a < 0.0) && !(a > 1.0))
-   # !((a >= 0.0) && (a <= 1.0))
-   # !(a == fsat(a))
-   # a != fsat(a)
-   (('ior', ('flt', a, 0.0), ('flt', 1.0, a)), ('fneu', a, ('fsat', a)), '!options->lower_fsat'),
-
   # Note: fmin(-a, -b) == -fmax(a, b)
   (('fmax',                        ('b2f(is_used_once)', 'a@1'),           ('b2f', 'b@1')),           ('b2f', ('ior', a, b))),
   (('fmax', ('fneg(is_used_once)', ('b2f(is_used_once)', 'a@1')), ('fneg', ('b2f', 'b@1'))), ('fneg', ('b2f', ('iand', a, b)))),
@ -2237,22 +2236,12 @@ late_optimizations = [
   # new patterns like these.  The patterns that compare with zero are removed
   # because they are unlikely to be created in by anything in
   # late_optimizations.
-
-   # flt(fsat(a), b > 0 && b < 1) is inexact if a is NaN (fsat(NaN) is 0)
-   # because it returns True while flt(a, b) always returns False.
-   (('~flt', ('fsat(is_used_once)', a), '#b(is_gt_0_and_lt_1)'), ('flt', a, b)),
   (('flt', '#b(is_gt_0_and_lt_1)', ('fsat(is_used_once)', a)), ('flt', b, a)),
   (('fge', ('fsat(is_used_once)', a), '#b(is_gt_0_and_lt_1)'), ('fge', a, b)),
-   # fge(b > 0 && b < 1, fsat(a)) is inexact if a is NaN (fsat(NaN) is 0)
-   # because it returns True while fge(b, a) always returns False.
-   (('~fge', '#b(is_gt_0_and_lt_1)', ('fsat(is_used_once)', a)), ('fge', b, a)),
   (('feq', ('fsat(is_used_once)', a), '#b(is_gt_0_and_lt_1)'), ('feq', a, b)),
   (('fneu', ('fsat(is_used_once)', a), '#b(is_gt_0_and_lt_1)'), ('fneu', a, b)),

   (('fge', ('fsat(is_used_once)', a), 1.0), ('fge', a, 1.0)),
-   # flt(fsat(a), 1.0) is inexact because it returns True if a is NaN
-   # (fsat(NaN) is 0), while flt(a, 1.0) always returns FALSE.
-   (('~flt', ('fsat(is_used_once)', a), 1.0), ('flt', a, 1.0)),

   (('~fge', ('fmin(is_used_once)', ('fadd(is_used_once)', a, b), ('fadd', c, d)), 0.0), ('iand', ('fge', a, ('fneg', b)), ('fge', c, ('fneg', d)))),