brw/cse: Don't eliminate instructions that write flags

With other changes in my tree, I observed this code from
dEQP-VK.subgroups.vote.compute.subgroupallequal_float have the second
cmp.z removed.

    undef(8) %69:UD
    cmp.z.f0.0(8) %69:F, %37:F, %57+0.0<0>:F
    mov(1) v58+0.0:D, 0d NoMask group0
    (+f0.0) mov(1) v58+0.0:D, -1d NoMask group0
    cmp.nz.f0.0(8) null:D, v58+0.0<0>:D, 0d
    ...
    undef(8) %72:UD
    cmp.z.f0.0(8) %72:F, %37:F, %57+0.0<0>:F
    mov(1) v63+0.0:D, 0d NoMask group0
    (+f0.0) mov(1) v63+0.0:D, -1d NoMask group0

This was also fixed by running dead-code elimination before CSE. That
seems more like avoiding the problem than fixing it, though.

I believe this affects shader-db results because leaving the second
CMP in the shader can give more opportunities for cmod propagation.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: 234c45c929 ("intel/brw: Write a new global CSE pass that works on defs")

shader-db:

All Intel platforms had similar results. (Lunar Lake shown)
total cycles in shared programs: 922097690 -> 922260862 (0.02%)
cycles in affected programs: 3178926 -> 3342098 (5.13%)
helped: 130
HURT: 88
helped stats (abs) min: 2 max: 2194 x̄: 296.71 x̃: 16
helped stats (rel) min: <.01% max: 16.56% x̄: 1.86% x̃: 0.18%
HURT stats (abs)   min: 4 max: 11992 x̄: 2292.55 x̃: 47
HURT stats (rel)   min: 0.04% max: 57.32% x̄: 11.82% x̃: 0.61%
95% mean confidence interval for cycles value: 320.36 1176.63
95% mean confidence interval for cycles %-change: 1.59% 5.73%
Cycles are HURT.

LOST:   2
GAINED: 1

fossil-db:

Lunar Lake, Meteor Lake, Tiger Lake had similar results. (Lunar Lake shown)
Totals:
Instrs: 142022960 -> 142022928 (-0.00%); split: -0.00%, +0.00%
Cycle count: 21995242782 -> 21995384040 (+0.00%); split: -0.00%, +0.00%
Max live registers: 48013385 -> 48013343 (-0.00%)

Totals from 507 (0.09% of 551441) affected shaders:
Instrs: 886191 -> 886159 (-0.00%); split: -0.01%, +0.01%
Cycle count: 69302492 -> 69443750 (+0.20%); split: -0.66%, +0.86%
Max live registers: 94413 -> 94371 (-0.04%)

DG2
Totals:
Instrs: 152856370 -> 152856093 (-0.00%); split: -0.00%, +0.00%
Cycle count: 17237159885 -> 17236804052 (-0.00%); split: -0.00%, +0.00%
Fill count: 150673 -> 150631 (-0.03%)
Max live registers: 31871520 -> 31871476 (-0.00%)

Totals from 506 (0.08% of 633197) affected shaders:
Instrs: 831795 -> 831518 (-0.03%); split: -0.04%, +0.01%
Cycle count: 55578509 -> 55222676 (-0.64%); split: -1.38%, +0.74%
Fill count: 2779 -> 2737 (-1.51%)
Max live registers: 51383 -> 51339 (-0.09%)

Ice Lake and Skylake had similar results. (Ice Lake shown)
Totals:
Instrs: 152017826 -> 152017793 (-0.00%); split: -0.00%, +0.00%
Cycle count: 15180773451 -> 15180761166 (-0.00%); split: -0.00%, +0.00%
Fill count: 106610 -> 106614 (+0.00%)
Max live registers: 32195006 -> 32194966 (-0.00%)

Totals from 411 (0.06% of 637268) affected shaders:
Instrs: 705935 -> 705902 (-0.00%); split: -0.01%, +0.01%
Cycle count: 47830019 -> 47817734 (-0.03%); split: -0.05%, +0.02%
Fill count: 2865 -> 2869 (+0.14%)
Max live registers: 42883 -> 42843 (-0.09%)

(cherry picked from commit 9aba731d03)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
This commit is contained in:
Ian Romanick 2024-11-06 12:07:15 -08:00 committed by Dylan Baker
parent 1e792b0933
commit 7994534fe9
2 changed files with 14 additions and 1 deletions

View file

@ -1274,7 +1274,7 @@
"description": "brw/cse: Don't eliminate instructions that write flags",
"nominated": true,
"nomination_type": 2,
"resolution": 0,
"resolution": 1,
"main_sha": null,
"because_sha": "234c45c929e0341f1d0c2a51a587c4ce7e4bba52",
"notes": null

View file

@ -475,6 +475,19 @@ brw_fs_opt_cse_defs(fs_visitor &s)
assert(ops_must_match);
}
/* Some later instruction could depend on the flags written by
* this instruction. It can only be removed if the previous
* instruction that write the flags is identical.
*/
if (inst->flags_written(devinfo)) {
bool ignored;
if (last_flag_write == NULL ||
!instructions_match(last_flag_write, inst, &ignored)) {
continue;
}
}
progress = true;
need_remaps = true;
remap_table[inst->dst.nr] =