brw_fs_opt_eliminate_find_live_channel eliminates FIND_LIVE_CHANNEL
outside of control flow. None of our optimization passes generate
additional cases of that instruction, so once it's gone, we shouldn't
ever have to run the pass again. Moving it out of the loop should
save a bit of CPU time.
While we're at it, also clean adjacent BROADCAST instructions that
consume the result of our FIND_LIVE_CHANNEL. Without this, we have
to perform copy propagation to get the MOV 0 immediate into the
BROADCAST, then algebraic to turn it into a MOV, which enables more
copy propagation...not to mention CSE gets involved. Since this
FIND_LIVE_CHANNEL + BROADCAST pattern from emit_uniformize() is
really common, and it's trivial to clean up, we can do that. This
lets the initial copy prop in the loop see MOV instead of BROADCAST.
Zero impact on fossil-db, but less work in the optimization loop.
Together with the previous patches, this cuts compile time in
Borderlands 3 on Alchemist by -1.38539% +/- 0.1632% (n = 24).
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28286>
We don't necessarily want to split up MOVs for 64-bit addresses into
2x 32-bit MOVs right away, as this makes things like copy propagating
the whole address around harder. We should do this late, once, while
still doing other algebraic optimizations earlier.
fossil-db results for Alchemist show tiny improvements:
Totals:
Instrs: 161310502 -> 161310436 (-0.00%); split: -0.00%, +0.00%
Cycles: 14370605606 -> 14370605159 (-0.00%); split: -0.00%, +0.00%
Totals from 33 (0.01% of 652298) affected shaders:
Instrs: 15053 -> 14987 (-0.44%); split: -0.64%, +0.20%
Cycles: 196947 -> 196500 (-0.23%); split: -0.25%, +0.02%
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28286>
This copy propagation can create MADs with immediates in src1, which
need to be cleaned up by constant combining (which puts them back in
VGRFs).
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27876>