intel/brw: Emit better code for read_invocation(x, constant)

For something as basic as read_invocation(x, 0), we were emitting:

   mov(8) vgrf67:D, 0d
   find_live_channel(8) vgrf236:UD, NoMask
   broadcast(8) vgrf237:D, vgrf67:D, vgrf236+0.0<0>:UD NoMask
   broadcast(8) vgrf235+0.0:W, vgrf197+0.0:W, vgrf237+0.0<0>:D NoMask
   mov(8) vgrf234+0.0:W, vgrf235+0.0<0>:W

This is way overcomplicated - if the invocation is a constant, we can
simply emit a single MOV which reads the desired channel index.  Not
only that, but it's difficult to clean up:

1. If this expression appears multiple times, CSE will find all the
   redundant emit_uniformize(invocation) and get rid of the duplicate
   (find_live_channel+broadcast) on future instructions.
2. Copy propagation will put the 0d directly in the first broadcast.
3. Dead code elimination will get rid of the vgrf67 temp holding 0.
4. Algebraic will replace the first broadcast(x, 0) with a MOV.
5. Copy propagation will put the 0d directly in the second broadcast.
6. Dead code elimination will get rid of the vgrf237 temp.
7. Algebraic will replace the second broadcast(x, 0) with a MOV.
8. Copy propagation will finally combine the two MOVs

That's at least 7-8 optimization passes and several loops through the
same passes just to clean up something we can do trivially.

Cuts 25% of the of the optimizer steps in pipeline 22200210259a2c9c
of fossil-db/google-meet-clvk/BgBlur.1f58fdf742c27594.1 (31 to 23).

Shortens compilation time of the google-meet-clvk/Relight pipeline by
-2.87717% +/- 0.509162% (n=150).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28097>
This commit is contained in:
Kenneth Graunke 2024-03-10 01:00:50 -08:00 committed by Marge Bot
parent e87881f616
commit 97aec40111

View file

@ -7209,7 +7209,13 @@ fs_nir_emit_intrinsic(nir_to_brw_state &ntb,
case nir_intrinsic_read_invocation: {
const fs_reg value = get_nir_src(ntb, instr->src[0]);
const fs_reg invocation = get_nir_src(ntb, instr->src[1]);
const fs_reg invocation = get_nir_src_imm(ntb, instr->src[1]);
if (invocation.file == IMM) {
unsigned i = invocation.ud & (bld.dispatch_width() - 1);
bld.MOV(retype(dest, value.type), component(value, i));
break;
}
fs_reg tmp = bld.vgrf(value.type);