nir/flrp: Lower flrp(a, b, #c) differently

This doesn't help on Intel GPUs now because we always take the
"always_precise" path first.  It may help on other GPUs, and it does
prevent a bunch of regressions in "intel/compiler: Don't always require
precise lowering of flrp".

Reviewed-by: Matt Turner <mattst88@gmail.com>
This commit is contained in:
Ian Romanick 2018-08-18 16:53:55 -07:00
parent ae02622d8f
commit c995d1ca3a

View file

@ -555,6 +555,23 @@ convert_flrp_instruction(nir_builder *bld,
}
}
/*
* - If t is constant:
*
* x(1 - t) + yt
*
* The cost is three instructions without FMA or two instructions with
* FMA. This is the same cost as the imprecise lowering, but it gives
* the instruction scheduler a little more freedom.
*
* There is no need to handle t = 0.5 specially. nir_opt_algebraic
* already has optimizations to convert 0.5x + 0.5y to 0.5(x + y).
*/
if (alu->src[2].src.ssa->parent_instr->type == nir_instr_type_load_const) {
replace_with_strict(bld, dead_flrp, alu);
return;
}
/*
* - Otherwise
*