intel/elk: Don't propagate saturate to an instruction that writes flags

There are two problems.

1. This is not NaN safe. 'add.le.sat dst F, Inf F, -Inf F' has a
   different result than 'add dst F, Inf F, -Inf F; cmp.le null, dst F, 0F'.

2. Ignoring the first problem, this only produces the desired flags
   for LE and G. All other cases can produce the wrong result.

shader-db:

All Intel platforms had similar results. (Broadwell shown)
total instructions in shared programs: 18282314 -> 18282316 (<.01%)
instructions in affected programs: 78 -> 80 (2.56%)
helped: 0
HURT: 2

total cycles in shared programs: 952924234 -> 952924252 (<.01%)
cycles in affected programs: 584 -> 602 (3.08%)
helped: 0
HURT: 2

Fixes: e6022281f2 ("intel/elk: Rename files to use elk prefix")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29774>
This commit is contained in:
Ian Romanick 2024-06-25 13:30:53 -07:00
parent 3d8fea0e09
commit 9125b7c1b4

View file

@ -45,7 +45,8 @@ using namespace elk;
*/
static bool
opt_saturate_propagation_local(const fs_live_variables &live, elk_bblock_t *block)
opt_saturate_propagation_local(const intel_device_info *devinfo,
const fs_live_variables &live, elk_bblock_t *block)
{
bool progress = false;
int ip = block->end_ip + 1;
@ -74,6 +75,16 @@ opt_saturate_propagation_local(const fs_live_variables &live, elk_bblock_t *bloc
!scan_inst->can_change_types()))
break;
/* min and max pseudo ops modify the flags on Gfx4 and Gfx5, but
* it's not based on the result of the operation. This is the one
* case where it is always safe to propagate a saturate to an
* instruction that writes the flags.
*/
if (scan_inst->flags_written(devinfo) != 0 &&
scan_inst->opcode != ELK_OPCODE_SEL) {
break;
}
if (scan_inst->saturate) {
inst->saturate = false;
progress = true;
@ -156,7 +167,7 @@ elk_fs_visitor::opt_saturate_propagation()
bool progress = false;
foreach_block (block, cfg) {
progress = opt_saturate_propagation_local(live, block) || progress;
progress = opt_saturate_propagation_local(devinfo, live, block) || progress;
}
/* Live intervals are still valid. */