aco: improve parse_delay_alu

Use gpr_map to determine how many cycles each dependency of the
s_delay_alu needs. This information helps the pass avoid further
s_delay_alu instructions.

fossil-db (gfx1100):
Totals from 13097 (9.73% of 134574) affected shaders:
Instrs: 30711894 -> 30702692 (-0.03%)
CodeSize: 153462500 -> 153425692 (-0.02%)
Latency: 372758612 -> 372741922 (-0.00%)
InvThroughput: 50164111 -> 50160717 (-0.01%); split: -0.01%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20512>
This commit is contained in:
Rhys Perry 2023-01-03 18:14:16 +00:00 committed by Marge Bot
parent bbad550f3d
commit c8357136d4

View file

@ -432,6 +432,18 @@ parse_delay_alu(wait_ctx& ctx, alu_delay_info& delay, Instruction* instr)
else if (wait >= alu_delay_wait::SALU_CYCLE_1)
delay.salu_cycles = imm[i] - (uint32_t)alu_delay_wait::SALU_CYCLE_1 + 1;
}
for (std::pair<const PhysReg, wait_entry>& e : ctx.gpr_map) {
wait_entry& entry = e.second;
if (delay.valu_instrs <= entry.delay.valu_instrs)
delay.valu_cycles = std::max(delay.valu_cycles, entry.delay.valu_cycles);
if (delay.trans_instrs <= entry.delay.trans_instrs)
delay.trans_cycles = std::max(delay.trans_cycles, entry.delay.trans_cycles);
if (delay.salu_cycles <= entry.delay.salu_cycles)
delay.salu_cycles = std::max(delay.salu_cycles, entry.delay.salu_cycles);
}
return true;
}