aco/sched_ilp: improve scheduling with VMEM/DS->VALU WaW

This improves scheduling with one side of a divergent branch writing to a
VGPR using VMEM/DS, and the other writing using VALU. At the merge block,
it will properly consider that the VGPR was written by a VMEM/DS.

fossil-db (navi31):
Totals from 1224 (1.53% of 79825) affected shaders:
Instrs: 5264815 -> 5267604 (+0.05%); split: -0.00%, +0.06%
CodeSize: 27406404 -> 27422132 (+0.06%); split: -0.00%, +0.06%
Latency: 48325204 -> 48293975 (-0.06%); split: -0.09%, +0.03%
InvThroughput: 8923880 -> 8919191 (-0.05%); split: -0.07%, +0.02%

fossil-db (navi21):
Totals from 1267 (1.59% of 79825) affected shaders:
Instrs: 4628583 -> 4629190 (+0.01%); split: -0.00%, +0.01%
CodeSize: 24974672 -> 24977188 (+0.01%); split: -0.00%, +0.01%
Latency: 45080476 -> 44998120 (-0.18%); split: -0.20%, +0.02%
InvThroughput: 12288202 -> 12269634 (-0.15%); split: -0.16%, +0.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262>
This commit is contained in:
Rhys Perry 2025-11-03 14:07:28 +00:00 committed by Marge Bot
parent 88b6b6db17
commit 69bc4efa37

View file

@ -573,8 +573,12 @@ remove_entry(SchedILPContext& ctx, const Instruction* const instr, const uint32_
if (ctx.regs[reg].has_direct_dependency && ctx.regs[reg].direct_dependency == idx) {
ctx.regs[reg].has_direct_dependency = false;
if (!ctx.is_vopd) {
/* Do MAX2() so that the latency from both predecessors of a merge block are considered. */
if (BITSET_TEST(ctx.reg_has_latency, reg))
ctx.regs[reg].latency = MAX2(ctx.regs[reg].latency, latency);
else
ctx.regs[reg].latency = latency;
BITSET_SET(ctx.reg_has_latency, reg);
ctx.regs[reg].latency = latency;
}
}
}