aco: only apply DPP with 3 or less uses

Creating many new DPP instructions increases code size and decreases throughput.

Foz-DB Navi48:
Totals from 2196 (2.67% of 82179) affected shaders:
MaxWaves: 59930 -> 59960 (+0.05%); split: +0.08%, -0.03%
Instrs: 3718514 -> 3718298 (-0.01%); split: -0.08%, +0.07%
CodeSize: 20593544 -> 20507660 (-0.42%); split: -0.43%, +0.02%
VGPRs: 135924 -> 135744 (-0.13%); split: -0.17%, +0.04%
Latency: 33174704 -> 33163001 (-0.04%); split: -0.07%, +0.04%
InvThroughput: 6500723 -> 6491382 (-0.14%); split: -0.15%, +0.01%
VClause: 72348 -> 72343 (-0.01%); split: -0.06%, +0.05%
SClause: 83160 -> 83165 (+0.01%); split: -0.03%, +0.04%
Copies: 286592 -> 285575 (-0.35%); split: -0.45%, +0.09%
Branches: 99970 -> 99971 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 103280 -> 103279 (-0.00%)
PreVGPRs: 95590 -> 95440 (-0.16%); split: -0.30%, +0.14%
VALU: 1931369 -> 1931725 (+0.02%); split: -0.08%, +0.09%
SALU: 637663 -> 636780 (-0.14%); split: -0.15%, +0.01%
VOPD: 65236 -> 65589 (+0.54%); split: +0.91%, -0.37%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>
This commit is contained in:
Georg Lehmann 2026-01-25 18:43:20 +01:00 committed by Marge Bot
parent bb6a3e2891
commit 1c1bd9d090
2 changed files with 7 additions and 0 deletions

View file

@ -4979,6 +4979,9 @@ select_instruction(opt_ctx& ctx, aco_ptr<Instruction>& instr)
for (unsigned i = 0; i < input_info.operands.size(); i++) {
if (!input_info.operands[i].op.isTemp())
continue;
/* Applying DPP with many uses is unlikely to be profitable. */
if (ctx.uses[input_info.operands[i].op.tempId()] > 3)
continue;
Instruction* parent = ctx.info[input_info.operands[i].op.tempId()].parent_instr;
if (!parent->isDPP() || parent->opcode != aco_opcode::v_mov_b32 ||

View file

@ -612,6 +612,10 @@ try_combine_dpp(pr_opt_ctx& ctx, aco_ptr<Instruction>& instr)
if (mov->opcode != aco_opcode::v_mov_b32 || !mov->isDPP())
continue;
/* Applying DPP with many uses is unlikely to be profitable. */
if (ctx.uses[mov->definitions[0].tempId()] > 3)
continue;
/* If we aren't going to remove the v_mov_b32, we have to ensure that it doesn't overwrite
* it's own operand before we use it.
*/