From e0e7bfadeef2779ea00a456c03ca8eeec76dd794 Mon Sep 17 00:00:00 2001 From: Rhys Perry Date: Thu, 22 Aug 2024 15:37:20 +0100 Subject: [PATCH] aco: ignore exec and literals when mitigating VALUMaskWriteHazard MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit LLVM ignores exec and literals don't seem to work in some cases. fossil-db (navi31): Totals from 2676 (3.37% of 79395) affected shaders: Instrs: 10638979 -> 10646019 (+0.07%); split: -0.00%, +0.07% CodeSize: 55929640 -> 55959416 (+0.05%); split: -0.00%, +0.06% Latency: 107707408 -> 107712893 (+0.01%); split: -0.00%, +0.01% InvThroughput: 18119843 -> 18120442 (+0.00%); split: -0.00%, +0.00% Signed-off-by: Rhys Perry Reviewed-by: Daniel Schürmann Backport-to: 24.1 Backport-to: 24.2 Part-of: (cherry picked from commit ee648326d9a70883063a1b8ff69948d75370be38) --- .pick_status.json | 2 +- src/amd/compiler/README-ISA.md | 3 ++- src/amd/compiler/aco_insert_NOPs.cpp | 3 ++- 3 files changed, 5 insertions(+), 3 deletions(-) diff --git a/.pick_status.json b/.pick_status.json index 3be4cc257bb..a5c692592b1 100644 --- a/.pick_status.json +++ b/.pick_status.json @@ -1094,7 +1094,7 @@ "description": "aco: ignore exec and literals when mitigating VALUMaskWriteHazard", "nominated": false, "nomination_type": 3, - "resolution": 4, + "resolution": 1, "main_sha": null, "because_sha": null, "notes": null diff --git a/src/amd/compiler/README-ISA.md b/src/amd/compiler/README-ISA.md index 62f38dd849f..3bd7dfeb873 100644 --- a/src/amd/compiler/README-ISA.md +++ b/src/amd/compiler/README-ISA.md @@ -375,4 +375,5 @@ Triggered by: SALU writing then reading a SGPR that was previously used as a lane mask for a VALU. Mitigated by: -A VALU instruction reading a SGPR or with literal, or a sa_sdst=0 wait: `s_waitcnt_depctr 0xfffe` +A VALU instruction reading a non-exec SGPR before the SALU write, or a sa_sdst=0 wait: +`s_waitcnt_depctr 0xfffe` diff --git a/src/amd/compiler/aco_insert_NOPs.cpp b/src/amd/compiler/aco_insert_NOPs.cpp index 08856fd0359..c4d84edde44 100644 --- a/src/amd/compiler/aco_insert_NOPs.cpp +++ b/src/amd/compiler/aco_insert_NOPs.cpp @@ -1498,7 +1498,8 @@ handle_instruction_gfx11(State& state, NOP_ctx_gfx11& ctx, aco_ptr& if (state.program->wave_size == 64) { for (Operand& op : instr->operands) { - if (op.isLiteral() || (!op.isConstant() && op.physReg().reg() < 128)) + /* This should ignore exec reads */ + if (!op.isConstant() && op.physReg().reg() < 126) ctx.sgpr_read_by_valu_as_lanemask.reset(); } switch (instr->opcode) {