From 6e4db7e726c4eb8ba026666af1069ecbe1a07ceb Mon Sep 17 00:00:00 2001 From: Samuel Pitoiset Date: Wed, 25 Mar 2020 18:17:38 +0100 Subject: [PATCH] ac/nir: use llvm.amdgcn.rsq for nir_op_frsq MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Instead of emitting 1.0 / sqrt(x) which includes a slow division that LLVM doesn't always optimize even if the metadata is correctly set. pipeline-db (VEGA10/LLVM 9): Totals from affected shaders: SGPRS: 16872 -> 16864 (-0.05 %) VGPRS: 15320 -> 15464 (0.94 %) Spilled SGPRs: 2021 -> 2133 (5.54 %) Code Size: 1915464 -> 1917476 (0.11 %) bytes Max Waves: 641 -> 639 (-0.31 %) pipeline-db (VEGA10/LLVM 10): Totals from affected shaders: SGPRS: 43936 -> 44120 (0.42 %) VGPRS: 41776 -> 41972 (0.47 %) Spilled SGPRs: 875 -> 875 (0.00 %) Code Size: 4468164 -> 4468120 (-0.00 %) bytes Max Waves: 2412 -> 2414 (0.08 %) pipeline-db (VEGA10/LLVM 11 - 92744f62478): Totals from affected shaders: SGPRS: 60096 -> 60096 (0.00 %) VGPRS: 63552 -> 63648 (0.15 %) Spilled SGPRs: 6135 -> 6117 (-0.29 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 6252996 -> 6249772 (-0.05 %) bytes Max Waves: 2324 -> 2337 (0.56 %) LLVM 11 (master) is more affected than previous versions, but based on the small impact with LLVM 9/10, I decided to emit it unconditionally. Cc: 20.0 Signed-off-by: Samuel Pitoiset Reviewed-by: Bas Nieuwenhuizen Reviewed-by: Marek Olšák Part-of: (cherry picked from commit d548384fc686f4e9cc9e6551f9a582cc740f3233) --- .pick_status.json | 2 +- src/amd/llvm/ac_nir_to_llvm.c | 5 ++--- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/.pick_status.json b/.pick_status.json index dc03de49ae9..3186759be9f 100644 --- a/.pick_status.json +++ b/.pick_status.json @@ -616,7 +616,7 @@ "description": "ac/nir: use llvm.amdgcn.rsq for nir_op_frsq", "nominated": true, "nomination_type": 0, - "resolution": 0, + "resolution": 1, "master_sha": null, "because_sha": null }, diff --git a/src/amd/llvm/ac_nir_to_llvm.c b/src/amd/llvm/ac_nir_to_llvm.c index f64399b5f77..c609384948f 100644 --- a/src/amd/llvm/ac_nir_to_llvm.c +++ b/src/amd/llvm/ac_nir_to_llvm.c @@ -834,9 +834,8 @@ static void visit_alu(struct ac_nir_context *ctx, const nir_alu_instr *instr) ac_to_float_type(&ctx->ac, def_type), src[0]); break; case nir_op_frsq: - result = emit_intrin_1f_param(&ctx->ac, "llvm.sqrt", - ac_to_float_type(&ctx->ac, def_type), src[0]); - result = ac_build_fdiv(&ctx->ac, LLVMConstReal(LLVMTypeOf(result), 1.0), result); + result = emit_intrin_1f_param(&ctx->ac, "llvm.amdgcn.rsq", + ac_to_float_type(&ctx->ac, def_type), src[0]); break; case nir_op_frexp_exp: src[0] = ac_to_float(&ctx->ac, src[0]);