ac/llvm: use mul24 intrinsics

With the current code in clpeak LLVM ended up generating v_mad_u64_u32 instructions, with this we get nice v_mad_u32_s24 ones instead and an 4x performance increase in the int24 benchmark. Suggested-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34630>
2025-12-24 13:10:10 +01:00 · 2025-04-21 13:38:01 +02:00 · 2025-04-21 13:38:01 +02:00 · e3edc6029b
commit e3edc6029b
parent 3c267535ae
1 changed files with 5 additions and 1 deletions
--- a/src/amd/llvm/ac_nir_to_llvm.c
+++ b/src/amd/llvm/ac_nir_to_llvm.c
@ -612,9 +612,13 @@ static bool visit_alu(struct ac_nir_context *ctx, const nir_alu_instr *instr)
      else
         result = LLVMBuildSub(ctx->ac.builder, src[0], src[1], "");
      break;
-   case nir_op_imul:
   case nir_op_imul24_relaxed:
+      result = ac_build_intrinsic(&ctx->ac, "llvm.amdgcn.mul.i24", ctx->ac.i32, src, 2, 0);
+      break;
   case nir_op_umul24_relaxed:
+      result = ac_build_intrinsic(&ctx->ac, "llvm.amdgcn.mul.u24", ctx->ac.i32, src, 2, 0);
+      break;
+   case nir_op_imul:
      if (instr->no_unsigned_wrap)
         result = LLVMBuildNUWMul(ctx->ac.builder, src[0], src[1], "");
      else if (instr->no_signed_wrap)