nir/opt_load_store_vectorize: match amul like imul

for AGX, we preserve amul all the way until fusing address modes in order to be able to fuse effectively. so the load/store vectorizer wouldn't vectorize before fusing. however, after fusing we get fused intrinsics which are tricky to teach the vectorizer about as their semantics are pretty subtle. so we can't vectorize after, either. the easiest solution is to teach the vectorize about amul, which can always be replaced by imul for our pattern matches. this fixes certain cases of vectorization in OpenCL kernels on asahi. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32398>
2025-12-24 15:20:10 +01:00 · 2024-11-27 08:57:20 -05:00 · 2024-11-27 08:57:20 -05:00 · 0d77e91ca3
commit 0d77e91ca3
parent 77d4ed0a01
1 changed files with 15 additions and 1 deletions
--- a/src/compiler/nir/nir_opt_load_store_vectorize.c
+++ b/src/compiler/nir/nir_opt_load_store_vectorize.c
@ -258,13 +258,27 @@ get_write_mask(const nir_intrinsic_instr *intrin)
   return nir_component_mask(intrin->src[info->value_src].ssa->num_components);
 }

+static nir_op
+get_effective_alu_op(nir_scalar scalar)
+{
+   nir_op op = nir_scalar_alu_op(scalar);
+
+   /* amul can always be replaced by imul and we pattern match on the more
+    * general opcode, so return imul for amul.
+    */
+   if (op == nir_op_amul)
+      return nir_op_imul;
+   else
+      return op;
+}
+
 /* If "def" is from an alu instruction with the opcode "op" and one of it's
 * sources is a constant, update "def" to be the non-constant source, fill "c"
 * with the constant and return true. */
 static bool
 parse_alu(nir_scalar *def, nir_op op, uint64_t *c)
 {
-   if (!nir_scalar_is_alu(*def) || nir_scalar_alu_op(*def) != op)
+   if (!nir_scalar_is_alu(*def) || get_effective_alu_op(*def) != op)
      return false;

   nir_scalar src0 = nir_scalar_chase_alu_src(*def, 0);