pan/mdg: Replicate swizzles for scalar sources

This works around issue packing 32-bit scalar swizzles zero-extended to 64-bit, seen with the umul_high implementation. I tried for a while figuring out the root cause (even rewrote a big chunk of disassembler) but am still a bit lost. Nevertheless this is a safe workaround with no performance impact (and avoids relying on NIR undefined behaviour to implement GPU undefined behaviour), so let's do this for now to fix umul_high. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17860>
2026-01-01 18:20:10 +01:00 · 2022-08-24 12:20:04 -04:00 · 2022-08-24 12:20:04 -04:00 · 7b78e05ba8
commit 7b78e05ba8
parent e951d6362c
1 changed files with 19 additions and 2 deletions
--- a/src/panfrost/midgard/midgard_compile.c
+++ b/src/panfrost/midgard/midgard_compile.c
@ -661,10 +661,27 @@ mir_copy_src(midgard_instruction *ins, nir_alu_instr *instr, unsigned i, unsigne
        ins->src[to] = nir_src_index(NULL, &src.src);
        ins->src_types[to] = nir_op_infos[instr->op].input_types[i] | bits;

+        /* Figure out which component we should fill unused channels with. This
+         * doesn't matter too much in the non-broadcast case, but it makes
+         * should that scalar sources are packed with replicated swizzles,
+         * which works around issues seen with the combination of source
+         * expansion and destination shrinking.
+         */
+        unsigned replicate_c = 0;
+        if (bcast_count) {
+                replicate_c = bcast_count - 1;
+        } else {
+                for (unsigned c = 0; c < NIR_MAX_VEC_COMPONENTS; ++c) {
+                        if (nir_alu_instr_channel_used(instr, i, c))
+                                replicate_c = c;
+                }
+        }
+
        for (unsigned c = 0; c < NIR_MAX_VEC_COMPONENTS; ++c) {
                ins->swizzle[to][c] = src.swizzle[
-                        (!bcast_count || c < bcast_count) ? c :
-                                (bcast_count - 1)];
+                        ((!bcast_count || c < bcast_count) &&
+                          nir_alu_instr_channel_used(instr, i, c)) ?
+                        c : replicate_c];
        }
 }