brw/reg_allocate: Require SIMD32 for destination / source interference on Xe2

No platforms other than Lunar Lake were affected in shader-db or fossil-db for obvious reasons. shader-db: Lunar Lake total instructions in shared programs: 17070074 -> 17069908 (<.01%) instructions in affected programs: 151939 -> 151773 (-0.11%) helped: 61 / HURT: 60 total cycles in shared programs: 891338314 -> 880188516 (-1.25%) cycles in affected programs: 550482120 -> 539332322 (-2.03%) helped: 8053 / HURT: 7183 total spills in shared programs: 3294 -> 3278 (-0.49%) spills in affected programs: 138 -> 122 (-11.59%) helped: 8 / HURT: 0 total fills in shared programs: 1653 -> 1632 (-1.27%) fills in affected programs: 212 -> 191 (-9.91%) helped: 8 / HURT: 0 LOST: 96 GAINED: 70 fossil-db: Lunar Lake Totals: Instrs: 208555066 -> 208509387 (-0.02%); split: -0.03%, +0.00% Cycle count: 31487691872 -> 31318442816 (-0.54%); split: -0.88%, +0.34% Spill count: 508701 -> 504809 (-0.77%); split: -0.86%, +0.10% Fill count: 612583 -> 607047 (-0.90%); split: -1.03%, +0.13% Scratch Memory Size: 35311616 -> 35037184 (-0.78%); split: -0.81%, +0.04% Totals from 214417 (30.33% of 706852) affected shaders: Instrs: 123732970 -> 123687291 (-0.04%); split: -0.04%, +0.01% Cycle count: 27410928904 -> 27241679848 (-0.62%); split: -1.01%, +0.39% Spill count: 452458 -> 448566 (-0.86%); split: -0.97%, +0.11% Fill count: 550991 -> 545455 (-1.00%); split: -1.15%, +0.14% Scratch Memory Size: 31138816 -> 30864384 (-0.88%); split: -0.92%, +0.04% Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35903>
2026-05-08 06:58:05 +02:00 · 2025-07-01 16:21:03 -07:00 · 2025-07-01 16:21:03 -07:00 · 4e05de7c3d
commit 4e05de7c3d
parent e9ae997ffc
1 changed files with 22 additions and 9 deletions
--- a/src/intel/compiler/brw_reg_allocate.cpp
+++ b/src/intel/compiler/brw_reg_allocate.cpp
@ -536,16 +536,29 @@ brw_reg_alloc::setup_inst_interference(const brw_inst *inst)
   }

   /* A compressed instruction is actually two instructions executed
-    * simultaneously.  On most platforms, it ok to have the source and
-    * destination registers be the same.  In this case, each instruction
-    * over-writes its own source and there's no problem.  The real problem
-    * here is if the source and destination registers are off by one.  Then
-    * you can end up in a scenario where the first instruction over-writes the
-    * source of the second instruction.  Since the compiler doesn't know about
-    * this level of granularity, we simply make the source and destination
-    * interfere.
+    * simultaneously. If the source and destination registers are the same,
+    * each instruction overwrites its own source, and there's no problem. The
+    * real problem here is if the source and destination registers are off by
+    * one. Then you can end up in a scenario where the first instruction
+    * overwrites the source of the second instruction. Consider this
+    * instruction:
+    *
+    *    and(16)         g17<1>UD        g16<1,1,0>UD    g13<1,1,0>UD
+    *
+    * The EU processes this as
+    *
+    *    and(8)          g17<1>UD        g16<1,1,0>UD    g13<1,1,0>UD
+    *    and(8)          g18<1>UD        g17<1,1,0>UD    g14<1,1,0>UD
+    *
+    * The first SIMD8 part of the instruction overwrites the source used in
+    * the second SIMD8 part. Since there's no way to tell the register
+    * allocator "the destination register number can be src, but it can't be
+    * src+1," simply make the source and destination interfere.
+    *
+    * Theoretically, the register_coalesce passes should have done the dest ==
+    * src merging.
    */
-   if (inst->dst.component_size(inst->exec_size) > REG_SIZE &&
+   if (inst->dst.component_size(inst->exec_size) > (reg_unit(devinfo) * REG_SIZE) &&
       inst->dst.file == VGRF) {
      for (int i = 0; i < inst->sources; ++i) {
         if (inst->src[i].file == VGRF) {