jay/lower_scoreboard: use .src annotations

This is less heavy handed, avoiding unnecessary stalls after SENDs in a bunch of common cases. The stats (SIMD32) are: Totals: CodeSize: 70345392 -> 71674272 (+1.89%) Totals from 1774 (67.02% of 2647) affected shaders: CodeSize: 67359248 -> 68688128 (+1.97%) What's happening here is we are inserting extra SYNC.nop instructions in a bunch of cases for the .src preceding the eventual .dst. However, putting aside the i-cache impact for a moment, this is showing the optimization doing what it should (deferring dst syncs and inserting cheaper src syncs first). So this should be positive in reality despite the negative stat impact. The most hurt shaders are pooling up SYNC.nop's at the end of blocks due to local-only SWSB and lack of SYNC.allwr optimization. The latter is added later in this MR. The former is planned. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41398>
2026-05-08 04:48:08 +02:00 · 2026-05-06 09:47:36 -04:00 · 2026-05-06 09:47:36 -04:00 · 0885ed10f5
commit 0885ed10f5
parent 130e724d5e
1 changed files with 5 additions and 2 deletions
--- a/src/intel/compiler/jay/jay_lower_scoreboard.c
+++ b/src/intel/compiler/jay/jay_lower_scoreboard.c
@ -69,10 +69,13 @@ lower_send_local(jay_function *func, jay_block *block)
         struct gpr_range dst = def_to_gpr(func, I, I->dst);

         u_foreach_bit(sbid, busy) {
-            if (BITSET_TEST_COUNT(tokens[sbid].reading, dst.base, dst.width) ||
-                BITSET_TEST_COUNT(tokens[sbid].writing, dst.base, dst.width)) {
+            if (BITSET_TEST_COUNT(tokens[sbid].writing, dst.base, dst.width)) {
               jay_SYNC_nop(&b, tgl_swsb_sbid(TGL_SBID_DST, sbid));
               busy &= ~BITFIELD_BIT(sbid);
+            } else if (BITSET_TEST_COUNT(tokens[sbid].reading, dst.base,
+                                         dst.width)) {
+               jay_SYNC_nop(&b, tgl_swsb_sbid(TGL_SBID_SRC, sbid));
+               BITSET_ZERO(tokens[sbid].reading);
            }
         }
      }