jay/lower_scoreboard: run RegDist globally

poking around, it seems branches stall the pipelines so we don't need to do any dataflow analysis, but we do need to fall through for correctness. just keep going across block boundaries. this isn't optimal yet but it reduces a pile of A@1's already. Totals from 1389 (52.47% of 2647) affected shaders: CodeSize: 56385376 -> 56325776 (-0.11%); split: -0.13%, +0.03% -- this also fixes issues where the first instruction of a block is a SEND that has an unmet register dependency, since the old code was fundamentally broken. oops. lol. fixes dEQP-VK.compute.pipeline.workgroup_memory_explicit_layout.zero.uint8_t_array_to_uint_array_1 among many others. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41510>
2026-05-28 14:18:13 +02:00 · 2026-05-07 10:28:12 -04:00 · 2026-05-07 10:28:12 -04:00 · a7b8395c15
commit a7b8395c15
parent 52224bb597
1 changed files with 1 additions and 7 deletions
--- a/src/intel/compiler/jay/jay_lower_scoreboard.c
+++ b/src/intel/compiler/jay/jay_lower_scoreboard.c
@ -389,12 +389,6 @@ lower_regdist_local(jay_function *func,

      last_sync = NULL;
   }
-
-   /* Sync on block boundaries. */
-   jay_inst *first = jay_first_inst(block);
-   if (block != jay_first_block(func) && first && first->op != JAY_OPCODE_SEND) {
-      first->dep = tgl_swsb_regdist(1);
-   }
 }

 /*
@ -426,6 +420,7 @@ jay_lower_scoreboard(jay_shader *shader)
   u32_per_pipe *access = malloc(sizeof(*access) * nr_keys);

   jay_foreach_function(shader, f) {
+      memset(access, 0, sizeof(*access) * nr_keys);
      struct swsb_state state = { .access = access };

      jay_foreach_block(f, block) {
@ -433,7 +428,6 @@ jay_lower_scoreboard(jay_shader *shader)
      }

      jay_foreach_block(f, block) {
-         memset(access, 0, sizeof(*access) * nr_keys);
         lower_regdist_local(f, block, &state);
      }
   }