jay/lower_scoreboard: run RegDist globally

poking around, it seems branches stall the pipelines so we don't need to do any
dataflow analysis, but we do need to fall through for correctness. just keep
going across block boundaries. this isn't optimal yet but it reduces a
pile of A@1's already.

Totals from 1389 (52.47% of 2647) affected shaders:
CodeSize: 56385376 -> 56325776 (-0.11%); split: -0.13%, +0.03%

--

this also fixes issues where the first instruction of a block is a SEND that has
an unmet register dependency, since the old code was fundamentally broken. oops.
lol. fixes
dEQP-VK.compute.pipeline.workgroup_memory_explicit_layout.zero.uint8_t_array_to_uint_array_1
among many others.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41510>
This commit is contained in:
Alyssa Rosenzweig 2026-05-07 10:28:12 -04:00 committed by Marge Bot
parent 52224bb597
commit a7b8395c15

View file

@ -389,12 +389,6 @@ lower_regdist_local(jay_function *func,
last_sync = NULL;
}
/* Sync on block boundaries. */
jay_inst *first = jay_first_inst(block);
if (block != jay_first_block(func) && first && first->op != JAY_OPCODE_SEND) {
first->dep = tgl_swsb_regdist(1);
}
}
/*
@ -426,6 +420,7 @@ jay_lower_scoreboard(jay_shader *shader)
u32_per_pipe *access = malloc(sizeof(*access) * nr_keys);
jay_foreach_function(shader, f) {
memset(access, 0, sizeof(*access) * nr_keys);
struct swsb_state state = { .access = access };
jay_foreach_block(f, block) {
@ -433,7 +428,6 @@ jay_lower_scoreboard(jay_shader *shader)
}
jay_foreach_block(f, block) {
memset(access, 0, sizeof(*access) * nr_keys);
lower_regdist_local(f, block, &state);
}
}