ir3/ra: make main shader reg select independent of preamble
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run

ir3_ra allocates registers in a round-robin fashion to avoid false
dependencies. In order to do this, it keeps track of a "file start"
register for each register file and will search starting from there for
available registers.

This file start is initialized at the beginning of RA of kept across
blocks, including across the preamble. This means that a change that
only affects the preamble may cause changes in how registers are
allocated in the main shader. This may result in more or less copies,
and more or less false dependencies which changes the behavior of
postsched.

Changes in the preamble affecting the main shader makes it more
difficult to analyze shader-db results, as I often find myself chasing
down a regression that is just caused by RA/postsched "bad luck" in a
main shader that didn't actually change. Prevent this by resetting the
file start at the beginning of the main shader.

Totals:
Instrs: 364710030 -> 364631384 (-0.02%); split: -0.19%, +0.17%
CodeSize: 926766046 -> 926671488 (-0.01%); split: -0.10%, +0.09%
NOPs: 47703035 -> 47653319 (-0.10%); split: -1.05%, +0.94%
MOVs: 17072354 -> 17075112 (+0.02%); split: -1.28%, +1.29%
COVs: 4098062 -> 4096784 (-0.03%); split: -0.04%, +0.01%
Full: 15164359 -> 15112404 (-0.34%); split: -0.34%, +0.00%
(ss): 7818796 -> 7819147 (+0.00%); split: -1.10%, +1.11%
(sy): 3985674 -> 3983435 (-0.06%); split: -0.72%, +0.67%
(ss)-stall: 26535279 -> 26525929 (-0.04%); split: -1.36%, +1.32%
(sy)-stall: 111983489 -> 111716382 (-0.24%); split: -1.26%, +1.02%
Last helper: 116734916 -> 116595531 (-0.12%); split: -0.62%, +0.50%
Cat0: 53338794 -> 53289450 (-0.09%); split: -0.94%, +0.85%
Cat1: 22352349 -> 22328303 (-0.11%); split: -1.28%, +1.17%
Cat2: 155348173 -> 155348012 (-0.00%); split: -0.00%, +0.00%
Cat7: 9314194 -> 9309099 (-0.05%); split: -0.88%, +0.82%

Totals from 224302 (16.59% of 1352016) affected shaders:
Instrs: 148838101 -> 148759455 (-0.05%); split: -0.47%, +0.42%
CodeSize: 404838970 -> 404744412 (-0.02%); split: -0.22%, +0.20%
NOPs: 26261983 -> 26212267 (-0.19%); split: -1.90%, +1.71%
MOVs: 8372715 -> 8375473 (+0.03%); split: -2.60%, +2.63%
COVs: 2061488 -> 2060210 (-0.06%); split: -0.09%, +0.02%
Full: 3420300 -> 3368345 (-1.52%); split: -1.52%, +0.00%
(ss): 3848423 -> 3848774 (+0.01%); split: -2.24%, +2.25%
(sy): 2021040 -> 2018801 (-0.11%); split: -1.43%, +1.32%
(ss)-stall: 13554064 -> 13544714 (-0.07%); split: -2.65%, +2.59%
(sy)-stall: 59778475 -> 59511368 (-0.45%); split: -2.36%, +1.91%
Last helper: 52847662 -> 52708277 (-0.26%); split: -1.38%, +1.12%
Cat0: 29270336 -> 29220992 (-0.17%); split: -1.72%, +1.55%
Cat1: 10820261 -> 10796215 (-0.22%); split: -2.63%, +2.41%
Cat2: 57289060 -> 57288899 (-0.00%); split: -0.00%, +0.00%
Cat7: 5686726 -> 5681631 (-0.09%); split: -1.43%, +1.34%

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37003>
This commit is contained in:
Job Noorman 2025-08-26 11:02:11 +02:00 committed by Marge Bot
parent bb14ea5c19
commit 9d4ba885bb

View file

@ -2272,6 +2272,52 @@ handle_block(struct ra_ctx *ctx, struct ir3_block *block)
ra_file_init(&ctx->half);
ra_file_init(&ctx->shared);
if (block == ir3_after_preamble(block->shader) &&
block != ir3_start_block(block->shader)) {
/* Reset the file start in the first block after the preamble to make the
* main shader independent of the preamble. Without this, the allocated
* registers in the main shader will depend on how many registers were
* used in the preamble. This in turn may cause more or less copies being
* generated or postsched behaving differently due to a difference in
* false dependencies. This is undesirable when analyzing compiler changes
* that should only affect the preamble as they may also change main
* shader stats, generating noise in the shader-db output.
*/
ctx->full.start = 0;
ctx->half.start = 0;
ctx->shared.start = 0;
/* However, make sure the file start accounts for defs that are
* live-through the preamble (inputs and tex prefetches). If not, this
* could introduce unwanted false dependencies.
*/
foreach_instr (input, &ir3_start_block(block->shader)->instr_list) {
if (input->opc != OPC_META_INPUT &&
input->opc != OPC_META_TEX_PREFETCH) {
break;
}
struct ir3_register *dst = input->dsts[0];
assert(dst->num != INVALID_REG);
physreg_t dst_start = ra_reg_get_physreg(dst);
physreg_t dst_end;
if (dst->merge_set) {
/* Take the whole merge set into account to prevent its range being
* allocated for defs not part of the merge set.
*/
assert(dst_start >= dst->merge_set_offset);
dst_end = dst_start - dst->merge_set_offset + dst->merge_set->size;
} else {
dst_end = dst_start + reg_size(dst);
}
struct ra_file *file = ra_get_file(ctx, dst);
file->start = MAX2(file->start, dst_end);
}
}
/* Handle live-ins, phis, and input meta-instructions. These all appear
* live at the beginning of the block, and interfere with each other
* therefore need to be allocated "in parallel". This means that we