ir3: merge rpt groups after postsched

It often happens that postsched puts rpt groups back into an order that
allows them to be merged into a repeated instruction. Breaking up these
rpt groups before postsched loses this opportunity.

To fix this, change ir3_cleanup_rpt to not take the ip into account and
call ir3_merge_rpt after postsched.

Totals from 129238 (73.31% of 176279) affected shaders:
MaxWaves: 1834226 -> 1834248 (+0.00%); split: +0.00%, -0.00%
Instrs: 46484782 -> 46382869 (-0.22%); split: -0.69%, +0.48%
CodeSize: 95513914 -> 93871848 (-1.72%); split: -2.24%, +0.52%
NOPs: 8018516 -> 7939362 (-0.99%); split: -3.28%, +2.30%
MOVs: 1391770 -> 1408039 (+1.17%); split: -4.39%, +5.56%
COVs: 776518 -> 776182 (-0.04%); split: -0.06%, +0.02%
Full: 1473903 -> 1489694 (+1.07%); split: -0.76%, +1.83%
(ss): 1143180 -> 1146977 (+0.33%); split: -3.07%, +3.40%
(sy): 552487 -> 562122 (+1.74%); split: -1.83%, +3.57%
(ss)-stall: 4292082 -> 4259946 (-0.75%); split: -3.95%, +3.20%
(sy)-stall: 16573976 -> 17151457 (+3.48%); split: -2.41%, +5.89%
STPs: 16131 -> 16157 (+0.16%); split: -0.10%, +0.26%
LDPs: 19583 -> 19634 (+0.26%); split: -0.02%, +0.28%
Preamble Instrs: 9889595 -> 9887178 (-0.02%); split: -0.23%, +0.21%
Early Preamble: 103194 -> 103646 (+0.44%); split: +0.51%, -0.07%
Cat0: 8850422 -> 8769964 (-0.91%); split: -3.00%, +2.09%
Cat1: 2212326 -> 2226425 (+0.64%); split: -2.90%, +3.54%
Cat2: 17452525 -> 17448724 (-0.02%); split: -0.02%, +0.00%
Cat6: 501182 -> 501263 (+0.02%); split: -0.00%, +0.02%
Cat7: 1293844 -> 1262010 (-2.46%); split: -4.17%, +1.71%

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38576>
This commit is contained in:
Job Noorman 2025-11-21 14:11:56 +01:00 committed by Marge Bot
parent e8dbed2be4
commit 435db6fabe
2 changed files with 8 additions and 5 deletions

View file

@ -5945,8 +5945,8 @@ ir3_compile_shader_nir(struct ir3_compiler *compiler,
}
IR3_PASS(ir, ir3_remove_noop_subreg_moves);
IR3_PASS(ir, ir3_merge_rpt, so);
IR3_PASS(ir, ir3_postsched, so);
IR3_PASS(ir, ir3_merge_rpt, so);
IR3_PASS(ir, ir3_legalize_relative);
IR3_PASS(ir, ir3_lower_subgroups);

View file

@ -110,8 +110,6 @@ can_rpt(struct ir3_instruction *instr, struct ir3_instruction *rpt,
{
if (rpt_n >= 4)
return false;
if (rpt->ip != instr->ip + rpt_n)
return false;
if (rpt->opc != instr->opc)
return false;
if (!ir3_supports_rpt(instr->block->shader->compiler, instr->opc))
@ -261,14 +259,19 @@ try_merge(struct ir3_instruction *instr, struct ir3_instruction *rpt,
static bool
merge_instr(struct ir3_instruction *instr)
{
if (!ir3_instr_is_first_rpt(instr))
if (!ir3_instr_is_rpt(instr))
return false;
bool progress = false;
unsigned rpt_n = 1;
foreach_instr_rpt_excl_safe (rpt, instr) {
/* Try to merge instr with the instructions after it in its rpt group. If
* there are still instructions before it, they cannot be merged (because
* they come after instr in the block) and will be handled later.
*/
for (struct ir3_instruction *rpt = instr->rpt_next; rpt;
rpt = rpt->rpt_next) {
/* When rpt cannot be merged, stop immediately. We will try to merge rpt
* with the following instructions (if any) once we encounter it in
* ir3_combine_rpt.