agx: Use agx_nir_opt_preamble

Now that everything is in place, we can actually take advantage of preambles. This wins us a crude form of UBO pushing (accounting for most of the win here), as well as its intended purpose of optimizing uniform-on-uniform arithmetic. shader-db results are excellent. The shader that's regressed for instruction count is a fragment shader that solely consists of `gl_FragColor = uniform`, which goes from a vectorized UBO load to four scalar moves. That's more instructions (and more bytes) but presumably faster, since ALU should be much cheaper than load/store. total instructions in shared programs: 6502 -> 5764 (-11.35%) instructions in affected programs: 5136 -> 4398 (-14.37%) helped: 60 HURT: 1 helped stats (abs) min: 2.0 max: 47.0 x̄: 12.33 x̃: 8 helped stats (rel) min: 0.84% max: 34.48% x̄: 18.69% x̃: 21.05% HURT stats (abs) min: 2.0 max: 2.0 x̄: 2.00 x̃: 2 HURT stats (rel) min: 33.33% max: 33.33% x̄: 33.33% x̃: 33.33% 95% mean confidence interval for instructions value: -14.69 -9.51 95% mean confidence interval for instructions %-change: -20.49% -15.20% Instructions are helped. total bytes in shared programs: 42186 -> 38310 (-9.19%) bytes in affected programs: 33182 -> 29306 (-11.68%) helped: 60 HURT: 1 helped stats (abs) min: 10.0 max: 272.0 x̄: 64.83 x̃: 50 helped stats (rel) min: 0.72% max: 30.00% x̄: 15.16% x̃: 16.67% HURT stats (abs) min: 14.0 max: 14.0 x̄: 14.00 x̃: 14 HURT stats (rel) min: 31.82% max: 31.82% x̄: 31.82% x̃: 31.82% 95% mean confidence interval for bytes value: -77.73 -49.35 95% mean confidence interval for bytes %-change: -16.66% -12.11% Bytes are helped. total halfregs in shared programs: 2370 -> 1639 (-30.84%) halfregs in affected programs: 1804 -> 1073 (-40.52%) helped: 60 HURT: 0 helped stats (abs) min: 1.0 max: 40.0 x̄: 12.18 x̃: 8 helped stats (rel) min: 3.85% max: 72.73% x̄: 41.37% x̃: 36.17% 95% mean confidence interval for halfregs value: -14.77 -9.60 95% mean confidence interval for halfregs %-change: -46.00% -36.75% Halfregs are helped. Total CPU time (seconds): 2.71 -> 2.80 (3.32%) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18813>
2026-05-09 04:38:03 +02:00 · 2022-10-22 14:17:54 -04:00 · 2022-10-22 14:17:54 -04:00 · 3570e94bcc
commit 3570e94bcc
parent 5e8b0289c3
1 changed files with 3 additions and 2 deletions
--- a/src/asahi/compiler/agx_compile.c
+++ b/src/asahi/compiler/agx_compile.c
@ -1646,7 +1646,7 @@ agx_lower_aligned_offsets(struct nir_builder *b,
 }

 static void
-agx_optimize_nir(nir_shader *nir)
+agx_optimize_nir(nir_shader *nir, unsigned *preamble_size)
 {
   bool progress;

@ -1687,6 +1687,7 @@ agx_optimize_nir(nir_shader *nir)
      NIR_PASS(progress, nir, nir_opt_loop_unroll);
   } while (progress);

+   NIR_PASS_V(nir, agx_nir_opt_preamble, preamble_size);
   NIR_PASS_V(nir, nir_opt_algebraic_late);
   NIR_PASS_V(nir, nir_opt_constant_folding);
   NIR_PASS_V(nir, nir_copy_prop);
@ -1935,7 +1936,7 @@ agx_compile_shader_nir(nir_shader *nir,
   NIR_PASS_V(nir, agx_lower_resinfo);
   NIR_PASS_V(nir, nir_legalize_16bit_sampler_srcs, tex_constraints);

-   agx_optimize_nir(nir);
+   agx_optimize_nir(nir, &out->push_count);

   /* Implement conditional discard with real control flow like Metal */
   NIR_PASS_V(nir, nir_lower_discard_if, (nir_lower_discard_if_to_cf |