From 3735ac616573fa2d2fe80d33b8b37583547b0792 Mon Sep 17 00:00:00 2001 From: Emma Anholt Date: Wed, 23 Jul 2025 21:40:41 -0700 Subject: [PATCH] tu: Use nir_opt_reassociate. I've elected to go with the more aggressive CSE heuristic here in addition to scalar math, which shaves another 1% on instruction count in exchange for a small hit to max waves. With either CSE or scalar, we take a notable hit to spilling (STP/LDP) on Aztec Ruins, Civ 6, Fallout 4, and Monster Hunter World, and with CSE those get worse. Totals (A750): MaxWaves: 6803894 -> 6795012 (-0.13%); split: +0.20%, -0.33% Instrs: 154246248 -> 151508232 (-1.78%); split: -1.92%, +0.15% CodeSize: 324303600 -> 322969162 (-0.41%); split: -0.84%, +0.43% NOPs: 24723513 -> 24536554 (-0.76%); split: -3.04%, +2.29% MOVs: 4729771 -> 4711212 (-0.39%); split: -3.75%, +3.36% COVs: 1762268 -> 1762432 (+0.01%); split: -0.05%, +0.06% Full: 4679471 -> 4688316 (+0.19%); split: -0.46%, +0.65% (ss): 3443963 -> 3450363 (+0.19%); split: -2.33%, +2.51% (sy): 1811290 -> 1811142 (-0.01%); split: -1.15%, +1.15% (ss)-stall: 12438303 -> 12597798 (+1.28%); split: -3.17%, +4.45% (sy)-stall: 47647687 -> 47720784 (+0.15%); split: -1.56%, +1.71% STPs: 35424 -> 35846 (+1.19%); split: -0.26%, +1.45% LDPs: 28110 -> 28643 (+1.90%); split: -0.45%, +2.34% Preamble Instrs: 38170428 -> 39461432 (+3.38%); split: -0.33%, +3.71% Early Preamble: 355599 -> 355772 (+0.05%); split: +0.16%, -0.11% Subgroup size: 41463040 -> 41355072 (-0.26%); split: +0.17%, -0.43% Cat0: 27282700 -> 27094195 (-0.69%); split: -2.77%, +2.08% Cat1: 6609687 -> 6589640 (-0.30%); split: -2.84%, +2.54% Cat2: 75455473 -> 72725047 (-3.62%); split: -3.77%, +0.15% Cat3: 32359423 -> 32526926 (+0.52%); split: -0.35%, +0.87% Cat4: 4691910 -> 4694398 (+0.05%); split: -0.00%, +0.05% Cat5: 3316443 -> 3316276 (-0.01%); split: -0.01%, +0.00% Cat6: 1031600 -> 1032185 (+0.06%); split: -0.03%, +0.09% Cat7: 3499012 -> 3529565 (+0.87%); split: -2.02%, +2.89% Part-of: --- src/freedreno/ci/traces-freedreno.yml | 48 +++++++++++++-------------- src/freedreno/ir3/ir3_nir.c | 7 ++++ 2 files changed, 31 insertions(+), 24 deletions(-) diff --git a/src/freedreno/ci/traces-freedreno.yml b/src/freedreno/ci/traces-freedreno.yml index e356bc128c7..502f1cd0ab3 100644 --- a/src/freedreno/ci/traces-freedreno.yml +++ b/src/freedreno/ci/traces-freedreno.yml @@ -26,7 +26,7 @@ traces: freedreno-a530: checksum: e3370ce93f56703e0c827a36dca2256d freedreno-a618: - checksum: eaedce6c69165b08a63a57b9c9901230 + checksum: 6fd0222ebd706f792aa05b1fc8d41ab7 zink-a618: checksum: 45bdbb33bf87ed114bd548248be13408 label: [skip, broken] @@ -54,7 +54,7 @@ traces: freedreno-a530: checksum: b0a10ed261fdfeba76de4de5c2bd0aae freedreno-a618: - checksum: 747fa9a4e47bbb37c24c3376a3f8255b + checksum: bce9a4f5f4c70cec484712254b7f915c zink-a618: label: [skip, slow] checksum: ade41e6fe932552914c678155149babb @@ -77,7 +77,7 @@ traces: freedreno-a530: label: [unsupported] freedreno-a618: - checksum: 17c6a6dd333514b125cc18282ce24ba8 + checksum: be6f39a5e970efdb72d6bfc6e144c6da zink-a618: label: [skip, flakes] checksum: 96f7f231042f892c7d11c91defd7ecc1 @@ -103,7 +103,7 @@ traces: checksum: 1ae49af7017ae2a08fbb1caf377ada91 label: [skip, slow] freedreno-a618: - checksum: 47016a34553e5a28e2e1e0b92d11c92f + checksum: e44a7227c8d080e16ed373496fe3b6cb zink-a618: label: [crash, skip] checksum: 5cd30bb46cbabc0d77cc4aacbcd7c0c2 @@ -126,7 +126,7 @@ traces: label: [skip, slow] text: 2 minute runtime on db820c freedreno-a618: - checksum: e0b4cb968d2653a568f0ea5eeee4d39a + checksum: eeb2e6bcd89f6b3e455691a58eeab733 zink-a618: label: [skip, timeout] @@ -153,9 +153,9 @@ traces: checksum: 2a53e6086588f4675ae3dcda9f26603b label: [skip, slow] freedreno-a618: - checksum: 730692659fbb9eefa44d6b1a2df2fa8e + checksum: f463be29a333b1866da407eb323f867f zink-a618: - checksum: 552e62fabd05ebfbb6d7bdd574b4e1c7 + checksum: 2a31ee1a56b755dcf5975b31eacbec32 behdad-glyphy/glyphy-v2.trace: freedreno-a306: @@ -164,10 +164,10 @@ traces: checksum: 3a37faf7ec62d48dada63f157f30d876 freedreno-a618: label: [no-perf] - checksum: d25edb433abfcde517b626b3071906ff + checksum: 42b070ccc03915125a1d4b98c2bcffd7 zink-a618: label: [no-perf] - checksum: d25edb433abfcde517b626b3071906ff + checksum: 42b070ccc03915125a1d4b98c2bcffd7 glxgears/glxgears-2-v2.trace: freedreno-a306: @@ -190,9 +190,9 @@ traces: freedreno-a530: checksum: 88188447495b819e5814368486deb0a0 freedreno-a618: - checksum: eb810bd258c06f873a2d9718c5209c6d + checksum: 24c35a9fe9ee5f5ccfec23f4cd54e974 zink-a618: - checksum: eefcef0b1167c1140c298f3908c31195 + checksum: cb943dd61361e72cdedf2e9db7b38c3e # gimark requires an environment var to work around a bug, disable for now # gputest/gimark-v2.trace: @@ -242,9 +242,9 @@ traces: checksum: ef9cec3c226477e908d4bb2ffe9e8eb9 text: Looks fine, but totally different shape from the rendering on i965. freedreno-a618: - checksum: e4da2cf366cb68833569105d37aaa50d + checksum: 63c5cc2289771f34645ac859874f92de zink-a618: - checksum: e4da2cf366cb68833569105d37aaa50d + checksum: 63c5cc2289771f34645ac859874f92de gputest/plot3d-v2.trace: freedreno-a306: @@ -263,9 +263,9 @@ traces: label: [unsupported] text: Requires GL4 for tess. freedreno-a618: - checksum: 92312303aa8279214f0a300a625efa87 + checksum: 9a3a1ba0bf326aee13797f634afd052f zink-a618: - checksum: 92312303aa8279214f0a300a625efa87 + checksum: 9a3a1ba0bf326aee13797f634afd052f gputest/triangle-v2.trace: freedreno-a306: @@ -283,9 +283,9 @@ traces: freedreno-a530: checksum: aab5c853e383e1cda56663d65f6925ad freedreno-a618: - checksum: 83fd7bce0fc1e1f30bd143b7d30ca890 + checksum: 2e4f7b19a865d2345e952b0721fc5362 zink-a618: - checksum: 5263f9d22462a6f48f5ca9e91d146f06 + checksum: 873a94da6be13419f76867ae6e339443 humus/CelShading-v2.trace: freedreno-a306: @@ -338,9 +338,9 @@ traces: freedreno-a530: checksum: 0fb847eb10e74da0483a17e782f2a22a freedreno-a618: - checksum: 5f1a655e62eab99d53dab88b634afed3 + checksum: 9099f7d4c082d741f1c52f5f9a4211ab zink-a618: - checksum: 5f1a655e62eab99d53dab88b634afed3 + checksum: 9099f7d4c082d741f1c52f5f9a4211ab humus/VolumetricFogging2-v2.trace: freedreno-a306: @@ -379,10 +379,10 @@ traces: freedreno-a530: label: [skip] freedreno-a618: - checksum: dd05d3e98eb93c0e520c1359de18e9fb + checksum: ecbcc9d262908cea8993df2ed40a26e7 zink-a618: label: [no-perf] - checksum: dd05d3e98eb93c0e520c1359de18e9fb + checksum: ecbcc9d262908cea8993df2ed40a26e7 pathfinder/canvas_moire-v2.trace: freedreno-a306: @@ -487,7 +487,7 @@ traces: text: text is prone to occasional misrendering, particularly in the lower left checksum: ae37867b1a9a94d2be9ff6c7e2009813 zink-a618: - checksum: ae37867b1a9a94d2be9ff6c7e2009813 + checksum: 32cf337ffebb83536321a55091a38c54 unvanquished/unvanquished-ultra.trace: freedreno-a306: @@ -495,9 +495,9 @@ traces: freedreno-a530: label: [unsupported] freedreno-a618: - checksum: a71d1ad391162acef60cbb2804d0cf64 + checksum: 0bd5cd4835c8105353f02877de1ae906 zink-a618: - checksum: b487c2784d458dff4a12f65e5cc46ac1 + checksum: d2ea0b27d3f850dd3ab899c83c14098e warzone2100/warzone2100-default.trace: freedreno-a306: diff --git a/src/freedreno/ir3/ir3_nir.c b/src/freedreno/ir3/ir3_nir.c index 73c90f87a06..905eff1a69a 100644 --- a/src/freedreno/ir3/ir3_nir.c +++ b/src/freedreno/ir3/ir3_nir.c @@ -1175,6 +1175,13 @@ ir3_nir_lower_variant(struct ir3_shader_variant *so, ir3_setup_const_state(s, so, ir3_const_state_mut(so)); } + /* reassociate constants and scalar operations together in groups of + * associative ALU ops. Do this before preamble to give more chances to + * hoist to preamble. + */ + progress |= OPT(s, nir_opt_reassociate_loop, + nir_reassociate_scalar_math | nir_reassociate_cse_heuristic); + /* Cleanup code leftover from lowering passes before opt_preamble */ if (progress) { ir3_optimize_loop(so->compiler, options, s);