turnip: Enable lowering of mediump temps/CS shared to 16-bit.

In Aztec Ruins, we end up storing some big shared-mem arrays as 16-bit, cutting shared mem size in half across many shaders while also reducing conversions. gfxbench vk-5-normal perf +0.364983% +/- 0.189764% (n=4). fossil-db: Totals from 448 (2.99% of 14988) affected shaders: MaxWaves: 6154 -> 6390 (+3.83%); split: +3.96%, -0.13% Instrs: 174554 -> 165045 (-5.45%); split: -6.45%, +1.01% CodeSize: 364224 -> 345558 (-5.12%); split: -6.03%, +0.90% NOPs: 48224 -> 48024 (-0.41%); split: -3.33%, +2.91% MOVs: 6985 -> 6104 (-12.61%); split: -19.11%, +6.50% Full: 4577 -> 4101 (-10.40%); split: -11.08%, +0.68% (ss): 3428 -> 3335 (-2.71%); split: -4.17%, +1.46% (sy): 1250 -> 1205 (-3.60%); split: -4.72%, +1.12% (ss)-stall: 14695 -> 14528 (-1.14%); split: -2.25%, +1.12% (sy)-stall: 19565 -> 17998 (-8.01%); split: -9.55%, +1.54% STPs: 1086 -> 870 (-19.89%) LDPs: 162 -> 108 (-33.33%) Cat0: 51400 -> 51120 (-0.54%); split: -3.31%, +2.76% Cat1: 16861 -> 14688 (-12.89%); split: -18.18%, +5.30% Cat2: 71161 -> 68454 (-3.80%); split: -4.52%, +0.72% Cat3: 29572 -> 25306 (-14.43%); split: -14.49%, +0.06% Cat4: 3128 -> 3131 (+0.10%) Cat5: 1502 -> 1506 (+0.27%) Cat6: 840 -> 750 (-10.71%) aztec ruins is a big winner with the ldp/stp reductions. summoners_war racks up an astounding 41% reduction in instructions and +15% max_waves. Most affected apps show a minor win in instrs, with fallout_shelter_online, and aztec ruins on ANGLE taking minor hits. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18259>
2025-12-26 17:10:11 +01:00 · 2022-08-25 14:34:20 -07:00 · 2022-08-25 14:34:20 -07:00 · 08548650bd
commit 08548650bd
parent e1588cdf9e
1 changed files with 13 additions and 0 deletions
--- a/src/freedreno/vulkan/tu_shader.c
+++ b/src/freedreno/vulkan/tu_shader.c
@ -104,9 +104,22 @@ tu_spirv_to_nir(struct tu_device *dev,
   NIR_PASS_V(nir, nir_lower_sysvals_to_varyings, &sysvals_to_varyings);

   NIR_PASS_V(nir, nir_lower_global_vars_to_local);
+
+   /* Older glslang missing bf6efd0316d8 ("SPV: Fix #2293: keep relaxed
+    * precision on arg passed to relaxed param") will pass function args through
+    * a highp temporary, so we need the nir_opt_find_array_copies() and a copy
+    * prop before we lower mediump vars, or you'll be unable to optimize out
+    * array copies after lowering.  We do this before splitting copies, since
+    * that works against nir_opt_find_array_copies().
+    * */
+   NIR_PASS_V(nir, nir_opt_find_array_copies);
+   NIR_PASS_V(nir, nir_opt_copy_prop_vars);
+   NIR_PASS_V(nir, nir_opt_dce);
+
   NIR_PASS_V(nir, nir_split_var_copies);
   NIR_PASS_V(nir, nir_lower_var_copies);

+   NIR_PASS_V(nir, nir_lower_mediump_vars, nir_var_function_temp | nir_var_shader_temp | nir_var_mem_shared);
   NIR_PASS_V(nir, nir_opt_copy_prop_vars);
   NIR_PASS_V(nir, nir_opt_combine_stores, nir_var_all);