mesa/src/freedreno
Timothy Arceri d75a36a9ee glsl: remove do_copy_propagation_elements() optimisation pass
Since 13b859de do_copy_propagation_elements() has a flaw where
the time it takes to complete grows exponentially slowers as the number
of nested loops increases. It can also hurt rather than help verses
just letting NIR optimise the code. So if the NIR linker is enabled we
let it handle it instead.

shader-db results Iris (BDW):

total instructions in shared programs: 11177181 -> 11199739 (0.20%)
instructions in affected programs: 119424 -> 141982 (18.89%)
helped: 109
HURT: 65
total cycles in shared programs: 368946819 -> 372277173 (0.90%)
cycles in affected programs: 116539428 -> 119869782 (2.86%)

total spills in shared programs: 3983 -> 8785 (120.56%)
spills in affected programs: 2072 -> 6874 (231.76%)
helped: 0
HURT: 6

total fills in shared programs: 2016 -> 6068 (200.99%)
fills in affected programs: 230 -> 4282 (1761.74%)
helped: 0
HURT: 6

LOST:   85
GAINED: 77

freedreno results:

total instructions in shared programs: 11011122 -> 11011620 (<.01%)
instructions in affected programs: 939829 -> 940327 (0.05%)
total full in shared programs: 762725 -> 762674 (<.01%)
full in affected programs: 1096 -> 1045 (-4.65%)
total constlen in shared programs: 1772092 -> 1771596 (-0.03%)
constlen in affected programs: 2780 -> 2284 (-17.84%)
total stp in shared programs: 4040 -> 4058 (0.45%)
stp in affected programs: 3656 -> 3674 (0.49%)
total ldp in shared programs: 2160 -> 2178 (0.83%)
ldp in affected programs: 1748 -> 1766 (1.03%)
stp HURT:   shaders/robclark-shaders/gfxbench5/gl_5_high_off/13.shader_test CL: 1231 -> 1234 (0.24%)
stp HURT:   shaders/robclark-shaders/gfxbench5/gl_5_normal_off/13.shader_test CL: 1231 -> 1234 (0.24%)
stp HURT:   shaders/robclark-shaders/gfxbench5/gl_5_high_off/15.shader_test CL: 453 -> 456 (0.66%)
stp HURT:   shaders/robclark-shaders/gfxbench5/gl_5_normal_off/15.shader_test CL: 453 -> 456 (0.66%)
stp HURT:   shaders/robclark-shaders/gfxbench5/gl_5_high_off/17.shader_test CL: 144 -> 147 (2.08%)
stp HURT:   shaders/robclark-shaders/gfxbench5/gl_5_normal_off/17.shader_test CL: 144 -> 147 (2.08%)

however, those stp counts are misleading -- gfxbench gl-5-normal actually
gets its scratch (ldp/stp) stored as 16 bits instead of 32 thanks to
better NIR copy prop, and the result is 2.64398% +/- 0.0991923% perf
improvement!

i915 results:

total instructions in shared programs: 510528 -> 510489 (<.01%)
instructions in affected programs: 3303 -> 3264 (-1.18%)
total tex_indirect in shared programs: 16708 -> 16717 (0.05%)
tex_indirect in affected programs: 134 -> 143 (6.72%)
total temps in shared programs: 30181 -> 30169 (-0.04%)
temps in affected programs: 1268 -> 1256 (-0.95%)
LOST:   0
GAINED: 1

i915 highlights:
instructions HURT:   shaders/closed/steam/legend-of-grimrock/47.shader_test FS: 141 -> 144 (2.13%)
instructions HURT:   shaders/closed/steam/steamworld-dig/22.shader_test FS: 84 -> 108 (28.57%)
temps HURT:   shaders/closed/steam/left-4-dead-2/medium/3682.shader_test FS: 7 -> 13 (85.71%)

r300 results:

total instructions in shared programs: 1340439 -> 1340845 (0.03%)
instructions in affected programs: 32354 -> 32760 (1.25%)
total temps in shared programs: 179394 -> 179329 (-0.04%)
temps in affected programs: 1505 -> 1440 (-4.32%)
total consts in shared programs: 1177742 -> 1177885 (0.01%)
consts in affected programs: 1107 -> 1250 (12.92%)
total lits in shared programs: 24992 -> 25019 (0.11%)
lits in affected programs: 138 -> 165 (19.57%)
instructions HURT:   shaders/closed/steam/legend-of-grimrock/26.shader_test FS: 47 -> 52 (10.64%)
instructions HURT:   shaders/closed/steam/sanctum-2/6072.shader_test FS: 43 -> 48 (11.63%)
instructions HURT:   shaders/closed/steam/champions-of-regnum/2378.shader_test VS: 35 -> 40 (14.29%)

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13288>
2023-03-01 16:09:25 +00:00
..
.gitlab-ci freedreno: Add A2xx REG_A2XX_RBBM_PM_OVERRIDE2 bitfields 2023-02-24 14:48:27 +00:00
afuc meson: do not use source_root() when possible 2022-11-22 06:11:07 +00:00
ci glsl: remove do_copy_propagation_elements() optimisation pass 2023-03-01 16:09:25 +00:00
common freedreno: Add seqno helper 2023-02-16 19:57:13 +00:00
computerator freedreno/drm: Return fence from submit flush 2022-12-17 19:14:12 +00:00
decode freedreno/crashdec: Disable GALLIUM_DUMP_CPU 2023-02-23 20:02:26 +00:00
drm tu+meson: Re-work KMD selection 2023-02-25 17:02:34 +00:00
drm-shim freedreno/drm-shim: add a660 2022-07-22 02:11:14 +00:00
ds freedreno/pps: Fix a signed/unsigned complaint. 2023-01-18 05:04:46 +00:00
fdl turnip: avoid FMT6_Z24_UNORM_S8_UINT_AS_R8G8B8A8 for event blits 2023-02-16 01:35:50 +00:00
ir2 freedreno/ir2: Re-indent 2021-04-17 15:38:56 +00:00
ir3 nir: change 16bit image dest folding option to per type 2023-02-27 09:55:34 +00:00
isa ir3: Add cat7 sleep instruction 2023-02-21 19:59:14 +00:00
perfcntrs freedreno/drm: Return fence from submit flush 2022-12-17 19:14:12 +00:00
registers freedreno: Add A2xx REG_A2XX_RBBM_PM_OVERRIDE2 bitfields 2023-02-24 14:48:27 +00:00
rnn freedreno: Move the headergen2 test to be meson unit tests. 2021-10-01 23:16:04 +00:00
vulkan vulkan/wsi: switch to using an options struct for last param 2023-02-27 13:21:21 +00:00
.clang-format freedreno: Add some options to .clang-format 2021-07-12 20:57:21 +00:00
.dir-locals.el freedreno: Update editorconfig and emacs settings for freedreno reformat. 2021-05-10 23:16:00 +00:00
.editorconfig freedreno: Update editorconfig and emacs settings for freedreno reformat. 2021-05-10 23:16:00 +00:00
meson.build meson: do not use source_root() when possible 2022-11-22 06:11:07 +00:00