panvk: Optimize in the preprocess hook
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run

NIR is actually pretty good at optimizing UBO, SSBO, and shared memory
access but in order to do so, we actually have to run the optimizations
before we lower it all.  Same for I/O.  By doing all our lowering in
panvk before we ever run the optimization loop, we risk hampering it
significantly.

Ignoring loop changes (several get unrolled now), fossil-db on Sascha
Willems demos and a few others looks lik

    Instrs: 189054 -> 187802 (-0.66%); split: -0.67%, +0.01%
    CodeSize: 1756160 -> 1747072 (-0.52%); split: -0.52%, +0.01%
    Estimated normalized CVT cycles: 771.367106999997 -> 766.0311719999971 (-0.69%); split: -1.05%, +0.36%
    Estimated normalized SFU cycles: 1407.21875 -> 1406.9375 (-0.02%); split: -0.03%, +0.01%
    Estimated normalized Load/Store cycles: 17477.0 -> 16917.0 (-3.20%)
    Maximum number of threads: 1257 -> 1213 (-3.50%); split: +0.08%, -3.58%
    Number of hardware loops: 283 -> 278 (-1.77%)

    Totals from 186 (19.81% of 939) affected shaders:
    Instrs: 102588 -> 101336 (-1.22%); split: -1.23%, +0.01%
    CodeSize: 834432 -> 825344 (-1.09%); split: -1.10%, +0.02%
    Estimated normalized CVT cycles: 463.226562 -> 457.890627 (-1.15%); split: -1.74%, +0.59%
    Estimated normalized SFU cycles: 1021.84375 -> 1021.5625 (-0.03%); split: -0.05%, +0.02%
    Estimated normalized Load/Store cycles: 8425.0 -> 7865.0 (-6.65%)
    Maximum number of threads: 334 -> 290 (-13.17%); split: +0.30%, -13.47%
    Number of hardware loops: 63 -> 58 (-7.94%)

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>
This commit is contained in:
Faith Ekstrand 2025-11-07 22:03:48 -05:00 committed by Marge Bot
parent 1a9c7f8c8a
commit 51a68ecc87

View file

@ -452,6 +452,8 @@ panvk_preprocess_nir(struct vk_physical_device *vk_pdev,
if (nir->info.stage == MESA_SHADER_FRAGMENT) if (nir->info.stage == MESA_SHADER_FRAGMENT)
NIR_PASS(_, nir, nir_lower_wpos_center); NIR_PASS(_, nir, nir_lower_wpos_center);
pan_shader_optimize(nir, pdev->kmod.props.gpu_id);
NIR_PASS(_, nir, nir_split_var_copies); NIR_PASS(_, nir, nir_split_var_copies);
NIR_PASS(_, nir, nir_lower_var_copies); NIR_PASS(_, nir, nir_lower_var_copies);