From 51a68ecc878e12f240f58c3326f6f17cea08fa06 Mon Sep 17 00:00:00 2001 From: Faith Ekstrand Date: Fri, 7 Nov 2025 22:03:48 -0500 Subject: [PATCH] panvk: Optimize in the preprocess hook NIR is actually pretty good at optimizing UBO, SSBO, and shared memory access but in order to do so, we actually have to run the optimizations before we lower it all. Same for I/O. By doing all our lowering in panvk before we ever run the optimization loop, we risk hampering it significantly. Ignoring loop changes (several get unrolled now), fossil-db on Sascha Willems demos and a few others looks lik Instrs: 189054 -> 187802 (-0.66%); split: -0.67%, +0.01% CodeSize: 1756160 -> 1747072 (-0.52%); split: -0.52%, +0.01% Estimated normalized CVT cycles: 771.367106999997 -> 766.0311719999971 (-0.69%); split: -1.05%, +0.36% Estimated normalized SFU cycles: 1407.21875 -> 1406.9375 (-0.02%); split: -0.03%, +0.01% Estimated normalized Load/Store cycles: 17477.0 -> 16917.0 (-3.20%) Maximum number of threads: 1257 -> 1213 (-3.50%); split: +0.08%, -3.58% Number of hardware loops: 283 -> 278 (-1.77%) Totals from 186 (19.81% of 939) affected shaders: Instrs: 102588 -> 101336 (-1.22%); split: -1.23%, +0.01% CodeSize: 834432 -> 825344 (-1.09%); split: -1.10%, +0.02% Estimated normalized CVT cycles: 463.226562 -> 457.890627 (-1.15%); split: -1.74%, +0.59% Estimated normalized SFU cycles: 1021.84375 -> 1021.5625 (-0.03%); split: -0.05%, +0.02% Estimated normalized Load/Store cycles: 8425.0 -> 7865.0 (-6.65%) Maximum number of threads: 334 -> 290 (-13.17%); split: +0.30%, -13.47% Number of hardware loops: 63 -> 58 (-7.94%) Reviewed-by: Lars-Ivar Hesselberg Simonsen Reviewed-by: Christoph Pillmayer Part-of: --- src/panfrost/vulkan/panvk_vX_shader.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/panfrost/vulkan/panvk_vX_shader.c b/src/panfrost/vulkan/panvk_vX_shader.c index c7f440e96eb..ae72b5d2634 100644 --- a/src/panfrost/vulkan/panvk_vX_shader.c +++ b/src/panfrost/vulkan/panvk_vX_shader.c @@ -452,6 +452,8 @@ panvk_preprocess_nir(struct vk_physical_device *vk_pdev, if (nir->info.stage == MESA_SHADER_FRAGMENT) NIR_PASS(_, nir, nir_lower_wpos_center); + pan_shader_optimize(nir, pdev->kmod.props.gpu_id); + NIR_PASS(_, nir, nir_split_var_copies); NIR_PASS(_, nir, nir_lower_var_copies);