From 49d923078fe34d01e0fd0462d13f32405f20a01d Mon Sep 17 00:00:00 2001 From: Rhys Perry Date: Thu, 11 Dec 2025 09:58:29 +0000 Subject: [PATCH] ac/nir: fix calculation of aligned_new_size MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This should consider nir_round_up_components(). fossil-db (gfx1201): Totals from 90 (0.11% of 79839) affected shaders: MaxWaves: 1829 -> 1901 (+3.94%) Instrs: 410780 -> 411825 (+0.25%); split: -0.02%, +0.27% CodeSize: 2227956 -> 2234464 (+0.29%); split: -0.02%, +0.31% VGPRs: 6952 -> 6760 (-2.76%); split: -3.11%, +0.35% Latency: 3071765 -> 3073960 (+0.07%); split: -0.00%, +0.07% InvThroughput: 766201 -> 767322 (+0.15%); split: -0.00%, +0.15% VClause: 7887 -> 7898 (+0.14%); split: -0.08%, +0.22% Copies: 48189 -> 48324 (+0.28%); split: -0.05%, +0.33% PreVGPRs: 6605 -> 6595 (-0.15%); split: -0.18%, +0.03% VALU: 237272 -> 238147 (+0.37%); split: -0.01%, +0.37% SALU: 48987 -> 49003 (+0.03%) VMEM: 15542 -> 15560 (+0.12%) VOPD: 188 -> 200 (+6.38%) fossil-db (navi31): Totals from 89 (0.11% of 79825) affected shaders: MaxWaves: 1811 -> 1883 (+3.98%) Instrs: 403695 -> 404691 (+0.25%); split: -0.01%, +0.26% CodeSize: 2150612 -> 2154860 (+0.20%); split: -0.03%, +0.23% VGPRs: 6892 -> 6676 (-3.13%) Latency: 3306107 -> 3310010 (+0.12%); split: -0.01%, +0.13% InvThroughput: 813092 -> 814382 (+0.16%); split: -0.00%, +0.16% VClause: 7999 -> 8010 (+0.14%); split: -0.06%, +0.20% Copies: 50089 -> 50210 (+0.24%); split: -0.05%, +0.29% PreVGPRs: 6596 -> 6586 (-0.15%); split: -0.18%, +0.03% VALU: 239617 -> 240392 (+0.32%); split: -0.01%, +0.33% SALU: 45349 -> 45363 (+0.03%) VMEM: 15762 -> 15780 (+0.11%) VOPD: 258 -> 262 (+1.55%) fossil-db (navi21): Totals from 89 (0.11% of 79825) affected shaders: Instrs: 345634 -> 346426 (+0.23%); split: -0.00%, +0.23% CodeSize: 1895616 -> 1900156 (+0.24%); split: -0.00%, +0.24% Latency: 3043334 -> 3046859 (+0.12%); split: -0.01%, +0.13% InvThroughput: 928236 -> 929626 (+0.15%); split: -0.01%, +0.16% VClause: 7894 -> 7905 (+0.14%); split: -0.06%, +0.20% Copies: 48694 -> 48785 (+0.19%); split: -0.03%, +0.22% PreVGPRs: 6580 -> 6570 (-0.15%); split: -0.18%, +0.03% VALU: 228323 -> 229072 (+0.33%); split: -0.01%, +0.33% SALU: 47202 -> 47216 (+0.03%) VMEM: 16546 -> 16564 (+0.11%) Signed-off-by: Rhys Perry Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14458 Backport-to: 25.3 Reviewed-by: Samuel Pitoiset Reviewed-by: Marek Olšák Part-of: --- src/amd/common/nir/ac_nir.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/amd/common/nir/ac_nir.c b/src/amd/common/nir/ac_nir.c index d7e020b2230..58ba0cc3eb0 100644 --- a/src/amd/common/nir/ac_nir.c +++ b/src/amd/common/nir/ac_nir.c @@ -551,8 +551,9 @@ ac_nir_mem_vectorize_callback(unsigned align_mul, unsigned align_offset, unsigne /* Align the size to what the hw supports. */ unsigned unaligned_new_size = num_components * bit_size; - unsigned aligned_new_size = align_load_store_size(config->gfx_level, unaligned_new_size, - uses_smem, is_shared); + unsigned aligned_new_size = nir_round_up_components(num_components) * bit_size; + aligned_new_size = align_load_store_size(config->gfx_level, aligned_new_size, + uses_smem, is_shared); if (uses_smem) { /* Maximize SMEM vectorization except for LLVM, which suffers from SGPR and VGPR spilling.