ac/nir: fix check for increasing size of non-descriptor loads

In the previous version, "end" could have been zero, which would have
allowed an increase of "mul" bytes, when it should not not be increased at all.

For example:
- align_offset=4
- mul=4
- unaligned_new_size=96
- aligned_new_size=128
This would have loaded a dword which was not loaded previously.

fossil-db (gfx1201):
Totals from 115 (0.14% of 79839) affected shaders:
Instrs: 286697 -> 287097 (+0.14%); split: -0.16%, +0.30%
CodeSize: 1477728 -> 1481256 (+0.24%); split: -0.13%, +0.37%
SpillSGPRs: 1662 -> 1658 (-0.24%); split: -0.42%, +0.18%
Latency: 2288612 -> 2290248 (+0.07%); split: -0.04%, +0.11%
InvThroughput: 467307 -> 467602 (+0.06%); split: -0.03%, +0.10%
VClause: 3689 -> 3691 (+0.05%)
SClause: 5052 -> 5064 (+0.24%); split: -0.20%, +0.44%
Copies: 34837 -> 35103 (+0.76%); split: -0.80%, +1.56%
Branches: 7402 -> 7401 (-0.01%)
PreSGPRs: 9147 -> 9143 (-0.04%); split: -0.44%, +0.39%
VALU: 159333 -> 159372 (+0.02%); split: -0.01%, +0.04%
SALU: 52047 -> 52276 (+0.44%); split: -0.55%, +0.99%
SMEM: 9556 -> 9697 (+1.48%)

fossil-db (navi31):
Totals from 238 (0.30% of 79825) affected shaders:
Instrs: 484480 -> 485105 (+0.13%); split: -0.05%, +0.17%
CodeSize: 2514012 -> 2517928 (+0.16%); split: -0.06%, +0.22%
SpillSGPRs: 1064 -> 1059 (-0.47%)
Latency: 3941121 -> 3944670 (+0.09%); split: -0.04%, +0.13%
InvThroughput: 897483 -> 898090 (+0.07%); split: -0.04%, +0.11%
VClause: 7101 -> 7098 (-0.04%)
SClause: 9036 -> 9052 (+0.18%); split: -0.44%, +0.62%
Copies: 42790 -> 43096 (+0.72%); split: -0.30%, +1.01%
PreSGPRs: 14357 -> 14342 (-0.10%); split: -0.37%, +0.26%
VALU: 298325 -> 298347 (+0.01%); split: -0.01%, +0.02%
SALU: 57288 -> 57577 (+0.50%); split: -0.20%, +0.70%
SMEM: 18768 -> 18967 (+1.06%); split: -0.01%, +1.07%

fossil-db (navi21):
Totals from 239 (0.30% of 79825) affected shaders:
Instrs: 444783 -> 445177 (+0.09%); split: -0.07%, +0.15%
CodeSize: 2371776 -> 2373136 (+0.06%); split: -0.13%, +0.19%
Latency: 4226478 -> 4219221 (-0.17%); split: -0.24%, +0.07%
InvThroughput: 1430962 -> 1428445 (-0.18%); split: -0.23%, +0.06%
SClause: 9357 -> 9398 (+0.44%); split: -0.20%, +0.64%
Copies: 42742 -> 42927 (+0.43%); split: -0.53%, +0.96%
Branches: 12975 -> 12970 (-0.04%); split: -0.05%, +0.02%
PreSGPRs: 14368 -> 14312 (-0.39%); split: -0.47%, +0.08%
VALU: 306642 -> 306720 (+0.03%); split: -0.02%, +0.05%
SALU: 63702 -> 63790 (+0.14%); split: -0.31%, +0.45%
SMEM: 20030 -> 20231 (+1.00%); split: -0.00%, +1.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14458
Backport-to: 25.3
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38903>
(cherry picked from commit b5cf3b1628)
This commit is contained in:
Rhys Perry 2025-12-11 09:59:16 +00:00 committed by Dylan Baker
parent e902c28669
commit 366a2272d3
2 changed files with 3 additions and 3 deletions

View file

@ -4,7 +4,7 @@
"description": "ac/nir: fix check for increasing size of non-descriptor loads",
"nominated": true,
"nomination_type": 4,
"resolution": 0,
"resolution": 1,
"main_sha": null,
"because_sha": null,
"notes": null

View file

@ -573,8 +573,8 @@ ac_nir_mem_vectorize_callback(unsigned align_mul, unsigned align_offset, unsigne
low->intrinsic == nir_intrinsic_load_global ? NIR_ALIGN_MUL_MAX : 4;
uint32_t page_size = 4096;
uint32_t mul = MIN3(align_mul, page_size, resource_align);
unsigned end = (align_offset + unaligned_new_size / 8u) & (mul - 1);
if ((aligned_new_size - unaligned_new_size) / 8u > (mul - end))
unsigned end = (align_offset + unaligned_new_size / 8u);
if ((aligned_new_size - unaligned_new_size) / 8u > (align(end, mul) - end))
return false;
}