mesa/src
Rhys Perry ce85e8219c ac/nir: fix check for increasing size of non-descriptor loads
In the previous version, "end" could have been zero, which would have
allowed an increase of "mul" bytes, when it should not not be increased at all.

For example:
- align_offset=4
- mul=4
- unaligned_new_size=96
- aligned_new_size=128
This would have loaded a dword which was not loaded previously.

fossil-db (gfx1201):
Totals from 115 (0.14% of 79839) affected shaders:
Instrs: 286697 -> 287097 (+0.14%); split: -0.16%, +0.30%
CodeSize: 1477728 -> 1481256 (+0.24%); split: -0.13%, +0.37%
SpillSGPRs: 1662 -> 1658 (-0.24%); split: -0.42%, +0.18%
Latency: 2288612 -> 2290248 (+0.07%); split: -0.04%, +0.11%
InvThroughput: 467307 -> 467602 (+0.06%); split: -0.03%, +0.10%
VClause: 3689 -> 3691 (+0.05%)
SClause: 5052 -> 5064 (+0.24%); split: -0.20%, +0.44%
Copies: 34837 -> 35103 (+0.76%); split: -0.80%, +1.56%
Branches: 7402 -> 7401 (-0.01%)
PreSGPRs: 9147 -> 9143 (-0.04%); split: -0.44%, +0.39%
VALU: 159333 -> 159372 (+0.02%); split: -0.01%, +0.04%
SALU: 52047 -> 52276 (+0.44%); split: -0.55%, +0.99%
SMEM: 9556 -> 9697 (+1.48%)

fossil-db (navi31):
Totals from 238 (0.30% of 79825) affected shaders:
Instrs: 484480 -> 485105 (+0.13%); split: -0.05%, +0.17%
CodeSize: 2514012 -> 2517928 (+0.16%); split: -0.06%, +0.22%
SpillSGPRs: 1064 -> 1059 (-0.47%)
Latency: 3941121 -> 3944670 (+0.09%); split: -0.04%, +0.13%
InvThroughput: 897483 -> 898090 (+0.07%); split: -0.04%, +0.11%
VClause: 7101 -> 7098 (-0.04%)
SClause: 9036 -> 9052 (+0.18%); split: -0.44%, +0.62%
Copies: 42790 -> 43096 (+0.72%); split: -0.30%, +1.01%
PreSGPRs: 14357 -> 14342 (-0.10%); split: -0.37%, +0.26%
VALU: 298325 -> 298347 (+0.01%); split: -0.01%, +0.02%
SALU: 57288 -> 57577 (+0.50%); split: -0.20%, +0.70%
SMEM: 18768 -> 18967 (+1.06%); split: -0.01%, +1.07%

fossil-db (navi21):
Totals from 239 (0.30% of 79825) affected shaders:
Instrs: 444783 -> 445177 (+0.09%); split: -0.07%, +0.15%
CodeSize: 2371776 -> 2373136 (+0.06%); split: -0.13%, +0.19%
Latency: 4226478 -> 4219221 (-0.17%); split: -0.24%, +0.07%
InvThroughput: 1430962 -> 1428445 (-0.18%); split: -0.23%, +0.06%
SClause: 9357 -> 9398 (+0.44%); split: -0.20%, +0.64%
Copies: 42742 -> 42927 (+0.43%); split: -0.53%, +0.96%
Branches: 12975 -> 12970 (-0.04%); split: -0.05%, +0.02%
PreSGPRs: 14368 -> 14312 (-0.39%); split: -0.47%, +0.08%
VALU: 306642 -> 306720 (+0.03%); split: -0.02%, +0.05%
SALU: 63702 -> 63790 (+0.14%); split: -0.31%, +0.45%
SMEM: 20030 -> 20231 (+1.00%); split: -0.00%, +1.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14458
Backport-to: 25.3
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38903>
(cherry picked from commit b5cf3b1628)
2025-12-15 11:23:34 -08:00
..
amd ac/nir: fix check for increasing size of non-descriptor loads 2025-12-15 11:23:34 -08:00
android_stub
asahi hk: Report the correct plane count in VkDrmFormatModifierProperties2?EXT 2025-11-04 10:16:42 -08:00
broadcom ci: use $CI_TRON_JOB_PRIORITY tag on all ci-tron jobs 2025-12-01 09:16:33 -08:00
c11 c11/threads: fix build on c23 2025-11-13 08:10:20 -08:00
compiler nir: Ignore ray query ranges that don't start with rq_initialize 2025-12-15 11:23:34 -08:00
drm-shim drm-shim: fix with asan 2025-09-03 11:47:00 +00:00
egl egl/x11: Fix memory leak when querying translated coord. 2025-12-15 11:23:34 -08:00
etnaviv ci: use $CI_TRON_JOB_PRIORITY tag on all ci-tron jobs 2025-12-01 09:16:33 -08:00
freedreno ir3: Fix condition for using uniform predicates 2025-12-15 11:23:34 -08:00
gallium asahi: Set prefer_real_buffer_in_constbuf0 2025-12-15 11:23:34 -08:00
gbm egl,glx: allow OpenGL with old libx11, but disable glthread if it's unsafe 2025-08-21 02:05:26 +00:00
getopt
gfxstream gfxstream: fix logspam in TLS helper function 2025-12-04 09:15:31 -08:00
glx glx: provide glx.pc 2025-10-14 20:53:10 +00:00
gtest
imagination pvr: enable samplerMirrorClampToEdge feature 2025-12-03 11:23:49 -08:00
imgui imgui: Silence build warnings for imgui 2025-09-16 06:16:19 +00:00
intel anv/video: fix VP9 chroma subsampling format detection 2025-12-15 11:23:34 -08:00
loader loader: Wrap nouveau_zink_predicate with HAVE_LIBDRM 2025-11-21 14:22:57 -08:00
mesa gallium: Make upload_cb0 return a releasebuf 2025-12-15 11:23:34 -08:00
microsoft dozen: return INCOMPATIBLE_DRIVER on instance create failure 2025-12-01 09:16:32 -08:00
nouveau nak/cmat: free the type mapping hash table. 2025-12-08 09:25:37 -08:00
panfrost panfrost/ci: Fix GitLab rules after YAML split 2025-12-15 11:23:34 -08:00
tool clang-format: Update the .clang-format files to conformance clang-format json-schema 2025-09-09 07:04:55 +00:00
util util/driconf/asahi: Override GL renderer for web browsers 2025-12-08 09:25:38 -08:00
virtio venus: fix racy semaphore feedback counter update 2025-12-03 15:02:48 -08:00
vulkan wsi/metal: Fix blit_imate_to_image's pool selection for cmd buffer alloc 2025-12-05 08:11:47 -08:00
x11 meson: add missing x11 dependency on libloader_x11 2025-08-08 21:45:59 +00:00
.clang-format clang-format: Move ForEachMacros into src/.clang-format for freedreno 2025-09-09 07:04:55 +00:00
meson.build Revert "meson: use vcs_tag() instead of custom script" 2025-10-06 23:06:11 +00:00