mesa/src/amd
Georg Lehmann 64cae5c48d aco: form mixed MTBUF/MUBUF clauses
This should be one clause (all of the instructions load from the same vertex buffer)

s_clause 0x2                                                ; bfa10002
tbuffer_load_format_xyzw v[8:11], v5, s[4:7], 0 format:[BUF_FMT_8_8_8_8_UNORM] idxen offset:36 ; e9c32024 80010805
tbuffer_load_format_xyzw v[12:15], v5, s[4:7], 0 format:[BUF_FMT_8_8_8_8_UNORM] idxen offset:16 ; e9c32010 80010c05
tbuffer_load_format_xyzw v[16:19], v5, s[4:7], 0 format:[BUF_FMT_8_8_8_8_UNORM] idxen offset:12 ; e9c3200c 80011005
s_clause 0x2                                                ; bfa10002
buffer_load_dwordx3 v[20:22], v5, s[4:7], 0 idxen           ; e03c2000 80011405
buffer_load_dwordx3 v[23:25], v5, s[4:7], 0 idxen offset:20 ; e03c2014 80011705
buffer_load_dwordx4 v[28:31], v5, s[4:7], 0 idxen offset:48 ; e0382030 80011c05
tbuffer_load_format_xy v[0:1], v5, s[4:7], 0 format:[BUF_FMT_8_8_UNORM] idxen offset:32 ; e8712020 80010005

Foz-DB Navi21:
Totals from 5624 (7.08% of 79395) affected shaders:
MaxWaves: 149894 -> 149898 (+0.00%)
Instrs: 3032697 -> 3034853 (+0.07%); split: -0.05%, +0.12%
CodeSize: 15907852 -> 15915752 (+0.05%); split: -0.05%, +0.10%
VGPRs: 216248 -> 216144 (-0.05%)
Latency: 10955137 -> 11008760 (+0.49%); split: -0.22%, +0.70%
InvThroughput: 2032857 -> 2033916 (+0.05%); split: -0.03%, +0.08%
VClause: 50120 -> 41778 (-16.64%); split: -16.66%, +0.02%
SClause: 62034 -> 62004 (-0.05%); split: -0.33%, +0.29%
Copies: 253836 -> 254505 (+0.26%); split: -0.17%, +0.43%
VALU: 1621606 -> 1622274 (+0.04%); split: -0.03%, +0.07%
SALU: 653251 -> 653252 (+0.00%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34379>
2025-04-08 09:22:04 +00:00
..
addrlib amd/addrlib: remove the DCC page fault workaround 2025-04-01 03:23:22 -04:00
ci glsl: fix regression in ubo cloning 2025-04-06 19:43:47 +10:00
common ac/surface: remove 64K_2D modifier with 64B max compressed blocks for gfx12 2025-04-07 19:44:22 +00:00
compiler aco: form mixed MTBUF/MUBUF clauses 2025-04-08 09:22:04 +00:00
drm-shim amd/drm-shim: add gfx1201 2025-03-10 11:21:36 +00:00
gmlib amd/gmlib: add gmlib for radeonsi 2025-02-27 03:15:16 +00:00
llvm ac/llvm: support mul24_relaxed 2025-03-27 06:24:16 +00:00
registers amd: Rename GFX1103_R1/R2 to PHOENIX/2 2024-11-20 02:14:40 +00:00
vpelib amd/vpelib: More parameters to the segmentation process and introduce validation hook 2025-03-06 02:11:53 +00:00
vulkan radv: add clip rects state bit for emitting discard rectangles 2025-04-08 08:42:17 +00:00
meson.build amd/gmlib: add gmlib for radeonsi 2025-02-27 03:15:16 +00:00