Commit graph

4275 commits

Author SHA1 Message Date
Daniel Schürmann
97f095f6e0 aco/lower_branches: Add try_rotate_latch_block() optimization
This optimization looks for unconditional back-edges and aims
to rotate the loop in a way that the final block is emitted
before the loop header, essentially turning

BB1:
  if ()
    goto BB3;
BB2:
  <loop body>
  goto BB1;
BB3:
  ...

into

  goto BB1;
BB2:
  <loop body>
BB1:
  if(!cond)
    goto BB2;
BB3:
  ...

Totals from 4969 (5.89% of 84383) affected shaders: (Navi48)

Instrs: 15253038 -> 15254019 (+0.01%); split: -0.00%, +0.01%
CodeSize: 81225300 -> 81227696 (+0.00%); split: -0.02%, +0.02%
Latency: 320796283 -> 320693480 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 51395922 -> 51376156 (-0.04%); split: -0.04%, +0.00%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519>
2026-02-13 14:49:44 +00:00
Daniel Schürmann
ade5e300ab aco/insert_delay_alu: handle loop latch block before loop body
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519>
2026-02-13 14:49:44 +00:00
Daniel Schürmann
102aca9843 aco/assembler: emit block_kind_loop_latch before the loop header
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519>
2026-02-13 14:49:44 +00:00
Daniel Schürmann
da1594f8bb aco: introduce notion of block_kind_loop_latch
A block annotated with block_kind_loop_latch denotes a block
the re-entry point for a loop back-edge. It is emitted after
the loop preheader and (potentially) before the loop header.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519>
2026-02-13 14:49:44 +00:00
Daniel Schürmann
9887ce6709 aco/print_asm: Sort block markers by block offset
We are going to emit blocks in a different order.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519>
2026-02-13 14:49:44 +00:00
Daniel Schürmann
800a4957bb aco/lower_branches: Consider branch target of nested conditional branches
Totals from 1470 (1.74% of 84383) affected shaders: (Navi48)

Instrs: 5128451 -> 5126842 (-0.03%)
CodeSize: 29359832 -> 29353656 (-0.02%); split: -0.02%, +0.00%
Latency: 41047203 -> 41040786 (-0.02%)
InvThroughput: 6040459 -> 6039619 (-0.01%); split: -0.01%, +0.00%
Branches: 146219 -> 144648 (-1.07%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519>
2026-02-13 14:49:44 +00:00
Daniel Schürmann
fbf2083b8f aco/isel: Don't emit ELSE side of divergent branches which jump
Totals from 50 (0.06% of 84383) affected shaders: (Navi48)

Instrs: 402490 -> 402444 (-0.01%); split: -0.01%, +0.00%
CodeSize: 2239024 -> 2238864 (-0.01%); split: -0.01%, +0.00%
SpillSGPRs: 1493 -> 1496 (+0.20%)
Latency: 5836785 -> 5836747 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 1120893 -> 1120909 (+0.00%); split: -0.00%, +0.00%
Copies: 46128 -> 46082 (-0.10%)
VALU: 222708 -> 222715 (+0.00%); split: -0.00%, +0.00%
SALU: 53039 -> 52993 (-0.09%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519>
2026-02-13 14:49:44 +00:00
Daniel Schürmann
ba32219cf8 aco/isel: Don't emit ELSE side of uniform branches which jump
Totals from 4 (0.00% of 84383) affected shaders: (Navi48)

Instrs: 16473 -> 16468 (-0.03%)
CodeSize: 85276 -> 85300 (+0.03%)
SpillSGPRs: 175 -> 176 (+0.57%)
Latency: 267907 -> 267885 (-0.01%)
InvThroughput: 36302 -> 36298 (-0.01%)
Copies: 1353 -> 1345 (-0.59%)
VALU: 9025 -> 9029 (+0.04%)
SALU: 2635 -> 2627 (-0.30%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519>
2026-02-13 14:49:44 +00:00
Daniel Schürmann
96a639918c aco: don't emit p_logical_start / p_logical_end after divergent branches
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519>
2026-02-13 14:49:44 +00:00
Daniel Schürmann
3743230252 aco/isel: Do IF-simplification if that didn't happen during NIR optimizations
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519>
2026-02-13 14:49:43 +00:00
Daniel Schürmann
50b093ec90 aco/builder: Fix v_add_co_u32 carry-out to VCC if post_ra
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519>
2026-02-13 14:49:43 +00:00
Georg Lehmann
d7814bcad0 aco: remove redundant can_use_DPP declaration
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39801>
2026-02-11 11:34:29 +00:00
Georg Lehmann
fc7b5d7eed aco/opt_postRA: don't optimize across calls
Could do better by checking which registers are clobbered/preserved,
but that's unlikely to be useful anyway.

Backport-to: 26.0

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39801>
2026-02-11 11:34:29 +00:00
Georg Lehmann
10b12a6ee2 aco: handle all SALU that modifies PC in needs_exec_mask
Calls use swappc.

Backport-to: 26.0

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39801>
2026-02-11 11:34:29 +00:00
Georg Lehmann
421a4dacf0 aco/lower_branches: consider jump target of conditional branches based on vcc
Cc: mesa-stable

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39801>
2026-02-11 11:34:29 +00:00
Georg Lehmann
77d05ac1ba aco/optimizer: stop checking precise for med3
No Foz-DB changes.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
e873b8764a aco/optimizer: use nan preserve flag to prevent incorrect med3
No Foz-DB changes.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
0c46053c05 aco/optimzer: apply extract with any uses
Foz-DB Navi48:
Totals from 362 (0.44% of 82405) affected shaders:
MaxWaves: 5052 -> 5066 (+0.28%)
Instrs: 5297858 -> 5294009 (-0.07%); split: -0.09%, +0.01%
CodeSize: 30187188 -> 30177592 (-0.03%); split: -0.05%, +0.02%
VGPRs: 44280 -> 44172 (-0.24%)
Latency: 35632812 -> 35619796 (-0.04%); split: -0.05%, +0.01%
InvThroughput: 7050206 -> 7041058 (-0.13%); split: -0.14%, +0.01%
VClause: 137780 -> 137794 (+0.01%); split: -0.01%, +0.02%
SClause: 114821 -> 114781 (-0.03%)
Copies: 466018 -> 465150 (-0.19%); split: -0.24%, +0.05%
Branches: 171990 -> 171988 (-0.00%)
PreVGPRs: 39268 -> 39084 (-0.47%)
VALU: 2557456 -> 2554297 (-0.12%); split: -0.15%, +0.02%
SALU: 893170 -> 893192 (+0.00%); split: -0.00%, +0.01%
VOPD: 393760 -> 394427 (+0.17%); split: +0.39%, -0.22%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:40 +00:00
Georg Lehmann
85c62f1515 aco/optimizer: only copy propagate p_split_vector if it can be eliminated
Foz-DB Navi48:
Totals from 402 (0.49% of 82405) affected shaders:
Instrs: 3078116 -> 3070117 (-0.26%); split: -0.28%, +0.02%
CodeSize: 17329444 -> 17240360 (-0.51%); split: -0.53%, +0.01%
VGPRs: 48960 -> 48924 (-0.07%); split: -0.12%, +0.05%
SpillVGPRs: 1683 -> 1687 (+0.24%)
Latency: 27758978 -> 27728451 (-0.11%); split: -0.17%, +0.06%
InvThroughput: 5748513 -> 5741761 (-0.12%); split: -0.18%, +0.06%
VClause: 69557 -> 69575 (+0.03%); split: -0.01%, +0.03%
SClause: 74850 -> 74866 (+0.02%)
Copies: 338241 -> 329400 (-2.61%); split: -2.71%, +0.10%
Branches: 118443 -> 118431 (-0.01%)
PreVGPRs: 44561 -> 44598 (+0.08%)
VALU: 1463081 -> 1455438 (-0.52%); split: -0.56%, +0.04%
SALU: 574113 -> 574013 (-0.02%); split: -0.03%, +0.01%
VMEM: 105789 -> 105797 (+0.01%)
VOPD: 140203 -> 139009 (-0.85%); split: +0.44%, -1.29%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
5ecc800edd aco/optimizer: add second copy prop for pseudo instructions
Foz-DB Navi48:
Totals from 28 (0.03% of 82405) affected shaders:
Instrs: 144993 -> 144645 (-0.24%); split: -0.26%, +0.02%
CodeSize: 784668 -> 783604 (-0.14%); split: -0.19%, +0.05%
SpillVGPRs: 215 -> 209 (-2.79%)
Latency: 2529900 -> 2526895 (-0.12%); split: -0.12%, +0.00%
InvThroughput: 775379 -> 773859 (-0.20%); split: -0.20%, +0.00%
VClause: 2815 -> 2803 (-0.43%)
Copies: 23474 -> 23170 (-1.30%); split: -1.38%, +0.09%
Branches: 4638 -> 4632 (-0.13%)
VALU: 81924 -> 81620 (-0.37%); split: -0.40%, +0.03%
SALU: 23986 -> 23995 (+0.04%); split: -0.03%, +0.07%
VMEM: 3726 -> 3714 (-0.32%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
269007faf3 aco/optimizer: apply byte p_split_vector as extract
Foz-DB Navi48:
Totals from 80 (0.10% of 82405) affected shaders:
Instrs: 3022374 -> 3024178 (+0.06%); split: -0.00%, +0.06%
CodeSize: 17396984 -> 17403108 (+0.04%); split: -0.00%, +0.04%
Latency: 17685547 -> 17687073 (+0.01%); split: -0.01%, +0.02%
InvThroughput: 3622683 -> 3622618 (-0.00%); split: -0.02%, +0.02%
VClause: 83840 -> 83841 (+0.00%)
Copies: 242072 -> 242528 (+0.19%); split: -0.01%, +0.20%
Branches: 81582 -> 81578 (-0.00%)
PreVGPRs: 7536 -> 7527 (-0.12%)
VALU: 1520822 -> 1521762 (+0.06%); split: -0.01%, +0.07%
VOPD: 294392 -> 293908 (-0.16%); split: +0.03%, -0.20%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
b21b36b6ab aco/optimizer: apply further extracts to v_cvt_f32_ubyte
Foz-DB Navi48:
Totals from 21 (0.03% of 82405) affected shaders:
Instrs: 2818255 -> 2817482 (-0.03%)
CodeSize: 16282360 -> 16273080 (-0.06%)
Latency: 14172672 -> 14172405 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 2728551 -> 2728493 (-0.00%); split: -0.00%, +0.00%
Copies: 213703 -> 212973 (-0.34%)
VALU: 1407351 -> 1406585 (-0.05%)
VOPD: 291185 -> 291221 (+0.01%); split: +0.04%, -0.03%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
08f9bad0b5 aco/isel: avoid extracts for continuous alu src components
Helps fp8 FSR4, hurts parallel_rdp.

Foz-DB Navi48:
Totals from 23 (0.03% of 82405) affected shaders:
MaxWaves: 380 -> 383 (+0.79%)
Instrs: 71228 -> 71487 (+0.36%); split: -0.26%, +0.62%
CodeSize: 411500 -> 415004 (+0.85%); split: -0.21%, +1.06%
VGPRs: 2856 -> 2784 (-2.52%)
Latency: 1654160 -> 1665555 (+0.69%); split: -0.14%, +0.83%
InvThroughput: 354145 -> 361122 (+1.97%); split: -0.10%, +2.07%
VClause: 1557 -> 1541 (-1.03%); split: -1.41%, +0.39%
Copies: 9857 -> 10059 (+2.05%); split: -1.76%, +3.80%
PreVGPRs: 2285 -> 2182 (-4.51%); split: -4.73%, +0.22%
VALU: 38873 -> 39066 (+0.50%); split: -0.47%, +0.96%
VOPD: 1237 -> 1246 (+0.73%); split: +1.13%, -0.40%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
a0c663378c aco/isel: split vector into dwords/words first
Foz-DB Navi48:
Totals from 361 (0.44% of 82405) affected shaders:
MaxWaves: 5806 -> 5832 (+0.45%)
Instrs: 2343746 -> 2343762 (+0.00%); split: -0.04%, +0.04%
CodeSize: 13270504 -> 13267116 (-0.03%); split: -0.10%, +0.08%
VGPRs: 42008 -> 41708 (-0.71%)
SpillVGPRs: 308 -> 303 (-1.62%)
Scratch: 1574656 -> 1574400 (-0.02%)
Latency: 26571385 -> 22602486 (-14.94%); split: -14.95%, +0.01%
InvThroughput: 5474157 -> 4614777 (-15.70%); split: -15.70%, +0.00%
VClause: 57512 -> 57515 (+0.01%); split: -0.03%, +0.03%
SClause: 56313 -> 56319 (+0.01%)
Copies: 251626 -> 248707 (-1.16%); split: -1.24%, +0.08%
Branches: 89620 -> 89614 (-0.01%)
PreVGPRs: 37361 -> 36910 (-1.21%); split: -1.21%, +0.01%
VALU: 1111534 -> 1108507 (-0.27%); split: -0.29%, +0.02%
SALU: 443684 -> 443687 (+0.00%); split: -0.00%, +0.00%
VMEM: 85287 -> 85277 (-0.01%)
VOPD: 97987 -> 98091 (+0.11%); split: +0.30%, -0.20%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
1a3e627223 aco: improve emit_extract_vector for vector of vecs
No Foz-DB changes, but nessecary for dword first splits.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
1b491cc51a aco/optimizer: don't remove label_extract for splits
No Foz-DB changes, but will become nessecary with dword first splits.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
66f2a35954 aco/optimizer: repeat vector of split opt
Foz-DB Navi48:
Totals from 13 (0.02% of 82405) affected shaders:
Instrs: 12071 -> 12119 (+0.40%); split: -0.07%, +0.46%
CodeSize: 86908 -> 86960 (+0.06%); split: -0.29%, +0.35%
Latency: 104959 -> 105385 (+0.41%); split: -0.60%, +1.00%
InvThroughput: 46518 -> 46598 (+0.17%); split: -0.03%, +0.20%
VClause: 515 -> 506 (-1.75%); split: -3.11%, +1.36%
SClause: 32 -> 30 (-6.25%)
Copies: 973 -> 1038 (+6.68%); split: -0.82%, +7.50%
PreVGPRs: 1185 -> 1191 (+0.51%)
VALU: 7126 -> 7166 (+0.56%); split: -0.08%, +0.65%
SALU: 1127 -> 1129 (+0.18%)
VOPD: 1516 -> 1539 (+1.52%); split: +1.78%, -0.26%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
6951ddc43b aco: clean up emit_extract_vector a bit
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Rhys Perry
53ed863b88 aco/insert_waitcnt: improve s_setpc_b64/s_swappc_b64/end_with_regs a bit
Don't wait for any events which don't involve registers.

fossil-db (navi31):
Totals from 210 (0.25% of 84369) affected shaders:
Instrs: 106932 -> 106677 (-0.24%)
CodeSize: 604164 -> 603144 (-0.17%)
Latency: 726405 -> 720433 (-0.82%)
InvThroughput: 102048 -> 101504 (-0.53%); split: -0.54%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39590>
2026-02-06 09:49:20 +00:00
Rhys Perry
63b18e9e5b aco: move return address to a clobbered register
It's placed in the preserved registers, but the p_call clobbers it, so
this change removes some special casing.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39590>
2026-02-06 09:49:19 +00:00
Rhys Perry
ec74e34672 aco: add return address to call_clobbered_regs
It's better for handle_call() to make sure these SGPRs are clear.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39590>
2026-02-06 09:49:18 +00:00
Rhys Perry
837afd7faf aco: use Program::stack_ptr instead of Program::static_scratch_rsrc
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39590>
2026-02-06 09:49:17 +00:00
Rhys Perry
a6502b4a29 aco: use ABI::numClobbered() more
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39590>
2026-02-06 09:49:17 +00:00
Georg Lehmann
146779d16d aco: remove unpack_half support
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511>
2026-02-06 06:12:36 +00:00
Marek Olšák
bac80013e6 ac: remove image_load buffer code from ACO & LLVM
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> (aco)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>
2026-02-02 17:56:55 +00:00
Marek Olšák
c05d340184 ac: remove txf buffer code from ACO & LLVM
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> (aco)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>
2026-02-02 17:56:55 +00:00
Marek Olšák
21d79dc22f aco,ac/llvm: force IDXEN=1 for buffer format opcodes on GFX9
This fixes txf and image_load lowered to buffer_load_amd.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>
2026-02-02 17:56:53 +00:00
Marek Olšák
4a7728c436 aco: handle ACCESS_SPARSE and ACCESS_SKIP_HELPERS for load_buffer_amd
buffer txf and buffer image_load will be lowered to load_buffer_amd

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>
2026-02-02 17:56:53 +00:00
Natalie Vock
ad23e02a28 aco: Don't exclude discardable parameters from register preservation
The original semantic of discardable parameters was "okay, nothing
actually uses this parameter, feel free to clobber it", but we were
only using it with tail calls from a function without discardable
parameters, which was broken.

Instead, slightly change the use-case and utilize the "discardable"
attribute to mark parameters that the callee will clobber in a tail
call. This makes doing tail calls safe when the tail callee receives a
modified set of parameters.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39579>
2026-01-31 14:26:57 +00:00
Rhys Perry
0b0e124a73 aco: use lv1.resize() pattern
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39537>
2026-01-28 16:46:30 +00:00
Rhys Perry
5f5032bb6a aco: use lv1/lv2 instead of v1/v2.as_linear()
This is just a search+replace then clang-format.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39537>
2026-01-28 16:46:30 +00:00
Rhys Perry
c98204c963 aco: add lv1/lv2 as alias for v1/v2.as_linear()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39537>
2026-01-28 16:46:29 +00:00
Georg Lehmann
2d38da94d4 aco: allow v_cmpx with DPP
The wording in the RDNA3 ISA doc was since clarified, v_cmpx with DPP
behaves exactly like one would expect:
FI controls whether the source value can be read from inactive lanes,
but inactive lanes always write a 0 bit. The same applies to v_cmp with DPP.

Foz-DB Navi48:
Totals from 987 (1.20% of 82405) affected shaders:
Instrs: 517003 -> 516445 (-0.11%); split: -0.11%, +0.00%
CodeSize: 2782688 -> 2780508 (-0.08%); split: -0.08%, +0.00%
Latency: 2059169 -> 2056327 (-0.14%); split: -0.14%, +0.00%
InvThroughput: 365374 -> 365328 (-0.01%); split: -0.03%, +0.01%
Copies: 64669 -> 65616 (+1.46%)
SALU: 70693 -> 70652 (-0.06%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>
2026-01-27 20:42:51 +00:00
Georg Lehmann
1c1bd9d090 aco: only apply DPP with 3 or less uses
Creating many new DPP instructions increases code size and decreases throughput.

Foz-DB Navi48:
Totals from 2196 (2.67% of 82179) affected shaders:
MaxWaves: 59930 -> 59960 (+0.05%); split: +0.08%, -0.03%
Instrs: 3718514 -> 3718298 (-0.01%); split: -0.08%, +0.07%
CodeSize: 20593544 -> 20507660 (-0.42%); split: -0.43%, +0.02%
VGPRs: 135924 -> 135744 (-0.13%); split: -0.17%, +0.04%
Latency: 33174704 -> 33163001 (-0.04%); split: -0.07%, +0.04%
InvThroughput: 6500723 -> 6491382 (-0.14%); split: -0.15%, +0.01%
VClause: 72348 -> 72343 (-0.01%); split: -0.06%, +0.05%
SClause: 83160 -> 83165 (+0.01%); split: -0.03%, +0.04%
Copies: 286592 -> 285575 (-0.35%); split: -0.45%, +0.09%
Branches: 99970 -> 99971 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 103280 -> 103279 (-0.00%)
PreVGPRs: 95590 -> 95440 (-0.16%); split: -0.30%, +0.14%
VALU: 1931369 -> 1931725 (+0.02%); split: -0.08%, +0.09%
SALU: 637663 -> 636780 (-0.14%); split: -0.15%, +0.01%
VOPD: 65236 -> 65589 (+0.54%); split: +0.91%, -0.37%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>
2026-01-27 20:42:51 +00:00
Georg Lehmann
bb6a3e2891 aco/optimizer: rework how dpp is applied
Using the common helpers means we can use VINTERP instead of DPP,
which has higher throughput and smaller CodeSize.

Foz-DB Navi48:
Totals from 986 (1.20% of 82405) affected shaders:
Instrs: 1985282 -> 1985545 (+0.01%); split: -0.01%, +0.02%
CodeSize: 11179700 -> 11151780 (-0.25%); split: -0.26%, +0.01%
Latency: 19899190 -> 19897694 (-0.01%); split: -0.01%, +0.01%
InvThroughput: 4110650 -> 4104911 (-0.14%)
VClause: 44143 -> 44139 (-0.01%); split: -0.03%, +0.02%
Copies: 164340 -> 164344 (+0.00%); split: -0.02%, +0.02%
VALU: 1061904 -> 1061908 (+0.00%); split: -0.00%, +0.00%
SALU: 305980 -> 305974 (-0.00%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>
2026-01-27 20:42:51 +00:00
Georg Lehmann
228cb29dae aco/optimizer: allow DPP with scalar src1 in alu_opt_info_is_valid
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>
2026-01-27 20:42:51 +00:00
Georg Lehmann
d4c0318f48 aco: apply DPP with scalar src1 on gfx11.5+
Foz-DB Navi48:
Totals from 6261 (7.62% of 82179) affected shaders:
MaxWaves: 176284 -> 176236 (-0.03%); split: +0.01%, -0.03%
Instrs: 5850185 -> 5828451 (-0.37%); split: -0.41%, +0.04%
CodeSize: 31363324 -> 31419904 (+0.18%); split: -0.08%, +0.26%
VGPRs: 328284 -> 328200 (-0.03%); split: -0.07%, +0.05%
SpillSGPRs: 2268 -> 2256 (-0.53%)
Latency: 50235516 -> 50218816 (-0.03%); split: -0.06%, +0.03%
InvThroughput: 8256243 -> 8242036 (-0.17%); split: -0.22%, +0.05%
VClause: 81000 -> 80975 (-0.03%); split: -0.11%, +0.08%
SClause: 136376 -> 136387 (+0.01%); split: -0.11%, +0.11%
Copies: 414021 -> 417894 (+0.94%); split: -0.13%, +1.07%
Branches: 105301 -> 105298 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 291360 -> 291432 (+0.02%)
PreVGPRs: 238593 -> 238729 (+0.06%); split: -0.02%, +0.08%
VALU: 3425446 -> 3403463 (-0.64%); split: -0.65%, +0.01%
SALU: 815505 -> 819372 (+0.47%); split: -0.02%, +0.50%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>
2026-01-27 20:42:51 +00:00
Georg Lehmann
3fe329b3d0 aco/ra: don't move sgpr into v_fmac_f32_dpp src0
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>
2026-01-27 20:42:50 +00:00
Georg Lehmann
903d940fa9 aco: don't convert VOP3P to VOP3 when applying DPP
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>
2026-01-27 20:42:50 +00:00
Georg Lehmann
8ac7b9fc37 aco: undo operand swap if applying DPP fails
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>
2026-01-27 20:42:50 +00:00