Commit graph

559 commits

Author SHA1 Message Date
Georg Lehmann
77d05ac1ba aco/optimizer: stop checking precise for med3
No Foz-DB changes.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:03 +00:00
Georg Lehmann
e873b8764a aco/optimizer: use nan preserve flag to prevent incorrect med3
No Foz-DB changes.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641>
2026-02-10 18:42:02 +00:00
Georg Lehmann
0c46053c05 aco/optimzer: apply extract with any uses
Foz-DB Navi48:
Totals from 362 (0.44% of 82405) affected shaders:
MaxWaves: 5052 -> 5066 (+0.28%)
Instrs: 5297858 -> 5294009 (-0.07%); split: -0.09%, +0.01%
CodeSize: 30187188 -> 30177592 (-0.03%); split: -0.05%, +0.02%
VGPRs: 44280 -> 44172 (-0.24%)
Latency: 35632812 -> 35619796 (-0.04%); split: -0.05%, +0.01%
InvThroughput: 7050206 -> 7041058 (-0.13%); split: -0.14%, +0.01%
VClause: 137780 -> 137794 (+0.01%); split: -0.01%, +0.02%
SClause: 114821 -> 114781 (-0.03%)
Copies: 466018 -> 465150 (-0.19%); split: -0.24%, +0.05%
Branches: 171990 -> 171988 (-0.00%)
PreVGPRs: 39268 -> 39084 (-0.47%)
VALU: 2557456 -> 2554297 (-0.12%); split: -0.15%, +0.02%
SALU: 893170 -> 893192 (+0.00%); split: -0.00%, +0.01%
VOPD: 393760 -> 394427 (+0.17%); split: +0.39%, -0.22%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:40 +00:00
Georg Lehmann
85c62f1515 aco/optimizer: only copy propagate p_split_vector if it can be eliminated
Foz-DB Navi48:
Totals from 402 (0.49% of 82405) affected shaders:
Instrs: 3078116 -> 3070117 (-0.26%); split: -0.28%, +0.02%
CodeSize: 17329444 -> 17240360 (-0.51%); split: -0.53%, +0.01%
VGPRs: 48960 -> 48924 (-0.07%); split: -0.12%, +0.05%
SpillVGPRs: 1683 -> 1687 (+0.24%)
Latency: 27758978 -> 27728451 (-0.11%); split: -0.17%, +0.06%
InvThroughput: 5748513 -> 5741761 (-0.12%); split: -0.18%, +0.06%
VClause: 69557 -> 69575 (+0.03%); split: -0.01%, +0.03%
SClause: 74850 -> 74866 (+0.02%)
Copies: 338241 -> 329400 (-2.61%); split: -2.71%, +0.10%
Branches: 118443 -> 118431 (-0.01%)
PreVGPRs: 44561 -> 44598 (+0.08%)
VALU: 1463081 -> 1455438 (-0.52%); split: -0.56%, +0.04%
SALU: 574113 -> 574013 (-0.02%); split: -0.03%, +0.01%
VMEM: 105789 -> 105797 (+0.01%)
VOPD: 140203 -> 139009 (-0.85%); split: +0.44%, -1.29%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
5ecc800edd aco/optimizer: add second copy prop for pseudo instructions
Foz-DB Navi48:
Totals from 28 (0.03% of 82405) affected shaders:
Instrs: 144993 -> 144645 (-0.24%); split: -0.26%, +0.02%
CodeSize: 784668 -> 783604 (-0.14%); split: -0.19%, +0.05%
SpillVGPRs: 215 -> 209 (-2.79%)
Latency: 2529900 -> 2526895 (-0.12%); split: -0.12%, +0.00%
InvThroughput: 775379 -> 773859 (-0.20%); split: -0.20%, +0.00%
VClause: 2815 -> 2803 (-0.43%)
Copies: 23474 -> 23170 (-1.30%); split: -1.38%, +0.09%
Branches: 4638 -> 4632 (-0.13%)
VALU: 81924 -> 81620 (-0.37%); split: -0.40%, +0.03%
SALU: 23986 -> 23995 (+0.04%); split: -0.03%, +0.07%
VMEM: 3726 -> 3714 (-0.32%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
269007faf3 aco/optimizer: apply byte p_split_vector as extract
Foz-DB Navi48:
Totals from 80 (0.10% of 82405) affected shaders:
Instrs: 3022374 -> 3024178 (+0.06%); split: -0.00%, +0.06%
CodeSize: 17396984 -> 17403108 (+0.04%); split: -0.00%, +0.04%
Latency: 17685547 -> 17687073 (+0.01%); split: -0.01%, +0.02%
InvThroughput: 3622683 -> 3622618 (-0.00%); split: -0.02%, +0.02%
VClause: 83840 -> 83841 (+0.00%)
Copies: 242072 -> 242528 (+0.19%); split: -0.01%, +0.20%
Branches: 81582 -> 81578 (-0.00%)
PreVGPRs: 7536 -> 7527 (-0.12%)
VALU: 1520822 -> 1521762 (+0.06%); split: -0.01%, +0.07%
VOPD: 294392 -> 293908 (-0.16%); split: +0.03%, -0.20%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
b21b36b6ab aco/optimizer: apply further extracts to v_cvt_f32_ubyte
Foz-DB Navi48:
Totals from 21 (0.03% of 82405) affected shaders:
Instrs: 2818255 -> 2817482 (-0.03%)
CodeSize: 16282360 -> 16273080 (-0.06%)
Latency: 14172672 -> 14172405 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 2728551 -> 2728493 (-0.00%); split: -0.00%, +0.00%
Copies: 213703 -> 212973 (-0.34%)
VALU: 1407351 -> 1406585 (-0.05%)
VOPD: 291185 -> 291221 (+0.01%); split: +0.04%, -0.03%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
1b491cc51a aco/optimizer: don't remove label_extract for splits
No Foz-DB changes, but will become nessecary with dword first splits.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
66f2a35954 aco/optimizer: repeat vector of split opt
Foz-DB Navi48:
Totals from 13 (0.02% of 82405) affected shaders:
Instrs: 12071 -> 12119 (+0.40%); split: -0.07%, +0.46%
CodeSize: 86908 -> 86960 (+0.06%); split: -0.29%, +0.35%
Latency: 104959 -> 105385 (+0.41%); split: -0.60%, +1.00%
InvThroughput: 46518 -> 46598 (+0.17%); split: -0.03%, +0.20%
VClause: 515 -> 506 (-1.75%); split: -3.11%, +1.36%
SClause: 32 -> 30 (-6.25%)
Copies: 973 -> 1038 (+6.68%); split: -0.82%, +7.50%
PreVGPRs: 1185 -> 1191 (+0.51%)
VALU: 7126 -> 7166 (+0.56%); split: -0.08%, +0.65%
SALU: 1127 -> 1129 (+0.18%)
VOPD: 1516 -> 1539 (+1.52%); split: +1.78%, -0.26%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
1c1bd9d090 aco: only apply DPP with 3 or less uses
Creating many new DPP instructions increases code size and decreases throughput.

Foz-DB Navi48:
Totals from 2196 (2.67% of 82179) affected shaders:
MaxWaves: 59930 -> 59960 (+0.05%); split: +0.08%, -0.03%
Instrs: 3718514 -> 3718298 (-0.01%); split: -0.08%, +0.07%
CodeSize: 20593544 -> 20507660 (-0.42%); split: -0.43%, +0.02%
VGPRs: 135924 -> 135744 (-0.13%); split: -0.17%, +0.04%
Latency: 33174704 -> 33163001 (-0.04%); split: -0.07%, +0.04%
InvThroughput: 6500723 -> 6491382 (-0.14%); split: -0.15%, +0.01%
VClause: 72348 -> 72343 (-0.01%); split: -0.06%, +0.05%
SClause: 83160 -> 83165 (+0.01%); split: -0.03%, +0.04%
Copies: 286592 -> 285575 (-0.35%); split: -0.45%, +0.09%
Branches: 99970 -> 99971 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 103280 -> 103279 (-0.00%)
PreVGPRs: 95590 -> 95440 (-0.16%); split: -0.30%, +0.14%
VALU: 1931369 -> 1931725 (+0.02%); split: -0.08%, +0.09%
SALU: 637663 -> 636780 (-0.14%); split: -0.15%, +0.01%
VOPD: 65236 -> 65589 (+0.54%); split: +0.91%, -0.37%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>
2026-01-27 20:42:51 +00:00
Georg Lehmann
bb6a3e2891 aco/optimizer: rework how dpp is applied
Using the common helpers means we can use VINTERP instead of DPP,
which has higher throughput and smaller CodeSize.

Foz-DB Navi48:
Totals from 986 (1.20% of 82405) affected shaders:
Instrs: 1985282 -> 1985545 (+0.01%); split: -0.01%, +0.02%
CodeSize: 11179700 -> 11151780 (-0.25%); split: -0.26%, +0.01%
Latency: 19899190 -> 19897694 (-0.01%); split: -0.01%, +0.01%
InvThroughput: 4110650 -> 4104911 (-0.14%)
VClause: 44143 -> 44139 (-0.01%); split: -0.03%, +0.02%
Copies: 164340 -> 164344 (+0.00%); split: -0.02%, +0.02%
VALU: 1061904 -> 1061908 (+0.00%); split: -0.00%, +0.00%
SALU: 305980 -> 305974 (-0.00%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>
2026-01-27 20:42:51 +00:00
Georg Lehmann
228cb29dae aco/optimizer: allow DPP with scalar src1 in alu_opt_info_is_valid
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>
2026-01-27 20:42:51 +00:00
Georg Lehmann
d4c0318f48 aco: apply DPP with scalar src1 on gfx11.5+
Foz-DB Navi48:
Totals from 6261 (7.62% of 82179) affected shaders:
MaxWaves: 176284 -> 176236 (-0.03%); split: +0.01%, -0.03%
Instrs: 5850185 -> 5828451 (-0.37%); split: -0.41%, +0.04%
CodeSize: 31363324 -> 31419904 (+0.18%); split: -0.08%, +0.26%
VGPRs: 328284 -> 328200 (-0.03%); split: -0.07%, +0.05%
SpillSGPRs: 2268 -> 2256 (-0.53%)
Latency: 50235516 -> 50218816 (-0.03%); split: -0.06%, +0.03%
InvThroughput: 8256243 -> 8242036 (-0.17%); split: -0.22%, +0.05%
VClause: 81000 -> 80975 (-0.03%); split: -0.11%, +0.08%
SClause: 136376 -> 136387 (+0.01%); split: -0.11%, +0.11%
Copies: 414021 -> 417894 (+0.94%); split: -0.13%, +1.07%
Branches: 105301 -> 105298 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 291360 -> 291432 (+0.02%)
PreVGPRs: 238593 -> 238729 (+0.06%); split: -0.02%, +0.08%
VALU: 3425446 -> 3403463 (-0.64%); split: -0.65%, +0.01%
SALU: 815505 -> 819372 (+0.47%); split: -0.02%, +0.50%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>
2026-01-27 20:42:51 +00:00
Georg Lehmann
903d940fa9 aco: don't convert VOP3P to VOP3 when applying DPP
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>
2026-01-27 20:42:50 +00:00
Georg Lehmann
8ac7b9fc37 aco: undo operand swap if applying DPP fails
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>
2026-01-27 20:42:50 +00:00
Georg Lehmann
510dbbae7f aco/optimizer: use opcode_supports_dpp
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>
2026-01-27 20:42:49 +00:00
Georg Lehmann
e74323577f aco/optimizer: optimize pack(undef, f2f16_rtz(a)) for salu
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>
2026-01-26 10:54:22 +00:00
Georg Lehmann
6cbd16daae aco/optimizer: optimize pack(undef, f2f16_rtz(a)) for gfx8+
Do this late because the v_cvt_pkrtz_f16_f32 can be applied to
its operand.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>
2026-01-26 10:54:22 +00:00
Georg Lehmann
57ca974d1d aco/optimizer: optimize pack(undef, f2f16_rtz(a)) for gfx6/7
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>
2026-01-26 10:54:21 +00:00
Georg Lehmann
ba73792de0 aco/optimizer: fix parsing salu p_insert as shift
Fixes: 88f7e3fff3 ("aco/optimizer: parse pseudo alu instructions")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>
2026-01-26 10:54:21 +00:00
Georg Lehmann
939b4a6476 aco/optimizer: apply v_cvt_pkrtz_f16_f32 as fma_mix to operands
Foz-DB Navi21:
Totals from 2085 (2.14% of 97591) affected shaders:
Instrs: 4880879 -> 4882355 (+0.03%); split: -0.04%, +0.07%
CodeSize: 26869332 -> 26881744 (+0.05%); split: -0.02%, +0.06%
VGPRs: 93944 -> 94160 (+0.23%); split: -0.06%, +0.29%
Latency: 40035558 -> 40035595 (+0.00%); split: -0.02%, +0.02%
InvThroughput: 10333800 -> 10329093 (-0.05%); split: -0.06%, +0.01%
VClause: 139147 -> 139148 (+0.00%)
Copies: 454527 -> 454656 (+0.03%); split: -0.00%, +0.03%
VALU: 3214838 -> 3211105 (-0.12%)

Foz-DB Navi48:
Totals from 2349 (2.41% of 97637) affected shaders:
Instrs: 6471998 -> 6471817 (-0.00%); split: -0.05%, +0.05%
CodeSize: 34793372 -> 34808748 (+0.04%); split: -0.02%, +0.06%
VGPRs: 141804 -> 142560 (+0.53%)
Latency: 45225910 -> 45226000 (+0.00%); split: -0.01%, +0.01%
InvThroughput: 9152634 -> 9149850 (-0.03%); split: -0.04%, +0.01%
VClause: 148536 -> 148537 (+0.00%)
Copies: 527206 -> 527336 (+0.02%); split: -0.01%, +0.03%
VALU: 3491701 -> 3487347 (-0.12%); split: -0.12%, +0.00%
VOPD: 669 -> 683 (+2.09%); split: +2.69%, -0.60%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38815>
2026-01-20 14:48:23 +00:00
Georg Lehmann
c6b74705dd aco/optimizer: support fma_mix with rtz
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38815>
2026-01-20 14:48:23 +00:00
Daniel Schürmann
cfb745592d amd: add ac_cu_info::has_mad32 flag and use in ACO
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:47 +00:00
Georg Lehmann
0478021fdc aco/optimizer: reassociate rcp(mul(a, const)) into rcp_omod(a)
Foz-DB Navi48:
Totals from 2484 (2.54% of 97637) affected shaders:
Instrs: 10368279 -> 10361892 (-0.06%); split: -0.06%, +0.00%
CodeSize: 55161104 -> 55150752 (-0.02%); split: -0.02%, +0.00%
SpillSGPRs: 14665 -> 14666 (+0.01%)
Latency: 87694014 -> 87689324 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 16595764 -> 16594448 (-0.01%); split: -0.01%, +0.00%
VClause: 209922 -> 209918 (-0.00%); split: -0.01%, +0.00%
SClause: 205195 -> 205251 (+0.03%); split: -0.01%, +0.04%
Copies: 843771 -> 843765 (-0.00%); split: -0.01%, +0.01%
Branches: 275985 -> 275962 (-0.01%); split: -0.01%, +0.00%
PreVGPRs: 170608 -> 170494 (-0.07%)
VALU: 5840893 -> 5838038 (-0.05%); split: -0.05%, +0.00%
SALU: 1481388 -> 1479037 (-0.16%); split: -0.16%, +0.00%
VOPD: 7496 -> 7485 (-0.15%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38730>
2025-12-17 08:41:32 +00:00
Georg Lehmann
a8f5ced670 aco/optimizer: reassociate mul(mul(a, const), b) into mul_omod(a, b)
Foz-DB Navi48:
Totals from 14608 (14.96% of 97637) affected shaders:
MaxWaves: 364201 -> 364421 (+0.06%)
Instrs: 28051720 -> 28022503 (-0.10%); split: -0.13%, +0.03%
CodeSize: 148938740 -> 148943480 (+0.00%); split: -0.04%, +0.04%
VGPRs: 994520 -> 994004 (-0.05%); split: -0.05%, +0.00%
SpillSGPRs: 45182 -> 45179 (-0.01%)
Latency: 187734461 -> 187725301 (-0.00%); split: -0.07%, +0.06%
InvThroughput: 33967002 -> 33949881 (-0.05%); split: -0.11%, +0.06%
VClause: 495237 -> 495207 (-0.01%); split: -0.03%, +0.02%
Copies: 2048324 -> 2047937 (-0.02%); split: -0.12%, +0.10%
Branches: 598445 -> 598431 (-0.00%); split: -0.01%, +0.01%
PreSGPRs: 877715 -> 877684 (-0.00%)
PreVGPRs: 778146 -> 776383 (-0.23%); split: -0.23%, +0.00%
VALU: 16413380 -> 16391508 (-0.13%); split: -0.15%, +0.01%
SALU: 3685279 -> 3677655 (-0.21%); split: -0.23%, +0.02%
VOPD: 26219 -> 25926 (-1.12%); split: +0.43%, -1.55%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38730>
2025-12-17 08:41:31 +00:00
Georg Lehmann
839a035564 aco/optimizer: propagate fixed regs to copy/extract/insert
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38830>
2025-12-11 08:06:58 +00:00
Georg Lehmann
d32dd5e1df aco/optimizer: propagate fixed registers
Foz-DB Navi48:
Totals from 351 (0.36% of 97637) affected shaders:
Instrs: 3568192 -> 3567166 (-0.03%)
CodeSize: 18890368 -> 18886304 (-0.02%)
Latency: 17047945 -> 17048185 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 3185739 -> 3185813 (+0.00%); split: -0.00%, +0.00%
SClause: 61544 -> 61536 (-0.01%)
Copies: 271592 -> 270845 (-0.28%)
PreSGPRs: 17186 -> 17094 (-0.54%)
PreVGPRs: 21897 -> 21901 (+0.02%)
VALU: 2003976 -> 2003980 (+0.00%)
SALU: 468403 -> 467664 (-0.16%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38830>
2025-12-11 08:06:58 +00:00
Georg Lehmann
b798ace443 aco/optimizer: fix skip_smem_offset_align with non temp register operands
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38830>
2025-12-11 08:06:58 +00:00
Georg Lehmann
72e3071751 aco/optimizer: keep pass_flags valid for all instructions
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38830>
2025-12-11 08:06:57 +00:00
Georg Lehmann
bb58ba2075 aco/optimizer: propagate salu fabs
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Foz-DB Navi48:
Totals from 5156 (5.28% of 97637) affected shaders:
Instrs: 12713219 -> 12694317 (-0.15%); split: -0.15%, +0.00%
CodeSize: 67099236 -> 67037588 (-0.09%); split: -0.13%, +0.04%
VGPRs: 352620 -> 352608 (-0.00%)
SpillSGPRs: 22032 -> 22031 (-0.00%)
Latency: 68288972 -> 68271858 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 13639078 -> 13633997 (-0.04%); split: -0.04%, +0.00%
VClause: 235194 -> 235186 (-0.00%); split: -0.01%, +0.00%
SClause: 249057 -> 249012 (-0.02%); split: -0.03%, +0.01%
Copies: 963813 -> 960529 (-0.34%); split: -0.36%, +0.02%
Branches: 321041 -> 321039 (-0.00%)
PreSGPRs: 303392 -> 303295 (-0.03%); split: -0.03%, +0.00%
VALU: 7134563 -> 7134533 (-0.00%); split: -0.00%, +0.00%
SALU: 1913802 -> 1899948 (-0.72%); split: -0.72%, +0.00%
VOPD: 19914 -> 19885 (-0.15%); split: +0.01%, -0.15%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38723>
2025-12-10 10:07:12 +00:00
Georg Lehmann
04037c7af3 aco/optimizer: propagate salu fneg
Foz-DB Navi48:
Totals from 23796 (24.37% of 97637) affected shaders:
MaxWaves: 638922 -> 638898 (-0.00%)
Instrs: 32968990 -> 32880147 (-0.27%); split: -0.28%, +0.01%
CodeSize: 174252352 -> 173922400 (-0.19%); split: -0.20%, +0.01%
VGPRs: 1396472 -> 1396592 (+0.01%)
SpillSGPRs: 63672 -> 63599 (-0.11%)
Latency: 201025393 -> 200966204 (-0.03%); split: -0.05%, +0.02%
InvThroughput: 37429702 -> 37411026 (-0.05%); split: -0.06%, +0.01%
VClause: 534241 -> 534115 (-0.02%); split: -0.05%, +0.02%
SClause: 831765 -> 831559 (-0.02%); split: -0.07%, +0.05%
Copies: 2404134 -> 2400539 (-0.15%); split: -0.29%, +0.14%
Branches: 728518 -> 728503 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 1337403 -> 1336846 (-0.04%); split: -0.04%, +0.00%
PreVGPRs: 1017490 -> 1017521 (+0.00%); split: -0.00%, +0.00%
VALU: 18319620 -> 18318960 (-0.00%); split: -0.01%, +0.00%
SALU: 5069557 -> 5001384 (-1.34%); split: -1.38%, +0.03%
VOPD: 80235 -> 80172 (-0.08%); split: +0.13%, -0.21%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38723>
2025-12-10 10:07:12 +00:00
Georg Lehmann
8b1340a52c aco/optimizer: validate uses
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38724>
2025-12-10 09:40:13 +00:00
Georg Lehmann
ad3add311c aco/optimizer: fix uses in to_uniform_bool_instr
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38724>
2025-12-10 09:40:13 +00:00
Georg Lehmann
d86f5f6bcb aco/optimizer: apply omod to pseudo scalar trans instructions
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Foz-DB Navi48:
Totals from 2062 (2.11% of 97637) affected shaders:
Instrs: 8061281 -> 8055482 (-0.07%); split: -0.07%, +0.00%
CodeSize: 42727968 -> 42696504 (-0.07%); split: -0.07%, +0.00%
Latency: 54739436 -> 54737749 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 10833704 -> 10833346 (-0.00%); split: -0.00%, +0.00%
VClause: 167276 -> 167275 (-0.00%)
SClause: 160183 -> 160163 (-0.01%); split: -0.02%, +0.01%
Copies: 684315 -> 683984 (-0.05%); split: -0.05%, +0.00%
PreSGPRs: 146747 -> 146746 (-0.00%)
VALU: 4377180 -> 4377168 (-0.00%); split: -0.00%, +0.00%
SALU: 1255321 -> 1251342 (-0.32%); split: -0.32%, +0.00%
VOPD: 16467 -> 16469 (+0.01%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658>
2025-11-29 08:27:59 +00:00
Georg Lehmann
b82339d99e aco/optimizer: use new helpers for omod/clamp
Also resolves the old TODO about using omod for multiplication
with negative 0.5, 2.0 or 4.0.

Foz-DB Navi21:
Totals from 5680 (5.82% of 97591) affected shaders:
MaxWaves: 111976 -> 111974 (-0.00%)
Instrs: 12013419 -> 12003946 (-0.08%); split: -0.08%, +0.00%
CodeSize: 65379508 -> 65364884 (-0.02%); split: -0.04%, +0.02%
VGPRs: 375840 -> 375856 (+0.00%); split: -0.00%, +0.01%
Latency: 85804600 -> 85784850 (-0.02%); split: -0.03%, +0.01%
InvThroughput: 20705698 -> 20692571 (-0.06%); split: -0.07%, +0.00%
VClause: 269772 -> 269606 (-0.06%); split: -0.09%, +0.03%
SClause: 324997 -> 324934 (-0.02%); split: -0.03%, +0.01%
Copies: 963255 -> 963264 (+0.00%); split: -0.06%, +0.06%
Branches: 326691 -> 326688 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 345106 -> 345109 (+0.00%)
PreVGPRs: 317681 -> 317729 (+0.02%)
VALU: 8372681 -> 8363374 (-0.11%); split: -0.11%, +0.00%
SALU: 1456669 -> 1456589 (-0.01%); split: -0.01%, +0.01%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658>
2025-11-29 08:27:59 +00:00
Georg Lehmann
69b5767eee aco/optimizer: use new helpers to create v_fma_mixlo_f16
Foz-DB Navi21:
Totals from 69 (0.07% of 97591) affected shaders:
Instrs: 45091 -> 45057 (-0.08%)
CodeSize: 244016 -> 243932 (-0.03%); split: -0.12%, +0.09%
VGPRs: 1792 -> 1680 (-6.25%)
Latency: 133496 -> 133572 (+0.06%); split: -0.03%, +0.09%
InvThroughput: 35383 -> 35338 (-0.13%)
Copies: 4050 -> 4048 (-0.05%)
VALU: 30172 -> 30138 (-0.11%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658>
2025-11-29 08:27:58 +00:00
Georg Lehmann
ee28801eae aco/optimizer: use new helpers to apply insert
Foz-DB Navi21:
Totals from 505 (0.52% of 97591) affected shaders:
Instrs: 1438254 -> 1436780 (-0.10%); split: -0.11%, +0.01%
CodeSize: 8063364 -> 8054192 (-0.11%); split: -0.13%, +0.01%
Latency: 18596788 -> 18597262 (+0.00%); split: -0.01%, +0.01%
InvThroughput: 5213861 -> 5213061 (-0.02%); split: -0.02%, +0.01%
VClause: 37121 -> 37130 (+0.02%)
Copies: 174744 -> 175222 (+0.27%); split: -0.07%, +0.34%
Branches: 65722 -> 65718 (-0.01%)
VALU: 912967 -> 911074 (-0.21%); split: -0.21%, +0.00%
SALU: 251045 -> 251560 (+0.21%); split: -0.01%, +0.21%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658>
2025-11-29 08:27:58 +00:00
Georg Lehmann
d60ce9ceef aco/optimizer: use new helpers to apply packed fsat
No Foz-DB changes.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658>
2025-11-29 08:27:57 +00:00
Georg Lehmann
0a82c8cb13 aco/optimizer: back propagate modifiers through rcp
Foz-DB Navi21:
Totals from 5 (0.01% of 97591) affected shaders:
Instrs: 1473 -> 1468 (-0.34%)
CodeSize: 7664 -> 7660 (-0.05%)
Latency: 25897 -> 25863 (-0.13%)
InvThroughput: 2737 -> 2731 (-0.22%)
VALU: 1141 -> 1136 (-0.44%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658>
2025-11-29 08:27:57 +00:00
Georg Lehmann
4442064449 aco/optimizer: use new helpers to apply neg/abs to output of instructions
Foz-DB Navi21:
Totals from 6765 (6.93% of 97591) affected shaders:
MaxWaves: 134398 -> 134408 (+0.01%)
Instrs: 9775725 -> 9768079 (-0.08%); split: -0.08%, +0.01%
CodeSize: 50785228 -> 50777880 (-0.01%); split: -0.02%, +0.01%
VGPRs: 445840 -> 445784 (-0.01%)
SpillSGPRs: 14483 -> 14476 (-0.05%)
Latency: 40232431 -> 40230284 (-0.01%); split: -0.04%, +0.03%
InvThroughput: 10339051 -> 10329846 (-0.09%); split: -0.09%, +0.00%
VClause: 186785 -> 186788 (+0.00%); split: -0.01%, +0.01%
SClause: 157106 -> 157116 (+0.01%); split: -0.00%, +0.01%
Copies: 746817 -> 745378 (-0.19%); split: -0.26%, +0.07%
Branches: 189298 -> 189211 (-0.05%); split: -0.06%, +0.01%
PreSGPRs: 346169 -> 346158 (-0.00%)
PreVGPRs: 370712 -> 370660 (-0.01%); split: -0.02%, +0.00%
VALU: 6847295 -> 6839753 (-0.11%); split: -0.11%, +0.00%
SALU: 1139960 -> 1139942 (-0.00%); split: -0.00%, +0.00%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658>
2025-11-29 08:27:56 +00:00
Georg Lehmann
58f407702d aco/optimizer: handle gfx11+ vinterp as fma special case
No effect on its own, but will be important for output modifiers.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658>
2025-11-29 08:27:56 +00:00
Georg Lehmann
37d3c63a12 aco/optimizer: add new helpers for applying output modifiers
To replace the old instr_mod_labels.

Foz-DB Navi21:
Totals from 683 (0.70% of 97591) affected shaders:
Instrs: 3341288 -> 3340447 (-0.03%); split: -0.03%, +0.00%
CodeSize: 18522460 -> 18520212 (-0.01%); split: -0.01%, +0.00%
Latency: 34359519 -> 34358772 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 9229621 -> 9229494 (-0.00%); split: -0.00%, +0.00%
Copies: 368383 -> 368260 (-0.03%); split: -0.04%, +0.00%
PreSGPRs: 48060 -> 48061 (+0.00%)
SALU: 543991 -> 543150 (-0.15%); split: -0.16%, +0.00%

Changes are caused by optimizing not(salu) without killed scc.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658>
2025-11-29 08:27:56 +00:00
Georg Lehmann
fc29821d3b aco/optimizer: move med3 -> add_clamp opt later
Soon we will apply omod later,
when the combine_instruction reaches the multiplication with constant.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658>
2025-11-29 08:27:55 +00:00
Georg Lehmann
f5eb3fe9cb aco/optimizer: optimze cndmask(a, b, not(c)) to cndmask(b, a, c)
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Can happen with nir_op_bitz/b2f/b2i.

Foz-DB Navi48:
Totals from 3465 (4.20% of 82419) affected shaders:
Instrs: 7534077 -> 7527637 (-0.09%); split: -0.09%, +0.01%
CodeSize: 40017384 -> 39993008 (-0.06%); split: -0.07%, +0.01%
Latency: 38593071 -> 38582815 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 8519291 -> 8518620 (-0.01%); split: -0.01%, +0.00%
VClause: 151669 -> 151662 (-0.00%); split: -0.02%, +0.02%
SClause: 155781 -> 155772 (-0.01%); split: -0.01%, +0.01%
Copies: 628453 -> 628531 (+0.01%); split: -0.01%, +0.02%
Branches: 180429 -> 180430 (+0.00%)
PreSGPRs: 182855 -> 182801 (-0.03%)
VALU: 4315173 -> 4315241 (+0.00%); split: -0.00%, +0.00%
SALU: 992125 -> 986876 (-0.53%); split: -0.53%, +0.00%
VOPD: 15827 -> 15838 (+0.07%); split: +0.23%, -0.16%

Foz-DB Navi21:
Totals from 3341 (4.06% of 82387) affected shaders:
MaxWaves: 61924 -> 61950 (+0.04%)
Instrs: 6640276 -> 6635078 (-0.08%); split: -0.08%, +0.00%
CodeSize: 35932788 -> 35913760 (-0.05%); split: -0.06%, +0.00%
VGPRs: 205512 -> 205456 (-0.03%)
Latency: 40201463 -> 40194285 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 12379144 -> 12378028 (-0.01%); split: -0.01%, +0.00%
VClause: 151556 -> 151563 (+0.00%); split: -0.01%, +0.01%
SClause: 157470 -> 157472 (+0.00%); split: -0.00%, +0.01%
Copies: 645034 -> 644947 (-0.01%); split: -0.02%, +0.01%
Branches: 192070 -> 192071 (+0.00%)
PreSGPRs: 173368 -> 173311 (-0.03%)
VALU: 4554790 -> 4554782 (-0.00%); split: -0.00%, +0.00%
SALU: 881251 -> 876087 (-0.59%); split: -0.59%, +0.00%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530>
2025-11-25 11:49:19 +00:00
Georg Lehmann
752f1fb4ae aco/optimizer: extend existing patterns to handle b2f/b2i(not(a))
The next commit will optimize b2f(not(a)) and b2i(not(a)),
so handle those in other patterns to prevent regressions.

No Foz-DB changes on its own.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530>
2025-11-25 11:49:19 +00:00
Georg Lehmann
c538f47f03 aco/optimizer: create ff0/bcnt0
Foz-DB Navi21:
Totals from 1 (0.00% of 82387) affected shaders:
Instrs: 350 -> 347 (-0.86%)
CodeSize: 1800 -> 1788 (-0.67%)
Latency: 2427 -> 2421 (-0.25%)
SALU: 80 -> 77 (-3.75%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530>
2025-11-25 11:49:18 +00:00
Georg Lehmann
0f7a1ce23e aco/optimizer: some more mul opts
Foz-DB Navi48:
Totals from 1650 (2.00% of 82419) affected shaders:
Instrs: 975716 -> 970609 (-0.52%); split: -0.53%, +0.00%
CodeSize: 4986260 -> 4982916 (-0.07%); split: -0.09%, +0.02%
Latency: 2795394 -> 2793211 (-0.08%); split: -0.09%, +0.01%
InvThroughput: 620892 -> 620914 (+0.00%); split: -0.00%, +0.01%
VClause: 18773 -> 18729 (-0.23%)
SClause: 13219 -> 13218 (-0.01%)
Copies: 53619 -> 53620 (+0.00%); split: -0.01%, +0.01%
VALU: 592094 -> 592096 (+0.00%); split: -0.00%, +0.00%
SALU: 96586 -> 93532 (-3.16%); split: -3.17%, +0.00%

Foz-DB Navi21:
Totals from 1647 (2.00% of 82387) affected shaders:
Instrs: 1104100 -> 1100149 (-0.36%); split: -0.36%, +0.00%
CodeSize: 5631092 -> 5637668 (+0.12%); split: -0.00%, +0.12%
Latency: 3503029 -> 3501621 (-0.04%); split: -0.05%, +0.01%
InvThroughput: 1088494 -> 1088495 (+0.00%); split: -0.00%, +0.00%
VClause: 20898 -> 20885 (-0.06%)
Copies: 72641 -> 72635 (-0.01%); split: -0.02%, +0.01%
VALU: 725593 -> 725592 (-0.00%); split: -0.00%, +0.00%
SALU: 139046 -> 135175 (-2.78%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530>
2025-11-25 11:49:17 +00:00
Georg Lehmann
92dbf42379 aco/optimizer: use cndmask for neg(b2i)
Foz-DB Navi48:
Totals from 1310 (1.59% of 82419) affected shaders:
Instrs: 1337622 -> 1338677 (+0.08%); split: -0.00%, +0.08%
CodeSize: 7039828 -> 7043996 (+0.06%); split: -0.00%, +0.06%
Latency: 7783135 -> 7782526 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 1587987 -> 1586644 (-0.08%)
Branches: 24320 -> 24318 (-0.01%)

Foz-DB Navi21:
Totals from 334 (0.41% of 82387) affected shaders:
Instrs: 666102 -> 666094 (-0.00%)
CodeSize: 3599748 -> 3599724 (-0.00%)
Latency: 6873870 -> 6873868 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 2151773 -> 2151780 (+0.00%); split: -0.00%, +0.00%
Branches: 17419 -> 17411 (-0.05%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530>
2025-11-25 11:49:17 +00:00
Georg Lehmann
0e4d4aeef7 aco/optimizer: add some bitop combining
Foz-DB Navi48:
Totals from 53 (0.06% of 82419) affected shaders:
Instrs: 172843 -> 172769 (-0.04%); split: -0.06%, +0.01%
CodeSize: 937308 -> 936924 (-0.04%); split: -0.04%, +0.00%
Latency: 454652 -> 454823 (+0.04%); split: -0.01%, +0.05%
InvThroughput: 89833 -> 89812 (-0.02%); split: -0.06%, +0.03%
PreSGPRs: 2926 -> 2929 (+0.10%)
PreVGPRs: 2920 -> 2919 (-0.03%); split: -0.07%, +0.03%
VALU: 76638 -> 76556 (-0.11%)
SALU: 37856 -> 37859 (+0.01%); split: -0.01%, +0.01%
VOPD: 10943 -> 10936 (-0.06%)

Foz-DB Navi21:
Totals from 59 (0.07% of 82387) affected shaders:
Instrs: 1047744 -> 1047578 (-0.02%)
CodeSize: 5641948 -> 5640780 (-0.02%)
Latency: 5116816 -> 5116957 (+0.00%); split: -0.00%, +0.01%
InvThroughput: 1274035 -> 1274023 (-0.00%); split: -0.00%, +0.00%
VClause: 30744 -> 30745 (+0.00%)
PreSGPRs: 3329 -> 3333 (+0.12%)
PreVGPRs: 4130 -> 4129 (-0.02%); split: -0.05%, +0.02%
VALU: 689731 -> 689562 (-0.02%)
SALU: 162830 -> 162833 (+0.00%); split: -0.00%, +0.00%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530>
2025-11-25 11:49:16 +00:00
Georg Lehmann
ee0354e0f1 aco/optimizer: use new helpers for bitwise n2 opts
Foz-DB Navi48:
Totals from 604 (0.73% of 82419) affected shaders:
Instrs: 2759878 -> 2758431 (-0.05%); split: -0.06%, +0.01%
CodeSize: 14801888 -> 14793412 (-0.06%); split: -0.06%, +0.01%
SpillSGPRs: 6237 -> 6233 (-0.06%)
Latency: 23509766 -> 23507853 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 7471297 -> 7471008 (-0.00%); split: -0.00%, +0.00%
Branches: 104979 -> 104977 (-0.00%)
PreSGPRs: 51506 -> 51408 (-0.19%); split: -0.20%, +0.01%
VALU: 1351564 -> 1351561 (-0.00%); split: -0.00%, +0.00%
SALU: 537430 -> 536266 (-0.22%); split: -0.23%, +0.01%
VOPD: 3834 -> 3833 (-0.03%)

Foz-DB Navi21:
Totals from 739 (0.90% of 82387) affected shaders:
Instrs: 2489644 -> 2488228 (-0.06%); split: -0.06%, +0.00%
CodeSize: 13930192 -> 13915972 (-0.10%); split: -0.11%, +0.00%
SpillSGPRs: 980 -> 976 (-0.41%)
Latency: 25027553 -> 25027845 (+0.00%); split: -0.01%, +0.01%
InvThroughput: 8591377 -> 8591097 (-0.00%); split: -0.00%, +0.00%
SClause: 78380 -> 78382 (+0.00%)
Copies: 275433 -> 275393 (-0.01%); split: -0.02%, +0.01%
Branches: 113718 -> 113716 (-0.00%)
PreSGPRs: 48377 -> 48260 (-0.24%); split: -0.27%, +0.03%
VALU: 1589250 -> 1589240 (-0.00%)
SALU: 420348 -> 418962 (-0.33%); split: -0.34%, +0.01%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530>
2025-11-25 11:49:15 +00:00