Georg Lehmann
788aafba2a
aco/sched_vopd: create dot2acc from VOP3P dot2
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40225 >
2026-03-10 14:21:56 +00:00
Georg Lehmann
6cef434478
aco/sched_vopd: convert fma with inline constants to fmamk/fmaak
...
This optimization was previously done in the post-RA optimizer,
but it is more fitting for the vopd scheduler.
Doing it here also has the benefit that we don't unnecessarily use
the constant bus when VOPD can't be used.
No Foz-DB changes on GFX12 until the next commit.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40225 >
2026-03-10 14:21:56 +00:00
Marek Olšák
fae7aef5ca
ac: tidy up ac_hw_cache_flags
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40022 >
2026-03-04 21:14:56 +00:00
Rhys Perry
5c3b5688a1
amd: rename ac_cu_info to ac_compiler_info
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40042 >
2026-03-03 08:50:12 +00:00
Rhys Perry
8801ca188d
ac/nir: don't pass radeon_info to ac_nir_set_options
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40042 >
2026-03-03 08:50:10 +00:00
Marek Olšák
f22f117d1a
amd: add meson variable idep_amd_generated_headers for all generated headers
...
group all generated header under the same variable
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40084 >
2026-02-28 05:23:59 +00:00
Daniel Schürmann
fbf2083b8f
aco/isel: Don't emit ELSE side of divergent branches which jump
...
Totals from 50 (0.06% of 84383) affected shaders: (Navi48)
Instrs: 402490 -> 402444 (-0.01%); split: -0.01%, +0.00%
CodeSize: 2239024 -> 2238864 (-0.01%); split: -0.01%, +0.00%
SpillSGPRs: 1493 -> 1496 (+0.20%)
Latency: 5836785 -> 5836747 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 1120893 -> 1120909 (+0.00%); split: -0.00%, +0.00%
Copies: 46128 -> 46082 (-0.10%)
VALU: 222708 -> 222715 (+0.00%); split: -0.00%, +0.00%
SALU: 53039 -> 52993 (-0.09%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
ba32219cf8
aco/isel: Don't emit ELSE side of uniform branches which jump
...
Totals from 4 (0.00% of 84383) affected shaders: (Navi48)
Instrs: 16473 -> 16468 (-0.03%)
CodeSize: 85276 -> 85300 (+0.03%)
SpillSGPRs: 175 -> 176 (+0.57%)
Latency: 267907 -> 267885 (-0.01%)
InvThroughput: 36302 -> 36298 (-0.01%)
Copies: 1353 -> 1345 (-0.59%)
VALU: 9025 -> 9029 (+0.04%)
SALU: 2635 -> 2627 (-0.30%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Rhys Perry
63b18e9e5b
aco: move return address to a clobbered register
...
It's placed in the preserved registers, but the p_call clobbers it, so
this change removes some special casing.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39590 >
2026-02-06 09:49:19 +00:00
Rhys Perry
5f5032bb6a
aco: use lv1/lv2 instead of v1/v2.as_linear()
...
This is just a search+replace then clang-format.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39537 >
2026-01-28 16:46:30 +00:00
Georg Lehmann
2d38da94d4
aco: allow v_cmpx with DPP
...
The wording in the RDNA3 ISA doc was since clarified, v_cmpx with DPP
behaves exactly like one would expect:
FI controls whether the source value can be read from inactive lanes,
but inactive lanes always write a 0 bit. The same applies to v_cmp with DPP.
Foz-DB Navi48:
Totals from 987 (1.20% of 82405) affected shaders:
Instrs: 517003 -> 516445 (-0.11%); split: -0.11%, +0.00%
CodeSize: 2782688 -> 2780508 (-0.08%); split: -0.08%, +0.00%
Latency: 2059169 -> 2056327 (-0.14%); split: -0.14%, +0.00%
InvThroughput: 365374 -> 365328 (-0.01%); split: -0.03%, +0.01%
Copies: 64669 -> 65616 (+1.46%)
SALU: 70693 -> 70652 (-0.06%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516 >
2026-01-27 20:42:51 +00:00
Rhys Perry
f0f53e624c
aco/tests: remove vcc definitions from p_call
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The version of instruction selection that got merged doesn't have vcc
definitions, so this shouldn't either.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39390 >
2026-01-20 13:33:16 +00:00
Rhys Perry
ba798120c6
aco/ra: split blocking vectors if needed when handling fixed operands
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39390 >
2026-01-20 13:33:16 +00:00
Rhys Perry
d2a9122cfa
aco/tests: add function call regalloc tests
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38811 >
2026-01-16 10:05:43 +00:00
Rhys Perry
bf30a57440
aco/ra: omit renaming when necessary when moving copy definitions
...
This should resolve the FIXME.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38811 >
2026-01-16 10:05:41 +00:00
Emma Anholt
ed8676dc28
nir: Rename the unit_test_*_amd intrinics to be un-vendored.
...
We'll reuse these from the nir_opt_algebraic_pattern_test.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076 >
2026-01-15 19:09:37 +00:00
Georg Lehmann
daf235c607
aco/tests: don't destroy vk_device if it was never created
...
Happens if you only run one test that doesn't need a vk_device.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39268 >
2026-01-12 16:16:54 +00:00
Georg Lehmann
fad95030a7
aco/tests: test VALUMaskWriteHazard with v_cmpx
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39252 >
2026-01-12 15:48:39 +00:00
Georg Lehmann
1d85552745
aco/tests: test VALUReadSGPRHazard with v_cmpx
...
To avoid regressing this in a future rework.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39252 >
2026-01-12 15:48:39 +00:00
Daniel Schürmann
eb16f701a6
aco/tests: Add new test to pack 2x16 SGPRs into VGPR
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39107 >
2026-01-05 14:54:00 +00:00
Daniel Schürmann
61c1ec541d
aco/tests: Add test for subdword extraction from SGPR
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39107 >
2026-01-05 14:54:00 +00:00
Daniel Schürmann
d8481fd7cc
aco/lower_to_hw: Fix SGPR Operand RegClasses of subdword copies
...
Extracting from an SGPR could cause a wrong RegClass on
the operand which could later lead to selecting VOPD
instructions which falsely operate on the corresponding
VGPR.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39107 >
2026-01-05 14:53:58 +00:00
Daniel Schürmann
7b1f6fa6fc
aco: remove radeon_family from aco::Program
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:48 +00:00
Daniel Schürmann
f791e46c47
aco: add ac_cu_info to aco_compiler_options
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:46 +00:00
Daniel Schürmann
addd4ea59f
aco: pass aco_compiler_options to init_program()
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:46 +00:00
Daniel Schürmann
bf9bec07c2
aco/tests: don't pass CHIP_UNKNOWN to ACO
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:46 +00:00
Georg Lehmann
17e597093d
radv: eliminate unused FS output channels
...
For formats that don't have all color channels, there is no reason to
output all of them.
Games often write to R only or RGB formats with non trivial remaining channels.
Foz-DB Navi21:
Totals from 10270 (10.55% of 97347) affected shaders:
MaxWaves: 249166 -> 250950 (+0.72%); split: +0.73%, -0.01%
Instrs: 8442016 -> 8354715 (-1.03%); split: -1.05%, +0.01%
CodeSize: 45939644 -> 45487156 (-0.98%); split: -1.01%, +0.02%
VGPRs: 472584 -> 463784 (-1.86%); split: -1.98%, +0.12%
SpillSGPRs: 1502 -> 1448 (-3.60%)
LDS: 6024192 -> 6011904 (-0.20%)
Inputs: 42463 -> 41773 (-1.62%)
Outputs: 24601 -> 23955 (-2.63%)
Latency: 78011745 -> 77653907 (-0.46%); split: -0.56%, +0.10%
InvThroughput: 19767826 -> 19274046 (-2.50%); split: -2.53%, +0.03%
VClause: 177891 -> 176681 (-0.68%); split: -0.80%, +0.12%
SClause: 236784 -> 235324 (-0.62%); split: -0.72%, +0.10%
Copies: 621048 -> 616096 (-0.80%); split: -1.03%, +0.23%
Branches: 202608 -> 201811 (-0.39%); split: -0.44%, +0.05%
PreSGPRs: 441032 -> 437698 (-0.76%); split: -0.77%, +0.01%
PreVGPRs: 378067 -> 369564 (-2.25%); split: -2.26%, +0.01%
VALU: 5906415 -> 5833179 (-1.24%); split: -1.25%, +0.01%
SALU: 973428 -> 968088 (-0.55%); split: -0.61%, +0.06%
VMEM: 298277 -> 296504 (-0.59%); split: -0.61%, +0.01%
SMEM: 402244 -> 399612 (-0.65%); split: -0.71%, +0.06%
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38853 >
2025-12-12 17:00:51 +00:00
Rhys Perry
156ae6195e
aco: print large p_parallelcopy using several lines
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Emre Cecanpunar <emreleno@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38695 >
2025-12-11 16:51:21 +00:00
Marek Olšák
9b011a7344
amd: rename most GFX115x definitions for released chips
...
addrlib changes match the original code.
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38718 >
2025-12-03 13:29:07 +00:00
Georg Lehmann
d86f5f6bcb
aco/optimizer: apply omod to pseudo scalar trans instructions
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Foz-DB Navi48:
Totals from 2062 (2.11% of 97637) affected shaders:
Instrs: 8061281 -> 8055482 (-0.07%); split: -0.07%, +0.00%
CodeSize: 42727968 -> 42696504 (-0.07%); split: -0.07%, +0.00%
Latency: 54739436 -> 54737749 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 10833704 -> 10833346 (-0.00%); split: -0.00%, +0.00%
VClause: 167276 -> 167275 (-0.00%)
SClause: 160183 -> 160163 (-0.01%); split: -0.02%, +0.01%
Copies: 684315 -> 683984 (-0.05%); split: -0.05%, +0.00%
PreSGPRs: 146747 -> 146746 (-0.00%)
VALU: 4377180 -> 4377168 (-0.00%); split: -0.00%, +0.00%
SALU: 1255321 -> 1251342 (-0.32%); split: -0.32%, +0.00%
VOPD: 16467 -> 16469 (+0.01%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658 >
2025-11-29 08:27:59 +00:00
Georg Lehmann
b82339d99e
aco/optimizer: use new helpers for omod/clamp
...
Also resolves the old TODO about using omod for multiplication
with negative 0.5, 2.0 or 4.0.
Foz-DB Navi21:
Totals from 5680 (5.82% of 97591) affected shaders:
MaxWaves: 111976 -> 111974 (-0.00%)
Instrs: 12013419 -> 12003946 (-0.08%); split: -0.08%, +0.00%
CodeSize: 65379508 -> 65364884 (-0.02%); split: -0.04%, +0.02%
VGPRs: 375840 -> 375856 (+0.00%); split: -0.00%, +0.01%
Latency: 85804600 -> 85784850 (-0.02%); split: -0.03%, +0.01%
InvThroughput: 20705698 -> 20692571 (-0.06%); split: -0.07%, +0.00%
VClause: 269772 -> 269606 (-0.06%); split: -0.09%, +0.03%
SClause: 324997 -> 324934 (-0.02%); split: -0.03%, +0.01%
Copies: 963255 -> 963264 (+0.00%); split: -0.06%, +0.06%
Branches: 326691 -> 326688 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 345106 -> 345109 (+0.00%)
PreVGPRs: 317681 -> 317729 (+0.02%)
VALU: 8372681 -> 8363374 (-0.11%); split: -0.11%, +0.00%
SALU: 1456669 -> 1456589 (-0.01%); split: -0.01%, +0.01%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658 >
2025-11-29 08:27:59 +00:00
Georg Lehmann
69b5767eee
aco/optimizer: use new helpers to create v_fma_mixlo_f16
...
Foz-DB Navi21:
Totals from 69 (0.07% of 97591) affected shaders:
Instrs: 45091 -> 45057 (-0.08%)
CodeSize: 244016 -> 243932 (-0.03%); split: -0.12%, +0.09%
VGPRs: 1792 -> 1680 (-6.25%)
Latency: 133496 -> 133572 (+0.06%); split: -0.03%, +0.09%
InvThroughput: 35383 -> 35338 (-0.13%)
Copies: 4050 -> 4048 (-0.05%)
VALU: 30172 -> 30138 (-0.11%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658 >
2025-11-29 08:27:58 +00:00
Georg Lehmann
d60ce9ceef
aco/optimizer: use new helpers to apply packed fsat
...
No Foz-DB changes.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658 >
2025-11-29 08:27:57 +00:00
Georg Lehmann
e42be7536c
aco/optimizer: use new helpers for remaining add opts
...
Foz-DB Navi48:
Totals from 373 (0.45% of 82419) affected shaders:
Instrs: 542269 -> 542186 (-0.02%); split: -0.06%, +0.04%
CodeSize: 2872728 -> 2867204 (-0.19%); split: -0.21%, +0.02%
Latency: 3174435 -> 3174634 (+0.01%); split: -0.01%, +0.01%
InvThroughput: 828783 -> 828600 (-0.02%); split: -0.03%, +0.01%
SClause: 11954 -> 11955 (+0.01%)
Copies: 49104 -> 49110 (+0.01%)
PreSGPRs: 15422 -> 15420 (-0.01%)
VALU: 262635 -> 262641 (+0.00%)
Foz-DB Navi21:
Totals from 426 (0.52% of 82387) affected shaders:
Instrs: 624744 -> 624754 (+0.00%); split: -0.00%, +0.00%
CodeSize: 3382728 -> 3385664 (+0.09%); split: -0.00%, +0.09%
Latency: 3841693 -> 3842101 (+0.01%); split: -0.00%, +0.01%
InvThroughput: 1132036 -> 1132065 (+0.00%); split: -0.00%, +0.00%
VClause: 14008 -> 14011 (+0.02%)
Copies: 73104 -> 73114 (+0.01%); split: -0.00%, +0.02%
PreSGPRs: 19504 -> 19502 (-0.01%)
SALU: 131431 -> 131443 (+0.01%)
Foz-DB Polaris10:
Totals from 812 (1.31% of 61894) affected shaders:
Instrs: 610178 -> 609219 (-0.16%); split: -0.21%, +0.05%
CodeSize: 3142404 -> 3147304 (+0.16%); split: -0.02%, +0.17%
VGPRs: 38380 -> 38376 (-0.01%)
Latency: 8312085 -> 8307755 (-0.05%); split: -0.12%, +0.07%
InvThroughput: 3929970 -> 3924631 (-0.14%); split: -0.15%, +0.01%
VClause: 15714 -> 15632 (-0.52%); split: -0.67%, +0.15%
SClause: 14509 -> 14510 (+0.01%); split: -0.02%, +0.03%
Copies: 70197 -> 70388 (+0.27%); split: -0.61%, +0.89%
PreSGPRs: 26409 -> 26404 (-0.02%); split: -0.02%, +0.00%
PreVGPRs: 30448 -> 30436 (-0.04%)
VALU: 408184 -> 407068 (-0.27%); split: -0.29%, +0.01%
SALU: 95726 -> 95959 (+0.24%); split: -0.30%, +0.54%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:13 +00:00
Georg Lehmann
0359c8a901
aco/optimizer: use new helpers for v_add_u32 opts
...
Foz-DB Navi48:
Totals from 1554 (1.89% of 82419) affected shaders:
Instrs: 5154325 -> 5151499 (-0.05%); split: -0.08%, +0.02%
CodeSize: 27310012 -> 27318708 (+0.03%); split: -0.01%, +0.05%
VGPRs: 97236 -> 97200 (-0.04%); split: -0.05%, +0.01%
Latency: 34121873 -> 34120894 (-0.00%); split: -0.02%, +0.01%
InvThroughput: 6735276 -> 6730418 (-0.07%); split: -0.08%, +0.01%
VClause: 130106 -> 130090 (-0.01%); split: -0.05%, +0.04%
SClause: 90439 -> 90449 (+0.01%); split: -0.00%, +0.01%
Copies: 382920 -> 382401 (-0.14%); split: -0.18%, +0.05%
Branches: 130089 -> 130091 (+0.00%)
PreSGPRs: 67745 -> 67743 (-0.00%); split: -0.01%, +0.00%
PreVGPRs: 72710 -> 72674 (-0.05%)
VALU: 2941866 -> 2938129 (-0.13%); split: -0.13%, +0.00%
SALU: 651032 -> 651779 (+0.11%); split: -0.02%, +0.14%
VOPD: 2446 -> 2393 (-2.17%); split: +0.70%, -2.86%
Foz-DB Navi21:
Totals from 1534 (1.86% of 82387) affected shaders:
MaxWaves: 32481 -> 32479 (-0.01%)
Instrs: 4732755 -> 4730039 (-0.06%); split: -0.06%, +0.00%
CodeSize: 25305728 -> 25313148 (+0.03%); split: -0.00%, +0.03%
VGPRs: 84424 -> 84448 (+0.03%)
SpillVGPRs: 2420 -> 2419 (-0.04%)
Scratch: 180224 -> 179200 (-0.57%)
Latency: 36843383 -> 36846269 (+0.01%); split: -0.01%, +0.02%
InvThroughput: 9252495 -> 9238142 (-0.16%); split: -0.17%, +0.02%
VClause: 146629 -> 146671 (+0.03%); split: -0.02%, +0.05%
SClause: 94502 -> 94512 (+0.01%); split: -0.00%, +0.01%
Copies: 403672 -> 403592 (-0.02%); split: -0.09%, +0.07%
Branches: 141145 -> 141137 (-0.01%)
PreSGPRs: 70003 -> 70001 (-0.00%); split: -0.01%, +0.00%
PreVGPRs: 70835 -> 70800 (-0.05%)
VALU: 3114513 -> 3111338 (-0.10%); split: -0.10%, +0.00%
SALU: 651177 -> 651925 (+0.11%); split: -0.02%, +0.13%
VMEM: 271263 -> 271261 (-0.00%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:09 +00:00
Samuel Pitoiset
3889695e9f
aco/tests: switch to drm-shim
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38536 >
2025-11-20 09:53:29 +00:00
Georg Lehmann
4da74eed96
aco/tests: test packed fma opts
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150 >
2025-11-19 10:51:43 +00:00
Georg Lehmann
e8f5b9374b
aco/optimizer: use new helpers to optimize mul(b2f(a), b)
...
Foz-DB Navi48:
Totals from 979 (1.19% of 82419) affected shaders:
Instrs: 3630560 -> 3629463 (-0.03%); split: -0.03%, +0.00%
CodeSize: 19154176 -> 19147124 (-0.04%); split: -0.04%, +0.00%
Latency: 17700546 -> 17699505 (-0.01%); split: -0.01%, +0.01%
InvThroughput: 3143808 -> 3143254 (-0.02%); split: -0.02%, +0.01%
SClause: 76410 -> 76405 (-0.01%); split: -0.01%, +0.00%
Copies: 256544 -> 256554 (+0.00%); split: -0.02%, +0.02%
PreVGPRs: 40868 -> 40835 (-0.08%)
VALU: 2003291 -> 2002466 (-0.04%); split: -0.04%, +0.00%
SALU: 514000 -> 514006 (+0.00%)
VOPD: 3254 -> 3256 (+0.06%); split: +0.12%, -0.06%
Foz-DB Navi21:
Totals from 926 (1.12% of 82387) affected shaders:
MaxWaves: 21538 -> 21542 (+0.02%)
Instrs: 2984216 -> 2983187 (-0.03%); split: -0.04%, +0.00%
CodeSize: 16104112 -> 16097272 (-0.04%); split: -0.05%, +0.00%
VGPRs: 46864 -> 46848 (-0.03%)
Latency: 15678064 -> 15677099 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 3779550 -> 3778230 (-0.03%); split: -0.04%, +0.01%
VClause: 81590 -> 81598 (+0.01%)
SClause: 70753 -> 70751 (-0.00%); split: -0.01%, +0.00%
Copies: 240446 -> 240466 (+0.01%); split: -0.01%, +0.02%
PreSGPRs: 51121 -> 51062 (-0.12%)
PreVGPRs: 38538 -> 38505 (-0.09%)
VALU: 1978847 -> 1977777 (-0.05%); split: -0.06%, +0.00%
SALU: 439184 -> 439212 (+0.01%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150 >
2025-11-19 10:51:42 +00:00
Georg Lehmann
6fc250fc06
aco/optimizer: use new helpers for min3/max3/minmax/maxmin
...
Foz-DB Navi48:
Totals from 10453 (12.68% of 82419) affected shaders:
Instrs: 18676282 -> 18675798 (-0.00%); split: -0.00%, +0.00%
CodeSize: 100603268 -> 100603508 (+0.00%); split: -0.00%, +0.00%
Latency: 157036823 -> 157031708 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 28049331 -> 28048776 (-0.00%); split: -0.00%, +0.00%
Copies: 1452464 -> 1452503 (+0.00%); split: -0.00%, +0.00%
PreVGPRs: 458422 -> 458413 (-0.00%); split: -0.00%, +0.00%
VALU: 10429583 -> 10429353 (-0.00%); split: -0.00%, +0.00%
SALU: 2628403 -> 2628416 (+0.00%); split: -0.00%, +0.00%
VOPD: 21738 -> 21744 (+0.03%); split: +0.04%, -0.01%
Foz-DB Navi21:
Totals from 889 (1.08% of 82387) affected shaders:
MaxWaves: 15641 -> 15639 (-0.01%); split: +0.01%, -0.03%
Instrs: 2505527 -> 2505489 (-0.00%); split: -0.01%, +0.01%
CodeSize: 13975300 -> 13976516 (+0.01%); split: -0.00%, +0.01%
VGPRs: 65584 -> 65576 (-0.01%); split: -0.02%, +0.01%
Latency: 37135606 -> 37132577 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 10937032 -> 10935704 (-0.01%); split: -0.01%, +0.00%
VClause: 63136 -> 63140 (+0.01%); split: -0.01%, +0.01%
Copies: 256011 -> 256073 (+0.02%); split: -0.01%, +0.03%
PreSGPRs: 51804 -> 51809 (+0.01%)
PreVGPRs: 57905 -> 57890 (-0.03%); split: -0.03%, +0.00%
VALU: 1593523 -> 1593339 (-0.01%); split: -0.02%, +0.00%
SALU: 425116 -> 425134 (+0.00%); split: -0.00%, +0.01%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150 >
2025-11-19 10:51:42 +00:00
Georg Lehmann
5abc961514
aco/optimizer: use new helpers to create fma
...
Foz-DB Navi48:
Totals from 25949 (31.48% of 82419) affected shaders:
Instrs: 30904250 -> 30904153 (-0.00%); split: -0.00%, +0.00%
CodeSize: 164623100 -> 164604652 (-0.01%); split: -0.01%, +0.00%
Latency: 209402611 -> 209402684 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 36622293 -> 36622236 (-0.00%); split: -0.00%, +0.00%
Copies: 2252080 -> 2251998 (-0.00%); split: -0.00%, +0.00%
VALU: 16831507 -> 16831382 (-0.00%); split: -0.00%, +0.00%
VOPD: 28252 -> 28295 (+0.15%)
Foz-DB Navi21:
Totals from 56269 (68.30% of 82387) affected shaders:
Instrs: 43751754 -> 43746463 (-0.01%); split: -0.01%, +0.00%
CodeSize: 233615096 -> 233576912 (-0.02%); split: -0.02%, +0.00%
VGPRs: 2445528 -> 2445520 (-0.00%)
Latency: 276776920 -> 276761183 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 66406450 -> 66402214 (-0.01%); split: -0.01%, +0.00%
VClause: 902951 -> 902947 (-0.00%)
Copies: 3926260 -> 3926289 (+0.00%); split: -0.01%, +0.01%
VALU: 26924056 -> 26918783 (-0.02%); split: -0.02%, +0.00%
SALU: 6938335 -> 6938321 (-0.00%); split: -0.00%, +0.00%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150 >
2025-11-19 10:51:42 +00:00
Natalie Vock
1243d575a5
aco/insert_nops: Consider s_setpc target susceptible to VALUReadSGPRHazard
...
Some GPU hangs witnessed in the wild on RDNA4 in Control and Arc Raiders
seem to point towards closest-hit shaders reading a stale value for the
SGPR pair containing the currently-executing shader's address.
This SGPR pair was read by VALU in the preceding traversal shader,
making it susceptible to VALUReadSGPRHazard. Inserting
VALUReadSGPRHazard mitigations before accessing the s_setpc target seems
to fix the hang. We don't have conclusive proof that this is hazardous,
but given that all signs point towards it and we have a reasonably
simple workaround, let's roll with this for now to mitigate the hangs.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38290 >
2025-11-18 18:43:00 +00:00
Georg Lehmann
22dc06798b
aco/optimizer: never unfuse fma
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This shouldn't change anything in practice, and reducing precision
if precise isn't set is weird.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38183 >
2025-11-04 07:54:02 +00:00
Georg Lehmann
a17afd5edd
aco/tests: add some simple fp64 modifier tests
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38011 >
2025-10-29 17:57:53 +00:00
Georg Lehmann
2572528d31
aco/optimizer: remove can_apply_extract
...
Foz-DB NAvi21:
Totals from 10 (0.01% of 79789) affected shaders:
Latency: 426254 -> 426256 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 81782 -> 81784 (+0.00%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35272 >
2025-10-14 08:33:42 +00:00
Georg Lehmann
26da5cf8d9
aco/optimizer: apply sgprs/extract with new helpers
...
Foz-DB Navi21:
Totals from 387 (0.49% of 79789) affected shaders:
MaxWaves: 7332 -> 7324 (-0.11%)
Instrs: 3156365 -> 3155691 (-0.02%); split: -0.02%, +0.00%
CodeSize: 17013948 -> 17014456 (+0.00%); split: -0.01%, +0.01%
VGPRs: 24768 -> 24776 (+0.03%)
Latency: 28569179 -> 28568183 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 6530832 -> 6530566 (-0.00%); split: -0.00%, +0.00%
VClause: 90988 -> 90989 (+0.00%); split: -0.00%, +0.00%
Copies: 269074 -> 269060 (-0.01%); split: -0.01%, +0.01%
PreSGPRs: 22503 -> 22499 (-0.02%)
PreVGPRs: 22928 -> 22935 (+0.03%)
VALU: 2100245 -> 2099560 (-0.03%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35272 >
2025-10-14 08:33:41 +00:00
Georg Lehmann
859505d95a
aco/optimizer: use new helpers to apply literals
...
Foz-DB Navi21:
Totals from 21009 (26.33% of 79789) affected shaders:
MaxWaves: 495342 -> 495414 (+0.01%)
Instrs: 22345587 -> 22335371 (-0.05%); split: -0.05%, +0.00%
CodeSize: 122095820 -> 121795112 (-0.25%); split: -0.25%, +0.00%
VGPRs: 1025800 -> 1025480 (-0.03%)
Latency: 202876235 -> 203076272 (+0.10%); split: -0.04%, +0.14%
InvThroughput: 47599930 -> 47596113 (-0.01%); split: -0.03%, +0.02%
VClause: 475271 -> 475439 (+0.04%); split: -0.02%, +0.05%
SClause: 700679 -> 700629 (-0.01%); split: -0.01%, +0.01%
Copies: 1628498 -> 1618165 (-0.63%); split: -0.64%, +0.01%
Branches: 567199 -> 567216 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 952134 -> 952043 (-0.01%); split: -0.01%, +0.00%
PreVGPRs: 846614 -> 846272 (-0.04%)
VALU: 15572374 -> 15564050 (-0.05%); split: -0.05%, +0.00%
SALU: 2423329 -> 2421319 (-0.08%); split: -0.08%, +0.00%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Foz-DB Navi31:
Totals from 13049 (16.44% of 79395) affected shaders:
MaxWaves: 357242 -> 357268 (+0.01%)
Instrs: 19955572 -> 19944106 (-0.06%); split: -0.06%, +0.00%
CodeSize: 105689464 -> 105454348 (-0.22%); split: -0.23%, +0.00%
VGPRs: 765744 -> 764952 (-0.10%); split: -0.11%, +0.00%
Latency: 179063640 -> 179141591 (+0.04%); split: -0.02%, +0.07%
InvThroughput: 27978134 -> 27971318 (-0.02%); split: -0.03%, +0.01%
VClause: 386791 -> 386826 (+0.01%); split: -0.02%, +0.03%
SClause: 598113 -> 598106 (-0.00%); split: -0.01%, +0.01%
Copies: 1393111 -> 1383102 (-0.72%); split: -0.73%, +0.01%
Branches: 498533 -> 498535 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 573310 -> 573236 (-0.01%); split: -0.01%, +0.00%
PreVGPRs: 591459 -> 591043 (-0.07%)
VALU: 11623734 -> 11615755 (-0.07%); split: -0.07%, +0.00%
SALU: 1962055 -> 1960005 (-0.10%); split: -0.11%, +0.00%
VOPD: 3544 -> 3566 (+0.62%); split: +0.73%, -0.11%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35272 >
2025-10-14 08:33:39 +00:00
Georg Lehmann
0d8219f367
aco/tests: allow even more literals
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35272 >
2025-10-14 08:33:37 +00:00
Rhys Perry
dfa8ac6b91
aco: remove buffer_load_lds instructions
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
They don't exist
See https://github.com/llvm/llvm-project/pull/132916
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14041
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37716 >
2025-10-07 09:50:26 +00:00
Georg Lehmann
26e041e821
aco: remove existing dealloc_vgprs use
...
We didn't consider that s_sendmsg dealloc_vgpr waits for all counters
expect vscnt.
Foz-DB Navi31:
Totals from 74090 (92.52% of 80084) affected shaders:
Instrs: 36031071 -> 35853573 (-0.49%)
CodeSize: 189233756 -> 188523764 (-0.38%)
Latency: 222378318 -> 222374890 (-0.00%)
InvThroughput: 33366893 -> 33362457 (-0.01%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37508 >
2025-09-26 07:51:02 +00:00
Rhys Perry
e2181744c2
aco/tests: add barrier-to-waitcnt tests
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491 >
2025-09-09 12:34:40 +00:00