Commit graph

4153 commits

Author SHA1 Message Date
Natalie Vock
897c95c37e aco: Include arbitrarily fixed registers in max_reg_demand
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39157>
2026-01-12 21:46:50 +00:00
Georg Lehmann
daf235c607 aco/tests: don't destroy vk_device if it was never created
Happens if you only run one test that doesn't need a vk_device.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39268>
2026-01-12 16:16:54 +00:00
Georg Lehmann
fad95030a7 aco/tests: test VALUMaskWriteHazard with v_cmpx
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39252>
2026-01-12 15:48:39 +00:00
Georg Lehmann
1d85552745 aco/tests: test VALUReadSGPRHazard with v_cmpx
To avoid regressing this in a future rework.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39252>
2026-01-12 15:48:39 +00:00
Georg Lehmann
3e10ab34e1 aco/insert_NOPs: explicitly wait for sa_sdst to resolve SALU -> VALU hazards
The assumption that these waits are not required has been proven incorrect
in at least some cases.

Totals from 190 (0.24% of 79825) affected shaders: (Navi31)
Instrs: 499718 -> 500491 (+0.15%)
CodeSize: 2658228 -> 2661916 (+0.14%)
Latency: 5964632 -> 5965453 (+0.01%); split: -0.00%, +0.01%
InvThroughput: 794221 -> 794289 (+0.01%)

Totals from 17093 (21.41% of 79839) affected shaders: (Navi48)
Instrs: 22805214 -> 22854313 (+0.22%)
CodeSize: 121240428 -> 121432904 (+0.16%); split: -0.00%, +0.16%
Latency: 166500300 -> 166530529 (+0.02%); split: -0.00%, +0.02%
InvThroughput: 28770053 -> 28772870 (+0.01%); split: -0.00%, +0.01%

Fixes: 018f45f981 ("aco/insert_NOPs: remove redundant VALUReadSGPRHazard waits")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14516

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39252>
2026-01-12 15:48:38 +00:00
Konstantin Seurer
39d58a55a7 aco: Add support to f2f16 with rtpi/rtni
Those rounding modes are needed when computing 16-bit bounding boxes
since the bounding box must not get smaller.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37883>
2026-01-10 11:34:12 +01:00
Natalie Vock
60dd9d797e aco: Swizzle ray launch IDs in the RT prolog
This converts from 1D workgroups to 2D ray launch IDs entirely via
shader ALU, including handling partial/cut-off workgroups optimally.

Doing this entirely in-shader means it Just Works(TM) with indirect
dispatches as well. Previous approaches manipulating various things on
CPU depending on the dispatch size couldn't handle indirect dispatches.

The swizzle implemented here also swizzles with a recursive Z-order
pattern, which should be a little more optimal than arranging
invocations linearly within the wave.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39142>
2026-01-08 19:49:55 +01:00
Natalie Vock
1f6ac3fa93 radv/rt,aco: Always dispatch 1D workgroups for RT
We will swizzle the workgroups ourselves in the next commit.
Removes the need for 1D dispatch workarounds.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39142>
2026-01-08 19:49:54 +01:00
Georg Lehmann
eb4737a1dd nir: add nir_alu_instr_is_exact helper
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39103>
2026-01-07 09:40:57 +00:00
Daniel Schürmann
2d0d5fc104 aco/validate: validate constant bus limit after register allocation based on PhysReg
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39107>
2026-01-05 14:54:00 +00:00
Daniel Schürmann
eb16f701a6 aco/tests: Add new test to pack 2x16 SGPRs into VGPR
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39107>
2026-01-05 14:54:00 +00:00
Daniel Schürmann
61c1ec541d aco/tests: Add test for subdword extraction from SGPR
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39107>
2026-01-05 14:54:00 +00:00
Daniel Schürmann
0674c9d30e aco/validate: Validate correct RegisterClasses after lowering to HW instructions
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39107>
2026-01-05 14:53:59 +00:00
Daniel Schürmann
b087bf2fbf aco/lower_to_hw: Fix SGPR Operand RegClasses for pack_2x16
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39107>
2026-01-05 14:53:59 +00:00
Daniel Schürmann
9f5996ae8a aco/lower_to_hw: Don't use 2 SGPR operands before GFX10 in a single VOP3 instruction in do_pack_2x16()
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39107>
2026-01-05 14:53:58 +00:00
Daniel Schürmann
d8481fd7cc aco/lower_to_hw: Fix SGPR Operand RegClasses of subdword copies
Extracting from an SGPR could cause a wrong RegClass on
the operand which could later lead to selecting VOPD
instructions which falsely operate on the corresponding
VGPR.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39107>
2026-01-05 14:53:58 +00:00
Georg Lehmann
0c42141299 aco: allow opsel for last v_alignbyte/bit operand
For completeness' sake.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13285
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39061>
2025-12-31 08:58:24 +00:00
Daniel Schürmann
7b1f6fa6fc aco: remove radeon_family from aco::Program
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:48 +00:00
Daniel Schürmann
1e8d367537 amd: add and use ac_cu_info::has_vtx_format_alpha_adjust_bug
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:48 +00:00
Daniel Schürmann
febc29907c amd: add and use ac_cu_info::has_gfx6_mrt_export_bug
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:47 +00:00
Daniel Schürmann
7b7bdb76ab amd: add ac_cu_info::has_point_sample_accel flag and use in ACO
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:47 +00:00
Daniel Schürmann
cfb745592d amd: add ac_cu_info::has_mad32 flag and use in ACO
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:47 +00:00
Daniel Schürmann
1e3db50170 aco: use additional flags from ac_cu_info
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:46 +00:00
Daniel Schürmann
f791e46c47 aco: add ac_cu_info to aco_compiler_options
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:46 +00:00
Daniel Schürmann
addd4ea59f aco: pass aco_compiler_options to init_program()
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:46 +00:00
Daniel Schürmann
bf9bec07c2 aco/tests: don't pass CHIP_UNKNOWN to ACO
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:46 +00:00
Daniel Schürmann
0db1ae1f01 aco: disable XNACK on all GPUs
Affects code generation on GFX8 and GFX9 APUs where we misunderstood
the feature. XNACK replay is not being used with graphics APIs.

Totals from 41759 (65.90% of 63370) affected shaders: (Raven)

MaxWaves: 298672 -> 299000 (+0.11%)
Instrs: 19200726 -> 19138227 (-0.33%); split: -0.33%, +0.00%
CodeSize: 98501904 -> 98253196 (-0.25%); split: -0.26%, +0.00%
SGPRs: 3058544 -> 2831492 (-7.42%)
VGPRs: 1644896 -> 1643660 (-0.08%)
Latency: 193383803 -> 193224047 (-0.08%); split: -0.08%, +0.00%
InvThroughput: 92741082 -> 92698975 (-0.05%); split: -0.05%, +0.00%
SClause: 678580 -> 630107 (-7.14%); split: -7.15%, +0.00%
Copies: 1863375 -> 1863406 (+0.00%); split: -0.04%, +0.04%
VALU: 13791245 -> 13791267 (+0.00%); split: -0.00%, +0.00%
SALU: 2066726 -> 2066741 (+0.00%); split: -0.04%, +0.04%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:43 +00:00
Georg Lehmann
0478021fdc aco/optimizer: reassociate rcp(mul(a, const)) into rcp_omod(a)
Foz-DB Navi48:
Totals from 2484 (2.54% of 97637) affected shaders:
Instrs: 10368279 -> 10361892 (-0.06%); split: -0.06%, +0.00%
CodeSize: 55161104 -> 55150752 (-0.02%); split: -0.02%, +0.00%
SpillSGPRs: 14665 -> 14666 (+0.01%)
Latency: 87694014 -> 87689324 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 16595764 -> 16594448 (-0.01%); split: -0.01%, +0.00%
VClause: 209922 -> 209918 (-0.00%); split: -0.01%, +0.00%
SClause: 205195 -> 205251 (+0.03%); split: -0.01%, +0.04%
Copies: 843771 -> 843765 (-0.00%); split: -0.01%, +0.01%
Branches: 275985 -> 275962 (-0.01%); split: -0.01%, +0.00%
PreVGPRs: 170608 -> 170494 (-0.07%)
VALU: 5840893 -> 5838038 (-0.05%); split: -0.05%, +0.00%
SALU: 1481388 -> 1479037 (-0.16%); split: -0.16%, +0.00%
VOPD: 7496 -> 7485 (-0.15%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38730>
2025-12-17 08:41:32 +00:00
Georg Lehmann
a8f5ced670 aco/optimizer: reassociate mul(mul(a, const), b) into mul_omod(a, b)
Foz-DB Navi48:
Totals from 14608 (14.96% of 97637) affected shaders:
MaxWaves: 364201 -> 364421 (+0.06%)
Instrs: 28051720 -> 28022503 (-0.10%); split: -0.13%, +0.03%
CodeSize: 148938740 -> 148943480 (+0.00%); split: -0.04%, +0.04%
VGPRs: 994520 -> 994004 (-0.05%); split: -0.05%, +0.00%
SpillSGPRs: 45182 -> 45179 (-0.01%)
Latency: 187734461 -> 187725301 (-0.00%); split: -0.07%, +0.06%
InvThroughput: 33967002 -> 33949881 (-0.05%); split: -0.11%, +0.06%
VClause: 495237 -> 495207 (-0.01%); split: -0.03%, +0.02%
Copies: 2048324 -> 2047937 (-0.02%); split: -0.12%, +0.10%
Branches: 598445 -> 598431 (-0.00%); split: -0.01%, +0.01%
PreSGPRs: 877715 -> 877684 (-0.00%)
PreVGPRs: 778146 -> 776383 (-0.23%); split: -0.23%, +0.00%
VALU: 16413380 -> 16391508 (-0.13%); split: -0.15%, +0.01%
SALU: 3685279 -> 3677655 (-0.21%); split: -0.23%, +0.02%
VOPD: 26219 -> 25926 (-1.12%); split: +0.43%, -1.55%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38730>
2025-12-17 08:41:31 +00:00
Alyssa Rosenzweig
079e9ae606 treewide: use BITSET_*_COUNT
Mix of Coccinelle patch, manual fix ups, sed, etc. Probably best to review the diff
as-if hand written:

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38955>
2025-12-16 17:42:10 +00:00
Timur Kristóf
f001515c87 aco: Use only VGPR offset on buffer atomics on GFX6-7
SGPR offset is not included in the bounds check
according to the ISA documentation of GFX6-7 and
indeed it can trigger VM faults on OOB access.

Note that ACO already doesn't use the SGPR offset
on GFX6-7 for buffer loads and stores. This commit
just does the same for buffer atomics.

This commit mitigates a ton of VM faults that are exposed by:
24e75fea4b

Fossil DB stats on Hawaii (GFX7):

Totals from 148 (0.24% of 61818) affected shaders:
Instrs: 324004 -> 327352 (+1.03%)
CodeSize: 1556468 -> 1514100 (-2.72%); split: -2.74%, +0.02%
Latency: 1271480 -> 1276894 (+0.43%)
InvThroughput: 396850 -> 397740 (+0.22%)
VClause: 6861 -> 6858 (-0.04%)
Copies: 34083 -> 37430 (+9.82%)
PreVGPRs: 5705 -> 5706 (+0.02%)
VALU: 147529 -> 150898 (+2.28%)
SALU: 98194 -> 98172 (-0.02%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38958>
2025-12-15 21:03:19 +00:00
Georg Lehmann
a2b70ce4ec aco/isel: remove uniform reduce/scan optimization
This is now done in NIR, with the exception of exclusive min/max/and/or scans.
But those are not really useful, and if we ever come across them we can
optimize them in NIR using write_invocation_amd.

No Foz-DB changes on Navi21.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38902>
2025-12-15 12:22:32 +00:00
Georg Lehmann
17e597093d radv: eliminate unused FS output channels
For formats that don't have all color channels, there is no reason to
output all of them.
Games often write to R only or RGB formats with non trivial remaining channels.

Foz-DB Navi21:
Totals from 10270 (10.55% of 97347) affected shaders:
MaxWaves: 249166 -> 250950 (+0.72%); split: +0.73%, -0.01%
Instrs: 8442016 -> 8354715 (-1.03%); split: -1.05%, +0.01%
CodeSize: 45939644 -> 45487156 (-0.98%); split: -1.01%, +0.02%
VGPRs: 472584 -> 463784 (-1.86%); split: -1.98%, +0.12%
SpillSGPRs: 1502 -> 1448 (-3.60%)
LDS: 6024192 -> 6011904 (-0.20%)
Inputs: 42463 -> 41773 (-1.62%)
Outputs: 24601 -> 23955 (-2.63%)
Latency: 78011745 -> 77653907 (-0.46%); split: -0.56%, +0.10%
InvThroughput: 19767826 -> 19274046 (-2.50%); split: -2.53%, +0.03%
VClause: 177891 -> 176681 (-0.68%); split: -0.80%, +0.12%
SClause: 236784 -> 235324 (-0.62%); split: -0.72%, +0.10%
Copies: 621048 -> 616096 (-0.80%); split: -1.03%, +0.23%
Branches: 202608 -> 201811 (-0.39%); split: -0.44%, +0.05%
PreSGPRs: 441032 -> 437698 (-0.76%); split: -0.77%, +0.01%
PreVGPRs: 378067 -> 369564 (-2.25%); split: -2.26%, +0.01%
VALU: 5906415 -> 5833179 (-1.24%); split: -1.25%, +0.01%
SALU: 973428 -> 968088 (-0.55%); split: -0.61%, +0.06%
VMEM: 298277 -> 296504 (-0.59%); split: -0.61%, +0.01%
SMEM: 402244 -> 399612 (-0.65%); split: -0.71%, +0.06%

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38853>
2025-12-12 17:00:51 +00:00
Georg Lehmann
072815e5cb aco/gfx6: move mrtz writemask workaround to assembler and handle all mrt
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38853>
2025-12-12 17:00:51 +00:00
Rhys Perry
156ae6195e aco: print large p_parallelcopy using several lines
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Emre Cecanpunar <emreleno@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38695>
2025-12-11 16:51:21 +00:00
Rhys Perry
21414e0898 aco/ra: add first loop header phi operand to temp_to_phi_resources
If the first operand is a CSSA copy, we might want to add this to
temp_to_phi_resources, so that we later mark it as the last-seen phi
operand.

fossil-db (navi31):
Totals from 284 (0.36% of 79825) affected shaders:
Instrs: 4160233 -> 4157517 (-0.07%); split: -0.09%, +0.03%
CodeSize: 21546420 -> 21532884 (-0.06%); split: -0.09%, +0.02%
VGPRs: 31404 -> 31416 (+0.04%)
Latency: 40266308 -> 40253731 (-0.03%); split: -0.06%, +0.02%
InvThroughput: 8140751 -> 8139724 (-0.01%); split: -0.05%, +0.04%
VClause: 99849 -> 99835 (-0.01%); split: -0.02%, +0.01%
Copies: 344512 -> 341732 (-0.81%); split: -1.08%, +0.28%
Branches: 113620 -> 113629 (+0.01%); split: -0.02%, +0.03%
VALU: 2502619 -> 2499836 (-0.11%); split: -0.15%, +0.04%
SALU: 499245 -> 499341 (+0.02%); split: -0.02%, +0.04%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Emre Cecanpunar <emreleno@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38695>
2025-12-11 16:51:21 +00:00
Rhys Perry
43b3901362 aco/ra: copy vector_info to affinities
This eliminates some copies in BVH traversal loops.

fossil-db (navi31):
Totals from 200 (0.25% of 79825) affected shaders:
Instrs: 734931 -> 732521 (-0.33%); split: -0.34%, +0.01%
CodeSize: 3801080 -> 3791692 (-0.25%); split: -0.26%, +0.01%
VGPRs: 13704 -> 13728 (+0.18%); split: -0.44%, +0.61%
Latency: 6094605 -> 6082060 (-0.21%); split: -0.24%, +0.03%
InvThroughput: 1081982 -> 1080121 (-0.17%); split: -0.19%, +0.02%
VClause: 18835 -> 18837 (+0.01%); split: -0.01%, +0.02%
Copies: 64602 -> 62239 (-3.66%); split: -3.75%, +0.09%
Branches: 20111 -> 20112 (+0.00%); split: -0.01%, +0.02%
VALU: 438618 -> 436257 (-0.54%); split: -0.55%, +0.01%
SALU: 85092 -> 85085 (-0.01%); split: -0.01%, +0.00%
VOPD: 76 -> 74 (-2.63%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Emre Cecanpunar <emreleno@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38695>
2025-12-11 16:51:21 +00:00
Georg Lehmann
ef246aaf72 aco/isel: emit register copies for workgroup ids
This way, we don't overestimate SGPR pressure.

Foz-DB Navi48:
Totals from 1413 (1.45% of 97637) affected shaders:
Instrs: 3468375 -> 3468585 (+0.01%); split: -0.01%, +0.02%
CodeSize: 18643064 -> 18643520 (+0.00%); split: -0.01%, +0.01%
VGPRs: 71776 -> 71788 (+0.02%)
SpillSGPRs: 18575 -> 18561 (-0.08%)
Latency: 23207643 -> 23207998 (+0.00%); split: -0.00%, +0.01%
InvThroughput: 8116806 -> 8116541 (-0.00%); split: -0.01%, +0.00%
VClause: 75250 -> 75252 (+0.00%); split: -0.00%, +0.00%
SClause: 65274 -> 65283 (+0.01%); split: -0.02%, +0.04%
Copies: 275750 -> 275942 (+0.07%); split: -0.03%, +0.10%
PreSGPRs: 70246 -> 69072 (-1.67%)
VALU: 1892111 -> 1892092 (-0.00%); split: -0.00%, +0.00%
SALU: 523460 -> 523648 (+0.04%); split: -0.02%, +0.05%
VOPD: 41097 -> 41102 (+0.01%)

Sadly the RA noise is slightly negative for instruction count.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38830>
2025-12-11 08:06:59 +00:00
Georg Lehmann
839a035564 aco/optimizer: propagate fixed regs to copy/extract/insert
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38830>
2025-12-11 08:06:58 +00:00
Georg Lehmann
d32dd5e1df aco/optimizer: propagate fixed registers
Foz-DB Navi48:
Totals from 351 (0.36% of 97637) affected shaders:
Instrs: 3568192 -> 3567166 (-0.03%)
CodeSize: 18890368 -> 18886304 (-0.02%)
Latency: 17047945 -> 17048185 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 3185739 -> 3185813 (+0.00%); split: -0.00%, +0.00%
SClause: 61544 -> 61536 (-0.01%)
Copies: 271592 -> 270845 (-0.28%)
PreSGPRs: 17186 -> 17094 (-0.54%)
PreVGPRs: 21897 -> 21901 (+0.02%)
VALU: 2003976 -> 2003980 (+0.00%)
SALU: 468403 -> 467664 (-0.16%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38830>
2025-12-11 08:06:58 +00:00
Georg Lehmann
b798ace443 aco/optimizer: fix skip_smem_offset_align with non temp register operands
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38830>
2025-12-11 08:06:58 +00:00
Georg Lehmann
911e1ce168 aco/isel: emit exec copy for ballot(true)
Once copy propagated in the optimizer, this will allow
using nir_opt_uniform_subgroup without too many regressions.

Foz-DB Navi48:
Totals from 405 (0.41% of 97637) affected shaders:
Instrs: 3796716 -> 3796894 (+0.00%); split: -0.00%, +0.00%
CodeSize: 20116136 -> 20116652 (+0.00%); split: -0.00%, +0.00%
Latency: 18326661 -> 18327114 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 3353206 -> 3353268 (+0.00%); split: -0.00%, +0.00%
Copies: 292307 -> 293830 (+0.52%)
SALU: 507523 -> 507738 (+0.04%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38830>
2025-12-11 08:06:58 +00:00
Georg Lehmann
72e3071751 aco/optimizer: keep pass_flags valid for all instructions
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38830>
2025-12-11 08:06:57 +00:00
Marek Olšák
308da55f1a radv,radeonsi: use FRAG_RESULT_DUAL_SRC_BLEND
this is slightly nicer

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38604>
2025-12-10 19:16:46 +00:00
Georg Lehmann
bb58ba2075 aco/optimizer: propagate salu fabs
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Foz-DB Navi48:
Totals from 5156 (5.28% of 97637) affected shaders:
Instrs: 12713219 -> 12694317 (-0.15%); split: -0.15%, +0.00%
CodeSize: 67099236 -> 67037588 (-0.09%); split: -0.13%, +0.04%
VGPRs: 352620 -> 352608 (-0.00%)
SpillSGPRs: 22032 -> 22031 (-0.00%)
Latency: 68288972 -> 68271858 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 13639078 -> 13633997 (-0.04%); split: -0.04%, +0.00%
VClause: 235194 -> 235186 (-0.00%); split: -0.01%, +0.00%
SClause: 249057 -> 249012 (-0.02%); split: -0.03%, +0.01%
Copies: 963813 -> 960529 (-0.34%); split: -0.36%, +0.02%
Branches: 321041 -> 321039 (-0.00%)
PreSGPRs: 303392 -> 303295 (-0.03%); split: -0.03%, +0.00%
VALU: 7134563 -> 7134533 (-0.00%); split: -0.00%, +0.00%
SALU: 1913802 -> 1899948 (-0.72%); split: -0.72%, +0.00%
VOPD: 19914 -> 19885 (-0.15%); split: +0.01%, -0.15%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38723>
2025-12-10 10:07:12 +00:00
Georg Lehmann
04037c7af3 aco/optimizer: propagate salu fneg
Foz-DB Navi48:
Totals from 23796 (24.37% of 97637) affected shaders:
MaxWaves: 638922 -> 638898 (-0.00%)
Instrs: 32968990 -> 32880147 (-0.27%); split: -0.28%, +0.01%
CodeSize: 174252352 -> 173922400 (-0.19%); split: -0.20%, +0.01%
VGPRs: 1396472 -> 1396592 (+0.01%)
SpillSGPRs: 63672 -> 63599 (-0.11%)
Latency: 201025393 -> 200966204 (-0.03%); split: -0.05%, +0.02%
InvThroughput: 37429702 -> 37411026 (-0.05%); split: -0.06%, +0.01%
VClause: 534241 -> 534115 (-0.02%); split: -0.05%, +0.02%
SClause: 831765 -> 831559 (-0.02%); split: -0.07%, +0.05%
Copies: 2404134 -> 2400539 (-0.15%); split: -0.29%, +0.14%
Branches: 728518 -> 728503 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 1337403 -> 1336846 (-0.04%); split: -0.04%, +0.00%
PreVGPRs: 1017490 -> 1017521 (+0.00%); split: -0.00%, +0.00%
VALU: 18319620 -> 18318960 (-0.00%); split: -0.01%, +0.00%
SALU: 5069557 -> 5001384 (-1.34%); split: -1.38%, +0.03%
VOPD: 80235 -> 80172 (-0.08%); split: +0.13%, -0.21%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38723>
2025-12-10 10:07:12 +00:00
Georg Lehmann
8b1340a52c aco/optimizer: validate uses
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38724>
2025-12-10 09:40:13 +00:00
Georg Lehmann
ad3add311c aco/optimizer: fix uses in to_uniform_bool_instr
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38724>
2025-12-10 09:40:13 +00:00
Natalie Vock
6d799ac283 aco: Add pass for spilling call-related registers
This is a post-RA pass that tracks registers that are preserved by the
ABI, but clobbered by shader code. The pass inserts scratch spills and
reloads in appropriate locations to ensure the register values at the
end of the shader are the same as they were at the start.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38281>
2025-12-08 19:12:55 +00:00
Natalie Vock
93a5919cee aco/util: Add aco::unordered_set
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38281>
2025-12-08 19:12:55 +00:00