Natalie Vock
897c95c37e
aco: Include arbitrarily fixed registers in max_reg_demand
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39157 >
2026-01-12 21:46:50 +00:00
Georg Lehmann
daf235c607
aco/tests: don't destroy vk_device if it was never created
...
Happens if you only run one test that doesn't need a vk_device.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39268 >
2026-01-12 16:16:54 +00:00
Georg Lehmann
fad95030a7
aco/tests: test VALUMaskWriteHazard with v_cmpx
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39252 >
2026-01-12 15:48:39 +00:00
Georg Lehmann
1d85552745
aco/tests: test VALUReadSGPRHazard with v_cmpx
...
To avoid regressing this in a future rework.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39252 >
2026-01-12 15:48:39 +00:00
Georg Lehmann
3e10ab34e1
aco/insert_NOPs: explicitly wait for sa_sdst to resolve SALU -> VALU hazards
...
The assumption that these waits are not required has been proven incorrect
in at least some cases.
Totals from 190 (0.24% of 79825) affected shaders: (Navi31)
Instrs: 499718 -> 500491 (+0.15%)
CodeSize: 2658228 -> 2661916 (+0.14%)
Latency: 5964632 -> 5965453 (+0.01%); split: -0.00%, +0.01%
InvThroughput: 794221 -> 794289 (+0.01%)
Totals from 17093 (21.41% of 79839) affected shaders: (Navi48)
Instrs: 22805214 -> 22854313 (+0.22%)
CodeSize: 121240428 -> 121432904 (+0.16%); split: -0.00%, +0.16%
Latency: 166500300 -> 166530529 (+0.02%); split: -0.00%, +0.02%
InvThroughput: 28770053 -> 28772870 (+0.01%); split: -0.00%, +0.01%
Fixes: 018f45f981 ("aco/insert_NOPs: remove redundant VALUReadSGPRHazard waits")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14516
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39252 >
2026-01-12 15:48:38 +00:00
Konstantin Seurer
39d58a55a7
aco: Add support to f2f16 with rtpi/rtni
...
Those rounding modes are needed when computing 16-bit bounding boxes
since the bounding box must not get smaller.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37883 >
2026-01-10 11:34:12 +01:00
Natalie Vock
60dd9d797e
aco: Swizzle ray launch IDs in the RT prolog
...
This converts from 1D workgroups to 2D ray launch IDs entirely via
shader ALU, including handling partial/cut-off workgroups optimally.
Doing this entirely in-shader means it Just Works(TM) with indirect
dispatches as well. Previous approaches manipulating various things on
CPU depending on the dispatch size couldn't handle indirect dispatches.
The swizzle implemented here also swizzles with a recursive Z-order
pattern, which should be a little more optimal than arranging
invocations linearly within the wave.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39142 >
2026-01-08 19:49:55 +01:00
Natalie Vock
1f6ac3fa93
radv/rt,aco: Always dispatch 1D workgroups for RT
...
We will swizzle the workgroups ourselves in the next commit.
Removes the need for 1D dispatch workarounds.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39142 >
2026-01-08 19:49:54 +01:00
Georg Lehmann
eb4737a1dd
nir: add nir_alu_instr_is_exact helper
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39103 >
2026-01-07 09:40:57 +00:00
Daniel Schürmann
2d0d5fc104
aco/validate: validate constant bus limit after register allocation based on PhysReg
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39107 >
2026-01-05 14:54:00 +00:00
Daniel Schürmann
eb16f701a6
aco/tests: Add new test to pack 2x16 SGPRs into VGPR
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39107 >
2026-01-05 14:54:00 +00:00
Daniel Schürmann
61c1ec541d
aco/tests: Add test for subdword extraction from SGPR
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39107 >
2026-01-05 14:54:00 +00:00
Daniel Schürmann
0674c9d30e
aco/validate: Validate correct RegisterClasses after lowering to HW instructions
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39107 >
2026-01-05 14:53:59 +00:00
Daniel Schürmann
b087bf2fbf
aco/lower_to_hw: Fix SGPR Operand RegClasses for pack_2x16
...
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39107 >
2026-01-05 14:53:59 +00:00
Daniel Schürmann
9f5996ae8a
aco/lower_to_hw: Don't use 2 SGPR operands before GFX10 in a single VOP3 instruction in do_pack_2x16()
...
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39107 >
2026-01-05 14:53:58 +00:00
Daniel Schürmann
d8481fd7cc
aco/lower_to_hw: Fix SGPR Operand RegClasses of subdword copies
...
Extracting from an SGPR could cause a wrong RegClass on
the operand which could later lead to selecting VOPD
instructions which falsely operate on the corresponding
VGPR.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39107 >
2026-01-05 14:53:58 +00:00
Georg Lehmann
0c42141299
aco: allow opsel for last v_alignbyte/bit operand
...
For completeness' sake.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13285
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39061 >
2025-12-31 08:58:24 +00:00
Daniel Schürmann
7b1f6fa6fc
aco: remove radeon_family from aco::Program
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:48 +00:00
Daniel Schürmann
1e8d367537
amd: add and use ac_cu_info::has_vtx_format_alpha_adjust_bug
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:48 +00:00
Daniel Schürmann
febc29907c
amd: add and use ac_cu_info::has_gfx6_mrt_export_bug
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:47 +00:00
Daniel Schürmann
7b7bdb76ab
amd: add ac_cu_info::has_point_sample_accel flag and use in ACO
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:47 +00:00
Daniel Schürmann
cfb745592d
amd: add ac_cu_info::has_mad32 flag and use in ACO
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:47 +00:00
Daniel Schürmann
1e3db50170
aco: use additional flags from ac_cu_info
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:46 +00:00
Daniel Schürmann
f791e46c47
aco: add ac_cu_info to aco_compiler_options
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:46 +00:00
Daniel Schürmann
addd4ea59f
aco: pass aco_compiler_options to init_program()
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:46 +00:00
Daniel Schürmann
bf9bec07c2
aco/tests: don't pass CHIP_UNKNOWN to ACO
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:46 +00:00
Daniel Schürmann
0db1ae1f01
aco: disable XNACK on all GPUs
...
Affects code generation on GFX8 and GFX9 APUs where we misunderstood
the feature. XNACK replay is not being used with graphics APIs.
Totals from 41759 (65.90% of 63370) affected shaders: (Raven)
MaxWaves: 298672 -> 299000 (+0.11%)
Instrs: 19200726 -> 19138227 (-0.33%); split: -0.33%, +0.00%
CodeSize: 98501904 -> 98253196 (-0.25%); split: -0.26%, +0.00%
SGPRs: 3058544 -> 2831492 (-7.42%)
VGPRs: 1644896 -> 1643660 (-0.08%)
Latency: 193383803 -> 193224047 (-0.08%); split: -0.08%, +0.00%
InvThroughput: 92741082 -> 92698975 (-0.05%); split: -0.05%, +0.00%
SClause: 678580 -> 630107 (-7.14%); split: -7.15%, +0.00%
Copies: 1863375 -> 1863406 (+0.00%); split: -0.04%, +0.04%
VALU: 13791245 -> 13791267 (+0.00%); split: -0.00%, +0.00%
SALU: 2066726 -> 2066741 (+0.00%); split: -0.04%, +0.04%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701 >
2025-12-22 07:34:43 +00:00
Georg Lehmann
0478021fdc
aco/optimizer: reassociate rcp(mul(a, const)) into rcp_omod(a)
...
Foz-DB Navi48:
Totals from 2484 (2.54% of 97637) affected shaders:
Instrs: 10368279 -> 10361892 (-0.06%); split: -0.06%, +0.00%
CodeSize: 55161104 -> 55150752 (-0.02%); split: -0.02%, +0.00%
SpillSGPRs: 14665 -> 14666 (+0.01%)
Latency: 87694014 -> 87689324 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 16595764 -> 16594448 (-0.01%); split: -0.01%, +0.00%
VClause: 209922 -> 209918 (-0.00%); split: -0.01%, +0.00%
SClause: 205195 -> 205251 (+0.03%); split: -0.01%, +0.04%
Copies: 843771 -> 843765 (-0.00%); split: -0.01%, +0.01%
Branches: 275985 -> 275962 (-0.01%); split: -0.01%, +0.00%
PreVGPRs: 170608 -> 170494 (-0.07%)
VALU: 5840893 -> 5838038 (-0.05%); split: -0.05%, +0.00%
SALU: 1481388 -> 1479037 (-0.16%); split: -0.16%, +0.00%
VOPD: 7496 -> 7485 (-0.15%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38730 >
2025-12-17 08:41:32 +00:00
Georg Lehmann
a8f5ced670
aco/optimizer: reassociate mul(mul(a, const), b) into mul_omod(a, b)
...
Foz-DB Navi48:
Totals from 14608 (14.96% of 97637) affected shaders:
MaxWaves: 364201 -> 364421 (+0.06%)
Instrs: 28051720 -> 28022503 (-0.10%); split: -0.13%, +0.03%
CodeSize: 148938740 -> 148943480 (+0.00%); split: -0.04%, +0.04%
VGPRs: 994520 -> 994004 (-0.05%); split: -0.05%, +0.00%
SpillSGPRs: 45182 -> 45179 (-0.01%)
Latency: 187734461 -> 187725301 (-0.00%); split: -0.07%, +0.06%
InvThroughput: 33967002 -> 33949881 (-0.05%); split: -0.11%, +0.06%
VClause: 495237 -> 495207 (-0.01%); split: -0.03%, +0.02%
Copies: 2048324 -> 2047937 (-0.02%); split: -0.12%, +0.10%
Branches: 598445 -> 598431 (-0.00%); split: -0.01%, +0.01%
PreSGPRs: 877715 -> 877684 (-0.00%)
PreVGPRs: 778146 -> 776383 (-0.23%); split: -0.23%, +0.00%
VALU: 16413380 -> 16391508 (-0.13%); split: -0.15%, +0.01%
SALU: 3685279 -> 3677655 (-0.21%); split: -0.23%, +0.02%
VOPD: 26219 -> 25926 (-1.12%); split: +0.43%, -1.55%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38730 >
2025-12-17 08:41:31 +00:00
Alyssa Rosenzweig
079e9ae606
treewide: use BITSET_*_COUNT
...
Mix of Coccinelle patch, manual fix ups, sed, etc. Probably best to review the diff
as-if hand written:
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38955 >
2025-12-16 17:42:10 +00:00
Timur Kristóf
f001515c87
aco: Use only VGPR offset on buffer atomics on GFX6-7
...
SGPR offset is not included in the bounds check
according to the ISA documentation of GFX6-7 and
indeed it can trigger VM faults on OOB access.
Note that ACO already doesn't use the SGPR offset
on GFX6-7 for buffer loads and stores. This commit
just does the same for buffer atomics.
This commit mitigates a ton of VM faults that are exposed by:
24e75fea4b
Fossil DB stats on Hawaii (GFX7):
Totals from 148 (0.24% of 61818) affected shaders:
Instrs: 324004 -> 327352 (+1.03%)
CodeSize: 1556468 -> 1514100 (-2.72%); split: -2.74%, +0.02%
Latency: 1271480 -> 1276894 (+0.43%)
InvThroughput: 396850 -> 397740 (+0.22%)
VClause: 6861 -> 6858 (-0.04%)
Copies: 34083 -> 37430 (+9.82%)
PreVGPRs: 5705 -> 5706 (+0.02%)
VALU: 147529 -> 150898 (+2.28%)
SALU: 98194 -> 98172 (-0.02%)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38958 >
2025-12-15 21:03:19 +00:00
Georg Lehmann
a2b70ce4ec
aco/isel: remove uniform reduce/scan optimization
...
This is now done in NIR, with the exception of exclusive min/max/and/or scans.
But those are not really useful, and if we ever come across them we can
optimize them in NIR using write_invocation_amd.
No Foz-DB changes on Navi21.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38902 >
2025-12-15 12:22:32 +00:00
Georg Lehmann
17e597093d
radv: eliminate unused FS output channels
...
For formats that don't have all color channels, there is no reason to
output all of them.
Games often write to R only or RGB formats with non trivial remaining channels.
Foz-DB Navi21:
Totals from 10270 (10.55% of 97347) affected shaders:
MaxWaves: 249166 -> 250950 (+0.72%); split: +0.73%, -0.01%
Instrs: 8442016 -> 8354715 (-1.03%); split: -1.05%, +0.01%
CodeSize: 45939644 -> 45487156 (-0.98%); split: -1.01%, +0.02%
VGPRs: 472584 -> 463784 (-1.86%); split: -1.98%, +0.12%
SpillSGPRs: 1502 -> 1448 (-3.60%)
LDS: 6024192 -> 6011904 (-0.20%)
Inputs: 42463 -> 41773 (-1.62%)
Outputs: 24601 -> 23955 (-2.63%)
Latency: 78011745 -> 77653907 (-0.46%); split: -0.56%, +0.10%
InvThroughput: 19767826 -> 19274046 (-2.50%); split: -2.53%, +0.03%
VClause: 177891 -> 176681 (-0.68%); split: -0.80%, +0.12%
SClause: 236784 -> 235324 (-0.62%); split: -0.72%, +0.10%
Copies: 621048 -> 616096 (-0.80%); split: -1.03%, +0.23%
Branches: 202608 -> 201811 (-0.39%); split: -0.44%, +0.05%
PreSGPRs: 441032 -> 437698 (-0.76%); split: -0.77%, +0.01%
PreVGPRs: 378067 -> 369564 (-2.25%); split: -2.26%, +0.01%
VALU: 5906415 -> 5833179 (-1.24%); split: -1.25%, +0.01%
SALU: 973428 -> 968088 (-0.55%); split: -0.61%, +0.06%
VMEM: 298277 -> 296504 (-0.59%); split: -0.61%, +0.01%
SMEM: 402244 -> 399612 (-0.65%); split: -0.71%, +0.06%
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38853 >
2025-12-12 17:00:51 +00:00
Georg Lehmann
072815e5cb
aco/gfx6: move mrtz writemask workaround to assembler and handle all mrt
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38853 >
2025-12-12 17:00:51 +00:00
Rhys Perry
156ae6195e
aco: print large p_parallelcopy using several lines
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Emre Cecanpunar <emreleno@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38695 >
2025-12-11 16:51:21 +00:00
Rhys Perry
21414e0898
aco/ra: add first loop header phi operand to temp_to_phi_resources
...
If the first operand is a CSSA copy, we might want to add this to
temp_to_phi_resources, so that we later mark it as the last-seen phi
operand.
fossil-db (navi31):
Totals from 284 (0.36% of 79825) affected shaders:
Instrs: 4160233 -> 4157517 (-0.07%); split: -0.09%, +0.03%
CodeSize: 21546420 -> 21532884 (-0.06%); split: -0.09%, +0.02%
VGPRs: 31404 -> 31416 (+0.04%)
Latency: 40266308 -> 40253731 (-0.03%); split: -0.06%, +0.02%
InvThroughput: 8140751 -> 8139724 (-0.01%); split: -0.05%, +0.04%
VClause: 99849 -> 99835 (-0.01%); split: -0.02%, +0.01%
Copies: 344512 -> 341732 (-0.81%); split: -1.08%, +0.28%
Branches: 113620 -> 113629 (+0.01%); split: -0.02%, +0.03%
VALU: 2502619 -> 2499836 (-0.11%); split: -0.15%, +0.04%
SALU: 499245 -> 499341 (+0.02%); split: -0.02%, +0.04%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Emre Cecanpunar <emreleno@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38695 >
2025-12-11 16:51:21 +00:00
Rhys Perry
43b3901362
aco/ra: copy vector_info to affinities
...
This eliminates some copies in BVH traversal loops.
fossil-db (navi31):
Totals from 200 (0.25% of 79825) affected shaders:
Instrs: 734931 -> 732521 (-0.33%); split: -0.34%, +0.01%
CodeSize: 3801080 -> 3791692 (-0.25%); split: -0.26%, +0.01%
VGPRs: 13704 -> 13728 (+0.18%); split: -0.44%, +0.61%
Latency: 6094605 -> 6082060 (-0.21%); split: -0.24%, +0.03%
InvThroughput: 1081982 -> 1080121 (-0.17%); split: -0.19%, +0.02%
VClause: 18835 -> 18837 (+0.01%); split: -0.01%, +0.02%
Copies: 64602 -> 62239 (-3.66%); split: -3.75%, +0.09%
Branches: 20111 -> 20112 (+0.00%); split: -0.01%, +0.02%
VALU: 438618 -> 436257 (-0.54%); split: -0.55%, +0.01%
SALU: 85092 -> 85085 (-0.01%); split: -0.01%, +0.00%
VOPD: 76 -> 74 (-2.63%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Emre Cecanpunar <emreleno@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38695 >
2025-12-11 16:51:21 +00:00
Georg Lehmann
ef246aaf72
aco/isel: emit register copies for workgroup ids
...
This way, we don't overestimate SGPR pressure.
Foz-DB Navi48:
Totals from 1413 (1.45% of 97637) affected shaders:
Instrs: 3468375 -> 3468585 (+0.01%); split: -0.01%, +0.02%
CodeSize: 18643064 -> 18643520 (+0.00%); split: -0.01%, +0.01%
VGPRs: 71776 -> 71788 (+0.02%)
SpillSGPRs: 18575 -> 18561 (-0.08%)
Latency: 23207643 -> 23207998 (+0.00%); split: -0.00%, +0.01%
InvThroughput: 8116806 -> 8116541 (-0.00%); split: -0.01%, +0.00%
VClause: 75250 -> 75252 (+0.00%); split: -0.00%, +0.00%
SClause: 65274 -> 65283 (+0.01%); split: -0.02%, +0.04%
Copies: 275750 -> 275942 (+0.07%); split: -0.03%, +0.10%
PreSGPRs: 70246 -> 69072 (-1.67%)
VALU: 1892111 -> 1892092 (-0.00%); split: -0.00%, +0.00%
SALU: 523460 -> 523648 (+0.04%); split: -0.02%, +0.05%
VOPD: 41097 -> 41102 (+0.01%)
Sadly the RA noise is slightly negative for instruction count.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38830 >
2025-12-11 08:06:59 +00:00
Georg Lehmann
839a035564
aco/optimizer: propagate fixed regs to copy/extract/insert
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38830 >
2025-12-11 08:06:58 +00:00
Georg Lehmann
d32dd5e1df
aco/optimizer: propagate fixed registers
...
Foz-DB Navi48:
Totals from 351 (0.36% of 97637) affected shaders:
Instrs: 3568192 -> 3567166 (-0.03%)
CodeSize: 18890368 -> 18886304 (-0.02%)
Latency: 17047945 -> 17048185 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 3185739 -> 3185813 (+0.00%); split: -0.00%, +0.00%
SClause: 61544 -> 61536 (-0.01%)
Copies: 271592 -> 270845 (-0.28%)
PreSGPRs: 17186 -> 17094 (-0.54%)
PreVGPRs: 21897 -> 21901 (+0.02%)
VALU: 2003976 -> 2003980 (+0.00%)
SALU: 468403 -> 467664 (-0.16%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38830 >
2025-12-11 08:06:58 +00:00
Georg Lehmann
b798ace443
aco/optimizer: fix skip_smem_offset_align with non temp register operands
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38830 >
2025-12-11 08:06:58 +00:00
Georg Lehmann
911e1ce168
aco/isel: emit exec copy for ballot(true)
...
Once copy propagated in the optimizer, this will allow
using nir_opt_uniform_subgroup without too many regressions.
Foz-DB Navi48:
Totals from 405 (0.41% of 97637) affected shaders:
Instrs: 3796716 -> 3796894 (+0.00%); split: -0.00%, +0.00%
CodeSize: 20116136 -> 20116652 (+0.00%); split: -0.00%, +0.00%
Latency: 18326661 -> 18327114 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 3353206 -> 3353268 (+0.00%); split: -0.00%, +0.00%
Copies: 292307 -> 293830 (+0.52%)
SALU: 507523 -> 507738 (+0.04%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38830 >
2025-12-11 08:06:58 +00:00
Georg Lehmann
72e3071751
aco/optimizer: keep pass_flags valid for all instructions
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38830 >
2025-12-11 08:06:57 +00:00
Marek Olšák
308da55f1a
radv,radeonsi: use FRAG_RESULT_DUAL_SRC_BLEND
...
this is slightly nicer
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38604 >
2025-12-10 19:16:46 +00:00
Georg Lehmann
bb58ba2075
aco/optimizer: propagate salu fabs
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Foz-DB Navi48:
Totals from 5156 (5.28% of 97637) affected shaders:
Instrs: 12713219 -> 12694317 (-0.15%); split: -0.15%, +0.00%
CodeSize: 67099236 -> 67037588 (-0.09%); split: -0.13%, +0.04%
VGPRs: 352620 -> 352608 (-0.00%)
SpillSGPRs: 22032 -> 22031 (-0.00%)
Latency: 68288972 -> 68271858 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 13639078 -> 13633997 (-0.04%); split: -0.04%, +0.00%
VClause: 235194 -> 235186 (-0.00%); split: -0.01%, +0.00%
SClause: 249057 -> 249012 (-0.02%); split: -0.03%, +0.01%
Copies: 963813 -> 960529 (-0.34%); split: -0.36%, +0.02%
Branches: 321041 -> 321039 (-0.00%)
PreSGPRs: 303392 -> 303295 (-0.03%); split: -0.03%, +0.00%
VALU: 7134563 -> 7134533 (-0.00%); split: -0.00%, +0.00%
SALU: 1913802 -> 1899948 (-0.72%); split: -0.72%, +0.00%
VOPD: 19914 -> 19885 (-0.15%); split: +0.01%, -0.15%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38723 >
2025-12-10 10:07:12 +00:00
Georg Lehmann
04037c7af3
aco/optimizer: propagate salu fneg
...
Foz-DB Navi48:
Totals from 23796 (24.37% of 97637) affected shaders:
MaxWaves: 638922 -> 638898 (-0.00%)
Instrs: 32968990 -> 32880147 (-0.27%); split: -0.28%, +0.01%
CodeSize: 174252352 -> 173922400 (-0.19%); split: -0.20%, +0.01%
VGPRs: 1396472 -> 1396592 (+0.01%)
SpillSGPRs: 63672 -> 63599 (-0.11%)
Latency: 201025393 -> 200966204 (-0.03%); split: -0.05%, +0.02%
InvThroughput: 37429702 -> 37411026 (-0.05%); split: -0.06%, +0.01%
VClause: 534241 -> 534115 (-0.02%); split: -0.05%, +0.02%
SClause: 831765 -> 831559 (-0.02%); split: -0.07%, +0.05%
Copies: 2404134 -> 2400539 (-0.15%); split: -0.29%, +0.14%
Branches: 728518 -> 728503 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 1337403 -> 1336846 (-0.04%); split: -0.04%, +0.00%
PreVGPRs: 1017490 -> 1017521 (+0.00%); split: -0.00%, +0.00%
VALU: 18319620 -> 18318960 (-0.00%); split: -0.01%, +0.00%
SALU: 5069557 -> 5001384 (-1.34%); split: -1.38%, +0.03%
VOPD: 80235 -> 80172 (-0.08%); split: +0.13%, -0.21%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38723 >
2025-12-10 10:07:12 +00:00
Georg Lehmann
8b1340a52c
aco/optimizer: validate uses
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38724 >
2025-12-10 09:40:13 +00:00
Georg Lehmann
ad3add311c
aco/optimizer: fix uses in to_uniform_bool_instr
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38724 >
2025-12-10 09:40:13 +00:00
Natalie Vock
6d799ac283
aco: Add pass for spilling call-related registers
...
This is a post-RA pass that tracks registers that are preserved by the
ABI, but clobbered by shader code. The pass inserts scratch spills and
reloads in appropriate locations to ensure the register values at the
end of the shader are the same as they were at the start.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38281 >
2025-12-08 19:12:55 +00:00
Natalie Vock
93a5919cee
aco/util: Add aco::unordered_set
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38281 >
2025-12-08 19:12:55 +00:00