Rhys Perry
463e3643f2
nir: add and use block predecessor helpers
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40242 >
2026-04-08 15:06:32 +00:00
Georg Lehmann
d1ed4e1774
aco/optimizer: do not try to create 3 byte constant operands
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Operand::get_const will assert.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15239
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40828 >
2026-04-08 09:17:26 +00:00
Georg Lehmann
792ce7ddf6
aco/isel: optimize 16/64bit non constant valu bit test
...
By using the constant path we can combine the v_and and the v_cmp.
Foz-DB GFX1201:
Totals from 2 (0.00% of 205032) affected shaders:
Instrs: 2833 -> 2831 (-0.07%)
Latency: 27385 -> 27367 (-0.07%)
InvThroughput: 1712 -> 1710 (-0.12%)
VALU: 1301 -> 1299 (-0.15%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40705 >
2026-04-08 08:44:20 +00:00
Natalie Vock
fded5e321d
aco: Nuke ACO-side prolog selection
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40008 >
2026-04-07 11:28:05 +00:00
Natalie Vock
b53dc3f052
aco/lower_to_hw_instr: Run p_init_scratch if the program has a call
...
Callees may use scratch even if the caller doesn't.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40008 >
2026-04-07 11:28:05 +00:00
Natalie Vock
378c9536de
aco/isel: Fix stack_ptr synthesis
...
info.stack_ptr.is_reg is always true. We have a stack pointer to use
if and only if the program is a callee.
Also, apply_scratch_offset needs to be true in a few more places.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40008 >
2026-04-07 11:28:05 +00:00
Natalie Vock
31e08322d7
aco/spill_preserved: Only compute preserved registers if in a callee
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40008 >
2026-04-07 11:28:05 +00:00
Georg Lehmann
5453419086
aco/isel: use s_bitcmp1 for 1bit ubfe
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Avoid the s_pack at the cost of having to use scc.
Foz-DB GFX1201:
Totals from 1514 (0.74% of 205032) affected shaders:
Instrs: 3443431 -> 3434096 (-0.27%); split: -0.27%, +0.00%
CodeSize: 19062100 -> 19024320 (-0.20%); split: -0.20%, +0.00%
Latency: 22343329 -> 22342802 (-0.00%); split: -0.01%, +0.01%
InvThroughput: 4471707 -> 4471632 (-0.00%); split: -0.00%, +0.00%
Copies: 280191 -> 279645 (-0.19%); split: -0.21%, +0.01%
PreSGPRs: 71333 -> 71327 (-0.01%)
VALU: 1598064 -> 1598058 (-0.00%); split: -0.00%, +0.00%
SALU: 691458 -> 686437 (-0.73%)
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40707 >
2026-03-31 10:42:33 +00:00
Karol Herbst
5bb3c9f69c
nir: rename fsin_amd and fcos_amd to a more generic name
...
Nvidia implements both the same way as AMD does, so it makes sense to
allow for code sharing here.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40541 >
2026-03-31 01:47:29 +02:00
Georg Lehmann
ae2968c4ec
aco: allow spilling to LDS in RT shaders without stack pointer
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
No Foz-DB changes because most RT shaders use function calls now.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36367 >
2026-03-27 13:08:44 +00:00
Georg Lehmann
133ef9f94b
aco: spill VGPRs to LDS if it doesn't further limit occupancy
...
Only use LDS for VGPR spilling if we can use addtid access, to avoid having a VGPR addr.
Limit to single wave workgroups, to avoid needing the wave_id for the offset.
If we have a scratch stack pointer, don't use LDS at all.
Limit LDS spilling to not reduce occupancy further.
Note that in theory, this can still limit occupancy of other shaders running
on the CU at the same time, but that's unlikely and impossible to know at this point.
Removes all scratch usage in emulated FSR4 and parallel_rdp.
Besides that, only a single GoW shader is affected.
Foz-DB Navi31:
Totals from 9 (0.01% of 114641) affected shaders:
Instrs: 68863 -> 68830 (-0.05%); split: -0.07%, +0.02%
CodeSize: 416108 -> 416000 (-0.03%); split: -0.05%, +0.02%
LDS: 2048 -> 45056 (+2100.00%)
Scratch: 261888 -> 220672 (-15.74%)
Latency: 727951 -> 657155 (-9.73%); split: -9.73%, +0.00%
InvThroughput: 418644 -> 383269 (-8.45%)
VClause: 1506 -> 1200 (-20.32%)
Copies: 10651 -> 10624 (-0.25%)
VALU: 48700 -> 48684 (-0.03%)
SALU: 6200 -> 6199 (-0.02%); split: -0.05%, +0.03%
VMEM: 4139 -> 3589 (-13.29%)
VOPD: 580 -> 574 (-1.03%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36367 >
2026-03-27 13:08:44 +00:00
Georg Lehmann
17a9ee7152
aco/optimizer: apply dpp to v_dot before RA for gfx10.3
...
This is a bit unusual, as we otherwise only use the VOP2 codesize
optimization opcodes in the register allocator.
But unless we change the scheduler to not split v_mov_b32_dpp and
v_dot, we have no other choice.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40510 >
2026-03-24 09:05:40 +00:00
Emre Cecanpunar
c60e5df798
aco: drop optimizer peephole TODO comment
...
The remaining items are either handled elsewhere or unlikely to be
implemented in the optimizer.
Signed-off-by: Emre Cecanpunar <emreleno@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40497 >
2026-03-23 11:03:59 +00:00
Georg Lehmann
559a35dcb3
aco: skip fract for sin/cos on gfx6-8 if the src is already in range
...
Foz-DB Polaris10:
Totals from 1301 (1.86% of 69950) affected shaders:
Instrs: 1447217 -> 1445610 (-0.11%); split: -0.11%, +0.00%
CodeSize: 7775988 -> 7769588 (-0.08%); split: -0.08%, +0.00%
SGPRs: 101712 -> 101776 (+0.06%)
SpillSGPRs: 931 -> 927 (-0.43%)
Latency: 16119433 -> 16115293 (-0.03%); split: -0.03%, +0.01%
InvThroughput: 9605952 -> 9577042 (-0.30%); split: -0.31%, +0.01%
VClause: 24591 -> 24593 (+0.01%); split: -0.01%, +0.02%
SClause: 29656 -> 29655 (-0.00%)
Copies: 133968 -> 134001 (+0.02%); split: -0.01%, +0.03%
VALU: 1157855 -> 1156235 (-0.14%)
SALU: 124626 -> 124639 (+0.01%); split: -0.00%, +0.01%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40545 >
2026-03-23 09:27:32 +00:00
Marek Olšák
2283244975
nir: change export_amd intrinsics to use target instead of base
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40415 >
2026-03-23 06:10:49 +00:00
Marek Olšák
b75a3112fd
nir: change export_amd intrinsics to use enabled_channels instead of write_mask
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40415 >
2026-03-23 06:10:49 +00:00
Daniel Schürmann
4b238690cb
aco/tests: add and lower loop continue constructs in all tests which use continues
...
We are going to disallow continue statements without
loop continue constructs.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942 >
2026-03-21 07:42:55 +00:00
Rhys Perry
e2ebcba11b
aco/tests: fix assembler/isel tests with LLVM 23
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40513 >
2026-03-20 10:24:06 +00:00
Rhys Perry
0826685f1b
aco/tests: fix assembler tests with LLVM 22
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40513 >
2026-03-20 10:24:06 +00:00
Samuel Pitoiset
639207701d
aco,radv,radeonsi: remove debug report support in ACO
...
This doesn't seem very useful since ACO will abort and print the
error messages to stderr.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40379 >
2026-03-16 11:55:45 +00:00
Georg Lehmann
d7348ea501
aco/ra: don't tie definition when the operand is in a preserved reg
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40225 >
2026-03-10 14:21:56 +00:00
Georg Lehmann
444eb3dce5
aco/ra: try to allocate registers for dot2 to allow VOPD
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40225 >
2026-03-10 14:21:56 +00:00
Georg Lehmann
788aafba2a
aco/sched_vopd: create dot2acc from VOP3P dot2
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40225 >
2026-03-10 14:21:56 +00:00
Georg Lehmann
47599b2c38
aco/opt_postRA: remove try_convert_fma_to_vop2
...
This is now done directly in the VOPD scheduler.
Foz-DB GFX1201:
Totals from 600 (0.52% of 114655) affected shaders:
no stats changed
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40225 >
2026-03-10 14:21:56 +00:00
Georg Lehmann
6cef434478
aco/sched_vopd: convert fma with inline constants to fmamk/fmaak
...
This optimization was previously done in the post-RA optimizer,
but it is more fitting for the vopd scheduler.
Doing it here also has the benefit that we don't unnecessarily use
the constant bus when VOPD can't be used.
No Foz-DB changes on GFX12 until the next commit.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40225 >
2026-03-10 14:21:56 +00:00
Georg Lehmann
1ae9931145
aco/scheld_vopd: make VOPDInfo more flexible by adding a swizzle
...
No Foz-DB changes on GFX1201.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40225 >
2026-03-10 14:21:55 +00:00
Georg Lehmann
08cac48170
aco/isel: skip min/max for SALU fsat if possible
...
Foz-DB Navi48:
Totals from 789 (0.95% of 82636) affected shaders:
Instrs: 4144156 -> 4141345 (-0.07%); split: -0.07%, +0.00%
CodeSize: 23345212 -> 23333960 (-0.05%); split: -0.05%, +0.00%
Latency: 22988205 -> 22986666 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 4378321 -> 4377874 (-0.01%); split: -0.01%, +0.00%
Copies: 302311 -> 302313 (+0.00%); split: -0.00%, +0.00%
SALU: 647622 -> 645901 (-0.27%); split: -0.27%, +0.00%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Rhys Perry
82420ebc2c
aco: fix PS epilog dual-source blending with only one color output
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40005 >
2026-03-05 09:38:23 +00:00
Marek Olšák
fae7aef5ca
ac: tidy up ac_hw_cache_flags
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40022 >
2026-03-04 21:14:56 +00:00
Rhys Perry
5c3b5688a1
amd: rename ac_cu_info to ac_compiler_info
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40042 >
2026-03-03 08:50:12 +00:00
Rhys Perry
8801ca188d
ac/nir: don't pass radeon_info to ac_nir_set_options
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40042 >
2026-03-03 08:50:10 +00:00
Rhys Perry
17b18496f6
aco: perform dce for blocks skipped for process_block()
...
We might need to DCE users of dead instructions removed by
process_block().
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 9e8ba10447 ("aco/vn: remove dead instructions early")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40091 >
2026-03-02 13:38:16 +00:00
Marek Olšák
f22f117d1a
amd: add meson variable idep_amd_generated_headers for all generated headers
...
group all generated header under the same variable
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40084 >
2026-02-28 05:23:59 +00:00
Rhys Perry
43603f9b1d
amd: add ac_cu_info::local_invocation_ids_packed
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39992 >
2026-02-26 15:49:15 +00:00
Rhys Perry
29f8237d30
amd: move various flags to ac_cu_info
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39992 >
2026-02-26 15:49:14 +00:00
Georg Lehmann
8f4de30d05
aco/insert_fp_mode: don't skip setting round for fract
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
fract(-FLT_MIN) is < 1.0 with rtz but 1.0 with rtne.
Fixes: 7212a75c5e ("aco/insert_fp_mode: exclude some instructions that will never round")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40078 >
2026-02-26 00:20:02 +00:00
Rhys Perry
613b4fe407
aco: resolve hazards before calls
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39825 >
2026-02-24 13:20:55 +00:00
Rhys Perry
dfda890ae8
aco: reset all vgpr_used_by_vmem_ in resolve_all_gfx11
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39825 >
2026-02-24 13:20:55 +00:00
Rhys Perry
72923ad2f0
aco: fix VALUReadSGPRHazard with s_call_b64/s_swappc_b64
...
This probably doesn't do anything because sgpr_read_by_valu are all set
already for raytracing shaders.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39825 >
2026-02-24 13:20:54 +00:00
Georg Lehmann
dd067088ef
aco: allow dpp for fp8/bf8 dot4
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40003 >
2026-02-24 08:55:53 +00:00
Georg Lehmann
a033cd95a4
aco: allow modifiers for fp16 dot
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40003 >
2026-02-24 08:55:53 +00:00
Georg Lehmann
3238e64d3c
aco/ra: create v_dot2c_f32_f16
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40003 >
2026-02-24 08:55:53 +00:00
Georg Lehmann
237b8ca205
aco: mixed float dot product opcodes
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40003 >
2026-02-24 08:55:52 +00:00
Rhys Perry
af27fb23f3
aco/ra: don't modify parallelcopies if get_reg_for_affinity fails
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes baldurs_gate_3/60c8b7ff623fbb18 with vega10.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 310f588f92 ("aco/ra: move variables from affinity register to avoid waitcnt")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39986 >
2026-02-20 08:40:55 +00:00
Rhys Perry
75722da909
aco: fix gfx6-8 store_scratch() with function calls
...
Might happen with radv_emulate_rt=true.
Fixes the_great_circle/a6079328b8df7712 with polaris10.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: e006f68b11 ("aco/isel: Don't add scratch offset as gfx8- soffset if no offsets exist")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39986 >
2026-02-20 08:40:55 +00:00
Rhys Perry
f81aaee7f1
aco/ra: create vectors for affinities of split definitions
...
For example:
a = ...
b = ...
if {
c, d = split
}
phi(a, c)
phi(b, d)
This patch will allocate 'a' and 'b' as a vector.
fossil-db (navi31):
Totals from 2556 (3.20% of 79825) affected shaders:
MaxWaves: 59957 -> 59955 (-0.00%)
Instrs: 9170941 -> 9154954 (-0.17%); split: -0.19%, +0.02%
CodeSize: 48245956 -> 48182620 (-0.13%); split: -0.15%, +0.02%
VGPRs: 189372 -> 189900 (+0.28%); split: -0.04%, +0.32%
Latency: 85469322 -> 85262360 (-0.24%); split: -0.32%, +0.08%
InvThroughput: 14515911 -> 14486970 (-0.20%); split: -0.27%, +0.07%
VClause: 197980 -> 197959 (-0.01%); split: -0.02%, +0.01%
Copies: 787838 -> 774288 (-1.72%); split: -1.91%, +0.19%
Branches: 271810 -> 271799 (-0.00%); split: -0.01%, +0.01%
VALU: 5331813 -> 5318566 (-0.25%); split: -0.28%, +0.03%
SALU: 1133559 -> 1133054 (-0.04%); split: -0.05%, +0.01%
VOPD: 2435 -> 2418 (-0.70%); split: +0.12%, -0.82%
fossil-db (navi21):
Totals from 37513 (46.99% of 79825) affected shaders:
Instrs: 26734825 -> 26681225 (-0.20%); split: -0.23%, +0.03%
CodeSize: 141353284 -> 141144360 (-0.15%); split: -0.17%, +0.02%
VGPRs: 1556760 -> 1556384 (-0.02%); split: -0.21%, +0.18%
Latency: 146201548 -> 146156473 (-0.03%); split: -0.20%, +0.17%
InvThroughput: 33921803 -> 33867398 (-0.16%); split: -0.23%, +0.07%
VClause: 502263 -> 502209 (-0.01%); split: -0.27%, +0.26%
SClause: 593142 -> 593155 (+0.00%); split: -0.00%, +0.00%
Copies: 2600995 -> 2551257 (-1.91%); split: -2.16%, +0.25%
Branches: 857910 -> 857787 (-0.01%); split: -0.03%, +0.02%
VALU: 15674532 -> 15625013 (-0.32%); split: -0.35%, +0.04%
SALU: 4635548 -> 4634680 (-0.02%); split: -0.04%, +0.02%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
86f0195f5c
aco/ra: prefer phi operands which don't create waitcnt
...
fossil-db (navi31):
Totals from 89 (0.11% of 79825) affected shaders:
Instrs: 343443 -> 343384 (-0.02%); split: -0.10%, +0.09%
CodeSize: 1792948 -> 1792668 (-0.02%); split: -0.10%, +0.08%
Latency: 2656294 -> 2656490 (+0.01%); split: -0.02%, +0.02%
InvThroughput: 517696 -> 517691 (-0.00%); split: -0.01%, +0.01%
SClause: 9213 -> 9215 (+0.02%); split: -0.01%, +0.03%
Copies: 39138 -> 39089 (-0.13%); split: -0.84%, +0.71%
Branches: 10863 -> 10872 (+0.08%); split: -0.05%, +0.13%
SALU: 49185 -> 49136 (-0.10%); split: -0.67%, +0.57%
fossil-db (navi21):
Totals from 34490 (43.21% of 79825) affected shaders:
Instrs: 23005853 -> 22956529 (-0.21%); split: -0.25%, +0.04%
CodeSize: 120532004 -> 120341412 (-0.16%); split: -0.19%, +0.03%
VGPRs: 1396928 -> 1397520 (+0.04%); split: -0.07%, +0.11%
Latency: 108740068 -> 108499644 (-0.22%); split: -0.53%, +0.30%
InvThroughput: 25286526 -> 25358695 (+0.29%); split: -0.11%, +0.39%
VClause: 421179 -> 421132 (-0.01%); split: -0.29%, +0.27%
SClause: 446414 -> 446423 (+0.00%); split: -0.00%, +0.00%
Copies: 2242236 -> 2243168 (+0.04%); split: -0.42%, +0.46%
Branches: 724556 -> 724903 (+0.05%); split: -0.02%, +0.07%
VALU: 13321078 -> 13321940 (+0.01%); split: -0.07%, +0.08%
SALU: 4069929 -> 4070580 (+0.02%); split: -0.02%, +0.03%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
310f588f92
aco/ra: move variables from affinity register to avoid waitcnt
...
If we don't use this affinity register, we're likely to end up moving the
temporary later. If it's a memory instruction destination, that's probably
more expensive than just copying the blocking variables.
fossil-db (navi31):
Totals from 504 (0.63% of 79825) affected shaders:
Instrs: 4108284 -> 4109026 (+0.02%); split: -0.01%, +0.03%
CodeSize: 21226764 -> 21229764 (+0.01%); split: -0.01%, +0.02%
Latency: 26931635 -> 26806989 (-0.46%); split: -0.47%, +0.00%
InvThroughput: 8443520 -> 8439235 (-0.05%); split: -0.06%, +0.01%
VClause: 99209 -> 99314 (+0.11%); split: -0.00%, +0.11%
SClause: 85089 -> 85085 (-0.00%)
Copies: 340323 -> 340993 (+0.20%); split: -0.06%, +0.26%
Branches: 117225 -> 117209 (-0.01%); split: -0.02%, +0.00%
VALU: 2421859 -> 2422529 (+0.03%); split: -0.01%, +0.04%
SALU: 503465 -> 503470 (+0.00%); split: -0.00%, +0.00%
fossil-db (navi21):
Totals from 582 (0.73% of 79825) affected shaders:
Instrs: 3714908 -> 3714990 (+0.00%); split: -0.02%, +0.02%
CodeSize: 19977880 -> 19973076 (-0.02%); split: -0.04%, +0.01%
VGPRs: 40480 -> 40496 (+0.04%)
Latency: 26028895 -> 25772711 (-0.98%); split: -0.99%, +0.00%
InvThroughput: 9827389 -> 9818194 (-0.09%); split: -0.10%, +0.01%
VClause: 103702 -> 103815 (+0.11%); split: -0.02%, +0.13%
SClause: 90861 -> 90857 (-0.00%)
Copies: 335276 -> 335992 (+0.21%); split: -0.09%, +0.30%
Branches: 123912 -> 123897 (-0.01%); split: -0.02%, +0.00%
VALU: 2466032 -> 2466748 (+0.03%); split: -0.01%, +0.04%
SALU: 533658 -> 533667 (+0.00%); split: -0.00%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
681ec4cba7
aco/ra: track cost of moving variables
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
69bc4efa37
aco/sched_ilp: improve scheduling with VMEM/DS->VALU WaW
...
This improves scheduling with one side of a divergent branch writing to a
VGPR using VMEM/DS, and the other writing using VALU. At the merge block,
it will properly consider that the VGPR was written by a VMEM/DS.
fossil-db (navi31):
Totals from 1224 (1.53% of 79825) affected shaders:
Instrs: 5264815 -> 5267604 (+0.05%); split: -0.00%, +0.06%
CodeSize: 27406404 -> 27422132 (+0.06%); split: -0.00%, +0.06%
Latency: 48325204 -> 48293975 (-0.06%); split: -0.09%, +0.03%
InvThroughput: 8923880 -> 8919191 (-0.05%); split: -0.07%, +0.02%
fossil-db (navi21):
Totals from 1267 (1.59% of 79825) affected shaders:
Instrs: 4628583 -> 4629190 (+0.01%); split: -0.00%, +0.01%
CodeSize: 24974672 -> 24977188 (+0.01%); split: -0.00%, +0.01%
Latency: 45080476 -> 44998120 (-0.18%); split: -0.20%, +0.02%
InvThroughput: 12288202 -> 12269634 (-0.15%); split: -0.16%, +0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00