Marek Olšák
b75a3112fd
nir: change export_amd intrinsics to use enabled_channels instead of write_mask
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40415 >
2026-03-23 06:10:49 +00:00
Daniel Schürmann
4b238690cb
aco/tests: add and lower loop continue constructs in all tests which use continues
...
We are going to disallow continue statements without
loop continue constructs.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942 >
2026-03-21 07:42:55 +00:00
Rhys Perry
e2ebcba11b
aco/tests: fix assembler/isel tests with LLVM 23
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40513 >
2026-03-20 10:24:06 +00:00
Rhys Perry
0826685f1b
aco/tests: fix assembler tests with LLVM 22
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40513 >
2026-03-20 10:24:06 +00:00
Samuel Pitoiset
639207701d
aco,radv,radeonsi: remove debug report support in ACO
...
This doesn't seem very useful since ACO will abort and print the
error messages to stderr.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40379 >
2026-03-16 11:55:45 +00:00
Georg Lehmann
d7348ea501
aco/ra: don't tie definition when the operand is in a preserved reg
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40225 >
2026-03-10 14:21:56 +00:00
Georg Lehmann
444eb3dce5
aco/ra: try to allocate registers for dot2 to allow VOPD
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40225 >
2026-03-10 14:21:56 +00:00
Georg Lehmann
788aafba2a
aco/sched_vopd: create dot2acc from VOP3P dot2
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40225 >
2026-03-10 14:21:56 +00:00
Georg Lehmann
47599b2c38
aco/opt_postRA: remove try_convert_fma_to_vop2
...
This is now done directly in the VOPD scheduler.
Foz-DB GFX1201:
Totals from 600 (0.52% of 114655) affected shaders:
no stats changed
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40225 >
2026-03-10 14:21:56 +00:00
Georg Lehmann
6cef434478
aco/sched_vopd: convert fma with inline constants to fmamk/fmaak
...
This optimization was previously done in the post-RA optimizer,
but it is more fitting for the vopd scheduler.
Doing it here also has the benefit that we don't unnecessarily use
the constant bus when VOPD can't be used.
No Foz-DB changes on GFX12 until the next commit.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40225 >
2026-03-10 14:21:56 +00:00
Georg Lehmann
1ae9931145
aco/scheld_vopd: make VOPDInfo more flexible by adding a swizzle
...
No Foz-DB changes on GFX1201.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40225 >
2026-03-10 14:21:55 +00:00
Georg Lehmann
08cac48170
aco/isel: skip min/max for SALU fsat if possible
...
Foz-DB Navi48:
Totals from 789 (0.95% of 82636) affected shaders:
Instrs: 4144156 -> 4141345 (-0.07%); split: -0.07%, +0.00%
CodeSize: 23345212 -> 23333960 (-0.05%); split: -0.05%, +0.00%
Latency: 22988205 -> 22986666 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 4378321 -> 4377874 (-0.01%); split: -0.01%, +0.00%
Copies: 302311 -> 302313 (+0.00%); split: -0.00%, +0.00%
SALU: 647622 -> 645901 (-0.27%); split: -0.27%, +0.00%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987 >
2026-03-07 05:01:44 +00:00
Rhys Perry
82420ebc2c
aco: fix PS epilog dual-source blending with only one color output
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40005 >
2026-03-05 09:38:23 +00:00
Marek Olšák
fae7aef5ca
ac: tidy up ac_hw_cache_flags
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40022 >
2026-03-04 21:14:56 +00:00
Rhys Perry
5c3b5688a1
amd: rename ac_cu_info to ac_compiler_info
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40042 >
2026-03-03 08:50:12 +00:00
Rhys Perry
8801ca188d
ac/nir: don't pass radeon_info to ac_nir_set_options
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40042 >
2026-03-03 08:50:10 +00:00
Rhys Perry
17b18496f6
aco: perform dce for blocks skipped for process_block()
...
We might need to DCE users of dead instructions removed by
process_block().
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 9e8ba10447 ("aco/vn: remove dead instructions early")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40091 >
2026-03-02 13:38:16 +00:00
Marek Olšák
f22f117d1a
amd: add meson variable idep_amd_generated_headers for all generated headers
...
group all generated header under the same variable
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40084 >
2026-02-28 05:23:59 +00:00
Rhys Perry
43603f9b1d
amd: add ac_cu_info::local_invocation_ids_packed
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39992 >
2026-02-26 15:49:15 +00:00
Rhys Perry
29f8237d30
amd: move various flags to ac_cu_info
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39992 >
2026-02-26 15:49:14 +00:00
Georg Lehmann
8f4de30d05
aco/insert_fp_mode: don't skip setting round for fract
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
fract(-FLT_MIN) is < 1.0 with rtz but 1.0 with rtne.
Fixes: 7212a75c5e ("aco/insert_fp_mode: exclude some instructions that will never round")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40078 >
2026-02-26 00:20:02 +00:00
Rhys Perry
613b4fe407
aco: resolve hazards before calls
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39825 >
2026-02-24 13:20:55 +00:00
Rhys Perry
dfda890ae8
aco: reset all vgpr_used_by_vmem_ in resolve_all_gfx11
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39825 >
2026-02-24 13:20:55 +00:00
Rhys Perry
72923ad2f0
aco: fix VALUReadSGPRHazard with s_call_b64/s_swappc_b64
...
This probably doesn't do anything because sgpr_read_by_valu are all set
already for raytracing shaders.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39825 >
2026-02-24 13:20:54 +00:00
Georg Lehmann
dd067088ef
aco: allow dpp for fp8/bf8 dot4
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40003 >
2026-02-24 08:55:53 +00:00
Georg Lehmann
a033cd95a4
aco: allow modifiers for fp16 dot
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40003 >
2026-02-24 08:55:53 +00:00
Georg Lehmann
3238e64d3c
aco/ra: create v_dot2c_f32_f16
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40003 >
2026-02-24 08:55:53 +00:00
Georg Lehmann
237b8ca205
aco: mixed float dot product opcodes
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40003 >
2026-02-24 08:55:52 +00:00
Rhys Perry
af27fb23f3
aco/ra: don't modify parallelcopies if get_reg_for_affinity fails
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes baldurs_gate_3/60c8b7ff623fbb18 with vega10.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 310f588f92 ("aco/ra: move variables from affinity register to avoid waitcnt")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39986 >
2026-02-20 08:40:55 +00:00
Rhys Perry
75722da909
aco: fix gfx6-8 store_scratch() with function calls
...
Might happen with radv_emulate_rt=true.
Fixes the_great_circle/a6079328b8df7712 with polaris10.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: e006f68b11 ("aco/isel: Don't add scratch offset as gfx8- soffset if no offsets exist")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39986 >
2026-02-20 08:40:55 +00:00
Rhys Perry
f81aaee7f1
aco/ra: create vectors for affinities of split definitions
...
For example:
a = ...
b = ...
if {
c, d = split
}
phi(a, c)
phi(b, d)
This patch will allocate 'a' and 'b' as a vector.
fossil-db (navi31):
Totals from 2556 (3.20% of 79825) affected shaders:
MaxWaves: 59957 -> 59955 (-0.00%)
Instrs: 9170941 -> 9154954 (-0.17%); split: -0.19%, +0.02%
CodeSize: 48245956 -> 48182620 (-0.13%); split: -0.15%, +0.02%
VGPRs: 189372 -> 189900 (+0.28%); split: -0.04%, +0.32%
Latency: 85469322 -> 85262360 (-0.24%); split: -0.32%, +0.08%
InvThroughput: 14515911 -> 14486970 (-0.20%); split: -0.27%, +0.07%
VClause: 197980 -> 197959 (-0.01%); split: -0.02%, +0.01%
Copies: 787838 -> 774288 (-1.72%); split: -1.91%, +0.19%
Branches: 271810 -> 271799 (-0.00%); split: -0.01%, +0.01%
VALU: 5331813 -> 5318566 (-0.25%); split: -0.28%, +0.03%
SALU: 1133559 -> 1133054 (-0.04%); split: -0.05%, +0.01%
VOPD: 2435 -> 2418 (-0.70%); split: +0.12%, -0.82%
fossil-db (navi21):
Totals from 37513 (46.99% of 79825) affected shaders:
Instrs: 26734825 -> 26681225 (-0.20%); split: -0.23%, +0.03%
CodeSize: 141353284 -> 141144360 (-0.15%); split: -0.17%, +0.02%
VGPRs: 1556760 -> 1556384 (-0.02%); split: -0.21%, +0.18%
Latency: 146201548 -> 146156473 (-0.03%); split: -0.20%, +0.17%
InvThroughput: 33921803 -> 33867398 (-0.16%); split: -0.23%, +0.07%
VClause: 502263 -> 502209 (-0.01%); split: -0.27%, +0.26%
SClause: 593142 -> 593155 (+0.00%); split: -0.00%, +0.00%
Copies: 2600995 -> 2551257 (-1.91%); split: -2.16%, +0.25%
Branches: 857910 -> 857787 (-0.01%); split: -0.03%, +0.02%
VALU: 15674532 -> 15625013 (-0.32%); split: -0.35%, +0.04%
SALU: 4635548 -> 4634680 (-0.02%); split: -0.04%, +0.02%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
86f0195f5c
aco/ra: prefer phi operands which don't create waitcnt
...
fossil-db (navi31):
Totals from 89 (0.11% of 79825) affected shaders:
Instrs: 343443 -> 343384 (-0.02%); split: -0.10%, +0.09%
CodeSize: 1792948 -> 1792668 (-0.02%); split: -0.10%, +0.08%
Latency: 2656294 -> 2656490 (+0.01%); split: -0.02%, +0.02%
InvThroughput: 517696 -> 517691 (-0.00%); split: -0.01%, +0.01%
SClause: 9213 -> 9215 (+0.02%); split: -0.01%, +0.03%
Copies: 39138 -> 39089 (-0.13%); split: -0.84%, +0.71%
Branches: 10863 -> 10872 (+0.08%); split: -0.05%, +0.13%
SALU: 49185 -> 49136 (-0.10%); split: -0.67%, +0.57%
fossil-db (navi21):
Totals from 34490 (43.21% of 79825) affected shaders:
Instrs: 23005853 -> 22956529 (-0.21%); split: -0.25%, +0.04%
CodeSize: 120532004 -> 120341412 (-0.16%); split: -0.19%, +0.03%
VGPRs: 1396928 -> 1397520 (+0.04%); split: -0.07%, +0.11%
Latency: 108740068 -> 108499644 (-0.22%); split: -0.53%, +0.30%
InvThroughput: 25286526 -> 25358695 (+0.29%); split: -0.11%, +0.39%
VClause: 421179 -> 421132 (-0.01%); split: -0.29%, +0.27%
SClause: 446414 -> 446423 (+0.00%); split: -0.00%, +0.00%
Copies: 2242236 -> 2243168 (+0.04%); split: -0.42%, +0.46%
Branches: 724556 -> 724903 (+0.05%); split: -0.02%, +0.07%
VALU: 13321078 -> 13321940 (+0.01%); split: -0.07%, +0.08%
SALU: 4069929 -> 4070580 (+0.02%); split: -0.02%, +0.03%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
310f588f92
aco/ra: move variables from affinity register to avoid waitcnt
...
If we don't use this affinity register, we're likely to end up moving the
temporary later. If it's a memory instruction destination, that's probably
more expensive than just copying the blocking variables.
fossil-db (navi31):
Totals from 504 (0.63% of 79825) affected shaders:
Instrs: 4108284 -> 4109026 (+0.02%); split: -0.01%, +0.03%
CodeSize: 21226764 -> 21229764 (+0.01%); split: -0.01%, +0.02%
Latency: 26931635 -> 26806989 (-0.46%); split: -0.47%, +0.00%
InvThroughput: 8443520 -> 8439235 (-0.05%); split: -0.06%, +0.01%
VClause: 99209 -> 99314 (+0.11%); split: -0.00%, +0.11%
SClause: 85089 -> 85085 (-0.00%)
Copies: 340323 -> 340993 (+0.20%); split: -0.06%, +0.26%
Branches: 117225 -> 117209 (-0.01%); split: -0.02%, +0.00%
VALU: 2421859 -> 2422529 (+0.03%); split: -0.01%, +0.04%
SALU: 503465 -> 503470 (+0.00%); split: -0.00%, +0.00%
fossil-db (navi21):
Totals from 582 (0.73% of 79825) affected shaders:
Instrs: 3714908 -> 3714990 (+0.00%); split: -0.02%, +0.02%
CodeSize: 19977880 -> 19973076 (-0.02%); split: -0.04%, +0.01%
VGPRs: 40480 -> 40496 (+0.04%)
Latency: 26028895 -> 25772711 (-0.98%); split: -0.99%, +0.00%
InvThroughput: 9827389 -> 9818194 (-0.09%); split: -0.10%, +0.01%
VClause: 103702 -> 103815 (+0.11%); split: -0.02%, +0.13%
SClause: 90861 -> 90857 (-0.00%)
Copies: 335276 -> 335992 (+0.21%); split: -0.09%, +0.30%
Branches: 123912 -> 123897 (-0.01%); split: -0.02%, +0.00%
VALU: 2466032 -> 2466748 (+0.03%); split: -0.01%, +0.04%
SALU: 533658 -> 533667 (+0.00%); split: -0.00%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
681ec4cba7
aco/ra: track cost of moving variables
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
69bc4efa37
aco/sched_ilp: improve scheduling with VMEM/DS->VALU WaW
...
This improves scheduling with one side of a divergent branch writing to a
VGPR using VMEM/DS, and the other writing using VALU. At the merge block,
it will properly consider that the VGPR was written by a VMEM/DS.
fossil-db (navi31):
Totals from 1224 (1.53% of 79825) affected shaders:
Instrs: 5264815 -> 5267604 (+0.05%); split: -0.00%, +0.06%
CodeSize: 27406404 -> 27422132 (+0.06%); split: -0.00%, +0.06%
Latency: 48325204 -> 48293975 (-0.06%); split: -0.09%, +0.03%
InvThroughput: 8923880 -> 8919191 (-0.05%); split: -0.07%, +0.02%
fossil-db (navi21):
Totals from 1267 (1.59% of 79825) affected shaders:
Instrs: 4628583 -> 4629190 (+0.01%); split: -0.00%, +0.01%
CodeSize: 24974672 -> 24977188 (+0.01%); split: -0.00%, +0.01%
Latency: 45080476 -> 44998120 (-0.18%); split: -0.20%, +0.02%
InvThroughput: 12288202 -> 12269634 (-0.15%); split: -0.16%, +0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
88b6b6db17
aco: only consider cost of memory loads at waitcnt
...
We don't run this code before waitcnt insertion, so this isn't necessary.
This change improves accuracy in these two situations, because the waitcnt
insertion pass is more aware of divergent control flow:
v0 = valu
if (divergent) {
v0 = vmem
} else {
use(v0)
}
v0 = vmem
if (divergent) {
wait vmcnt(0)
} else {
wait vmcnt(0)
}
use(v0)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
6963c8dd80
radv,aco/gfx11: preserve s2 when NGG_WAVE_ID_EN=1
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
According to the ISA doc, this is needed for hang recovery.
This works by just avoiding putting temporaries in s0-3 unless they're
precolored there.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (radv)
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39720 >
2026-02-16 14:33:58 +00:00
Rhys Perry
b60bff0429
aco: consider 64-bit transcendental normal valu for s_delay_alu
...
https://github.com/llvm/llvm-project/pull/180940
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39851 >
2026-02-13 17:03:34 +00:00
Marek Olšák
d1e6a5c1c8
ac: lower load_num_workgroups in NIR
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638 >
2026-02-13 15:33:19 +00:00
Marek Olšák
a9e47751d2
ac: lower load_subgroup_id for ACO in NIR
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638 >
2026-02-13 15:33:19 +00:00
Marek Olšák
0a9bdcac79
ac: lower load_workgroup_ids for ACO in NIR
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638 >
2026-02-13 15:33:19 +00:00
Daniel Schürmann
97f095f6e0
aco/lower_branches: Add try_rotate_latch_block() optimization
...
This optimization looks for unconditional back-edges and aims
to rotate the loop in a way that the final block is emitted
before the loop header, essentially turning
BB1:
if ()
goto BB3;
BB2:
<loop body>
goto BB1;
BB3:
...
into
goto BB1;
BB2:
<loop body>
BB1:
if(!cond)
goto BB2;
BB3:
...
Totals from 4969 (5.89% of 84383) affected shaders: (Navi48)
Instrs: 15253038 -> 15254019 (+0.01%); split: -0.00%, +0.01%
CodeSize: 81225300 -> 81227696 (+0.00%); split: -0.02%, +0.02%
Latency: 320796283 -> 320693480 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 51395922 -> 51376156 (-0.04%); split: -0.04%, +0.00%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
ade5e300ab
aco/insert_delay_alu: handle loop latch block before loop body
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
102aca9843
aco/assembler: emit block_kind_loop_latch before the loop header
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
da1594f8bb
aco: introduce notion of block_kind_loop_latch
...
A block annotated with block_kind_loop_latch denotes a block
the re-entry point for a loop back-edge. It is emitted after
the loop preheader and (potentially) before the loop header.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
9887ce6709
aco/print_asm: Sort block markers by block offset
...
We are going to emit blocks in a different order.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
800a4957bb
aco/lower_branches: Consider branch target of nested conditional branches
...
Totals from 1470 (1.74% of 84383) affected shaders: (Navi48)
Instrs: 5128451 -> 5126842 (-0.03%)
CodeSize: 29359832 -> 29353656 (-0.02%); split: -0.02%, +0.00%
Latency: 41047203 -> 41040786 (-0.02%)
InvThroughput: 6040459 -> 6039619 (-0.01%); split: -0.01%, +0.00%
Branches: 146219 -> 144648 (-1.07%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
fbf2083b8f
aco/isel: Don't emit ELSE side of divergent branches which jump
...
Totals from 50 (0.06% of 84383) affected shaders: (Navi48)
Instrs: 402490 -> 402444 (-0.01%); split: -0.01%, +0.00%
CodeSize: 2239024 -> 2238864 (-0.01%); split: -0.01%, +0.00%
SpillSGPRs: 1493 -> 1496 (+0.20%)
Latency: 5836785 -> 5836747 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 1120893 -> 1120909 (+0.00%); split: -0.00%, +0.00%
Copies: 46128 -> 46082 (-0.10%)
VALU: 222708 -> 222715 (+0.00%); split: -0.00%, +0.00%
SALU: 53039 -> 52993 (-0.09%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
ba32219cf8
aco/isel: Don't emit ELSE side of uniform branches which jump
...
Totals from 4 (0.00% of 84383) affected shaders: (Navi48)
Instrs: 16473 -> 16468 (-0.03%)
CodeSize: 85276 -> 85300 (+0.03%)
SpillSGPRs: 175 -> 176 (+0.57%)
Latency: 267907 -> 267885 (-0.01%)
InvThroughput: 36302 -> 36298 (-0.01%)
Copies: 1353 -> 1345 (-0.59%)
VALU: 9025 -> 9029 (+0.04%)
SALU: 2635 -> 2627 (-0.30%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
96a639918c
aco: don't emit p_logical_start / p_logical_end after divergent branches
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00