Commit graph

136 commits

Author SHA1 Message Date
Rhys Perry
53ed863b88 aco/insert_waitcnt: improve s_setpc_b64/s_swappc_b64/end_with_regs a bit
Don't wait for any events which don't involve registers.

fossil-db (navi31):
Totals from 210 (0.25% of 84369) affected shaders:
Instrs: 106932 -> 106677 (-0.24%)
CodeSize: 604164 -> 603144 (-0.17%)
Latency: 726405 -> 720433 (-0.82%)
InvThroughput: 102048 -> 101504 (-0.53%); split: -0.54%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39590>
2026-02-06 09:49:20 +00:00
Natalie Vock
0a1911b220 radv,aco: Use function call structure for RT programs
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29580>
2026-01-14 14:19:07 +00:00
Natalie Vock
548062f10e aco/insert_waitcnt: Don't determine linearity by reg number
VGPRs can be linear too, and RT function calls will add VMEM
instructions acting on linear VGPRs. Using the linear VGPR in a block
with only linear preds will cause the pass to incorrectly skip inserting
s_waitcnt.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39157>
2026-01-12 21:46:50 +00:00
Daniel Schürmann
7b7bdb76ab amd: add ac_cu_info::has_point_sample_accel flag and use in ACO
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:47 +00:00
Georg Lehmann
a173e51541 aco/insert_waitcnt: don't merge waitcnts for LDS clauses
We form LDS clauses because heavily interleaving LDS and VALU leads to false
dependencies. But LDS is completely uncached, so splitting the clause with
waitcnts shouldn't hurt, it might even be beneficial because the first
LDS store can start earlier.

Foz-DB Navi48:
Totals from 170 (0.21% of 80287) affected shaders:
Instrs: 239633 -> 240148 (+0.21%)
CodeSize: 1276584 -> 1278532 (+0.15%)
Latency: 3788507 -> 3789876 (+0.04%); split: -0.01%, +0.04%
InvThroughput: 841637 -> 841694 (+0.01%); split: -0.01%, +0.02%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37701>
2025-10-07 13:12:45 +00:00
Georg Lehmann
8e03505782 aco: don't insert s_sendmsg dealloc_vgprs with little vgprs allocated
Reduces message bus traffic when the benefit is small.

Foz-DB Navi31:
Totals from 3752 (4.67% of 80273) affected shaders:
Instrs: 1999755 -> 1992249 (-0.38%)
CodeSize: 10531824 -> 10501800 (-0.29%)
Latency: 14935247 -> 14935147 (-0.00%)
InvThroughput: 5976053 -> 5975262 (-0.01%)

Foz-DB Navi33:
Totals from 2614 (3.26% of 80273) affected shaders:
Instrs: 969475 -> 964247 (-0.54%)
CodeSize: 5171240 -> 5150328 (-0.40%)
Latency: 7891519 -> 7891434 (-0.00%)
InvThroughput: 4815008 -> 4814287 (-0.01%); split: -0.01%, +0.00%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37508>
2025-09-26 07:51:02 +00:00
Georg Lehmann
27cc6317f9 aco: dealloc vgprs if there is a pending non scratch store and no pending export
Because s_sendmsg dealloc_vgprs waits for every counter except vs_count,
and the message bus has limited throughput, we should only insert the dealloc
when we know that it's beneficial.

Foz-DB Navi31:
Totals from 5280 (6.58% of 80273) affected shaders:
Instrs: 4186851 -> 4197416 (+0.25%)
CodeSize: 21910004 -> 21952264 (+0.19%)
Latency: 31679067 -> 31679173 (+0.00%)
InvThroughput: 9182625 -> 9183417 (+0.01%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37508>
2025-09-26 07:51:02 +00:00
Georg Lehmann
26e041e821 aco: remove existing dealloc_vgprs use
We didn't consider that s_sendmsg dealloc_vgpr waits for all counters
expect vscnt.

Foz-DB Navi31:
Totals from 74090 (92.52% of 80084) affected shaders:
Instrs: 36031071 -> 35853573 (-0.49%)
CodeSize: 189233756 -> 188523764 (-0.38%)
Latency: 222378318 -> 222374890 (-0.00%)
InvThroughput: 33366893 -> 33362457 (-0.01%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37508>
2025-09-26 07:51:02 +00:00
Rhys Perry
0f32b573a4 aco/gfx10: skip waitcnts or use vm_vsrc(0) for workgroup lds barriers
fossil-db (navi21):
Totals from 36594 (45.84% of 79825) affected shaders:
Instrs: 19922581 -> 19922563 (-0.00%)
CodeSize: 103616980 -> 103616956 (-0.00%)
Latency: 69862064 -> 69053273 (-1.16%)
InvThroughput: 14607708 -> 14606308 (-0.01%); split: -0.01%, +0.00%

fossil-db (navi31):
Totals from 1641 (2.06% of 79825) affected shaders:
Instrs: 1247591 -> 1247875 (+0.02%); split: -0.00%, +0.03%
CodeSize: 6259516 -> 6260612 (+0.02%); split: -0.00%, +0.02%
Latency: 7657224 -> 7577299 (-1.04%); split: -1.05%, +0.00%
InvThroughput: 1150669 -> 1148171 (-0.22%); split: -0.22%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
ac882985c0 aco/gfx10: skip waitcnts or use vm_vsrc(0) for workgroup vmem barriers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
145b178de2 aco: fix workgroup-scope barrier between vmem and lds
A barrier between two lds/vmem instructions needs to ensure that the
second starts after the first finishes, which means that we can't just
skip workgroup-scope vmem barriers if there is a lds instruction later.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
02718fd4c5 aco: use a separate event for sendmsg_rtn
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
5812c2ea89 aco: update waitcnt events for exports
Include primitive, dual source blend and POS4 exports.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
711c023b55 aco: remove waitcnt code for POPS
We now insert barriers around these instead.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
005694fe1f aco: remove waitcnt code for SMEM stores
These were removed in GFX10.3 and we haven't used them in a while.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
20cd5cf5f7 aco: delay barrier waitcnt until they are needed
fossil-db (navi21):
Totals from 44 (0.06% of 79825) affected shaders:
Instrs: 16001 -> 15932 (-0.43%); split: -0.46%, +0.02%
CodeSize: 85800 -> 85548 (-0.29%); split: -0.30%, +0.01%
Latency: 190124 -> 173458 (-8.77%)
InvThroughput: 23605 -> 22756 (-3.60%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
843acfa50b aco: add a separate barrier_info for release/acquire barriers
These can wait for different sets of accesses.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
6c446c2f83 aco: refactor waitcnt pass to use barrier_info
Currently there's just barrier_info_all, but more will be added later.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Natalie Vock
9707b30965 nir,aco: Add ds_bvh_stack_rtn
This is a ds instruction that also overwrites its first input, so
introduce a new ds format with two outputs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269>
2025-07-15 21:34:39 +00:00
Rhys Perry
0094e6c32a aco: optimize lds-only or vmem-only flat access
fossil-db (polaris10):
Totals from 138 (0.22% of 62070) affected shaders:
Instrs: 233452 -> 234436 (+0.42%)
CodeSize: 1209392 -> 1213220 (+0.32%)
Latency: 3934496 -> 3928089 (-0.16%); split: -0.17%, +0.00%
InvThroughput: 3040782 -> 3038562 (-0.07%); split: -0.07%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35465>
2025-07-11 12:15:08 +00:00
Rhys Perry
d705b6198c aco: simplify waitcnt insertion for flat access
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35465>
2025-07-11 12:15:08 +00:00
Rhys Perry
86ccceb4de aco: don't consider gfx1153 to have point sample acceleration
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>
2025-06-06 11:55:13 +01:00
Rhys Perry
f10b49781d aco: make all wait entries linear
If we remove exec skips, then we can wait for an entry on all paths in the
linear cfg, but not the logical cfg.

fossil-db (gfx1201):
Totals from 0 (0.00% of 79653) affected shaders:

fossil-db (navi31):
Totals from 0 (0.00% of 79653) affected shaders:

fossil-db (navi21):
Totals from 1586 (1.99% of 79653) affected shaders:
Instrs: 5118897 -> 5113206 (-0.11%); split: -0.11%, +0.00%
CodeSize: 28365852 -> 28343696 (-0.08%); split: -0.08%, +0.00%
Latency: 47820341 -> 47799532 (-0.04%); split: -0.09%, +0.05%
InvThroughput: 9904391 -> 9908653 (+0.04%); split: -0.02%, +0.06%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>
2025-06-06 11:55:13 +01:00
Rhys Perry
1088ac49db aco: sometimes join linear wait entries on logical edges
fossil-db (gfx1201):
Totals from 1303 (1.64% of 79653) affected shaders:
Instrs: 6920949 -> 6917692 (-0.05%); split: -0.06%, +0.01%
CodeSize: 37112404 -> 37095728 (-0.04%); split: -0.05%, +0.01%
Latency: 70471343 -> 70365986 (-0.15%); split: -0.15%, +0.00%
InvThroughput: 11515673 -> 11504666 (-0.10%); split: -0.10%, +0.01%

fossil-db (navi31):
Totals from 1293 (1.62% of 79653) affected shaders:
Instrs: 6500186 -> 6496761 (-0.05%); split: -0.06%, +0.01%
CodeSize: 34562712 -> 34549236 (-0.04%); split: -0.04%, +0.01%
Latency: 68604746 -> 68666532 (+0.09%); split: -0.15%, +0.24%
InvThroughput: 11276591 -> 11284914 (+0.07%); split: -0.10%, +0.17%

fossil-db (navi21):
Totals from 811 (1.02% of 79653) affected shaders:
Instrs: 4110953 -> 4108788 (-0.05%); split: -0.05%, +0.00%
CodeSize: 22955984 -> 22948064 (-0.03%); split: -0.03%, +0.00%
Latency: 35070231 -> 35064448 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 6945610 -> 6945053 (-0.01%); split: -0.01%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>
2025-06-06 11:51:08 +01:00
Rhys Perry
c1f8537131 aco: skip waitcnt between two vmem writing different lanes
fossil-db (gfx1201):
Totals from 1382 (1.74% of 79653) affected shaders:
Instrs: 6531704 -> 6523935 (-0.12%); split: -0.12%, +0.00%
CodeSize: 34992076 -> 34933568 (-0.17%); split: -0.17%, +0.01%
Latency: 70183360 -> 69616066 (-0.81%); split: -0.81%, +0.00%
InvThroughput: 11155445 -> 11068667 (-0.78%); split: -0.78%, +0.00%

fossil-db (navi31):
Totals from 46 (0.06% of 79653) affected shaders:
Instrs: 1833768 -> 1833732 (-0.00%)
CodeSize: 9468788 -> 9468716 (-0.00%)
Latency: 11683092 -> 11667865 (-0.13%)
InvThroughput: 2274377 -> 2272872 (-0.07%)

fossil-db (navi21):
Totals from 0 (0.00% of 79653) affected shaders:

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>
2025-06-06 11:51:08 +01:00
Rhys Perry
9649deb50e aco: skip waitcnt between two vmem writing different halves
fossil-db (gfx1201):
Totals from 4 (0.01% of 79653) affected shaders:
Instrs: 41374 -> 41380 (+0.01%); split: -0.01%, +0.02%
CodeSize: 238912 -> 238924 (+0.01%); split: -0.01%, +0.01%
Latency: 706714 -> 706410 (-0.04%)
InvThroughput: 352269 -> 352118 (-0.04%)
VClause: 803 -> 798 (-0.62%)

fossil-db (navi31):
Totals from 0 (0.00% of 79653) affected shaders:

fossil-db (navi21):
Totals from 0 (0.00% of 79653) affected shaders:

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13028
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>
2025-06-06 11:51:08 +01:00
Rhys Perry
9a38ad3ca7 aco: add wait_entry::logical_events
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>
2025-06-06 11:51:08 +01:00
Rhys Perry
bb99de00f7 aco: add wait_entry::vm_mask
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>
2025-06-06 11:51:08 +01:00
Rhys Perry
b70ecfa588 aco: only join barrier_imm/barrier_events for logical edges
fossil-db (gfx1201):
Totals from 3 (0.00% of 79653) affected shaders:
Instrs: 2904 -> 2893 (-0.38%)
CodeSize: 14944 -> 14900 (-0.29%)
Latency: 14703 -> 14248 (-3.09%)
InvThroughput: 1237 -> 1210 (-2.18%)

fossil-db (navi31):
Totals from 3 (0.00% of 79653) affected shaders:
Instrs: 2742 -> 2731 (-0.40%)
CodeSize: 14136 -> 14092 (-0.31%)
Latency: 14744 -> 14287 (-3.10%)
InvThroughput: 1241 -> 1213 (-2.26%)

fossil-db (navi21):
Totals from 3 (0.00% of 79653) affected shaders:
Instrs: 2326 -> 2315 (-0.47%)
CodeSize: 12472 -> 12428 (-0.35%)
Latency: 14921 -> 14465 (-3.06%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>
2025-06-06 11:51:08 +01:00
Rhys Perry
62a9b4b976 aco: set vmem_types for args_pending_vmem
fossil-db (gfx1201):
Totals from 0 (0.00% of 79653) affected shaders:

fossil-db (navi31):
Totals from 11 (0.01% of 79653) affected shaders:
Instrs: 4543 -> 4554 (+0.24%)
CodeSize: 23256 -> 23300 (+0.19%)

fossil-db (navi21):
Totals from 8 (0.01% of 79653) affected shaders:
Instrs: 2333 -> 2341 (+0.34%)
CodeSize: 12328 -> 12360 (+0.26%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 25.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>
2025-06-06 11:51:08 +01:00
Rhys Perry
e7a7d9ea2e aco: fix wait_entry::join() when changing vmem_types
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This is a bitmask, not a boolean.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34935>
2025-05-14 11:22:13 +00:00
Rhys Perry
171920ceed aco/gfx115: consider point sample acceleration
Like 15428e0d786939a5c7629a9978947c8a9112ce96 in LLVM.

fossil-db (gfx1150):
Totals from 909 (1.14% of 79653) affected shaders:
Instrs: 5840489 -> 5840705 (+0.00%); split: -0.00%, +0.00%
CodeSize: 31133460 -> 31134296 (+0.00%); split: -0.00%, +0.00%
Latency: 52982280 -> 53438577 (+0.86%); split: -0.00%, +0.86%
InvThroughput: 10841454 -> 10942682 (+0.93%); split: -0.00%, +0.93%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 25.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34935>
2025-05-14 11:22:13 +00:00
Rhys Perry
b03e071583 aco/gfx11: create waitcnt for workgroup vmem barriers
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
It seems this is necessary on GFX11.

Similar to 576a2e798c

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Backport-to: 25.0
Backport-to: 25.1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34634>
2025-04-25 10:41:52 +00:00
Konstantin Seurer
978e9b670e aco,nir: Add support for new GFX12 ray tracing instructions
Adds image_bvh_dual_intersect_ray and image_bvh8_intersect_ray which can
handle the new BVH format. Both instructions write up to 10 VGPRs so
they need to use a vec16 definition in nir.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34273>
2025-04-17 20:20:40 +00:00
Georg Lehmann
576a2e798c aco/gfx12: don't assume memory operations complete in order
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32569>
2024-12-11 12:22:59 +00:00
Rhys Perry
fd19ff0b9e aco: force linear for event_vmem_sample and event_vmem_bvh
I don't know if this issue affects GFX12, but workaround it anyway to be
safe.

fossil-db (gfx1200):
Totals from 3463 (4.36% of 79395) affected shaders:
Instrs: 9794280 -> 9833253 (+0.40%); split: -0.00%, +0.40%
CodeSize: 52306040 -> 52457988 (+0.29%); split: -0.01%, +0.30%
Latency: 90549385 -> 93617517 (+3.39%); split: -0.00%, +3.39%
InvThroughput: 13189030 -> 13602942 (+3.14%); split: -0.00%, +3.14%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32373>
2024-12-02 10:13:39 +00:00
Rhys Perry
86c63b29bc aco/gfx12: insert wait between VMEM WaW
https://github.com/llvm/llvm-project/pull/105549

fossil-db (gfx1200):
Totals from 1783 (2.25% of 79395) affected shaders:
Instrs: 7398391 -> 7404566 (+0.08%); split: -0.00%, +0.08%
CodeSize: 38862456 -> 38886364 (+0.06%); split: -0.00%, +0.06%
Latency: 83191513 -> 84211504 (+1.23%); split: -0.00%, +1.23%
InvThroughput: 15185936 -> 15345744 (+1.05%); split: -0.01%, +1.06%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32373>
2024-12-02 10:13:39 +00:00
Rhys Perry
5375d77488 aco: wait for scratch stores to complete before dealloc_vgprs
fossil-db (navi31):
Totals from 392 (0.49% of 79395) affected shaders:
Instrs: 5052043 -> 5054100 (+0.04%)
CodeSize: 26701200 -> 26709428 (+0.03%)
Latency: 43614861 -> 43615368 (+0.00%)
InvThroughput: 7353147 -> 7353216 (+0.00%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24884>
2024-11-06 09:58:05 +00:00
Rhys Perry
0ad713ca9f aco: add waitcnt build helper
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24884>
2024-11-06 09:58:04 +00:00
Rhys Perry
807651561e aco: split insert_wait_states into two
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23337>
2024-08-22 13:57:00 +00:00
Georg Lehmann
b0ad3c2160 aco: fix s_delay_alu with salu and trans dependency
These events were silently truncated in get_counters_for_event.
The integer types in this pass are a bit all over the place, maybe we should
consider using typedefs for clarity or a different solution with type safety.

Fixes: 9e9cabd2fa ("aco/waitcnt: support GFX12 in waitcnt pass")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30163>
2024-07-15 12:02:35 +00:00
Rhys Perry
f01cac835f aco/stats: support GFX12 in collect_preasm_stats()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29225>
2024-05-20 10:45:39 +00:00
Rhys Perry
9e9cabd2fa aco/waitcnt: support GFX12 in waitcnt pass
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29225>
2024-05-20 10:45:39 +00:00
Rhys Perry
75532d8687 aco: add wait_imm::unpack and wait_imm::max
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28981>
2024-05-10 11:53:08 +00:00
Rhys Perry
f3e461d643 aco/waitcnt: refactor for indexable wait_imm
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28981>
2024-05-10 11:53:08 +00:00
Rhys Perry
ff2e3ef5eb aco/waitcnt: add target_info
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28981>
2024-05-10 11:53:08 +00:00
Rhys Perry
5b1b09ad42 aco/waitcnt: fix DS/VMEM ordered writes when mixed
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28981>
2024-05-10 11:53:08 +00:00
Samuel Pitoiset
7a69d78ba2 aco: use SPDX-License-Identifier
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28622>
2024-04-08 15:49:25 +00:00
Rhys Perry
03938804f1 aco: avoid breaking clauses with waitcnt
fossil-db (navi31):
Totals from 3573 (4.50% of 79395) affected shaders:
Instrs: 6172096 -> 6170009 (-0.03%); split: -0.04%, +0.01%
CodeSize: 31448052 -> 31439660 (-0.03%); split: -0.03%, +0.01%
Latency: 37317302 -> 37307935 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 6820967 -> 6819930 (-0.02%); split: -0.02%, +0.00%
VClause: 163424 -> 157705 (-3.50%)
SClause: 135441 -> 135295 (-0.11%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28433>
2024-03-29 12:04:13 +00:00
Daniel Schürmann
a863c7951e aco: remove create_instruction() template parameter
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370>
2024-03-28 11:25:43 +00:00