Rhys Perry
20cd5cf5f7
aco: delay barrier waitcnt until they are needed
...
fossil-db (navi21):
Totals from 44 (0.06% of 79825) affected shaders:
Instrs: 16001 -> 15932 (-0.43%); split: -0.46%, +0.02%
CodeSize: 85800 -> 85548 (-0.29%); split: -0.30%, +0.01%
Latency: 190124 -> 173458 (-8.77%)
InvThroughput: 23605 -> 22756 (-3.60%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491 >
2025-09-09 12:34:40 +00:00
Rhys Perry
843acfa50b
aco: add a separate barrier_info for release/acquire barriers
...
These can wait for different sets of accesses.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491 >
2025-09-09 12:34:40 +00:00
Rhys Perry
6c446c2f83
aco: refactor waitcnt pass to use barrier_info
...
Currently there's just barrier_info_all, but more will be added later.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491 >
2025-09-09 12:34:40 +00:00
Natalie Vock
9707b30965
nir,aco: Add ds_bvh_stack_rtn
...
This is a ds instruction that also overwrites its first input, so
introduce a new ds format with two outputs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269 >
2025-07-15 21:34:39 +00:00
Rhys Perry
0094e6c32a
aco: optimize lds-only or vmem-only flat access
...
fossil-db (polaris10):
Totals from 138 (0.22% of 62070) affected shaders:
Instrs: 233452 -> 234436 (+0.42%)
CodeSize: 1209392 -> 1213220 (+0.32%)
Latency: 3934496 -> 3928089 (-0.16%); split: -0.17%, +0.00%
InvThroughput: 3040782 -> 3038562 (-0.07%); split: -0.07%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35465 >
2025-07-11 12:15:08 +00:00
Rhys Perry
d705b6198c
aco: simplify waitcnt insertion for flat access
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35465 >
2025-07-11 12:15:08 +00:00
Rhys Perry
86ccceb4de
aco: don't consider gfx1153 to have point sample acceleration
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978 >
2025-06-06 11:55:13 +01:00
Rhys Perry
f10b49781d
aco: make all wait entries linear
...
If we remove exec skips, then we can wait for an entry on all paths in the
linear cfg, but not the logical cfg.
fossil-db (gfx1201):
Totals from 0 (0.00% of 79653) affected shaders:
fossil-db (navi31):
Totals from 0 (0.00% of 79653) affected shaders:
fossil-db (navi21):
Totals from 1586 (1.99% of 79653) affected shaders:
Instrs: 5118897 -> 5113206 (-0.11%); split: -0.11%, +0.00%
CodeSize: 28365852 -> 28343696 (-0.08%); split: -0.08%, +0.00%
Latency: 47820341 -> 47799532 (-0.04%); split: -0.09%, +0.05%
InvThroughput: 9904391 -> 9908653 (+0.04%); split: -0.02%, +0.06%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978 >
2025-06-06 11:55:13 +01:00
Rhys Perry
1088ac49db
aco: sometimes join linear wait entries on logical edges
...
fossil-db (gfx1201):
Totals from 1303 (1.64% of 79653) affected shaders:
Instrs: 6920949 -> 6917692 (-0.05%); split: -0.06%, +0.01%
CodeSize: 37112404 -> 37095728 (-0.04%); split: -0.05%, +0.01%
Latency: 70471343 -> 70365986 (-0.15%); split: -0.15%, +0.00%
InvThroughput: 11515673 -> 11504666 (-0.10%); split: -0.10%, +0.01%
fossil-db (navi31):
Totals from 1293 (1.62% of 79653) affected shaders:
Instrs: 6500186 -> 6496761 (-0.05%); split: -0.06%, +0.01%
CodeSize: 34562712 -> 34549236 (-0.04%); split: -0.04%, +0.01%
Latency: 68604746 -> 68666532 (+0.09%); split: -0.15%, +0.24%
InvThroughput: 11276591 -> 11284914 (+0.07%); split: -0.10%, +0.17%
fossil-db (navi21):
Totals from 811 (1.02% of 79653) affected shaders:
Instrs: 4110953 -> 4108788 (-0.05%); split: -0.05%, +0.00%
CodeSize: 22955984 -> 22948064 (-0.03%); split: -0.03%, +0.00%
Latency: 35070231 -> 35064448 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 6945610 -> 6945053 (-0.01%); split: -0.01%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978 >
2025-06-06 11:51:08 +01:00
Rhys Perry
c1f8537131
aco: skip waitcnt between two vmem writing different lanes
...
fossil-db (gfx1201):
Totals from 1382 (1.74% of 79653) affected shaders:
Instrs: 6531704 -> 6523935 (-0.12%); split: -0.12%, +0.00%
CodeSize: 34992076 -> 34933568 (-0.17%); split: -0.17%, +0.01%
Latency: 70183360 -> 69616066 (-0.81%); split: -0.81%, +0.00%
InvThroughput: 11155445 -> 11068667 (-0.78%); split: -0.78%, +0.00%
fossil-db (navi31):
Totals from 46 (0.06% of 79653) affected shaders:
Instrs: 1833768 -> 1833732 (-0.00%)
CodeSize: 9468788 -> 9468716 (-0.00%)
Latency: 11683092 -> 11667865 (-0.13%)
InvThroughput: 2274377 -> 2272872 (-0.07%)
fossil-db (navi21):
Totals from 0 (0.00% of 79653) affected shaders:
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978 >
2025-06-06 11:51:08 +01:00
Rhys Perry
9649deb50e
aco: skip waitcnt between two vmem writing different halves
...
fossil-db (gfx1201):
Totals from 4 (0.01% of 79653) affected shaders:
Instrs: 41374 -> 41380 (+0.01%); split: -0.01%, +0.02%
CodeSize: 238912 -> 238924 (+0.01%); split: -0.01%, +0.01%
Latency: 706714 -> 706410 (-0.04%)
InvThroughput: 352269 -> 352118 (-0.04%)
VClause: 803 -> 798 (-0.62%)
fossil-db (navi31):
Totals from 0 (0.00% of 79653) affected shaders:
fossil-db (navi21):
Totals from 0 (0.00% of 79653) affected shaders:
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13028
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978 >
2025-06-06 11:51:08 +01:00
Rhys Perry
9a38ad3ca7
aco: add wait_entry::logical_events
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978 >
2025-06-06 11:51:08 +01:00
Rhys Perry
bb99de00f7
aco: add wait_entry::vm_mask
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978 >
2025-06-06 11:51:08 +01:00
Rhys Perry
b70ecfa588
aco: only join barrier_imm/barrier_events for logical edges
...
fossil-db (gfx1201):
Totals from 3 (0.00% of 79653) affected shaders:
Instrs: 2904 -> 2893 (-0.38%)
CodeSize: 14944 -> 14900 (-0.29%)
Latency: 14703 -> 14248 (-3.09%)
InvThroughput: 1237 -> 1210 (-2.18%)
fossil-db (navi31):
Totals from 3 (0.00% of 79653) affected shaders:
Instrs: 2742 -> 2731 (-0.40%)
CodeSize: 14136 -> 14092 (-0.31%)
Latency: 14744 -> 14287 (-3.10%)
InvThroughput: 1241 -> 1213 (-2.26%)
fossil-db (navi21):
Totals from 3 (0.00% of 79653) affected shaders:
Instrs: 2326 -> 2315 (-0.47%)
CodeSize: 12472 -> 12428 (-0.35%)
Latency: 14921 -> 14465 (-3.06%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978 >
2025-06-06 11:51:08 +01:00
Rhys Perry
62a9b4b976
aco: set vmem_types for args_pending_vmem
...
fossil-db (gfx1201):
Totals from 0 (0.00% of 79653) affected shaders:
fossil-db (navi31):
Totals from 11 (0.01% of 79653) affected shaders:
Instrs: 4543 -> 4554 (+0.24%)
CodeSize: 23256 -> 23300 (+0.19%)
fossil-db (navi21):
Totals from 8 (0.01% of 79653) affected shaders:
Instrs: 2333 -> 2341 (+0.34%)
CodeSize: 12328 -> 12360 (+0.26%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 25.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978 >
2025-06-06 11:51:08 +01:00
Rhys Perry
e7a7d9ea2e
aco: fix wait_entry::join() when changing vmem_types
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This is a bitmask, not a boolean.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34935 >
2025-05-14 11:22:13 +00:00
Rhys Perry
171920ceed
aco/gfx115: consider point sample acceleration
...
Like 15428e0d786939a5c7629a9978947c8a9112ce96 in LLVM.
fossil-db (gfx1150):
Totals from 909 (1.14% of 79653) affected shaders:
Instrs: 5840489 -> 5840705 (+0.00%); split: -0.00%, +0.00%
CodeSize: 31133460 -> 31134296 (+0.00%); split: -0.00%, +0.00%
Latency: 52982280 -> 53438577 (+0.86%); split: -0.00%, +0.86%
InvThroughput: 10841454 -> 10942682 (+0.93%); split: -0.00%, +0.93%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 25.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34935 >
2025-05-14 11:22:13 +00:00
Rhys Perry
b03e071583
aco/gfx11: create waitcnt for workgroup vmem barriers
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
It seems this is necessary on GFX11.
Similar to 576a2e798c
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Backport-to: 25.0
Backport-to: 25.1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34634 >
2025-04-25 10:41:52 +00:00
Konstantin Seurer
978e9b670e
aco,nir: Add support for new GFX12 ray tracing instructions
...
Adds image_bvh_dual_intersect_ray and image_bvh8_intersect_ray which can
handle the new BVH format. Both instructions write up to 10 VGPRs so
they need to use a vec16 definition in nir.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34273 >
2025-04-17 20:20:40 +00:00
Georg Lehmann
576a2e798c
aco/gfx12: don't assume memory operations complete in order
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32569 >
2024-12-11 12:22:59 +00:00
Rhys Perry
fd19ff0b9e
aco: force linear for event_vmem_sample and event_vmem_bvh
...
I don't know if this issue affects GFX12, but workaround it anyway to be
safe.
fossil-db (gfx1200):
Totals from 3463 (4.36% of 79395) affected shaders:
Instrs: 9794280 -> 9833253 (+0.40%); split: -0.00%, +0.40%
CodeSize: 52306040 -> 52457988 (+0.29%); split: -0.01%, +0.30%
Latency: 90549385 -> 93617517 (+3.39%); split: -0.00%, +3.39%
InvThroughput: 13189030 -> 13602942 (+3.14%); split: -0.00%, +3.14%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32373 >
2024-12-02 10:13:39 +00:00
Rhys Perry
86c63b29bc
aco/gfx12: insert wait between VMEM WaW
...
https://github.com/llvm/llvm-project/pull/105549
fossil-db (gfx1200):
Totals from 1783 (2.25% of 79395) affected shaders:
Instrs: 7398391 -> 7404566 (+0.08%); split: -0.00%, +0.08%
CodeSize: 38862456 -> 38886364 (+0.06%); split: -0.00%, +0.06%
Latency: 83191513 -> 84211504 (+1.23%); split: -0.00%, +1.23%
InvThroughput: 15185936 -> 15345744 (+1.05%); split: -0.01%, +1.06%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32373 >
2024-12-02 10:13:39 +00:00
Rhys Perry
5375d77488
aco: wait for scratch stores to complete before dealloc_vgprs
...
fossil-db (navi31):
Totals from 392 (0.49% of 79395) affected shaders:
Instrs: 5052043 -> 5054100 (+0.04%)
CodeSize: 26701200 -> 26709428 (+0.03%)
Latency: 43614861 -> 43615368 (+0.00%)
InvThroughput: 7353147 -> 7353216 (+0.00%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24884 >
2024-11-06 09:58:05 +00:00
Rhys Perry
0ad713ca9f
aco: add waitcnt build helper
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24884 >
2024-11-06 09:58:04 +00:00
Rhys Perry
807651561e
aco: split insert_wait_states into two
...
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23337 >
2024-08-22 13:57:00 +00:00
Georg Lehmann
b0ad3c2160
aco: fix s_delay_alu with salu and trans dependency
...
These events were silently truncated in get_counters_for_event.
The integer types in this pass are a bit all over the place, maybe we should
consider using typedefs for clarity or a different solution with type safety.
Fixes: 9e9cabd2fa ("aco/waitcnt: support GFX12 in waitcnt pass")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30163 >
2024-07-15 12:02:35 +00:00
Rhys Perry
f01cac835f
aco/stats: support GFX12 in collect_preasm_stats()
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29225 >
2024-05-20 10:45:39 +00:00
Rhys Perry
9e9cabd2fa
aco/waitcnt: support GFX12 in waitcnt pass
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29225 >
2024-05-20 10:45:39 +00:00
Rhys Perry
75532d8687
aco: add wait_imm::unpack and wait_imm::max
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28981 >
2024-05-10 11:53:08 +00:00
Rhys Perry
f3e461d643
aco/waitcnt: refactor for indexable wait_imm
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28981 >
2024-05-10 11:53:08 +00:00
Rhys Perry
ff2e3ef5eb
aco/waitcnt: add target_info
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28981 >
2024-05-10 11:53:08 +00:00
Rhys Perry
5b1b09ad42
aco/waitcnt: fix DS/VMEM ordered writes when mixed
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28981 >
2024-05-10 11:53:08 +00:00
Samuel Pitoiset
7a69d78ba2
aco: use SPDX-License-Identifier
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28622 >
2024-04-08 15:49:25 +00:00
Rhys Perry
03938804f1
aco: avoid breaking clauses with waitcnt
...
fossil-db (navi31):
Totals from 3573 (4.50% of 79395) affected shaders:
Instrs: 6172096 -> 6170009 (-0.03%); split: -0.04%, +0.01%
CodeSize: 31448052 -> 31439660 (-0.03%); split: -0.03%, +0.01%
Latency: 37317302 -> 37307935 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 6820967 -> 6819930 (-0.02%); split: -0.02%, +0.00%
VClause: 163424 -> 157705 (-3.50%)
SClause: 135441 -> 135295 (-0.11%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28433 >
2024-03-29 12:04:13 +00:00
Daniel Schürmann
a863c7951e
aco: remove create_instruction() template parameter
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370 >
2024-03-28 11:25:43 +00:00
Daniel Schürmann
9b0ebcc39b
aco: change return type of create_instruction() to Instruction*
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370 >
2024-03-28 11:25:43 +00:00
Daniel Schürmann
1187189235
aco: unify different SALU types into single struct SALU_instruction
...
This removes
- SOP1_instruction
- SOP2_instruction
- SOPC_instruction
- SOPK_instruction
- SOPP_instruction
and their corresponding methods.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370 >
2024-03-28 11:25:43 +00:00
Daniel Schürmann
5d265257a0
aco: remove SOPP_instruction::block member
...
Re-use SOPP_instruction::imm instead.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370 >
2024-03-28 11:25:43 +00:00
Georg Lehmann
e49c413a86
aco: use null operand for SOPK s_waitcnt
...
Both null def and op result in the same correct encoding, but these
instructions optionally read a sgpr, so it makes more sense to use an operand.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26163 >
2023-11-15 12:35:32 +00:00
Bas Nieuwenhuizen
5e7c828c0e
aco: Add WMMA instructions.
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24683 >
2023-10-24 13:24:18 +00:00
Qiang Yu
5ef7c54829
aco: wait memory ops done before go to next shader part
...
Next part don't know whether p_end_with_regs args are loaded from
memory ops or not, need to wait it's done here.
Other memory load needs to be waited too like:
a = load_mem()
b = ...
if (...) {
wait_mem(a)
store_mem(a)
}
p_end_with_regs(b)
"a" still needs to be waited, otherwise next shader part regs may
be overwritten by unfinished memory loads.
Memory stores are waited too. When >=gfx10 and last VGT has no
parameter export, we need to wait all memeory stores done before
pos export (see ac_nir_export_position). So when merged shader
(ES+GS or VS+GS) is partially built, first stage needs to wait
all memory stores done, otherwise second stage don't know if
any memory stores pending before.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Signe-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973 >
2023-10-10 02:36:34 +00:00
Rhys Perry
ae9a476c42
aco/waitcnt: add print helpers
...
These may be useful in the future.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25373 >
2023-09-27 13:43:11 +00:00
Rhys Perry
0d0a8c4365
aco/waitcnt: replace wait_cnt::*_cnt with booleans
...
Previously, a loop could be revisited until a counter reaches it's
maximum:
loop {
store()
}
Each visit of that loop would increase vs_cnt until it reaches max.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25373 >
2023-09-27 13:43:11 +00:00
Qiang Yu
a2484b20f9
aco: add pending_lds_access option for insert waitcnt
...
For tcs epilog to add p_barrier at the beginning to sync
main shader part tess factor LDS write.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24442 >
2023-08-16 02:27:45 +00:00
Vitaliy Triang3l Kuzmin
e0f4b52559
aco: Add Primitive Ordered Pixel Shading waitcnt rules
...
When letting the overlapping waves enter their ordered sections, there must
be no memory accesses to resources which need primitive-ordered access that
are still pending, or there would be a race between the current wave and
the overlapping waves.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250 >
2023-06-26 15:58:04 +00:00
Vitaliy Triang3l Kuzmin
6082e126eb
aco: Skip waitcnt insertion in the discard early exit block
...
Waits are needed for early exits from inside a Primitive Ordered Pixel
Shading ordered section, but that code doesn't insert them reliably anyway
because it doesn't obtain the counters for the exact locations of the
jumps, which may be anywhere inside the predecessor blocks.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250 >
2023-06-26 15:58:04 +00:00
Eric Engestrom
6b21653ab4
aco: reformat according to its .clang-format
...
Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23253 >
2023-06-16 19:59:52 +00:00
Rhys Perry
94958e637d
aco: improve printing of s_delay_alu
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23213 >
2023-05-30 12:42:00 +00:00
Rhys Perry
54c0088629
aco: insert s_delay_alu on the linear CFG
...
fossil-db (gfx1100):
Totals from 10498 (7.87% of 133428) affected shaders:
Instrs: 22274711 -> 22277717 (+0.01%); split: -0.01%, +0.03%
CodeSize: 114557040 -> 114569064 (+0.01%); split: -0.01%, +0.02%
Latency: 236505186 -> 236497338 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 33425052 -> 33423876 (-0.00%); split: -0.00%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23213 >
2023-05-30 12:42:00 +00:00
Rhys Perry
d7f48a61ec
aco: use pass_flags to recover s_delay_alu cycles
...
This is simpler and more accurate.
fossil-db (gfx1100):
Totals from 11678 (8.75% of 133428) affected shaders:
Instrs: 25448655 -> 25436028 (-0.05%)
CodeSize: 130364728 -> 130314220 (-0.04%)
Latency: 325247603 -> 325231064 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 45901166 -> 45900022 (-0.00%); split: -0.00%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23213 >
2023-05-30 12:42:00 +00:00