Commit graph

119 commits

Author SHA1 Message Date
Georg Lehmann
3e10ab34e1 aco/insert_NOPs: explicitly wait for sa_sdst to resolve SALU -> VALU hazards
The assumption that these waits are not required has been proven incorrect
in at least some cases.

Totals from 190 (0.24% of 79825) affected shaders: (Navi31)
Instrs: 499718 -> 500491 (+0.15%)
CodeSize: 2658228 -> 2661916 (+0.14%)
Latency: 5964632 -> 5965453 (+0.01%); split: -0.00%, +0.01%
InvThroughput: 794221 -> 794289 (+0.01%)

Totals from 17093 (21.41% of 79839) affected shaders: (Navi48)
Instrs: 22805214 -> 22854313 (+0.22%)
CodeSize: 121240428 -> 121432904 (+0.16%); split: -0.00%, +0.16%
Latency: 166500300 -> 166530529 (+0.02%); split: -0.00%, +0.02%
InvThroughput: 28770053 -> 28772870 (+0.01%); split: -0.00%, +0.01%

Fixes: 018f45f981 ("aco/insert_NOPs: remove redundant VALUReadSGPRHazard waits")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14516

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39252>
2026-01-12 15:48:38 +00:00
Daniel Schürmann
7b7bdb76ab amd: add ac_cu_info::has_point_sample_accel flag and use in ACO
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>
2025-12-22 07:34:47 +00:00
Alyssa Rosenzweig
079e9ae606 treewide: use BITSET_*_COUNT
Mix of Coccinelle patch, manual fix ups, sed, etc. Probably best to review the diff
as-if hand written:

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38955>
2025-12-16 17:42:10 +00:00
Natalie Vock
1243d575a5 aco/insert_nops: Consider s_setpc target susceptible to VALUReadSGPRHazard
Some GPU hangs witnessed in the wild on RDNA4 in Control and Arc Raiders
seem to point towards closest-hit shaders reading a stale value for the
SGPR pair containing the currently-executing shader's address.

This SGPR pair was read by VALU in the preceding traversal shader,
making it susceptible to VALUReadSGPRHazard. Inserting
VALUReadSGPRHazard mitigations before accessing the s_setpc target seems
to fix the hang. We don't have conclusive proof that this is hazardous,
but given that all signs point towards it and we have a reasonably
simple workaround, let's roll with this for now to mitigate the hangs.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38290>
2025-11-18 18:43:00 +00:00
Georg Lehmann
018f45f981 aco/insert_NOPs: remove redundant VALUReadSGPRHazard waits
Mostly removes SALU->VALU waits if the VALU writes a sgpr.

Foz-DB GFX1201:
Totals from 18553 (22.51% of 82419) affected shaders:
Instrs: 27388414 -> 27321118 (-0.25%)
CodeSize: 145389276 -> 145118128 (-0.19%); split: -0.19%, +0.00%
Latency: 200288087 -> 200252583 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 36311237 -> 36307369 (-0.01%); split: -0.01%, +0.00%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38445>
2025-11-17 16:28:36 +00:00
Georg Lehmann
b1d730982e aco/insert_NOPs: remove redundant VALUMaskWriteHazard waits
This removes a lot of VALU->SALU waits.

Foz-DB Navi31:
Totals from 8908 (10.84% of 82179) affected shaders:
Instrs: 17118986 -> 17084870 (-0.20%)
CodeSize: 91057212 -> 90919300 (-0.15%); split: -0.15%, +0.00%
Latency: 154044128 -> 154036848 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 26608698 -> 26607933 (-0.00%); split: -0.00%, +0.00%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38445>
2025-11-17 16:28:36 +00:00
Georg Lehmann
b2172467d1 aco/gfx10_3: work around NSA hazard
4+ dword NSA can hang if exec becomes non-zero again directly before
the instruction.

Foz-DB Navi21:
Totals from 608 (0.74% of 82161) affected shaders:
Instrs: 945138 -> 946431 (+0.14%)
CodeSize: 5171580 -> 5176864 (+0.10%)
Latency: 13356895 -> 13357113 (+0.00%)
InvThroughput: 3043234 -> 3043236 (+0.00%); split: -0.00%, +0.00%

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9852
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13981
Cc: mesa-stable

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38215>
2025-11-05 10:06:04 +00:00
Yonggang Luo
fc1b26f4dc aco: Fixes warning note: ambiguity is between a regular call to this operator and a call with the argument order reversed
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
../../src/amd/compiler/aco_util.h:300:9: note: ambiguity is between a regular call to this operator and a call with the argument order reversed
  300 |    bool operator==(const monotonic_buffer_resource& other) { return buffer == other.buffer; }

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36722>
2025-08-13 19:49:37 +00:00
Georg Lehmann
46c1bd1147 aco: add a dedicated pass for better float MODE insertion
Foz-DB Navi48:
Totals from 14 (0.02% of 80251) affected shaders:
Instrs: 13998 -> 11684 (-16.53%)
CodeSize: 104464 -> 86260 (-17.43%)
Latency: 108722 -> 106667 (-1.89%)
InvThroughput: 100332 -> 100324 (-0.01%)
VClause: 621 -> 595 (-4.19%); split: -4.99%, +0.81%
VALU: 6875 -> 6871 (-0.06%)
SALU: 3256 -> 1015 (-68.83%)
VOPD: 1328 -> 1332 (+0.30%)

Removes the s_setreg spam in FSR4.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35746>
2025-07-10 13:48:50 +00:00
Rhys Perry
34f1a8f707 aco: handle FPAtomicToDenormModeHazard
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This is quite unlikely to happen, but I guess it might be possible and
it's relatively simple to work around.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35884>
2025-07-07 13:02:43 +00:00
Rhys Perry
00dd0d0dd1 aco: update VALUReadSGPRHazard comment
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35387>
2025-06-09 10:12:25 +00:00
Rhys Perry
a714a19e16 aco/gfx12: fix VALUReadSGPRHazard with carry-out
fossil-db (gfx1201):
Totals from 370 (0.46% of 79653) affected shaders:
Instrs: 3933639 -> 3935914 (+0.06%)
CodeSize: 20743448 -> 20752068 (+0.04%); split: -0.00%, +0.04%
Latency: 26261246 -> 26261921 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 5363675 -> 5363760 (+0.00%); split: -0.00%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 65f95ae74e ("aco/insert_NOPs: implement VALU -> VALU case for VALUReadSGPRHazard on GFX12")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35387>
2025-06-09 10:12:25 +00:00
Rhys Perry
86ccceb4de aco: don't consider gfx1153 to have point sample acceleration
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34978>
2025-06-06 11:55:13 +01:00
Georg Lehmann
4fa3fb87c7 aco/insert_NOPs: allow WMMA with constant C matrix
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34396>
2025-04-22 16:08:56 +00:00
Rhys Perry
427479c040 aco: remove va_vdst/vm_vsrc/sa_sdst variables
Use the "wait" variable instead.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34529>
2025-04-17 17:28:22 +00:00
Rhys Perry
3d6fa6996c aco: init vm_vsrc/sa_sdst from depctr_wait
fossil-db (navi31):
Totals from 5805 (7.31% of 79377) affected shaders:
Instrs: 14229621 -> 14207115 (-0.16%); split: -0.16%, +0.00%
CodeSize: 75358724 -> 75268624 (-0.12%); split: -0.12%, +0.00%
Latency: 133637034 -> 133624262 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 22067819 -> 22066213 (-0.01%); split: -0.01%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34529>
2025-04-17 17:28:22 +00:00
Rhys Perry
ce2be5ab8e aco: combine VALU lanemask hazard into VALUMaskWriteHazard
This is now basically the same as the original VALUMaskWriteHazard, except
it now considers both VALU and SALU writes.

Now that it's a part of VALUMaskWriteHazard, differences from the original
VALU lanemask workaround are:
- it includes SALU reads after the write
- it includes VALU writes and SALU/VALU reads after the write which are
  not lanemasks
- it combines s_waitcnt_depctr instructions when it's a read after both a
  SALU write and a VALU write
- non-exec VALU SGPR reads reset the SGPRs read by VALU as a lanemask
- exec SGPRs are ignored

resolve_all_gfx11() is also finished.

fossil-db (navi31):
Totals from 21538 (27.13% of 79377) affected shaders:
Instrs: 27628855 -> 27552972 (-0.27%); split: -0.30%, +0.03%
CodeSize: 145968448 -> 145667616 (-0.21%); split: -0.23%, +0.02%
Latency: 209537805 -> 209509519 (-0.01%); split: -0.02%, +0.00%
InvThroughput: 36304270 -> 36301624 (-0.01%); split: -0.01%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12623
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11480
Backport-to: 25.0
Backport-to: 25.1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34529>
2025-04-17 17:28:22 +00:00
Rhys Perry
0ec174afd5 aco: insert dependency waits in certain situations
This seems to fix some artifacts, but we're not sure why, so it might not
be a correct or optimal solution.

fossil-db (navi31):
Totals from 28424 (35.81% of 79377) affected shaders:
Instrs: 30112910 -> 30348977 (+0.78%); split: -0.00%, +0.78%
CodeSize: 159542980 -> 160485336 (+0.59%); split: -0.00%, +0.59%
Latency: 221438396 -> 221500856 (+0.03%); split: -0.00%, +0.03%
InvThroughput: 38154231 -> 38159984 (+0.02%); split: -0.00%, +0.02%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Backport-to: 25.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33853>
2025-03-05 16:22:54 +00:00
Daniel Schürmann
65f95ae74e aco/insert_NOPs: implement VALU -> VALU case for VALUReadSGPRHazard on GFX12
Totals from 36918 (46.50% of 79395) affected shaders: (GFX1200)

Instrs: 34997889 -> 35296429 (+0.85%); split: -0.00%, +0.85%
CodeSize: 186161112 -> 187334364 (+0.63%); split: -0.00%, +0.63%
Latency: 250265551 -> 250330784 (+0.03%); split: -0.00%, +0.03%
InvThroughput: 41185298 -> 41192503 (+0.02%); split: -0.00%, +0.02%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32682>
2025-01-30 03:13:16 +00:00
Daniel Schürmann
6c7355f0e6 aco/insert_NOPs: refactor VALUReadSGPRHazard detection
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32682>
2025-01-30 03:13:16 +00:00
Rhys Perry
4c3809e7fc aco: use small_vec in RegCounterMap
This seems to be a little faster.

insert_NOPs (navi31):
Difference at 95.0% confidence
	-11.484 +/- 6.13377
	-1.62767% +/- 0.860593%
	(Student's t, pooled s = 5.71913)

insert_NOPs (gfx1200):
Difference at 95.0% confidence
	-35.6745 +/- 4.97972
	-8.1236% +/- 1.10453%
	(Student's t, pooled s = 4.6431)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32374>
2024-11-28 17:07:34 +00:00
Daniel Schürmann
bb87832ce0 aco/insert_NOPs: add early exit to handle_valu_partial_forwarding_hazard_instr
No need to continue if there was already a hazard found in
a different control flow path.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32191>
2024-11-22 08:46:32 +00:00
Daniel Schürmann
07df37ba01 aco/insert_NOPs: use RegCounterMap as replacement for the CounterMap implementation
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32191>
2024-11-22 08:46:32 +00:00
Daniel Schürmann
fb5e5adfb3 aco/insert_NOPs: implement vector-based RegCounterMap as replacement for VGPRCounterMap
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32191>
2024-11-22 08:46:32 +00:00
Rhys Perry
295b7d606f aco: insert NOP before dealloc_vgpr in the insert_NOPs pass
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24884>
2024-11-06 09:58:05 +00:00
Rhys Perry
47e0f468cf aco: workaround VALUReadSGPRHazard
fossil-db (gfx1200):
Totals from 65112 (82.01% of 79395) affected shaders:
Instrs: 41732906 -> 42987198 (+3.01%); split: -0.00%, +3.01%
CodeSize: 222451964 -> 226942644 (+2.02%); split: -0.01%, +2.03%
Latency: 290411063 -> 290944688 (+0.18%); split: -0.00%, +0.18%
InvThroughput: 45854913 -> 45910275 (+0.12%); split: -0.00%, +0.12%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30478>
2024-10-24 16:08:07 +00:00
Rhys Perry
9ab0c4b047 aco: minor CounterMap::operator== fix
I don't think this matters for how we use CounterMap::operator==.

The BITSET_TEST() was unnecessary because of the BITSET_EQUAL above.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30478>
2024-10-24 16:08:07 +00:00
Rhys Perry
f5b871f825 aco: split CounterMap off from VGPRCounterMap
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30478>
2024-10-24 16:08:07 +00:00
Georg Lehmann
977f435f4c aco/ir: add function to parse depctr waits
No Foz-DB changes on Navi31.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31132>
2024-10-17 11:16:16 +00:00
Rhys Perry
11262a01ce aco: preserve bitsets after a lane mask is written
fossil-db (navi31):
Totals from 4840 (6.10% of 79395) affected shaders:
Instrs: 13733449 -> 13761177 (+0.20%); split: -0.00%, +0.21%
CodeSize: 71997868 -> 72102520 (+0.15%); split: -0.00%, +0.15%
Latency: 128385177 -> 128408780 (+0.02%); split: -0.00%, +0.02%
InvThroughput: 21105847 -> 21109475 (+0.02%); split: -0.00%, +0.02%
VALU: 7741209 -> 7741210 (+0.00%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Backport-to: 24.1
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
2024-08-26 19:16:34 +00:00
Rhys Perry
61e73c2323 aco: check SALU writing lanemask later for VALUMaskWriteHazard
This should be done after reads are checked and
sgpr_read_by_valu_as_lanemask_then_wr_by_salu is reset. The old version
also skipped checking the reads if the write check passed.

fossil-db (navi31):
Totals from 193 (0.24% of 79395) affected shaders:
Instrs: 3212435 -> 3212735 (+0.01%)
CodeSize: 16462868 -> 16463848 (+0.01%); split: -0.00%, +0.01%
Latency: 19492377 -> 19492462 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 4419705 -> 4419718 (+0.00%); split: -0.00%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Backport-to: 24.1
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
2024-08-26 19:16:34 +00:00
Rhys Perry
b1ba7d1b99 aco: don't consider sa_sdst=0 before SALU write to fix VALUMaskWriteHazard
LLVM does but that's probably a bug.

fossil-db (navi31):
Totals from 311 (0.39% of 79395) affected shaders:
Instrs: 380453 -> 381075 (+0.16%)
CodeSize: 1961012 -> 1964744 (+0.19%)
Latency: 4799095 -> 4800313 (+0.03%)
InvThroughput: 958358 -> 958904 (+0.06%)
VALU: 242322 -> 242633 (+0.13%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Backport-to: 24.1
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
2024-08-26 19:16:34 +00:00
Rhys Perry
8f5ee70d85 aco: also consider VALU reads for VALUMaskWriteHazard
fossil-db (navi31):
Totals from 9776 (12.31% of 79395) affected shaders:
Instrs: 19348258 -> 19383680 (+0.18%); split: -0.00%, +0.19%
CodeSize: 101223460 -> 101366964 (+0.14%); split: -0.01%, +0.15%
Latency: 172853115 -> 172866070 (+0.01%); split: -0.01%, +0.01%
InvThroughput: 27590468 -> 27592390 (+0.01%); split: -0.00%, +0.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11550
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11436
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11337
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11738
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11741
Backport-to: 24.1
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
2024-08-26 19:16:34 +00:00
Rhys Perry
ee648326d9 aco: ignore exec and literals when mitigating VALUMaskWriteHazard
LLVM ignores exec and literals don't seem to work in some cases.

fossil-db (navi31):
Totals from 2676 (3.37% of 79395) affected shaders:
Instrs: 10638979 -> 10646019 (+0.07%); split: -0.00%, +0.07%
CodeSize: 55929640 -> 55959416 (+0.05%); split: -0.00%, +0.06%
Latency: 107707408 -> 107712893 (+0.01%); split: -0.00%, +0.01%
InvThroughput: 18119843 -> 18120442 (+0.00%); split: -0.00%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Backport-to: 24.1
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
2024-08-26 19:16:34 +00:00
Rhys Perry
7b92e11e16 aco: forget valu delays after certain s_waitcnt_depctr/LDSDIR
fossil-db (navi31):
Totals from 55242 (69.58% of 79395) affected shaders:
Instrs: 40507666 -> 40138006 (-0.91%); split: -0.91%, +0.00%
CodeSize: 212516104 -> 211025880 (-0.70%); split: -0.70%, +0.00%
Latency: 281643258 -> 281628053 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 46370668 -> 46369637 (-0.00%); split: -0.00%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23337>
2024-08-22 13:57:01 +00:00
Rhys Perry
0919ce1ac4 aco/gfx11.5: workaround export priority issue
https://github.com/llvm/llvm-project/pull/99273

fossil-db (gfx1150):
Totals from 73996 (93.20% of 79395) affected shaders:
Instrs: 36015357 -> 36807177 (+2.20%)
CodeSize: 189072544 -> 192238748 (+1.67%)
Latency: 245845181 -> 246790550 (+0.38%); split: -0.00%, +0.38%
InvThroughput: 45068018 -> 45116177 (+0.11%); split: -0.00%, +0.11%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30241>
2024-07-23 13:14:51 +00:00
Rhys Perry
17758f0a02 aco: fix wmma raw hazard
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29912>
2024-07-01 17:34:22 +00:00
Rhys Perry
872dda2bc5 aco: support GFX12 in insert_NOPs
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330>
2024-05-28 10:52:11 +00:00
Rhys Perry
75532d8687 aco: add wait_imm::unpack and wait_imm::max
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28981>
2024-05-10 11:53:08 +00:00
Samuel Pitoiset
7a69d78ba2 aco: use SPDX-License-Identifier
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28622>
2024-04-08 15:49:25 +00:00
Daniel Schürmann
a863c7951e aco: remove create_instruction() template parameter
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370>
2024-03-28 11:25:43 +00:00
Daniel Schürmann
9b0ebcc39b aco: change return type of create_instruction() to Instruction*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370>
2024-03-28 11:25:43 +00:00
Daniel Schürmann
1187189235 aco: unify different SALU types into single struct SALU_instruction
This removes
- SOP1_instruction
- SOP2_instruction
- SOPC_instruction
- SOPK_instruction
- SOPP_instruction

and their corresponding methods.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370>
2024-03-28 11:25:43 +00:00
Daniel Schürmann
5d265257a0 aco: remove SOPP_instruction::block member
Re-use SOPP_instruction::imm instead.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370>
2024-03-28 11:25:43 +00:00
Georg Lehmann
b90ec971d7 aco/gfx11: resolve VcmpxPermlaneHazard for v_permlane64
The GFX11 ISA docs description of this hazard says it's about v_permlane in
general, not just v_permlane(x)16.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27118>
2024-01-24 16:38:40 +00:00
Georg Lehmann
19876386e2 aco/gfx11: use v_nop to resolve VcmpxPermlaneHazard
The GFX11 ISA doc explicitly recommends using v_nop in
7.2.8. PERMLANE Specific Rules.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27118>
2024-01-24 16:38:40 +00:00
Daniel Schürmann
42e9ba1c70 aco: remove VCCZ and EXECZ register handling
We don't use these registers and since RDNA3 removed the explicit usage,
it is unlikely that we will properly support them in the future.
Removing the registers from the ACO IR prevents accidentally using them
without proper support.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26664>
2023-12-14 20:08:28 +00:00
Georg Lehmann
2f4e53b22a aco: fix detecting sgprs read by SMEM hazard
s_waitcnt_lgkmcnt is SOPK, not SOPP and there are other SOPK instructions
that don't mitigate the hazard.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26163>
2023-11-15 12:35:32 +00:00
Georg Lehmann
e49c413a86 aco: use null operand for SOPK s_waitcnt
Both null def and op result in the same correct encoding, but these
instructions optionally read a sgpr, so it makes more sense to use an operand.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26163>
2023-11-15 12:35:32 +00:00
Rhys Perry
690cb0b211 aco: resolve all possible hazards at the end of shader parts
fossil-db (vega10):
Totals from 1266 (2.01% of 63055) affected shaders:
Instrs: 707116 -> 708382 (+0.18%)
CodeSize: 3512452 -> 3517516 (+0.14%)
Latency: 6661724 -> 6666788 (+0.08%)
InvThroughput: 4393626 -> 4393904 (+0.01%); split: -0.00%, +0.01%

fossil-db (navi10):
Totals from 1305 (2.07% of 63015) affected shaders:
Instrs: 719699 -> 722009 (+0.32%)
CodeSize: 3650836 -> 3660076 (+0.25%)
Latency: 5691633 -> 5693933 (+0.04%)
InvThroughput: 1532010 -> 1532024 (+0.00%); split: -0.00%, +0.00%

fossil-db (navi31):
Totals from 1580 (1.99% of 79332) affected shaders:
Instrs: 1678242 -> 1679879 (+0.10%)
CodeSize: 8463464 -> 8470168 (+0.08%)
Latency: 14273661 -> 14275298 (+0.01%)
InvThroughput: 3668049 -> 3668080 (+0.00%); split: -0.00%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25374>
2023-10-11 15:14:04 +00:00