Commit graph

3082 commits

Author SHA1 Message Date
Rhys Perry
ef74407577 aco/gfx12: use ttmp9/ttmp7 for workgroup id
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330>
2024-05-28 10:52:11 +00:00
Rhys Perry
c8123b67e0 aco/gfx12: don't create v_fmac_legacy_f32
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330>
2024-05-28 10:52:11 +00:00
Rhys Perry
e79a8219d2 aco/gfx12: sign-extend s_getpc_b64
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330>
2024-05-28 10:52:11 +00:00
Rhys Perry
ae18c88409 aco/gfx12: implement workgroup barrier
Same sequence LLVM uses for llvm.amdgcn.s.barrier.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330>
2024-05-28 10:52:11 +00:00
Rhys Perry
fae2a85d57 aco/gfx12: implement subgroup shader clock
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330>
2024-05-28 10:52:11 +00:00
Rhys Perry
872dda2bc5 aco: support GFX12 in insert_NOPs
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330>
2024-05-28 10:52:11 +00:00
Samuel Pitoiset
3d6957268b aco: use new common helpers for building buffer descriptors
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29268>
2024-05-22 08:31:39 +00:00
Rhys Perry
b99c48b011 aco/lower_phis: don't create boolean loop header phis in some situations
If we have a loop with continue_or_break and no divergent exits, there is
no need for a loop header phi.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29121>
2024-05-21 21:28:13 +00:00
Rhys Perry
4ae8a558b2 aco: remove nir_to_aco
This isn't used anymore

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29121>
2024-05-21 21:28:13 +00:00
Rhys Perry
b1964f03e7 aco: use scalar phi lowering for lcssa workaround
This lets us use non-undef for the last operand, if necessary
(demonstrated in the test).

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29121>
2024-05-21 21:28:13 +00:00
Rhys Perry
bbe4652430 aco: create lcssa phis for continue_or_break loops when necessary
These might not exist because adding would decrease the quality of
divergence analysis. They are necessary for continue_or_break though, so
add them later, where they won't affect divergence analysis.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10623
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29121>
2024-05-21 21:28:13 +00:00
Rhys Perry
3fc7207f50 aco/lower_phis: create loop header phis for non-boolean loop exit phis
These might be necessary if continue_or_break and divergent breaks are both used:
   loop {
      if (divergent) {
         a = loop_invariant_sgpr
         break
      }
      discard_if
   }
   b = phi a
If we break because discard_if makes exec empty but only did so in
previous iterations, then the phi should use "a" from those previous
iterations.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29121>
2024-05-21 21:28:13 +00:00
Georg Lehmann
cc404d45ff aco: remove perfwarn
This didn't do anything useful.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29270>
2024-05-21 13:31:23 +00:00
Georg Lehmann
ea3e5bcc99 aco/optimizer: remove ineffective undef opt
No stats changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29270>
2024-05-21 13:31:23 +00:00
Georg Lehmann
bd699b5d88 aco/optimizer: remove ineffective vcc opt
No stats changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29270>
2024-05-21 13:31:23 +00:00
Dylan Baker
46644ba371 meson: use glslang --depfile argument when possible
This reduces the amount of manual dependency tracking developers need to
do. This is turned on if glslang >= 11.3.0 is used, or 11.9.0 on
Windows, but otherwise the status quo is maintained. This means I have
not removed any use of `depend_files`. We could make make these hard
requirements and remove the use of `depend_files` too.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28329>
2024-05-20 17:34:17 +00:00
Rhys Perry
418fed1805 aco: update VS prolog waitcnt for GFX12
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29225>
2024-05-20 10:45:39 +00:00
Rhys Perry
f01cac835f aco/stats: support GFX12 in collect_preasm_stats()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29225>
2024-05-20 10:45:39 +00:00
Rhys Perry
9e9cabd2fa aco/waitcnt: support GFX12 in waitcnt pass
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29225>
2024-05-20 10:45:39 +00:00
Rhys Perry
cadce0f3b7 aco: add GFX12 wait counters
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29225>
2024-05-20 10:45:38 +00:00
Rhys Perry
4abe5b7927 aco/gfx12: disable s_cmpk optimization
These opcodes were removed.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29162>
2024-05-14 20:50:28 +00:00
Rhys Perry
2c4f561708 aco: don't change prefetch mode on GFX11.5+
This instruction was effectively removed:
https://github.com/llvm/llvm-project/commit/ba131b7017ce

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29162>
2024-05-14 20:50:27 +00:00
Rhys Perry
5e58e32832 aco/tests: add GFX12 assembler tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29162>
2024-05-14 20:50:27 +00:00
Rhys Perry
e1e5bc0dd0 aco: support GFX12 in assembler
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29162>
2024-05-14 20:50:27 +00:00
Rhys Perry
74aa6437d6 aco: add GFX11.5+ opcodes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29162>
2024-05-14 20:50:27 +00:00
Rhys Perry
97698e564a aco: add SFPU/ValuPseudoScalarTrans instr class
The latency is from LLVM's SISchedule.td

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29162>
2024-05-14 20:50:27 +00:00
Rhys Perry
e9a25151fa aco/tests: support GFX12
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29162>
2024-05-14 20:50:27 +00:00
Friedrich Vock
590ea76104 aco/spill: Insert p_start_linear_vgpr right after p_logical_end
If p_start_linear_vgpr allocates a VGPR that is already blocked, RA
will try moving the blocking VGPR somewhere else. If
p_start_linear_vgpr is inserted right before the branch, that move will
be inserted after exec has been overwritten, which might cause the move
to be skipped for some threads.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28041>
2024-05-14 13:54:34 +00:00
Friedrich Vock
84c1870b65 aco/tests: Insert p_logical_start/end in reduce_temp tests
Linear VGPR insertion will depend on a p_logical_end existing in the
blocks the VGPR is inserted in.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28041>
2024-05-14 13:54:34 +00:00
Rhys Perry
cb81ec7a61 aco: don't count certain pseudo towards VMEM_STORE_CLAUSE_MAX_GRAB_DIST
fossil-db (navi31):
Totals from 1023 (1.29% of 79395) affected shaders:
MaxWaves: 29258 -> 29240 (-0.06%)
Instrs: 1134024 -> 1133163 (-0.08%); split: -0.10%, +0.02%
CodeSize: 5682108 -> 5678696 (-0.06%); split: -0.08%, +0.02%
VGPRs: 60248 -> 60272 (+0.04%); split: -0.08%, +0.12%
Latency: 3797510 -> 3792797 (-0.12%); split: -0.18%, +0.05%
InvThroughput: 781270 -> 781239 (-0.00%); split: -0.03%, +0.03%
VClause: 24701 -> 23976 (-2.94%); split: -3.55%, +0.61%
Copies: 75177 -> 75169 (-0.01%); split: -0.20%, +0.19%
VALU: 659939 -> 659962 (+0.00%); split: -0.02%, +0.02%
VOPD: 2040 -> 2009 (-1.52%); split: +0.29%, -1.81%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28948>
2024-05-14 12:30:54 +00:00
Georg Lehmann
80b8bbf0c5 aco/gfx11: use v_swap_b16
I tested that v_swap_b16 can be encoded as VOP3, because the ISA doc doesn't list
it as a possible VOP3 opcode. VOP3 is nessecary to access v128+.

Foz-DB Navi31:
Totals from 32 (0.04% of 79395) affected shaders:
Instrs: 201865 -> 195168 (-3.32%)
CodeSize: 1082220 -> 1031228 (-4.71%); split: -4.71%, +0.00%
Latency: 2258198 -> 2238586 (-0.87%)
InvThroughput: 796731 -> 788934 (-0.98%)
Copies: 34514 -> 29220 (-15.34%)
VALU: 122457 -> 117163 (-4.32%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29143>
2024-05-13 18:42:19 +00:00
Rhys Perry
6ddd675168 aco/util: improve small_vec assertion
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29071>
2024-05-13 17:22:26 +00:00
Rhys Perry
869253b66c aco: support VS prologs with unaligned access
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29071>
2024-05-13 17:22:26 +00:00
Rhys Perry
9ec2fa392f aco: copy VS prolog constants after loads
This way, the loads start earlier.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29071>
2024-05-13 17:22:26 +00:00
Rhys Perry
46b8ba8154 aco: form hard clauses in VS prologs
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29071>
2024-05-13 17:22:26 +00:00
Rhys Perry
d48c8905f1 radv: keep track of unaligned dynamic vertex access
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29071>
2024-05-13 17:22:26 +00:00
Marek Olšák
58a5de5c34 amd: add gfx12 register definitions into the register header generator
The generator renamed some definitions to resolve conflicts.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29007>
2024-05-11 22:14:05 -04:00
Rhys Perry
75532d8687 aco: add wait_imm::unpack and wait_imm::max
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28981>
2024-05-10 11:53:08 +00:00
Rhys Perry
c894c9ab1b aco/stats: refactor for indexable wait_imm
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28981>
2024-05-10 11:53:08 +00:00
Rhys Perry
f3e461d643 aco/waitcnt: refactor for indexable wait_imm
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28981>
2024-05-10 11:53:08 +00:00
Rhys Perry
ff2e3ef5eb aco/waitcnt: add target_info
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28981>
2024-05-10 11:53:08 +00:00
Rhys Perry
20b4e30e25 aco: make wait_imm indexable
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28981>
2024-05-10 11:53:08 +00:00
Rhys Perry
5b1b09ad42 aco/waitcnt: fix DS/VMEM ordered writes when mixed
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28981>
2024-05-10 11:53:08 +00:00
Rhys Perry
16eae62f0d aco/stats: don't use VS counter pre-GFX10
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28981>
2024-05-10 11:53:08 +00:00
Rhys Perry
16a9f6e2a4 aco/stats: fix s_waitcnt parsing
This was broken for GFX11 (s_waitcnt encoding changed) and s_waitcnt_vscnt
(waited for vm/lgkm/exp to be 0).

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28981>
2024-05-10 11:53:07 +00:00
Georg Lehmann
be7c137229 aco/gfx11+: optimize v_fma_mix throughput
Foz-DB Navi31:
Totals from 18677 (23.58% of 79206) affected shaders:
Latency: 83613889 -> 83558801 (-0.07%)
InvThroughput: 12696661 -> 12635199 (-0.48%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29047>
2024-05-08 19:36:07 +00:00
Samuel Pitoiset
53a142ad23 aco: add support for remapping color attachments
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27263>
2024-05-07 10:35:04 +00:00
Georg Lehmann
e7b942393a aco/tests: simplify small constant copy test
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29045>
2024-05-06 13:38:14 +00:00
Georg Lehmann
44cc0d31b8 aco/gfx10: use v_add_u16 with literal for constant copies
This also means the v_perm_b32 path is now unused.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29045>
2024-05-06 13:38:14 +00:00
Georg Lehmann
7823065f64 aco/gfx11+: use v_cvt_pk_u8_f32 for 8bit constant copies
Foz-DB Navi31:
Totals from 201 (0.25% of 79395) affected shaders:
Instrs: 186869 -> 186857 (-0.01%)
CodeSize: 1026760 -> 1026700 (-0.01%); split: -0.01%, +0.00%
Latency: 2302050 -> 2301969 (-0.00%)
InvThroughput: 739466 -> 739431 (-0.00%)
Copies: 26467 -> 26454 (-0.05%)
VALU: 93529 -> 93516 (-0.01%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29045>
2024-05-06 13:38:14 +00:00