Commit graph

223 commits

Author SHA1 Message Date
Rhys Perry
9f1a5645cf aco: completely skip branches if they're never taken
fossil-db (navi21):
Totals from 196 (0.25% of 79395) affected shaders:
Instrs: 101902 -> 101706 (-0.19%)
CodeSize: 576988 -> 576232 (-0.13%)
Latency: 750344 -> 750280 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 119170 -> 119161 (-0.01%)
Branches: 3933 -> 3737 (-4.98%)

fossil-db (vega10):
Totals from 585 (0.93% of 63053) affected shaders:
Instrs: 346877 -> 346292 (-0.17%)
CodeSize: 1810600 -> 1808260 (-0.13%)
Latency: 1817743 -> 1814233 (-0.19%)
InvThroughput: 652142 -> 651944 (-0.03%)
Branches: 5087 -> 4502 (-11.50%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30321>
2024-08-15 16:00:19 +00:00
Rhys Perry
c29d9f1184 aco: only remove branch jumping over SMEM/barrier if it's never taken
SMEM might be an invalid access, and barriers are probably expensive.

fossil-db (navi21):
Totals from 126 (0.16% of 79395) affected shaders:
Instrs: 2764965 -> 2765377 (+0.01%)
CodeSize: 15155348 -> 15156788 (+0.01%)
Latency: 17604293 -> 17604296 (+0.00%)
Branches: 105211 -> 105623 (+0.39%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Backport-to: 24.1
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30321>
2024-08-15 16:00:19 +00:00
Rhys Perry
b934255510 aco: split selection_control_remove into rarely_taken and never_taken
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Backport-to: 24.1
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30321>
2024-08-15 16:00:18 +00:00
Georg Lehmann
e0818cb87b aco/gfx11+: don't use VOP3 v_swap_b16
v_swap_b16 is not offically supported as VOP3, so it can't be used with v128-255.
Tests show that VOP3 appears to work correctly, but according to AMD that should
not be relied on.

https://github.com/llvm/llvm-project/pull/100442#discussion_r1703929676

Foz-DB Navi31:
Totals from 6 (0.01% of 79395) affected shaders:
Instrs: 64799 -> 65932 (+1.75%)
CodeSize: 360180 -> 368440 (+2.29%)
Latency: 1364648 -> 1365922 (+0.09%)
InvThroughput: 635843 -> 636475 (+0.10%)
Copies: 14766 -> 15698 (+6.31%)
VALU: 38743 -> 39675 (+2.41%)

Fixes: 80b8bbf0c5 ("aco/gfx11: use v_swap_b16")

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30515>
2024-08-06 20:40:12 +00:00
Georg Lehmann
53155ba12d aco: add CompilationProgress::after_lower_to_hw
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30399>
2024-07-29 18:35:33 +00:00
Georg Lehmann
4399c7bac3 aco: add aco_opcode::p_s_cvt_f16_f32_rtne
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29245>
2024-07-18 08:36:14 +00:00
Rhys Perry
1643c933ef aco/gfx11: don't use v_bfrev_b32 with wave64
v_mov_b32 can be dual issued.

fossil-db (navi31):
Totals from 1792 (2.26% of 79395) affected shaders:
CodeSize: 27462476 -> 27470308 (+0.03%); split: -0.00%, +0.03%
Latency: 29403214 -> 29402713 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 5005863 -> 5004702 (-0.02%); split: -0.02%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29912>
2024-07-01 17:34:22 +00:00
Rhys Perry
f842bd81ca aco: use s_pack_*_b32_b16 more in p_insert/p_extract lowering
This opcode doesn't write SCC, which gives later passes more freedom to
move instructions.

fossil-db (navi21):
Totals from 727 (0.92% of 79395) affected shaders:
Latency: 14943483 -> 14942704 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 3225790 -> 3225766 (-0.00%); split: -0.00%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29912>
2024-07-01 17:34:22 +00:00
Rhys Perry
9fe3af1e2a aco: insert s_nop before discard early exit sendmsg(dealloc_vgpr)
Forgot about this one.

fossil-db (gfx1100):
Totals from 3920 (2.94% of 133461) affected shaders:
Instrs: 6632088 -> 6636008 (+0.06%)
CodeSize: 34165376 -> 34181056 (+0.05%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: 37fbfa655a ("aco: insert s_nop before VGPR deallocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29770>
2024-06-18 20:17:38 +00:00
Georg Lehmann
046414e061 aco: add more anonymous namespaces
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29740>
2024-06-18 17:53:07 +00:00
Georg Lehmann
2b56a97374 aco/lower_to_hw: optimize split 64bit constant copies
Foz-DB Navi21:
Totals from 3209 (4.04% of 79395) affected shaders:
Instrs: 6502065 -> 6496612 (-0.08%)
CodeSize: 35578300 -> 35556596 (-0.06%)
Latency: 66092924 -> 66092668 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 16968953 -> 16968900 (-0.00%); split: -0.00%, +0.00%
SClause: 198651 -> 198647 (-0.00%)
Copies: 597323 -> 591872 (-0.91%)
SALU: 930918 -> 925467 (-0.59%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29422>
2024-05-29 11:59:22 +00:00
Georg Lehmann
5910a46101 aco/lower_to_hw: use copy_constant_sgpr for masks
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29422>
2024-05-29 11:59:22 +00:00
Georg Lehmann
23d88e68fc aco: small constant copy optimizations
Foz-DB Navi21:
Totals from 13 (0.02% of 79395) affected shaders:
CodeSize: 93432 -> 93376 (-0.06%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29422>
2024-05-29 11:59:22 +00:00
Georg Lehmann
54ad07c32a aco/lower_to_hw: add copy_constant_sgpr
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29422>
2024-05-29 11:59:22 +00:00
Georg Lehmann
56354c6cd7 aco: don't pass program to emit_bpermute
Also change the param order, because the builder typically comes first.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29422>
2024-05-29 11:59:22 +00:00
Rhys Perry
e79a8219d2 aco/gfx12: sign-extend s_getpc_b64
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330>
2024-05-28 10:52:11 +00:00
Rhys Perry
ae18c88409 aco/gfx12: implement workgroup barrier
Same sequence LLVM uses for llvm.amdgcn.s.barrier.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330>
2024-05-28 10:52:11 +00:00
Rhys Perry
fae2a85d57 aco/gfx12: implement subgroup shader clock
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330>
2024-05-28 10:52:11 +00:00
Rhys Perry
74aa6437d6 aco: add GFX11.5+ opcodes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29162>
2024-05-14 20:50:27 +00:00
Georg Lehmann
80b8bbf0c5 aco/gfx11: use v_swap_b16
I tested that v_swap_b16 can be encoded as VOP3, because the ISA doc doesn't list
it as a possible VOP3 opcode. VOP3 is nessecary to access v128+.

Foz-DB Navi31:
Totals from 32 (0.04% of 79395) affected shaders:
Instrs: 201865 -> 195168 (-3.32%)
CodeSize: 1082220 -> 1031228 (-4.71%); split: -4.71%, +0.00%
Latency: 2258198 -> 2238586 (-0.87%)
InvThroughput: 796731 -> 788934 (-0.98%)
Copies: 34514 -> 29220 (-15.34%)
VALU: 122457 -> 117163 (-4.32%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29143>
2024-05-13 18:42:19 +00:00
Georg Lehmann
44cc0d31b8 aco/gfx10: use v_add_u16 with literal for constant copies
This also means the v_perm_b32 path is now unused.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29045>
2024-05-06 13:38:14 +00:00
Georg Lehmann
7823065f64 aco/gfx11+: use v_cvt_pk_u8_f32 for 8bit constant copies
Foz-DB Navi31:
Totals from 201 (0.25% of 79395) affected shaders:
Instrs: 186869 -> 186857 (-0.01%)
CodeSize: 1026760 -> 1026700 (-0.01%); split: -0.01%, +0.00%
Latency: 2302050 -> 2301969 (-0.00%)
InvThroughput: 739466 -> 739431 (-0.00%)
Copies: 26467 -> 26454 (-0.05%)
VALU: 93529 -> 93516 (-0.01%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29045>
2024-05-06 13:38:14 +00:00
Georg Lehmann
d4084f7f09 aco/lower_to_hw: remove gfx6/7 subdword paths
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28836>
2024-05-02 11:09:36 +00:00
Georg Lehmann
6b35de971c aco/lower_to_hw: don't use regClass to identify subdword reductions
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28836>
2024-05-02 11:09:35 +00:00
Georg Lehmann
47d824a644 aco/lower_to_hw: fix 16bit p_insert on gfx8
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28881>
2024-04-25 09:47:19 +00:00
Georg Lehmann
bb80ac7a70 aco/lower_to_hw: fix v_cvt_pk_u16_u32 instruction format
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28881>
2024-04-25 09:47:18 +00:00
Georg Lehmann
4b5016a537 aco: support high_16bits FS IO
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28435>
2024-04-10 07:49:27 +00:00
Samuel Pitoiset
7a69d78ba2 aco: use SPDX-License-Identifier
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28622>
2024-04-08 15:49:25 +00:00
Rhys Perry
66153f7bfe aco: always emit float mode for merged shaders compiled separately
We don't know what the float mode was by the end of the previous shader,
so we should always set it.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28392>
2024-03-28 21:02:29 +00:00
Daniel Schürmann
a863c7951e aco: remove create_instruction() template parameter
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370>
2024-03-28 11:25:43 +00:00
Daniel Schürmann
9b0ebcc39b aco: change return type of create_instruction() to Instruction*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370>
2024-03-28 11:25:43 +00:00
Daniel Schürmann
1187189235 aco: unify different SALU types into single struct SALU_instruction
This removes
- SOP1_instruction
- SOP2_instruction
- SOPC_instruction
- SOPK_instruction
- SOPP_instruction

and their corresponding methods.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370>
2024-03-28 11:25:43 +00:00
Daniel Schürmann
5d265257a0 aco: remove SOPP_instruction::block member
Re-use SOPP_instruction::imm instead.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370>
2024-03-28 11:25:43 +00:00
Daniel Schürmann
cef01e817d aco: use instr_class::branch to identify SOPP branches
Also changes the instr_class of s_trap to instr_class::other.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370>
2024-03-28 11:25:43 +00:00
Rhys Perry
3b28ba8239 aco: optimize for purely linear VGPR copies
fossil-db:
Totals from 2 (0.00% of 79242) affected shaders:
Instrs: 1344 -> 1340 (-0.30%)
CodeSize: 6968 -> 6952 (-0.23%)
Latency: 4414 -> 4410 (-0.09%)
InvThroughput: 1018 -> 1020 (+0.20%)
Copies: 60 -> 56 (-6.67%)
SALU: 40 -> 36 (-10.00%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27697>
2024-03-06 12:55:46 +00:00
Rhys Perry
5e17a39b15 aco: allow p_start_linear_vgpr to use multiple operands
Merging the p_create_vector into the p_start_linear_vgpr is useful since
we stopped attempting to place the p_start_linear_vgpr definition in the
same registers as the operand.

fossil-db (navi31):
Totals from 927 (1.17% of 79242) affected shaders:
MaxWaves: 26412 -> 26442 (+0.11%)
Instrs: 938328 -> 938181 (-0.02%); split: -0.14%, +0.13%
CodeSize: 4891448 -> 4890820 (-0.01%); split: -0.11%, +0.10%
VGPRs: 47016 -> 47004 (-0.03%); split: -0.13%, +0.10%
SpillSGPRs: 222 -> 226 (+1.80%)
Latency: 5076065 -> 5075191 (-0.02%); split: -0.12%, +0.10%
InvThroughput: 712316 -> 712421 (+0.01%); split: -0.09%, +0.10%
SClause: 27992 -> 27972 (-0.07%); split: -0.09%, +0.02%
Copies: 38042 -> 38104 (+0.16%); split: -1.95%, +2.12%
PreVGPRs: 39448 -> 39369 (-0.20%)
VALU: 570157 -> 570224 (+0.01%); split: -0.13%, +0.14%
SALU: 51672 -> 51678 (+0.01%); split: -0.01%, +0.02%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27697>
2024-03-06 12:55:45 +00:00
Rhys Perry
f99443a68b aco: don't combine linear and normal VGPR copies
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27697>
2024-03-06 12:55:45 +00:00
Georg Lehmann
cd6d9c5918 aco: don't remove branches that skip v_writelane_b32
Cc: mesa-stable

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27537>
2024-02-09 21:55:02 +00:00
Rhys Perry
174e37afb9 aco: fix >8 byte linear vgpr copies
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27436>
2024-02-06 15:40:58 +00:00
Georg Lehmann
1d167d187e aco/gfx10+: don't use v_cmpx with VCC def
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26163>
2023-11-15 12:35:32 +00:00
Georg Lehmann
e49c413a86 aco: use null operand for SOPK s_waitcnt
Both null def and op result in the same correct encoding, but these
instructions optionally read a sgpr, so it makes more sense to use an operand.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26163>
2023-11-15 12:35:32 +00:00
Qiang Yu
5ef7c54829 aco: wait memory ops done before go to next shader part
Next part don't know whether p_end_with_regs args are loaded from
memory ops or not, need to wait it's done here.

Other memory load needs to be waited too like:
  a = load_mem()
  b = ...
  if (...) {
    wait_mem(a)
    store_mem(a)
  }
  p_end_with_regs(b)

"a" still needs to be waited, otherwise next shader part regs may
be overwritten by unfinished memory loads.

Memory stores are waited too. When >=gfx10 and last VGT has no
parameter export, we need to wait all memeory stores done before
pos export (see ac_nir_export_position). So when merged shader
(ES+GS or VS+GS) is partially built, first stage needs to wait
all memory stores done, otherwise second stage don't know if
any memory stores pending before.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Signe-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
2023-10-10 02:36:34 +00:00
Qiang Yu
5ba68f92b4 aco: create exit block for p_end_with_regs to branch to
To handle ps discard in radeonsi part mode shader.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
2023-10-10 02:36:34 +00:00
Georg Lehmann
34d8fa6185 aco/gfx11: optimize dual source export
We can combine dpp with the v_cndmask_b32.

Foz-DB Navi31:
Totals from 222 (0.28% of 79330) affected shaders:
Instrs: 564392 -> 563373 (-0.18%); split: -0.19%, +0.01%
CodeSize: 2867040 -> 2864728 (-0.08%); split: -0.09%, +0.01%
Latency: 4278957 -> 4275199 (-0.09%); split: -0.09%, +0.00%
InvThroughput: 586636 -> 585824 (-0.14%); split: -0.14%, +0.00%
SClause: 20210 -> 20211 (+0.00%); split: -0.02%, +0.02%
Copies: 39763 -> 39778 (+0.04%); split: -0.13%, +0.17%
PreVGPRs: 13924 -> 13922 (-0.01%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25541>
2023-10-05 10:37:34 +00:00
Rhys Perry
26fce534b5 aco: shrink DPP8_instruction
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25525>
2023-10-04 18:53:43 +00:00
Georg Lehmann
4ea611bca0 aco: fix p_extract with v1 dst and s1 operand
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: f14023666c ("aco: Allow p_extract to have different definition and operand sizes.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25403>
2023-09-27 14:12:29 +00:00
Daniel Schürmann
040142684c aco: make p_wqm a marker instruction without Operands/Definitions
Totals from 28277 (36.93% of 76572) affected shaders: (GFX11)

MaxWaves: 833930 -> 833898 (-0.00%); split: +0.01%, -0.01%
Instrs: 21366950 -> 21353346 (-0.06%); split: -0.11%, +0.05%
CodeSize: 112855368 -> 112610508 (-0.22%); split: -0.24%, +0.03%
VGPRs: 1157748 -> 1158540 (+0.07%); split: -0.10%, +0.17%
SpillSGPRs: 2465 -> 2463 (-0.08%); split: -0.16%, +0.08%
Latency: 168339886 -> 168383646 (+0.03%); split: -0.10%, +0.12%
InvThroughput: 25164895 -> 25158376 (-0.03%); split: -0.08%, +0.06%
VClause: 347660 -> 346256 (-0.40%); split: -0.55%, +0.15%
SClause: 794460 -> 799521 (+0.64%); split: -0.33%, +0.97%
Copies: 1151908 -> 1148370 (-0.31%); split: -0.54%, +0.23%
Branches: 359447 -> 359437 (-0.00%); split: -0.01%, +0.00%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25038>
2023-09-14 09:25:22 +00:00
Rhys Perry
1d29a1e2fc aco: add adjust_bpermute_dst helper
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24693>
2023-08-23 12:36:46 +01:00
Rhys Perry
9169fbf83c aco: clarify bpermute pseudo opcode names
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24693>
2023-08-23 12:36:46 +01:00
Rhys Perry
8a024c985f aco: fix p_bpermute_gfx6's exec save/restore with wave32
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24693>
2023-08-23 12:36:46 +01:00