Rhys Perry
6547e17e60
aco: add VOPD format
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23367 >
2024-02-05 10:51:14 +00:00
Daniel Schürmann
023e78b4d7
aco: add new post-RA scheduler for ILP
...
Totals from 77247 (97.37% of 79330) affected shaders: (GFX11)
Instrs: 44371374 -> 43215723 (-2.60%); split: -2.64%, +0.03%
CodeSize: 227819532 -> 223188224 (-2.03%); split: -2.06%, +0.03%
Latency: 301016823 -> 290147626 (-3.61%); split: -3.70%, +0.09%
InvThroughput: 48551749 -> 47646212 (-1.87%); split: -1.88%, +0.01%
VClause: 870581 -> 834655 (-4.13%); split: -4.13%, +0.00%
SClause: 1487061 -> 1340851 (-9.83%); split: -9.83%, +0.00%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25676 >
2024-01-06 11:30:42 +00:00
Daniel Schürmann
72a5c659d4
aco: form clauses for LDS instructions
...
No fossil-db changes.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25676 >
2024-01-06 11:30:42 +00:00
Daniel Schürmann
42e9ba1c70
aco: remove VCCZ and EXECZ register handling
...
We don't use these registers and since RDNA3 removed the explicit usage,
it is unlikely that we will properly support them in the future.
Removing the registers from the ACO IR prevents accidentally using them
without proper support.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26664 >
2023-12-14 20:08:28 +00:00
Daniel Schürmann
5ebba87772
aco: rename max_wave64_per_simd -> max_waves_per_simd
...
and update usage. Changes are because the scheduler targets
a different number of waves.
Totals from 195 (0.25% of 79330) affected shaders: (GFX11)
MaxWaves: 3120 -> 3108 (-0.38%)
Instrs: 71202 -> 71070 (-0.19%); split: -0.27%, +0.09%
CodeSize: 383272 -> 382828 (-0.12%); split: -0.21%, +0.10%
VGPRs: 7392 -> 7752 (+4.87%)
Latency: 2280141 -> 2262487 (-0.77%); split: -0.79%, +0.02%
InvThroughput: 4759022 -> 5725442 (+20.31%); split: -0.01%, +20.32%
VClause: 1737 -> 1741 (+0.23%); split: -3.11%, +3.34%
SClause: 2385 -> 2376 (-0.38%); split: -0.80%, +0.42%
Copies: 5257 -> 5274 (+0.32%); split: -0.25%, +0.57%
Branches: 1213 -> 1212 (-0.08%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26521 >
2023-12-11 10:39:50 +00:00
Georg Lehmann
91539713bb
aco: add src/def count and size for all ALU opcodes
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26163 >
2023-11-15 12:35:32 +00:00
Georg Lehmann
6cd78281f6
aco: deduplicate Format definition
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25943 >
2023-11-06 23:16:38 +00:00
Georg Lehmann
6e0bf33a89
aco: deduplicate instr_class definition
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25943 >
2023-11-06 23:16:38 +00:00
Georg Lehmann
1b9a3b7466
aco: stop using cstdint
...
We use stdint.h everywhere else.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25943 >
2023-11-06 23:16:38 +00:00
Bas Nieuwenhuizen
5e7c828c0e
aco: Add WMMA instructions.
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24683 >
2023-10-24 13:24:18 +00:00
Rhys Perry
e4842c0270
aco: consider exec_hi in reads_exec()
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25374 >
2023-10-11 15:14:04 +00:00
Qiang Yu
49250f9fc5
aco: add ps prolog generation for radeonsi
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973 >
2023-10-10 02:36:33 +00:00
Qiang Yu
80728a2e71
aco: do not eliminate final exec write when p_end_with_regs block
...
p_end_with_regs just partially end the program, next part need
exec mask to be set correctly. For example p_end_wqm will generate
a exec restore from WQM mode after p_end_with_regs.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973 >
2023-10-10 02:36:33 +00:00
Rhys Perry
0e79f76aa5
aco: add fetch_inactive field to DPP instructions
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25525 >
2023-10-04 18:53:43 +00:00
Rhys Perry
26fce534b5
aco: shrink DPP8_instruction
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25525 >
2023-10-04 18:53:43 +00:00
Rhys Perry
6518d09601
aco: don't combine DPP into v_cmpx
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25471 >
2023-09-29 18:23:21 +00:00
Rhys Perry
ae9a476c42
aco/waitcnt: add print helpers
...
These may be useful in the future.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25373 >
2023-09-27 13:43:11 +00:00
Rhys Perry
41b6020ff3
aco: remove fast path in insert_exec_mask's process_instructions
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25038 >
2023-09-14 09:25:22 +00:00
Qiang Yu
c8687a4b09
aco: do not fix_exports when program is prolog
...
Otherwise fix_export() will abort when find no export.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24712 >
2023-08-31 02:39:16 +00:00
Qiang Yu
a2ba50aee6
aco: add vs prolog instruction selection for radeonsi
...
Port from llvm si_llvm_build_vs_prolog().
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24712 >
2023-08-31 02:39:15 +00:00
Qiang Yu
d3333609e6
aco: don't emit s_endpgm for tcs with epilog
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24442 >
2023-08-16 02:27:45 +00:00
Qiang Yu
a2484b20f9
aco: add pending_lds_access option for insert waitcnt
...
For tcs epilog to add p_barrier at the beginning to sync
main shader part tess factor LDS write.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24442 >
2023-08-16 02:27:45 +00:00
Rhys Perry
c9b177db0e
aco: don't create sendmsg(dealloc_vgprs) if scratch is used
...
LLVM does something similar: https://reviews.llvm.org/D153295
fossil-db (gfx1100):
Totals from 21 (0.02% of 133461) affected shaders:
Instrs: 147428 -> 147396 (-0.02%)
CodeSize: 797188 -> 797060 (-0.02%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 2930317cea ("aco/gfx11: deallocate VGPRs at the end of the shader")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24669 >
2023-08-15 11:00:31 +00:00
Samuel Pitoiset
f433d39935
aco: add infra for compiling TCS epilogs
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24417 >
2023-08-02 07:29:50 +00:00
Qiang Yu
572625ea6c
aco: extract aco_compile_shader_part from aco_compile_ps_epilog
...
Will be shared with radeonsi tcs epilog and other shader parts build.
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24417 >
2023-08-02 07:29:50 +00:00
Konstantin Seurer
20beebb041
amd: Move ac_hw_stage to its own file
...
Otherwise ACO has to include ac_shader_util.h which also includes NIR.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23906 >
2023-07-03 21:12:45 +00:00
Vitaliy Triang3l Kuzmin
e0f4b52559
aco: Add Primitive Ordered Pixel Shading waitcnt rules
...
When letting the overlapping waves enter their ordered sections, there must
be no memory accesses to resources which need primitive-ordered access that
are still pending, or there would be a race between the current wave and
the overlapping waves.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250 >
2023-06-26 15:58:04 +00:00
Vitaliy Triang3l Kuzmin
a87628cd08
aco: Send MSG_ORDERED_PS_DONE where necessary
...
If the wave has set the Primitive Ordered Pixel Shading packer ID hardware
register, it must send MSG_ORDERED_PS_DONE once before the program ends.
It's also safe to send the message if the packer ID register hasn't been
set yet, therefore the message may be sent conservatively. For simplicity,
to ensure that it's sent on all execution paths after setting the packer ID
register, always sending it from a top-level block. This is required for
GFX9-10.3 POPS.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250 >
2023-06-26 15:58:04 +00:00
Vitaliy Triang3l Kuzmin
94d2888da2
aco: Add s_wait_event argument bit definitions
...
A wait for export_ready (if the corresponding bit is not set in the
instruction) is done to enter the Primitive Ordered Pixel Shading ordered
section on GFX11.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250 >
2023-06-26 15:58:04 +00:00
Vitaliy Triang3l Kuzmin
308a5ea43a
aco: Support pops_exiting_wave_id PhysReg usage
...
pops_exiting_wave_id is a volatile ALU source operand containing the ID of
the latest wave that hasn't exited yet, for comparing with the newest
overlapped wave ID in overlapping waves.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250 >
2023-06-26 15:58:03 +00:00
Timur Kristóf
05928f4200
aco: Use ac_hw_stage instead of aco-specific HWStage.
...
The new ac_hw_stage is going to be used by drivers as well.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23597 >
2023-06-23 12:49:04 +00:00
Eric Engestrom
6b21653ab4
aco: reformat according to its .clang-format
...
Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23253 >
2023-06-16 19:59:52 +00:00
Georg Lehmann
b9854a9097
aco: move cfg validation to its own function
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23507 >
2023-06-12 19:43:17 +00:00
Daniel Schürmann
f66f274304
aco: implement nir_intrinsic_load_resume_shader_address_amd
...
Similar to p_constaddr but targeting BBs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22096 >
2023-06-08 00:37:03 +00:00
Rhys Perry
417990b19e
aco: consider position/primitive exports around memory barriers
...
This is needed to create barriers which ensure stores finish before
position/primitive exports.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21624 >
2023-06-07 13:19:41 +00:00
Georg Lehmann
0f023d90c0
aco/ir: return true in hasRegClass for Operand(reg, rc)
...
Makes isOfType() usable after lower_to_hw_instr.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21402 >
2023-06-07 12:30:11 +00:00
Rhys Perry
35c133a77b
aco: add MIMG_instruction::strict_wqm
...
This lets us use linear VGPRs for part of the texture sample's address.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22636 >
2023-05-25 16:29:16 +00:00
Rhys Perry
ab885a011a
aco: remove unused RegType
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22636 >
2023-05-25 16:29:16 +00:00
Rhys Perry
446d0dd658
aco: add get_op_fixed_to_def() helper
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22446 >
2023-05-24 18:58:15 +00:00
Georg Lehmann
644c5e95a0
aco: use get_operand_size for dpp opt
...
This matters now that v_fma_mixlo_f16/v_fma_mixhi_f16 can use dpp.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23059 >
2023-05-18 22:57:24 +00:00
Georg Lehmann
0ffc9bccfa
aco: add helper function for can_use_input_modifiers
...
Some instructions have restrictions that can't be expressed with the bitfield.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23059 >
2023-05-18 22:57:24 +00:00
Georg Lehmann
a3d6335742
aco: add withoutVOP3 helper
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23059 >
2023-05-18 22:57:24 +00:00
Georg Lehmann
6a53af3fc8
aco: introduce helper to swap valu operands with modifiers
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23059 >
2023-05-18 22:57:23 +00:00
Georg Lehmann
151bcc1e8b
aco: use VOP3+DPP
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22698 >
2023-05-12 13:31:16 +00:00
Georg Lehmann
2548f28ab3
aco/assembler: support VOP3P with DPP
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22698 >
2023-05-12 13:31:15 +00:00
Timur Kristóf
16a05f1903
aco: Don't allow any VALU instruction to write m0.
...
Fixes: d5398b62da
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22885 >
2023-05-10 12:41:25 +00:00
Rhys Perry
d5398b62da
aco/ra: create M0-affinities for s_sendmsg
...
v2 by Timur Kristóf:
Do not add the affinity for instructions that can't write m0
reliably, such as readlane-like instructions on GFX8.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22690 >
2023-05-04 19:08:58 +00:00
Qiang Yu
360176b671
aco,radv: support symbol relocation in aco
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22727 >
2023-04-28 11:33:28 +08:00
Rhys Perry
1a6095b36e
aco: remove SMEM_instruction::prevent_overflow
...
This doesn't seem useful anymore, and it seems we forgot to set it in a
few places.
This commit changes the behaviour of the optimizer so that
prevent_overflow is always true.
fossil-db (navi21):
Totals from 7421 (5.47% of 135636) affected shaders:
Instrs: 5402823 -> 5440126 (+0.69%); split: -0.00%, +0.69%
CodeSize: 28731300 -> 28974152 (+0.85%); split: -0.00%, +0.85%
VGPRs: 317528 -> 317552 (+0.01%)
SpillSGPRs: 419 -> 415 (-0.95%)
Latency: 40712478 -> 40783115 (+0.17%); split: -0.01%, +0.19%
InvThroughput: 7612708 -> 7616751 (+0.05%); split: -0.00%, +0.06%
VClause: 123824 -> 123848 (+0.02%); split: -0.09%, +0.11%
SClause: 161915 -> 172741 (+6.69%); split: -0.03%, +6.71%
Copies: 393015 -> 394429 (+0.36%); split: -0.20%, +0.56%
PreSGPRs: 288658 -> 289603 (+0.33%); split: -0.04%, +0.36%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8864
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22553 >
2023-04-19 19:29:48 +00:00
Harri Nieminen
aea48a4ff1
amd: fix typos
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22432 >
2023-04-13 23:08:22 +00:00