Commit graph

164 commits

Author SHA1 Message Date
Konstantin Seurer
978e9b670e aco,nir: Add support for new GFX12 ray tracing instructions
Adds image_bvh_dual_intersect_ray and image_bvh8_intersect_ray which can
handle the new BVH format. Both instructions write up to 10 VGPRs so
they need to use a vec16 definition in nir.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34273>
2025-04-17 20:20:40 +00:00
Daniel Schürmann
713396ec8e aco/assembler: Don't insert chained branches into otherwise empty blocks
No fossil changes, but keeps block offsets of the empty blocks intact.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33762>
2025-02-27 10:40:01 +00:00
Daniel Schürmann
6659db285a aco/assembler: Fix short jumps over chained branches
If we insert

   <code>
   s_branch 1
   s_branch Target

at the end of some block, and later hide an additional chained branch
after the existing one, then we have to update the 's_branch 1' to
also jump over the newly added branch.

Fixes: cab5639a09 ('aco/assembler: chain branches instead of emitting long jumps')
Closes: #12673
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33762>
2025-02-27 10:40:01 +00:00
Daniel Schürmann
de1e38e214 aco/assembler: Find loop exits using the successor's loop nest depth
Previously, we just used the next block after a loop that
has a back-edge. This assumes that loop-exit blocks can
only be removed when falling through to the next block,
when in fact it can also be a jump to somewhere else,
in future even to some block before the actual loop.

12 (0.02% of 79395) affected shaders.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32477>
2025-01-23 00:11:06 +00:00
Daniel Schürmann
22881712c8 aco/assembler: Don't emit target basic block index when chaining branches
This could erroneously cause an assertion to fail if the
target block index was larger than UINT16_MAX.

Fixes: cab5639a09 ('aco/assembler: chain branches instead of emitting long jumps')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32599>
2024-12-11 23:28:55 +00:00
Daniel Schürmann
cab5639a09 aco/assembler: chain branches instead of emitting long jumps
As regular branch instructions cannot jump further than
32768 dwords, previously we used long jumps as fallback
solution. The disadvantage of that is that an extra SGPR
pair must be provided in order to temporarily store the PC.

This patch changes that to chained branch instructions by
inserting an artificial extra block into the code to be
targeted by the original branch. This block contains a
single branch instruction jumping to the original target.
Before the block, if necessary, we insert a <branch 1>
instruction for the existing code in order to jump over
the newly inserted block.

Only a few RT shaders are affected.

Totals from 29 (0.04% of 79395) affected shaders: (Navi31)

CodeSize: 17281176 -> 17276332 (-0.03%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32037>
2024-12-06 14:34:03 +00:00
Daniel Schürmann
c3d777d8ac aco/assembler: change ctx.loop_header to uint32_t instead of Block*
We are about to add new blocks during assembly which makes
pointers into a vector unreliable.
Also, only set it if the loop has no back-edge.

Totals from 126 (0.16% of 79206) affected shaders: (Navi31)

CodeSize: 1486152 -> 1488152 (+0.13%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32037>
2024-12-06 14:34:03 +00:00
Daniel Schürmann
592f3fd994 aco/assembler: Actually insert s_inst_prefetch instructions when aligning blocks for loops
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32037>
2024-12-06 14:34:03 +00:00
Daniel Schürmann
b92afdbd28 aco/assembler: constify assembly functions
Ensure that instruction formats and special operands
are not manipulated during assembly.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32037>
2024-12-06 14:34:03 +00:00
Konstantin Seurer
f8ef1afec8 aco: Handle nir_debug_info_instr
Propagated debug info using p_debug_info and Program::debug_info.
Offsets into the shader binary are gathered during assembly.
This will be usefull for mapping back the disassembled shader to
nir, glsl or spirv.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>
2024-11-11 08:39:13 +00:00
Rhys Perry
7f092cbd91 aco: workaround hazards in emit_long_jump
fossil-db (navi31):
Totals from 29 (0.04% of 79395) affected shaders:
CodeSize: 17612888 -> 17615096 (+0.01%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31316>
2024-09-30 09:04:35 +00:00
Georg Lehmann
970503a0b9 aco/assembler: support ds_append/ds_*_addtid
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31075>
2024-09-19 16:21:48 +00:00
Samuel Pitoiset
26d8f1a306 aco,radv,radeonsi: move has_epilog to the fragment shader info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31150>
2024-09-16 07:53:00 +00:00
Daniel Schürmann
677c9d9e93 aco/assembler: fix GFX67 MTBUF opcode encoding
Fixes: 56ac6f26e0 ('aco/assembler: slightly refactor MTBUF assembly for more readability')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29708>
2024-06-13 09:18:05 +00:00
Daniel Schürmann
56ac6f26e0 aco/assembler: slightly refactor MTBUF assembly for more readability
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29692>
2024-06-12 11:41:58 +00:00
Daniel Schürmann
14f4906e53 aco/assembler: fix MTBUF opcode encoding on GFX11
We have accidentally set the tfe bit for some opcodes.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29692>
2024-06-12 11:41:58 +00:00
Rhys Perry
00eccf524f aco: use GFX12 scope/temporal-hint
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243>
2024-06-07 13:22:42 +00:00
Rhys Perry
b41f0f6cc1 aco: use ac_hw_cache_flags
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243>
2024-06-07 13:22:42 +00:00
Rhys Perry
e79a8219d2 aco/gfx12: sign-extend s_getpc_b64
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330>
2024-05-28 10:52:11 +00:00
Rhys Perry
2c4f561708 aco: don't change prefetch mode on GFX11.5+
This instruction was effectively removed:
https://github.com/llvm/llvm-project/commit/ba131b7017ce

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29162>
2024-05-14 20:50:27 +00:00
Rhys Perry
e1e5bc0dd0 aco: support GFX12 in assembler
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29162>
2024-05-14 20:50:27 +00:00
Georg Lehmann
80b8bbf0c5 aco/gfx11: use v_swap_b16
I tested that v_swap_b16 can be encoded as VOP3, because the ISA doc doesn't list
it as a possible VOP3 opcode. VOP3 is nessecary to access v128+.

Foz-DB Navi31:
Totals from 32 (0.04% of 79395) affected shaders:
Instrs: 201865 -> 195168 (-3.32%)
CodeSize: 1082220 -> 1031228 (-4.71%); split: -4.71%, +0.00%
Latency: 2258198 -> 2238586 (-0.87%)
InvThroughput: 796731 -> 788934 (-0.98%)
Copies: 34514 -> 29220 (-15.34%)
VALU: 122457 -> 117163 (-4.32%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29143>
2024-05-13 18:42:19 +00:00
Georg Lehmann
e2cb9c57a2 aco: use v_interp_p2_f16 opsel
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28435>
2024-04-10 07:49:27 +00:00
Georg Lehmann
81a334a594 aco/assembler: add vintrp high_16bit support
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28435>
2024-04-10 07:49:26 +00:00
Samuel Pitoiset
7a69d78ba2 aco: use SPDX-License-Identifier
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28622>
2024-04-08 15:49:25 +00:00
Daniel Schürmann
1187189235 aco: unify different SALU types into single struct SALU_instruction
This removes
- SOP1_instruction
- SOP2_instruction
- SOPC_instruction
- SOPK_instruction
- SOPP_instruction

and their corresponding methods.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370>
2024-03-28 11:25:43 +00:00
Daniel Schürmann
5d265257a0 aco: remove SOPP_instruction::block member
Re-use SOPP_instruction::imm instead.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370>
2024-03-28 11:25:43 +00:00
Daniel Schürmann
cef01e817d aco: use instr_class::branch to identify SOPP branches
Also changes the instr_class of s_trap to instr_class::other.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370>
2024-03-28 11:25:43 +00:00
Rhys Perry
588b1ce533 aco: split instruction assembly into functions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27913>
2024-03-22 10:29:43 +00:00
Rhys Perry
5651aa7644 aco/gfx11: fix scratch ST mode assembly
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27913>
2024-03-22 10:29:43 +00:00
Rhys Perry
24c02dbfa6 aco: improve printing of VOPD instructions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27485>
2024-02-08 12:15:23 +00:00
Rhys Perry
6547e17e60 aco: add VOPD format
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23367>
2024-02-05 10:51:14 +00:00
Qiang Yu
71fd3c2a35 aco: do not fix_exports when separately compiled ngg vs or es
For radeonsi not abort when this case, as it does not jump at
last.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25631>
2023-10-26 02:01:49 +00:00
Qiang Yu
6eb0910d45 aco: do not fix_exports when program has epilog
PS with epilog does not need to fix_exports. And radeonsi use
p_end_with_regs so does not have jump instruction at last.

radeonsi may also have exec restore instruction, so may break
before reach to p_end_with_regs.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
2023-10-10 02:36:34 +00:00
Rhys Perry
0e79f76aa5 aco: add fetch_inactive field to DPP instructions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25525>
2023-10-04 18:53:43 +00:00
Rhys Perry
26fce534b5 aco: shrink DPP8_instruction
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25525>
2023-10-04 18:53:43 +00:00
Rhys Perry
ea633c128c aco/optimizer_postRA: don't combine DPP across exec on GFX8/9
GFX8/9 seem to use FI=0 behaviour.

fossil-db (vega10):
Totals from 1 (0.00% of 63053) affected shaders:
Instrs: 542 -> 570 (+5.17%)
CodeSize: 2928 -> 3040 (+3.83%)
Latency: 2087 -> 2118 (+1.49%)
InvThroughput: 1103 -> 1143 (+3.63%)

Affected shader is from Cyberpunk 2077 fossil.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Cc: 23.2 <mesa-stable>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9784
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25471>
2023-09-29 18:23:21 +00:00
Rhys Perry
21db2e7017 aco: reset prefetch in the correct block after removing the exit
fossil-db (navi31):
Totals from 279 (0.35% of 79332) affected shaders:
(no stat changes)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: c778803d67 ("aco/assembler: change prefetch mode on GFX10.3+ during loops if beneficial")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25312>
2023-09-25 14:18:46 +00:00
Qiang Yu
b5eaec6c80 aco,radv,radeonsi: rename is_monolithic to merged_shader_compiled_separately
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24990>
2023-09-04 10:53:44 +08:00
Samuel Pitoiset
ee8ba0f98f aco: adjust fix_exports() for VS/TES as NGG and non-monolithic shaders
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24907>
2023-09-01 06:52:40 +00:00
Qiang Yu
c8687a4b09 aco: do not fix_exports when program is prolog
Otherwise fix_export() will abort when find no export.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24712>
2023-08-31 02:39:16 +00:00
Qiang Yu
4756388038 aco,radeonsi: save const addr to symbol
For radeonsi to relocation const data when combine multiple
shader parts to a single one. So the final shader binary will
begin with exec code of all parts then const data.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24442>
2023-08-16 02:27:45 +00:00
Qiang Yu
d3333609e6 aco: don't emit s_endpgm for tcs with epilog
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24442>
2023-08-16 02:27:45 +00:00
Samuel Pitoiset
f4ec2e7bb3 radv,aco: move has_epilog to radv_shader_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24404>
2023-08-02 16:59:18 +00:00
Daniel Schürmann
7e4870e8e5 amd: Do shader binary alignment for prefetch at memory allocation time.
This makes it consistent between drivers and compilers.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23799>
2023-07-11 12:01:45 +00:00
Daniel Schürmann
437bf4fccb amd: move end-of-code marker padding to ACO.
This makes it consistent between drivers and compilers.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23799>
2023-07-11 12:01:45 +00:00
Daniel Schürmann
c778803d67 aco/assembler: change prefetch mode on GFX10.3+ during loops if beneficial
Totals from 8864 (6.68% of 132726) affected shaders: GFX11

CodeSize: 90776128 -> 90923760 (+0.16%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23748>
2023-07-11 07:15:43 +00:00
Daniel Schürmann
b9c5b273b0 aco/assembler: align loops if it reduces the number of cache lines
This is especially beneficial on GFX6-9.

Totals from 11229 (8.46% of 132726) affected shaders: GFX11

CodeSize: 109608640 -> 109840916 (+0.21%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23748>
2023-07-11 07:15:43 +00:00
Daniel Schürmann
de8ecc127e aco/assembler: align resume shaders with cache lines
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23748>
2023-07-11 07:15:43 +00:00
Timur Kristóf
05928f4200 aco: Use ac_hw_stage instead of aco-specific HWStage.
The new ac_hw_stage is going to be used by drivers as well.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23597>
2023-06-23 12:49:04 +00:00