Commit graph

15670 commits

Author SHA1 Message Date
Assadian, Navid
cb32bcd3fe amd/vpelib: Add 420 semi-planar 12bit handling
Adds semi-Planar 420 12 bits formats.

Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Navid Assadian <navid.assadian@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:15 +00:00
Brendan
fcad791d07 amd/vpelib: Create virtual stream concept
[Why]
Need to create streams that don't come from input params (ex. for bg
gen) to prepare for future concepts.

[How]
Add enum for stream type, create helper functions to populate virtual
streams, and add custom functions where virtual stream function varies
from input stream function.

Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Brendan Leder <brendansteve.leder@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Lin, Ricky
b670701b65 amd/vpelib: Increase the CD field in vpe descriptor programming
Introduce the vpe desc writer hook.

Co-authored-by: Roy Chan <roy.chan@amd.com>
Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Ricky Lin <ricky.lin@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Shih, Jude
cb9175a7af amd/vpelib: Update Plane Descriptor Writer
Refactor to support new plane descriptor hook, and update enum
vpe_scan_direction.

Co-authored-by: Jesse Agate <jesse.agate@amd.com>
Co-authored-by: Roy Chan <roy.chan@amd.com>
Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Jude Shih <shenshih@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Patel, Utpal
18dae30b17 amd/vpelib: Add resource function hooks for checking support
Add function hooks for checking support including rotation, background
color, DCC capability and input/output support check.

Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Utpal Patel <utpal.patel@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Alan Liu
06097ad64d amd/vpelib: Remove unused structs
Remove the definition of unused structs:
- struct x_axis_config
- struct point_config
- struct curve_points32
- struct lut_point
- struct pwl_parameter2

Reviewed-by: Krunoslav Kovac <krunoslav.kovac@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Alan Liu <haoping.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Chang, Tomson
6483c2c786 amd/vpelib: Add and fix collaborate sync data
[Why&How]
The original implementation always have sync data == 1.
Make it increasing with some 4 bits in random to help debugging
collaborate sync issues across multiple contexts.

Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Tomson Chang <tomson.chang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Lin, Ricky
015b1b52c8 amd/vpelib: Remove extra collaborate sync commands in IB
Remove extra collaborate sync commands and fix coding format.

Co-authored-by: Roy Chan <roy.chan@amd.com>
Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Ricky Lin <ricky.lin@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Lin, Ricky
e9e2fe389f amd/vpelib: Use VPE_IP_LEVEL_1_0 for VPE IP 6.1.3
Use VPE_IP_LEVEL_1_0 for VPE IP version 6.1.0 and 6.1.3.

Reviewed-by: Tomson Chang <tomson.chang@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Ricky Lin <ricky.lin@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Patel, Utpal
73d112f372 amd/vpelib: Add input pixel format support
Add input pixel format support for VPE.

Signed-off-by: Utpal Patel <utpal.patel@amd.com>
Reviewed-by: Jesse Agate <jesse.agate@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Hsieh, Mike
0164bfda65 amd/vpelib: Add cache mechanism for 3D Lut command
[WHY & HOW]
Converting 3D Lut parameters into vpe command takes time.
3D Lut will not change every frame, by adding cache mechanism can improve effeciency.

Reviewed-by: Tomson Chang <tomson.chang@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Mike Hsieh <mike.hsieh@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Kovac, Krunoslav
9817793cd9 amd/vpelib: Reuse existing float to reg format conversion
Remove vpe_fixpt_from_float and use existing conversion
for double(float)->reg custom 1.6.12 format.

Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Krunoslav Kovac <krunoslav.kovac@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Rhys Perry
dea1fedf51 aco/tests: add more VALUMaskWriteHazard tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
2024-08-26 19:16:34 +00:00
Rhys Perry
11262a01ce aco: preserve bitsets after a lane mask is written
fossil-db (navi31):
Totals from 4840 (6.10% of 79395) affected shaders:
Instrs: 13733449 -> 13761177 (+0.20%); split: -0.00%, +0.21%
CodeSize: 71997868 -> 72102520 (+0.15%); split: -0.00%, +0.15%
Latency: 128385177 -> 128408780 (+0.02%); split: -0.00%, +0.02%
InvThroughput: 21105847 -> 21109475 (+0.02%); split: -0.00%, +0.02%
VALU: 7741209 -> 7741210 (+0.00%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Backport-to: 24.1
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
2024-08-26 19:16:34 +00:00
Rhys Perry
61e73c2323 aco: check SALU writing lanemask later for VALUMaskWriteHazard
This should be done after reads are checked and
sgpr_read_by_valu_as_lanemask_then_wr_by_salu is reset. The old version
also skipped checking the reads if the write check passed.

fossil-db (navi31):
Totals from 193 (0.24% of 79395) affected shaders:
Instrs: 3212435 -> 3212735 (+0.01%)
CodeSize: 16462868 -> 16463848 (+0.01%); split: -0.00%, +0.01%
Latency: 19492377 -> 19492462 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 4419705 -> 4419718 (+0.00%); split: -0.00%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Backport-to: 24.1
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
2024-08-26 19:16:34 +00:00
Rhys Perry
b1ba7d1b99 aco: don't consider sa_sdst=0 before SALU write to fix VALUMaskWriteHazard
LLVM does but that's probably a bug.

fossil-db (navi31):
Totals from 311 (0.39% of 79395) affected shaders:
Instrs: 380453 -> 381075 (+0.16%)
CodeSize: 1961012 -> 1964744 (+0.19%)
Latency: 4799095 -> 4800313 (+0.03%)
InvThroughput: 958358 -> 958904 (+0.06%)
VALU: 242322 -> 242633 (+0.13%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Backport-to: 24.1
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
2024-08-26 19:16:34 +00:00
Rhys Perry
8f5ee70d85 aco: also consider VALU reads for VALUMaskWriteHazard
fossil-db (navi31):
Totals from 9776 (12.31% of 79395) affected shaders:
Instrs: 19348258 -> 19383680 (+0.18%); split: -0.00%, +0.19%
CodeSize: 101223460 -> 101366964 (+0.14%); split: -0.01%, +0.15%
Latency: 172853115 -> 172866070 (+0.01%); split: -0.01%, +0.01%
InvThroughput: 27590468 -> 27592390 (+0.01%); split: -0.00%, +0.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11550
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11436
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11337
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11738
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11741
Backport-to: 24.1
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
2024-08-26 19:16:34 +00:00
Rhys Perry
ee648326d9 aco: ignore exec and literals when mitigating VALUMaskWriteHazard
LLVM ignores exec and literals don't seem to work in some cases.

fossil-db (navi31):
Totals from 2676 (3.37% of 79395) affected shaders:
Instrs: 10638979 -> 10646019 (+0.07%); split: -0.00%, +0.07%
CodeSize: 55929640 -> 55959416 (+0.05%); split: -0.00%, +0.06%
Latency: 107707408 -> 107712893 (+0.01%); split: -0.00%, +0.01%
InvThroughput: 18119843 -> 18120442 (+0.00%); split: -0.00%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Backport-to: 24.1
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
2024-08-26 19:16:34 +00:00
Daniel Schürmann
14de650d58 aco: call nir_copy_prop() and nir_opt_dce() before instruction selection
Totals from 1037 (1.31% of 79395) affected shaders: (Navi21)

MaxWaves: 18760 -> 18960 (+1.07%)
Instrs: 4865258 -> 4860063 (-0.11%); split: -0.11%, +0.00%
CodeSize: 27094112 -> 27089224 (-0.02%); split: -0.06%, +0.04%
VGPRs: 68816 -> 68000 (-1.19%)
SpillVGPRs: 2140 -> 2105 (-1.64%)
Scratch: 4237312 -> 4234240 (-0.07%)
Latency: 55894512 -> 55748035 (-0.26%); split: -0.31%, +0.05%
InvThroughput: 11611286 -> 11372897 (-2.05%); split: -2.09%, +0.03%
VClause: 145331 -> 145285 (-0.03%); split: -0.04%, +0.01%
SClause: 150339 -> 150338 (-0.00%)
Copies: 472476 -> 468470 (-0.85%); split: -0.88%, +0.03%
Branches: 206562 -> 206067 (-0.24%); split: -0.24%, +0.00%
PreVGPRs: 61747 -> 61361 (-0.63%)
VALU: 3116434 -> 3112660 (-0.12%); split: -0.13%, +0.00%
SALU: 723154 -> 722887 (-0.04%); split: -0.04%, +0.01%
VMEM: 238656 -> 238586 (-0.03%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30786>
2024-08-26 12:59:00 +00:00
Samuel Pitoiset
cc5d481f41 radv/ci: enable RADV_PERFTEST=transfer_queue on GFX9+
To avoid breaking this because it's not enabled by default.

There is a couple of failures because MSAA is still broken with SDMA.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30778>
2024-08-26 09:26:52 +00:00
Samuel Pitoiset
731523a10b radv/ci: update flakes lists for NAVI21/VANGOGH
Found these when I did a stress test with RADV_PERFTEST=transfer_queue
enabled but they are existing flakes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30778>
2024-08-26 09:26:52 +00:00
Dave Airlie
68cd36d9b4 radv/video: fix reporting video format props for encode.
When encode isn't enabled, refuse the image usage, also use
the correct error on the decode check.

Fixes: 05cd42417f ("radv/video: enable video encoding behind perftest flag")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30838>
2024-08-26 08:49:54 +00:00
Samuel Pitoiset
7f7ecaf08c radv: optimize NOPs padding with DGC
There is two different alignment requirements:
a) IB VA must be aligned to ib_alignment
b) IB size must be aligned to ib_pad_dw_mask

Though RADV was aligning DGC cmdbuf to ib_alignment always, but this is
unnecessary. Using the optimal padding size for DGC cmdbuf removes a
bunch of useless NOPs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30768>
2024-08-26 08:22:06 +00:00
Samuel Pitoiset
a7547a9781 radv/amdgpu: assert that the DGC IB VA is correctly aligned
It must be aligned to what the kernel returns.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30768>
2024-08-26 08:22:06 +00:00
Qiang Yu
58e412014a ac,radv,radeonsi: stop using quad vote any/all when llvm
ClustedAnd with bool argument and cluster_size==4 will be lowered
to quad_vote_all. So does ALU nir_iand/ior op with bool src.

OpenGL and Vulkan subgroup clustered_and tests with bool argument
fail when using LLVM. It seems LLVM has bug when quad vote bool
is in complex control flow. So stop using it for now.

Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30610>
2024-08-26 10:46:15 +08:00
Qiang Yu
a37933b721 ac/llvm: build wqm for quad intrinsics only when fragment shader
Otherwise we get wrong result when non-fragment shader.

Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30610>
2024-08-26 10:46:11 +08:00
Karol Herbst
74dafa3c79 ac/llvm: fix umul_high
LLVM optimizes umul_hi with a constant to v_mul_hi_i32_i24_e32 which isn't
always what we need here. This causes miscalculations. To prevent LLVM to
apply this optimization, we insert a optimization barrier.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11761
Suggested-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30810>
2024-08-24 16:10:20 +00:00
Samuel Pitoiset
28c957409f radv/amdgpu: do not check that a CS is aligned if no padding is added
Some video queues don't require padding.

Fixes: d5efbc7f1c ("radv/amdgpu: fix CS padding for non-GFX/COMPUTE queues")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30795>
2024-08-23 13:48:51 +00:00
Samuel Pitoiset
3af0f0129c radv: fix DRLR with subpass input attachments and feedback loops
Dynamic rendering local read allows the application to use subpass input
attachments with feedback loops. But unless legacy RPs where it's
possible to determine feedback look at creation time, with dynamic
rendering it's not possible.

To fix that, the driver needs to determine at draw time if a feedback
loop is present, and it needs to decompress DCC/HTILE if necessary.

See https://gitlab.khronos.org/vulkan/vulkan/-/issues/3928 for more
information.

Note that VKCTS is still missing coverage but this has been reported.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11127
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30124>
2024-08-23 12:20:02 +00:00
Samuel Pitoiset
4a191e34c9 radv: add support for input attachment indices with DRLR
They will be used to detect feedback loops.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30124>
2024-08-23 12:20:02 +00:00
Samuel Pitoiset
ab2c8af634 radv: add radv_shader_info::ps::uses_fbfetch_output
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30124>
2024-08-23 12:20:02 +00:00
Samuel Pitoiset
541a204733 radv: use the Mesa-specifc dynamic rendering flag for meta operations
Meta operations never use subpass input attachments.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30124>
2024-08-23 12:20:02 +00:00
Samuel Pitoiset
421c42170e radv: stop emitting DB_COUNT_CONTROL in the GFX preamble
This is already emitted as part of the occlusion query state and this
state is dirty when a cmdbuf begins.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30788>
2024-08-23 09:50:40 +00:00
Samuel Pitoiset
e3e28bb514 radv: stop emitting PA_SC_CLIPRECT_RULE in the GFX preamble
It's already emitted as part of the discard rectangle state and all
dynamic states are dirty when a cmdbuf begins

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30788>
2024-08-23 09:50:40 +00:00
Samuel Pitoiset
4662483535 radv: stop emitting DB_RENDER_OVERRIDE in the GFX preamble
It's already emitted as part of the depth clamp enable state and all
dynamic states are dirty when a cmdbuf begins.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30788>
2024-08-23 09:50:40 +00:00
Samuel Pitoiset
cd57411aaa radv: remove redundant PA_SU_PRIM_FILTER_CNTL in the GFX preamble
It's already emitted below.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30788>
2024-08-23 09:50:40 +00:00
David Rosca
6e2ae9c581 radeonsi/vcn: Use pipe header params in H264 header encoder
This now supports writing all fields as we get them on input from
packed headers.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30672>
2024-08-23 10:00:02 +02:00
David Rosca
af849516f0 radeonsi/vcn: Use pipe header params in HEVC header encoder
This now supports writing all fields as we get them on input from
packed headers.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30672>
2024-08-23 10:00:02 +02:00
David Rosca
32c6a61e2b radeonsi/vcn: Switch to app DPB management for H264 and HEVC encode
This removes the internal DPB management logic, which was unnecessary as
it was duplicating what applications already do, and it was also causing
issues when the internal DPB would de-sync from application DPB (eg.
driver removes reference that application still intends to use).

DPB is now dynamically resized instead of using fixed number of slots.
This also saves a lot of memory with HEVC encoding, as that was always
using the max_references which va frontend sets to 15.

Move reconstructed pictures to the end of the context and meta buffers
to ensure resizing works correctly.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30672>
2024-08-23 09:59:58 +02:00
Timothy Arceri
038b3c24d7 ci: bump piglit version
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30752>
2024-08-23 14:55:21 +10:00
Daniel Stone
4bcd57b0b5 ci/amd: Move manual/nightly jobs to postmerge stage
Create a new stage called amd-postmerge and move the full and manual
jobs over there, to avoid entanglement with the pre-merge jobs.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30784>
2024-08-22 15:35:18 +00:00
Rhys Perry
7b92e11e16 aco: forget valu delays after certain s_waitcnt_depctr/LDSDIR
fossil-db (navi31):
Totals from 55242 (69.58% of 79395) affected shaders:
Instrs: 40507666 -> 40138006 (-0.91%); split: -0.91%, +0.00%
CodeSize: 212516104 -> 211025880 (-0.70%); split: -0.70%, +0.00%
Latency: 281643258 -> 281628053 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 46370668 -> 46369637 (-0.00%); split: -0.00%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23337>
2024-08-22 13:57:01 +00:00
Rhys Perry
30396ba604 aco: move insert_delay_alu to after insert_NOPs
s_delay_alu doesn't affect any hazards, but hazard workarounds don't
update s_delay_alu and so can make the s_delay_alu affect the wrong
instructions.

fossil-db (navi31):
Totals from 55777 (70.25% of 79395) affected shaders:
Instrs: 40740011 -> 40765017 (+0.06%)
CodeSize: 213768484 -> 213870856 (+0.05%); split: -0.00%, +0.05%
Latency: 283713083 -> 283714959 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 46551791 -> 46551835 (+0.00%); split: -0.00%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23337>
2024-08-22 13:57:01 +00:00
Rhys Perry
807651561e aco: split insert_wait_states into two
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23337>
2024-08-22 13:57:00 +00:00
Samuel Pitoiset
d5efbc7f1c radv/amdgpu: fix CS padding for non-GFX/COMPUTE queues
I forgot that SDMA and VIDEO existed somehow.

Fixes: d690f293c6 ("radv/winsys: pad gfx and compute IBs with only one NOP")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30769>
2024-08-22 11:10:29 +00:00
Samuel Pitoiset
322227ba17 radv: use a sized NOP packet for the DGC preamble
This is faster than a pile of 1-dword NOPs. Note that GFX6 actually
supports type-3 NOP as long as the size is more than the header which
is always the case for the DGC preamble.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30748>
2024-08-21 18:08:45 +00:00
Samuel Pitoiset
6fa1bf3b88 radv: pad GFX preambles IBs with only one NOP
This is optimal.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30743>
2024-08-21 14:55:04 +00:00
Samuel Pitoiset
d690f293c6 radv/winsys: pad gfx and compute IBs with only one NOP
1-dword NOPs are slow and it's better to emit a sized NOP packet when
possible.

Based on RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30743>
2024-08-21 14:55:04 +00:00
Konstantin
19d633af0b radv: Handle repeated instructions when splitting disassembly
cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30734>
2024-08-21 13:38:53 +00:00
Konstantin
1cf507b806 radv: Handle instruction encodings > 8 bytes when splitting disassembly
Choosing the wrong instruction length prevents
radv_dump_annotated_shader from matching waves.

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30734>
2024-08-21 13:38:53 +00:00