fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-16 22:48:05 +02:00

Author	SHA1	Message	Date
Assadian, Navid	cb32bcd3fe	amd/vpelib: Add 420 semi-planar 12bit handling Adds semi-Planar 420 12 bits formats. Reviewed-by: Roy Chan <roy.chan@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Navid Assadian <navid.assadian@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>	2024-08-26 19:57:15 +00:00
Brendan	fcad791d07	amd/vpelib: Create virtual stream concept [Why] Need to create streams that don't come from input params (ex. for bg gen) to prepare for future concepts. [How] Add enum for stream type, create helper functions to populate virtual streams, and add custom functions where virtual stream function varies from input stream function. Reviewed-by: Roy Chan <roy.chan@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Brendan Leder <brendansteve.leder@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>	2024-08-26 19:57:14 +00:00
Lin, Ricky	b670701b65	amd/vpelib: Increase the CD field in vpe descriptor programming Introduce the vpe desc writer hook. Co-authored-by: Roy Chan <roy.chan@amd.com> Reviewed-by: Roy Chan <roy.chan@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Ricky Lin <ricky.lin@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>	2024-08-26 19:57:14 +00:00
Shih, Jude	cb9175a7af	amd/vpelib: Update Plane Descriptor Writer Refactor to support new plane descriptor hook, and update enum vpe_scan_direction. Co-authored-by: Jesse Agate <jesse.agate@amd.com> Co-authored-by: Roy Chan <roy.chan@amd.com> Reviewed-by: Roy Chan <roy.chan@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Jude Shih <shenshih@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>	2024-08-26 19:57:14 +00:00
Patel, Utpal	18dae30b17	amd/vpelib: Add resource function hooks for checking support Add function hooks for checking support including rotation, background color, DCC capability and input/output support check. Reviewed-by: Roy Chan <roy.chan@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Utpal Patel <utpal.patel@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>	2024-08-26 19:57:14 +00:00
Alan Liu	06097ad64d	amd/vpelib: Remove unused structs Remove the definition of unused structs: - struct x_axis_config - struct point_config - struct curve_points32 - struct lut_point - struct pwl_parameter2 Reviewed-by: Krunoslav Kovac <krunoslav.kovac@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Alan Liu <haoping.liu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>	2024-08-26 19:57:14 +00:00
Chang, Tomson	6483c2c786	amd/vpelib: Add and fix collaborate sync data [Why&How] The original implementation always have sync data == 1. Make it increasing with some 4 bits in random to help debugging collaborate sync issues across multiple contexts. Reviewed-by: Roy Chan <roy.chan@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Tomson Chang <tomson.chang@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>	2024-08-26 19:57:14 +00:00
Lin, Ricky	015b1b52c8	amd/vpelib: Remove extra collaborate sync commands in IB Remove extra collaborate sync commands and fix coding format. Co-authored-by: Roy Chan <roy.chan@amd.com> Reviewed-by: Roy Chan <roy.chan@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Ricky Lin <ricky.lin@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>	2024-08-26 19:57:14 +00:00
Lin, Ricky	e9e2fe389f	amd/vpelib: Use VPE_IP_LEVEL_1_0 for VPE IP 6.1.3 Use VPE_IP_LEVEL_1_0 for VPE IP version 6.1.0 and 6.1.3. Reviewed-by: Tomson Chang <tomson.chang@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Ricky Lin <ricky.lin@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>	2024-08-26 19:57:14 +00:00
Patel, Utpal	73d112f372	amd/vpelib: Add input pixel format support Add input pixel format support for VPE. Signed-off-by: Utpal Patel <utpal.patel@amd.com> Reviewed-by: Jesse Agate <jesse.agate@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>	2024-08-26 19:57:14 +00:00
Hsieh, Mike	0164bfda65	amd/vpelib: Add cache mechanism for 3D Lut command [WHY & HOW] Converting 3D Lut parameters into vpe command takes time. 3D Lut will not change every frame, by adding cache mechanism can improve effeciency. Reviewed-by: Tomson Chang <tomson.chang@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Mike Hsieh <mike.hsieh@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>	2024-08-26 19:57:14 +00:00
Kovac, Krunoslav	9817793cd9	amd/vpelib: Reuse existing float to reg format conversion Remove vpe_fixpt_from_float and use existing conversion for double(float)->reg custom 1.6.12 format. Reviewed-by: Roy Chan <roy.chan@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Krunoslav Kovac <krunoslav.kovac@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>	2024-08-26 19:57:14 +00:00
Rhys Perry	dea1fedf51	aco/tests: add more VALUMaskWriteHazard tests Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>	2024-08-26 19:16:34 +00:00
Rhys Perry	11262a01ce	aco: preserve bitsets after a lane mask is written fossil-db (navi31): Totals from 4840 (6.10% of 79395) affected shaders: Instrs: 13733449 -> 13761177 (+0.20%); split: -0.00%, +0.21% CodeSize: 71997868 -> 72102520 (+0.15%); split: -0.00%, +0.15% Latency: 128385177 -> 128408780 (+0.02%); split: -0.00%, +0.02% InvThroughput: 21105847 -> 21109475 (+0.02%); split: -0.00%, +0.02% VALU: 7741209 -> 7741210 (+0.00%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Backport-to: 24.1 Backport-to: 24.2 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>	2024-08-26 19:16:34 +00:00
Rhys Perry	61e73c2323	aco: check SALU writing lanemask later for VALUMaskWriteHazard This should be done after reads are checked and sgpr_read_by_valu_as_lanemask_then_wr_by_salu is reset. The old version also skipped checking the reads if the write check passed. fossil-db (navi31): Totals from 193 (0.24% of 79395) affected shaders: Instrs: 3212435 -> 3212735 (+0.01%) CodeSize: 16462868 -> 16463848 (+0.01%); split: -0.00%, +0.01% Latency: 19492377 -> 19492462 (+0.00%); split: -0.00%, +0.00% InvThroughput: 4419705 -> 4419718 (+0.00%); split: -0.00%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Backport-to: 24.1 Backport-to: 24.2 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>	2024-08-26 19:16:34 +00:00
Rhys Perry	b1ba7d1b99	aco: don't consider sa_sdst=0 before SALU write to fix VALUMaskWriteHazard LLVM does but that's probably a bug. fossil-db (navi31): Totals from 311 (0.39% of 79395) affected shaders: Instrs: 380453 -> 381075 (+0.16%) CodeSize: 1961012 -> 1964744 (+0.19%) Latency: 4799095 -> 4800313 (+0.03%) InvThroughput: 958358 -> 958904 (+0.06%) VALU: 242322 -> 242633 (+0.13%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Backport-to: 24.1 Backport-to: 24.2 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>	2024-08-26 19:16:34 +00:00
Rhys Perry	8f5ee70d85	aco: also consider VALU reads for VALUMaskWriteHazard fossil-db (navi31): Totals from 9776 (12.31% of 79395) affected shaders: Instrs: 19348258 -> 19383680 (+0.18%); split: -0.00%, +0.19% CodeSize: 101223460 -> 101366964 (+0.14%); split: -0.01%, +0.15% Latency: 172853115 -> 172866070 (+0.01%); split: -0.01%, +0.01% InvThroughput: 27590468 -> 27592390 (+0.01%); split: -0.00%, +0.01% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11550 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11436 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11337 Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11738 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11741 Backport-to: 24.1 Backport-to: 24.2 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>	2024-08-26 19:16:34 +00:00
Rhys Perry	ee648326d9	aco: ignore exec and literals when mitigating VALUMaskWriteHazard LLVM ignores exec and literals don't seem to work in some cases. fossil-db (navi31): Totals from 2676 (3.37% of 79395) affected shaders: Instrs: 10638979 -> 10646019 (+0.07%); split: -0.00%, +0.07% CodeSize: 55929640 -> 55959416 (+0.05%); split: -0.00%, +0.06% Latency: 107707408 -> 107712893 (+0.01%); split: -0.00%, +0.01% InvThroughput: 18119843 -> 18120442 (+0.00%); split: -0.00%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Backport-to: 24.1 Backport-to: 24.2 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>	2024-08-26 19:16:34 +00:00
Daniel Schürmann	14de650d58	aco: call nir_copy_prop() and nir_opt_dce() before instruction selection Totals from 1037 (1.31% of 79395) affected shaders: (Navi21) MaxWaves: 18760 -> 18960 (+1.07%) Instrs: 4865258 -> 4860063 (-0.11%); split: -0.11%, +0.00% CodeSize: 27094112 -> 27089224 (-0.02%); split: -0.06%, +0.04% VGPRs: 68816 -> 68000 (-1.19%) SpillVGPRs: 2140 -> 2105 (-1.64%) Scratch: 4237312 -> 4234240 (-0.07%) Latency: 55894512 -> 55748035 (-0.26%); split: -0.31%, +0.05% InvThroughput: 11611286 -> 11372897 (-2.05%); split: -2.09%, +0.03% VClause: 145331 -> 145285 (-0.03%); split: -0.04%, +0.01% SClause: 150339 -> 150338 (-0.00%) Copies: 472476 -> 468470 (-0.85%); split: -0.88%, +0.03% Branches: 206562 -> 206067 (-0.24%); split: -0.24%, +0.00% PreVGPRs: 61747 -> 61361 (-0.63%) VALU: 3116434 -> 3112660 (-0.12%); split: -0.13%, +0.00% SALU: 723154 -> 722887 (-0.04%); split: -0.04%, +0.01% VMEM: 238656 -> 238586 (-0.03%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30786>	2024-08-26 12:59:00 +00:00
Samuel Pitoiset	cc5d481f41	radv/ci: enable RADV_PERFTEST=transfer_queue on GFX9+ To avoid breaking this because it's not enabled by default. There is a couple of failures because MSAA is still broken with SDMA. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30778>	2024-08-26 09:26:52 +00:00
Samuel Pitoiset	731523a10b	radv/ci: update flakes lists for NAVI21/VANGOGH Found these when I did a stress test with RADV_PERFTEST=transfer_queue enabled but they are existing flakes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30778>	2024-08-26 09:26:52 +00:00
Dave Airlie	68cd36d9b4	radv/video: fix reporting video format props for encode. When encode isn't enabled, refuse the image usage, also use the correct error on the decode check. Fixes: `05cd42417f` ("radv/video: enable video encoding behind perftest flag") Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30838>	2024-08-26 08:49:54 +00:00
Samuel Pitoiset	7f7ecaf08c	radv: optimize NOPs padding with DGC There is two different alignment requirements: a) IB VA must be aligned to ib_alignment b) IB size must be aligned to ib_pad_dw_mask Though RADV was aligning DGC cmdbuf to ib_alignment always, but this is unnecessary. Using the optimal padding size for DGC cmdbuf removes a bunch of useless NOPs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30768>	2024-08-26 08:22:06 +00:00
Samuel Pitoiset	a7547a9781	radv/amdgpu: assert that the DGC IB VA is correctly aligned It must be aligned to what the kernel returns. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30768>	2024-08-26 08:22:06 +00:00
Qiang Yu	58e412014a	ac,radv,radeonsi: stop using quad vote any/all when llvm ClustedAnd with bool argument and cluster_size==4 will be lowered to quad_vote_all. So does ALU nir_iand/ior op with bool src. OpenGL and Vulkan subgroup clustered_and tests with bool argument fail when using LLVM. It seems LLVM has bug when quad vote bool is in complex control flow. So stop using it for now. Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30610>	2024-08-26 10:46:15 +08:00
Qiang Yu	a37933b721	ac/llvm: build wqm for quad intrinsics only when fragment shader Otherwise we get wrong result when non-fragment shader. Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30610>	2024-08-26 10:46:11 +08:00
Karol Herbst	74dafa3c79	ac/llvm: fix umul_high LLVM optimizes umul_hi with a constant to v_mul_hi_i32_i24_e32 which isn't always what we need here. This causes miscalculations. To prevent LLVM to apply this optimization, we insert a optimization barrier. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11761 Suggested-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30810>	2024-08-24 16:10:20 +00:00
Samuel Pitoiset	28c957409f	radv/amdgpu: do not check that a CS is aligned if no padding is added Some video queues don't require padding. Fixes: `d5efbc7f1c` ("radv/amdgpu: fix CS padding for non-GFX/COMPUTE queues") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30795>	2024-08-23 13:48:51 +00:00
Samuel Pitoiset	3af0f0129c	radv: fix DRLR with subpass input attachments and feedback loops Dynamic rendering local read allows the application to use subpass input attachments with feedback loops. But unless legacy RPs where it's possible to determine feedback look at creation time, with dynamic rendering it's not possible. To fix that, the driver needs to determine at draw time if a feedback loop is present, and it needs to decompress DCC/HTILE if necessary. See https://gitlab.khronos.org/vulkan/vulkan/-/issues/3928 for more information. Note that VKCTS is still missing coverage but this has been reported. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11127 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30124>	2024-08-23 12:20:02 +00:00
Samuel Pitoiset	4a191e34c9	radv: add support for input attachment indices with DRLR They will be used to detect feedback loops. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30124>	2024-08-23 12:20:02 +00:00
Samuel Pitoiset	ab2c8af634	radv: add radv_shader_info::ps::uses_fbfetch_output Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30124>	2024-08-23 12:20:02 +00:00
Samuel Pitoiset	541a204733	radv: use the Mesa-specifc dynamic rendering flag for meta operations Meta operations never use subpass input attachments. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30124>	2024-08-23 12:20:02 +00:00
Samuel Pitoiset	421c42170e	radv: stop emitting DB_COUNT_CONTROL in the GFX preamble This is already emitted as part of the occlusion query state and this state is dirty when a cmdbuf begins. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30788>	2024-08-23 09:50:40 +00:00
Samuel Pitoiset	e3e28bb514	radv: stop emitting PA_SC_CLIPRECT_RULE in the GFX preamble It's already emitted as part of the discard rectangle state and all dynamic states are dirty when a cmdbuf begins Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30788>	2024-08-23 09:50:40 +00:00
Samuel Pitoiset	4662483535	radv: stop emitting DB_RENDER_OVERRIDE in the GFX preamble It's already emitted as part of the depth clamp enable state and all dynamic states are dirty when a cmdbuf begins. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30788>	2024-08-23 09:50:40 +00:00
Samuel Pitoiset	cd57411aaa	radv: remove redundant PA_SU_PRIM_FILTER_CNTL in the GFX preamble It's already emitted below. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30788>	2024-08-23 09:50:40 +00:00
David Rosca	6e2ae9c581	radeonsi/vcn: Use pipe header params in H264 header encoder This now supports writing all fields as we get them on input from packed headers. Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30672>	2024-08-23 10:00:02 +02:00
David Rosca	af849516f0	radeonsi/vcn: Use pipe header params in HEVC header encoder This now supports writing all fields as we get them on input from packed headers. Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30672>	2024-08-23 10:00:02 +02:00
David Rosca	32c6a61e2b	radeonsi/vcn: Switch to app DPB management for H264 and HEVC encode This removes the internal DPB management logic, which was unnecessary as it was duplicating what applications already do, and it was also causing issues when the internal DPB would de-sync from application DPB (eg. driver removes reference that application still intends to use). DPB is now dynamically resized instead of using fixed number of slots. This also saves a lot of memory with HEVC encoding, as that was always using the max_references which va frontend sets to 15. Move reconstructed pictures to the end of the context and meta buffers to ensure resizing works correctly. Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30672>	2024-08-23 09:59:58 +02:00
Timothy Arceri	038b3c24d7	ci: bump piglit version Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30752>	2024-08-23 14:55:21 +10:00
Daniel Stone	4bcd57b0b5	ci/amd: Move manual/nightly jobs to postmerge stage Create a new stage called amd-postmerge and move the full and manual jobs over there, to avoid entanglement with the pre-merge jobs. Signed-off-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30784>	2024-08-22 15:35:18 +00:00
Rhys Perry	7b92e11e16	aco: forget valu delays after certain s_waitcnt_depctr/LDSDIR fossil-db (navi31): Totals from 55242 (69.58% of 79395) affected shaders: Instrs: 40507666 -> 40138006 (-0.91%); split: -0.91%, +0.00% CodeSize: 212516104 -> 211025880 (-0.70%); split: -0.70%, +0.00% Latency: 281643258 -> 281628053 (-0.01%); split: -0.01%, +0.00% InvThroughput: 46370668 -> 46369637 (-0.00%); split: -0.00%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23337>	2024-08-22 13:57:01 +00:00
Rhys Perry	30396ba604	aco: move insert_delay_alu to after insert_NOPs s_delay_alu doesn't affect any hazards, but hazard workarounds don't update s_delay_alu and so can make the s_delay_alu affect the wrong instructions. fossil-db (navi31): Totals from 55777 (70.25% of 79395) affected shaders: Instrs: 40740011 -> 40765017 (+0.06%) CodeSize: 213768484 -> 213870856 (+0.05%); split: -0.00%, +0.05% Latency: 283713083 -> 283714959 (+0.00%); split: -0.00%, +0.00% InvThroughput: 46551791 -> 46551835 (+0.00%); split: -0.00%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23337>	2024-08-22 13:57:01 +00:00
Rhys Perry	807651561e	aco: split insert_wait_states into two No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23337>	2024-08-22 13:57:00 +00:00
Samuel Pitoiset	d5efbc7f1c	radv/amdgpu: fix CS padding for non-GFX/COMPUTE queues I forgot that SDMA and VIDEO existed somehow. Fixes: `d690f293c6` ("radv/winsys: pad gfx and compute IBs with only one NOP") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30769>	2024-08-22 11:10:29 +00:00
Samuel Pitoiset	322227ba17	radv: use a sized NOP packet for the DGC preamble This is faster than a pile of 1-dword NOPs. Note that GFX6 actually supports type-3 NOP as long as the size is more than the header which is always the case for the DGC preamble. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30748>	2024-08-21 18:08:45 +00:00
Samuel Pitoiset	6fa1bf3b88	radv: pad GFX preambles IBs with only one NOP This is optimal. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30743>	2024-08-21 14:55:04 +00:00
Samuel Pitoiset	d690f293c6	radv/winsys: pad gfx and compute IBs with only one NOP 1-dword NOPs are slow and it's better to emit a sized NOP packet when possible. Based on RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30743>	2024-08-21 14:55:04 +00:00
Konstantin	19d633af0b	radv: Handle repeated instructions when splitting disassembly cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30734>	2024-08-21 13:38:53 +00:00
Konstantin	1cf507b806	radv: Handle instruction encodings > 8 bytes when splitting disassembly Choosing the wrong instruction length prevents radv_dump_annotated_shader from matching waves. cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30734>	2024-08-21 13:38:53 +00:00

1 2 3 4 5 ...

15670 commits