fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 11:18:11 +02:00

Author	SHA1	Message	Date
Timur Kristóf	766617e8da	radv: Enable NGG culling by default on GFX10. We never took the time to actually test this, but it works fine. Improves performance on Navi 10 in the following test cases: Baldur's Gate 3 Vulkan: up to 10% Witcher 3 D3D11: around 4% Granite primitive stress test: 107% FSR2 sample app: 57% Notes: NGG is still disabled on Navi 14. Not tested on Navi 12. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31971>	2024-11-06 03:16:54 +00:00
Timur Kristóf	6bf19b2d70	radv: Increase NGG culling PS param limit to 12 on GFX10. Helps performance in Baldur's Gate 3 on Navi 10 when NGG culling is enabled. Also fix the description of the RADV_PERFTEST=nggc env var. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31971>	2024-11-06 03:16:53 +00:00
Evan	c3c80491f9	amd/vpelib: Input Format Adjustment Reviewed-by: Jiali Zhao <Jiali.Zhao@amd.com> Reviewed-by: Jesse Agate <Jesse.Agate@amd.com> Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com> Signed-off-by: Evan <evan.damphousse@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31918>	2024-11-06 02:19:39 +00:00
Chang, Tomson	d1b790c028	amd/vpelib: Fix color fill performance issue on VPE1.1 (#419 ) \[WHY\] For color fill only case we see performance on Vpe1.1 are not doubled due to CD are all 0, no odd CD \[HOW\] 1. Dummy stream dst rect should be in the middle of target rect so the two (dummy seg + bg only seg) are balanced, instead of target at upper left corner which makes it imbalance 2. BG gap generation should consider more for collaboration mode num_multiple 3. When pure bg case, skip dummy stream handling and go ahead do BG gap generation 4. Update memory requirement for the new pure BG case flow to avoid run out of embedded buffer 4. Additional -- fix the random Collaborate data generation bug (benign) \[TESTING\] Vpelibtest app + nv12torgb case with debug flag bgcolorfill set on in vpelibtestapp Media player with/without bgcolorfillonly flag Teams Reviewed-by: Roy Chan <Roy.Chan@amd.com> Reviewed-by: Navid Assadian <Navid.Assadian@amd.com> Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com> Signed-off-by: Tomson Chang <tomson.chang@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31918>	2024-11-06 02:19:39 +00:00
Visan, Tiberiu	4661bf3659	amd/vpelib: Remove TODO comments and legacy check(#421 ) \[WHY\] 1.Remove TODO comments that don't need action item 2.Delete the legacy command number check as it is now using a vector (i.e. without hard limit) \[HOW\] Remove TODO comments and delete the legacy command number check Signed off by <tvisan@amd.com> Reviewed-by: Roy Chan <Roy.Chan@amd.com> Reviewed-by: Jesse Agate <Jesse.Agate@amd.com> Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31918>	2024-11-06 02:19:39 +00:00
Chenyu Chen	e0754a6dc7	amd/vpelib: Remove unused define macro Reviewed-by: Roy Chan <Roy.Chan@amd.com> Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com> Signed-off-by: Chenyu Chen <Chen-Yu.Chen@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31918>	2024-11-06 02:19:39 +00:00
Samuel Pitoiset	64774f9c19	radv: cleanup tools related resources when destroying logical device This was missing. Cc: mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31986>	2024-11-05 15:31:00 +00:00
Marek Olšák	5882b5b93b	amd/ci: adjust stoney traces checksums Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31968>	2024-11-05 14:13:40 +00:00
Samuel Pitoiset	32a537b25b	aco: use inlined constant offsets for storing SGPRs in the trap handler Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31976>	2024-11-05 11:55:24 +00:00
Samuel Pitoiset	9bcf17ef5a	aco: add support for the trap handler shader on GFX11 This has been verified on navi31. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31960>	2024-11-05 07:58:38 +00:00
Samuel Pitoiset	6d5a2ae928	aco: clear the current wave exception in the trap handler This is required to re-enable VALU instructions in this wave, only float exception seem to be affected. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31960>	2024-11-05 07:58:38 +00:00
Samuel Pitoiset	e85fc0f869	aco: fix validation for VOP1 instructions without any dest/src Like v_clrexcp. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31960>	2024-11-05 07:58:38 +00:00
Samuel Pitoiset	81f4670ed6	radv,aco: dump all SGPRS from the trap handler Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31960>	2024-11-05 07:58:38 +00:00
Samuel Pitoiset	45d56d9395	radv: set missing shader info values for the trap handler This fixes an assert in radv_precompute_registers_pgm() on GFX11 because it was considered a vertex shader. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31962>	2024-11-05 07:11:30 +00:00
Marek Olšák	755fb7a262	amd: move Tonga and Iceland TC-compat HTILE workarounds to ac_gpu_info.c Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31910>	2024-11-04 19:45:54 +00:00
Georg Lehmann	a7f6294f90	radv: use nir_opt_frag_coord_to_pixel_coord Foz-DB Navi21: Totals from 1648 (2.08% of 79395) affected shaders: MaxWaves: 44918 -> 44950 (+0.07%); split: +0.09%, -0.02% Instrs: 1004193 -> 1001179 (-0.30%); split: -0.33%, +0.03% CodeSize: 5486412 -> 5486592 (+0.00%); split: -0.08%, +0.09% VGPRs: 56664 -> 56552 (-0.20%); split: -0.93%, +0.73% Latency: 15430894 -> 15435320 (+0.03%); split: -0.12%, +0.15% InvThroughput: 3097789 -> 3092861 (-0.16%); split: -0.20%, +0.04% VClause: 18757 -> 18793 (+0.19%); split: -0.13%, +0.32% SClause: 34475 -> 34495 (+0.06%); split: -0.11%, +0.17% Copies: 66195 -> 66150 (-0.07%); split: -0.88%, +0.81% Branches: 23035 -> 23033 (-0.01%) PreVGPRs: 42235 -> 41724 (-1.21%); split: -1.32%, +0.11% VALU: 709730 -> 706662 (-0.43%); split: -0.47%, +0.04% SALU: 111731 -> 111722 (-0.01%); split: -0.02%, +0.01% VMEM: 25988 -> 25987 (-0.00%) Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31864>	2024-11-04 12:34:31 +00:00
Georg Lehmann	a58d2b59e9	aco: implement load_pixel_coord Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31864>	2024-11-04 12:34:30 +00:00
Georg Lehmann	42d5cb62bb	ac/llvm: implement load_pixel_coord Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31864>	2024-11-04 12:34:30 +00:00
Georg Lehmann	a2a9e93e72	radv: add support for load_pixel_coord Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31864>	2024-11-04 12:34:30 +00:00
Samuel Pitoiset	1fa0fe1e0c	aco: add support for the trap handler shader on GFX9-GFX10.3 This has been tested on navi21. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31926>	2024-11-04 10:48:52 +00:00
Samuel Pitoiset	281eb14df8	aco: fix reading registers from the trap handler shader It should read 32-bit values, otherwise some MSB are 0 and it's missing some information. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31926>	2024-11-04 10:48:52 +00:00
Samuel Pitoiset	f7636b611a	radv: add a struct that describes the trap handler layout Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31934>	2024-11-01 15:40:25 +00:00
Samuel Pitoiset	49682fc0cb	radv,aco: save SQ_WAVE_GPR_ALLOC from the trap handler This would be used to dump SGPRs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31934>	2024-11-01 15:40:25 +00:00
Samuel Pitoiset	31fc3199dd	radv: fix dumping the faulty shader detected by the trap handler on GFX9+ The most significant bits need to be cleared. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31925>	2024-11-01 15:01:35 +00:00
Samuel Pitoiset	7b4da7f736	radv: only emit the TBA/TMA registers on GFX8 On GFX9+, these registers are privilegied and the kernel needs to configure them. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31925>	2024-11-01 15:01:35 +00:00
Samuel Pitoiset	930395c5e4	radv: check for has_trap_handler_support instead of asserting Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31925>	2024-11-01 15:01:35 +00:00
Samuel Pitoiset	e27ba67d33	ac: add ac_gpu_info::has_trap_handler_support Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31925>	2024-11-01 15:01:35 +00:00
Samuel Pitoiset	b23cc8c1d3	radv: add missing L2 non-coherent image case for mipmaps with DCC/HTILE on GFX11 According to PAL, an image with DCC/HTILE and mipmaps isn't coherent with L2 when the mip level is in the metadata mip-tail region. This fix isn't super optimal because the driver should rely on the subresource range to determine if the mip level is in the mip-tail, but it's easier to backport. Upcoming commits will optimize that. Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11939 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31920>	2024-11-01 14:36:55 +00:00
David Rosca	c9ade8c3b5	radeonsi/vcn: Enable VCN4 AV1 encode WA Cc: mesa-stable Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31889>	2024-11-01 14:05:04 +00:00
Samuel Pitoiset	01f329ec82	radv/ci: skip dEQP-VK.api.command_buffers.many_indirect_disps_on_secondary It can also hang randomly on VanGogh, let's skip it by default for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31922>	2024-10-31 11:44:12 +00:00
Samuel Pitoiset	77e59eefc1	radv: add an option to configure the trap handler exceptions This introduces RADV_TRAP_HANDLER_EXCP to configure the various shader exceptions. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31902>	2024-10-31 06:58:15 +00:00
Samuel Pitoiset	6b5a0f57ba	radv: fix configuring the memory violation exception for the compute stage The compute stage has two EXCP_EN fields and the memory violation bit is in EXCP_EN_MSB. Confirmed by writing a small test on GFX8. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31902>	2024-10-31 06:58:14 +00:00
Timur Kristóf	96b95c8427	radv: Flush L2 cache for non-L2-coherent images in EndCommandBuffer. This fixes a CTS hang on Hawaii. We previously only did a CB/DB flush, but that doesn't include a L2 cache flush. Also fix the comment that said this is for GFX9+. Fixes: `7c62f6fa01` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31906>	2024-10-30 17:46:50 +00:00
Samuel Pitoiset	7015e22cb6	ac/nir: cull triangles/lines when all W positions are zero/NaN It looks like the fixed-func hardware is very slow to cull primitives with zero pos.w but shader based culling helps a lot. This fixes a massive performance gap with the FSR2 demo compared to AMDGPU-PRO, +228% on RDNA2. Based on my investigation, AMDGPU-PRO seems to always cull these primitives. Note that disabling NGG culling with AMDGPU-PRO reports the same performance as RADV without that fix. Also note that the FSR2 sample doesn't specify any cull mode (ie. VK_CULL_MODE_NONE is used), so this is the only reason PRO was culling more than RADV. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7260 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31891>	2024-10-30 17:09:37 +00:00
Samuel Pitoiset	fc0545e6a7	radv: fix wrong index in radv_skip_graphics_pipeline_compile() Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12089 Cc: mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31901>	2024-10-30 11:25:59 +00:00
Daniel Schürmann	62715984f8	aco/README: add descriptions of recently added passes ... and less recent ones. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31888>	2024-10-30 09:23:54 +00:00
Daniel Schürmann	21ceeb22ed	aco: move jump threading optimization into separate pass Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31888>	2024-10-30 09:23:54 +00:00
Daniel Schürmann	87a3c08df1	aco/ssa_elimination: remove some redundant checks during jump threading Since phis got already lowered to parallelcopies by this point, there is no need to cross-check. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31888>	2024-10-30 09:23:54 +00:00
Daniel Schürmann	a6c38f706d	aco/ssa_elimination: perform jump threading after parallelcopy insertion Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31888>	2024-10-30 09:23:54 +00:00
Samuel Pitoiset	4459a1d210	radv: resize the SPM bo when it's too small This used to abort (see the previous commit) when the hardware wasn't able to sample all SPM counters because the BO was too small. The SPM BO can now be resized like the SQTT BO. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31883>	2024-10-29 18:33:17 +00:00
Samuel Pitoiset	e14511f77d	ac/spm: do not abort when the SPM BO is too small It needs to be resized instead, like the SQTT BO. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31883>	2024-10-29 18:33:17 +00:00
Marek Olšák	4f096b994d	ac/nir,radeonsi: use load_cull_line_viewport_xy_scale_and_offset_amd Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865>	2024-10-29 16:47:44 +00:00
Marek Olšák	0f39d44f1b	ac/nir,radeonsi: use load_cull_small_line_precision_amd Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865>	2024-10-29 16:47:44 +00:00
Marek Olšák	10c6f87adb	ac/nir,radeonsi: use load_cull_small_lines_enabled_amd Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865>	2024-10-29 16:47:44 +00:00
Marek Olšák	ee452129c6	nir: add cull_triangles_, cull_lines_ prefixes to viewport_xy_scale_and_offset for radeonsi Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865>	2024-10-29 16:47:44 +00:00
Marek Olšák	2227f5be9d	nir: rename load_cull_small_primitive_precision -> triangle, add line_precision for radeonsi Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865>	2024-10-29 16:47:44 +00:00
Marek Olšák	0914e0d02f	nir: rename load_cull_small_primitives -> triangles, add load_cull_small_lines for radeonsi Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865>	2024-10-29 16:47:44 +00:00
Lu Yao	0442a6c292	ac/radeonsi: compute htile for tile mode RADEON_SURF_MODE_1D on GFX6-8 Computing 'htile_size/meta_size' is allowed for RADEON_SURF_MODE_1D when RADEON_SURF_TC_COMPATIBLE_HTILE isn't set. Lacking of computing causes performance degradation in some scenarios. Fixes: `d4d9ec55c5` ("radeonsi: implement TC-compatible HTILE") Signed-off-by: Lu Yao <yaolu@kylinos.cn> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31617>	2024-10-29 16:23:51 +00:00
Georg Lehmann	938f5ec7ce	radv: use nir_opt_fragdepth Cyberpunk 2077 writes unmodified depth. Foz-DB Navi21: Totals from 28 (0.04% of 79395) affected shaders: Instrs: 6484 -> 6448 (-0.56%) CodeSize: 36016 -> 35784 (-0.64%) Latency: 58517 -> 58400 (-0.20%) InvThroughput: 7719 -> 7717 (-0.03%) Branches: 129 -> 119 (-7.75%) PreVGPRs: 394 -> 372 (-5.58%) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31874>	2024-10-29 15:15:24 +00:00
Georg Lehmann	695d2414cd	nir,radv: optimize shared atomic offsets Foz-DB Navi21: Totals from 87 (0.11% of 79395) affected shaders: Instrs: 140877 -> 140873 (-0.00%) CodeSize: 747760 -> 747164 (-0.08%); split: -0.09%, +0.01% Latency: 4528171 -> 4528162 (-0.00%) InvThroughput: 826358 -> 826349 (-0.00%) Copies: 10888 -> 10884 (-0.04%) VALU: 84634 -> 84630 (-0.00%) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31080>	2024-10-29 09:31:08 +00:00

1 2 3 4 5 ...

16124 commits