fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 09:08:07 +02:00

Author	SHA1	Message	Date
Samuel Pitoiset	b3d4d65f5a	radv: fix CP DMA clears/copies on GFX12 CP DMA on GFX12 doesn't always use L2. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32971>	2025-01-13 08:07:58 +00:00
Samuel Pitoiset	603541f1a2	ac/gpu_info: add cp_dma_use_L2 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32971>	2025-01-13 08:07:58 +00:00
Rhys Perry	2b10930b48	aco: use VOP3 v_mov_b16 if necessary Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Backport-to: 24.3 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32944>	2025-01-10 15:05:00 +00:00
Rhys Perry	46787fc2d0	aco/util: fix bit_reference::operator&= Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Backport-to: 24.3 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32944>	2025-01-10 15:05:00 +00:00
Timur Kristóf	dd980d2b28	radv: Only print "testing use only" message on GFX12+. This message has been confusing users, especially now that popular toolkits such as Gtk started using a Vulkan renderer. Printing a message on non-conformant implementations is also actually not required. So let's remove it. We haven't fully finished the GFX12 implementation yet, but on all other hardware, RADV should work just fine, and is definitely not meant for "testing use only". Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12314 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32930>	2025-01-09 23:16:48 +00:00
Marek Olšák	e640d5a9c3	amd: vectorize SMEM loads aggressively, allow overfetching for ACO If there is a 4-byte hole between 2 loads, they are vectorized. Example: load 4 + hole 4 + load 8 -> load 16 This helps GLSL uniform loads, which are often sparse. See the code for more info. RADV could get better code by vectorizing later. radeonsi+ACO - TOTALS FROM AFFECTED SHADERS (45482/58355) Spilled SGPRs: 841 -> 747 (-11.18 %) Code Size: 67552396 -> 65291092 (-3.35 %) bytes Max Waves: 714439 -> 714520 (0.01 %) This should have no effect on LLVM because ac_build_buffer_load scalarizes SMEM, but it's improved for some reason: radeonsi+LLVM - TOTALS FROM AFFECTED SHADERS (4673/58355) Spilled SGPRs: 1450 -> 1282 (-11.59 %) Spilled VGPRs: 106 -> 107 (0.94 %) Scratch size: 101 -> 102 (0.99 %) dwords per thread Code Size: 14994624 -> 14956316 (-0.26 %) bytes Max Waves: 66679 -> 66735 (0.08 %) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29399>	2025-01-09 22:01:54 +00:00
Marek Olšák	abd5216ae8	ac,radeonsi: scalarize overfetching loads There is nothing preventing ACO from generating loads with unused components. This happens often with GLSL uniforms. Some of those loads are partially re-vectorized after this. radeonsi+ACO: TOTALS FROM AFFECTED SHADERS (19564/58918) VGPRs: 732900 -> 728448 (-0.61 %) Spilled SGPRs: 429 -> 433 (0.93 %) Code Size: 38446004 -> 38485612 (0.10 %) bytes Max Waves: 305440 -> 305549 (0.04 %) Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29399>	2025-01-09 22:01:54 +00:00
Marek Olšák	58a88bbdb9	ac/nir/ngg: export positions after streamout to improve performance Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32686>	2025-01-09 20:47:16 +00:00
Marek Olšák	fc73749d6c	ac/nir/ngg: fold so_vertex_index * so_stride into immediate offset Instead of using a different voffset VGPR per streamout vertex, point voffset to the first vertex for all 3 vertices because the stride and vertex index are constant and can be in the immediate offset. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32686>	2025-01-09 20:47:16 +00:00
Marek Olšák	97e82af162	ac/nir/ngg: vectorize streamout stores for NGG optimally Walk the whole vertex stride thanks to XFB info sorted by offset, gather individual components from same or different outputs, and once we have gathered 4, store them as vec4. It also removes the memory_modes field from VMEM stores because I don't think it's needed. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32686>	2025-01-09 20:47:16 +00:00
Marek Olšák	4f2e2e10bc	ac/nir: vectorize streamout stores for legacy pipeline optimally Walk the whole vertex stride thanks to XFB info sorted by offset, gather individual components from same or different outputs, and once we have gathered 4, store them as vec4. It also removes the COHERENT flag from VMEM stores because NGG streamout doesn't use it either and I don't think it's needed. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32686>	2025-01-09 20:47:16 +00:00
Marek Olšák	e399f3bed9	ac/nir: sort xfb info to facilitate vectorization of xfb stores xfb stores are not vectorized properly, leading to generating random soup of b32, b64, b96, and b128 stores. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32686>	2025-01-09 20:47:16 +00:00
Samuel Pitoiset	f09f31d093	ac/nir: fix a comment typo in load_subgroup_id_lowered() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32940>	2025-01-09 08:02:19 +00:00
Samuel Pitoiset	44ba856089	ac/nir: fix lowering subgroup ID for compute shaders on GFX12 This is lowered in backend compilers (LLVM or ACO) because it needs to access ttmp registers which aren't exposed to NIR. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32940>	2025-01-09 08:02:19 +00:00
Samuel Pitoiset	bc1374355b	radv: program DB_RENDER_OVERRIDE correctly on GFX12 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32941>	2025-01-09 07:39:23 +00:00
Rhys Perry	8ac4744706	aco/tests: fix skip_lines=True with remaining characters in matches If the remaining character check fails, we should try a later line if skip_lines=True. So the check has to be done earlier. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32902>	2025-01-08 15:28:37 +00:00
Friedrich Vock	71392fff25	aco: Fix dead instruction/index handling for try_insert_saveexec_out_of_loop The loop checking if exec is overwritten didn't check for NULL instructions, and didn't fix up reg write indices after inserting instructions. Fixes: `fcd94a8c` ("aco: move try_optimize_branching_sequence() to postRA optimizations") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32746>	2025-01-08 10:48:01 +00:00
Georg Lehmann	208d8cd715	radv: run peephole_select in optimize_nir_algebraic Foz-DB Navi21: Totals from 451 (0.57% of 79395) affected shaders: MaxWaves: 8680 -> 8616 (-0.74%) Instrs: 689610 -> 688225 (-0.20%); split: -0.21%, +0.01% CodeSize: 3524580 -> 3521740 (-0.08%); split: -0.11%, +0.03% VGPRs: 28512 -> 28584 (+0.25%) Latency: 1906219 -> 1892124 (-0.74%); split: -0.91%, +0.17% InvThroughput: 481931 -> 483570 (+0.34%); split: -0.00%, +0.34% VClause: 10317 -> 10296 (-0.20%) SClause: 18105 -> 18088 (-0.09%); split: -0.17%, +0.07% Copies: 69532 -> 67579 (-2.81%); split: -2.85%, +0.04% Branches: 21353 -> 20501 (-3.99%) PreSGPRs: 27004 -> 27005 (+0.00%) VALU: 436235 -> 436334 (+0.02%); split: -0.01%, +0.03% SALU: 102349 -> 101944 (-0.40%); split: -0.61%, +0.21% Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32792>	2025-01-08 09:56:39 +00:00
Marek Olšák	c20c46cf7b	ac: update ATOMIC_MEM definitions Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32877>	2025-01-07 20:24:19 +00:00
Pierre-Eric Pelloux-Prayer	dd11eec06b	gl/spirv: update subgroup_size if GroupNonUniform is used This is similar to what link_intrastage_shaders is doing and it fixes the following test: KHR-Single-GL46.subgroups.builtin_var.compute.subgroupsize_compute Which was failing with SPIRV but passing with GLSL, the diff being: - SPIRV: "subgroup_size: 1" - GLSL: "subgroup_size: 2" Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32698>	2025-01-07 19:32:43 +00:00
Pierre-Eric Pelloux-Prayer	dc293ffe50	radeonsi: fallback to util_blitter_draw_rectangle The blitter VS expects coords to fit in a signed int16. When this is not the case, use util_blitter_draw_rectangle instead. Since util_blitter_draw_rectangle sets vertex elements, we need to make sure they're properly restored. The alternative to this fallback would be to pass coordinates unpacked (so 4 SGPRs instead of 2), but this doesn't fix the fbo-blit-check-limits test because of uv interpolation precision issue. Using 2 triangles instead of a rectangle + disabling window_space_position helps but then this breaks some GLES3 tests, like dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_x (which doesn't pass either if u_blitter is used for all cases). Using a single triangle covering the whole rectangles fixes all cases but it then requires to setup scissors to not write too much pixels... So, instead of adding so much complexity, let's use u_blitter for the "large coordinates" fallback, and keep the rectangle blit for the other cases. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32698>	2025-01-07 19:32:43 +00:00
Samuel Pitoiset	7f50162424	radv: fix programming WALK_ALIGN8_PRIM_FITS_ST on GFX12 This also needs to be disabled when a VRS image is used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32914>	2025-01-07 18:56:24 +00:00
Samuel Pitoiset	d7bc370b9e	radv: configure the VRS surface swizzle mode on GFX12 GFX11 allowed only one swizzle mode for the VRS image but GFX12 allows all 2D non-linear swizzle modes and PC_SC_VRS_INFO needs to be configured. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32914>	2025-01-07 18:56:24 +00:00
Samuel Pitoiset	0b53e645a0	radv: disable VRS coarse shading with 8x MSAA on GFX12 This isn't supported and the hw always clamps to 1x1. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32914>	2025-01-07 18:56:24 +00:00
Samuel Pitoiset	f94bd67b82	aco: fix VS prologs on GFX12 MTBUF/MUBUF instructions must use zero for SOFFSET, use const_offset instead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32904>	2025-01-07 13:44:32 +00:00
Feng Jiang	701600fb11	radv/rt: Fix memleak in radv_init_header() Fixes: `f8b584d` ("vulkan/runtime,radv: Add shared BVH building framework") Signed-off-by: Feng Jiang <jiangfeng@kylinos.cn> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32887>	2025-01-07 09:49:56 +00:00
Samuel Pitoiset	c5fe9dcf16	ac/descriptors: fix configuring NBC views on GFX12 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32892>	2025-01-07 09:15:12 +00:00
Chia-I Wu	f6332ca650	radv: use common calibrated timestamp support Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32689>	2025-01-07 03:39:29 +00:00
Martin Roukala (né Peres)	f1a6af133a	radeonsi/ci: run a fraction of glcts-vangogh in pre-merge Now that ACO has become the default on pre-RDNA GPUs, all pre-merge CI coverage of radeonsi+LLVM has disapeared. Let's fix this by making our post-merge glcts-vangogh-valve job run inpre-merge pipelines. However, we are limited in vangogh capacity, so rather than running the full glcts/piglit test suites we run a fraction of it to stay under 15 minutes of execution time on a single Steam Deck. Suggested-by: Marek Olšák <maraeo@gmail.com> Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32865>	2025-01-06 11:55:22 +00:00
Martin Roukala (né Peres)	0c538f82bc	radeonsi/ci: run on ACO changes Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32865>	2025-01-06 11:55:22 +00:00
Martin Roukala (né Peres)	bec7f09e76	radeonsi/ci: update the vangogh expectations Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32865>	2025-01-06 11:55:21 +00:00
David Rosca	e33452a6d3	ac/surface: Don't force linear for VIDEO_REFERENCE with emulated image opcodes This caused regression by using higher pitch than needed on compute-only devices, resulting in video decode errors. Fixes: `308bae950f` ("ac/surface: Add RADEON_SURF_VIDEO_REFERENCE") Tested-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32863>	2025-01-04 09:13:44 +00:00
Samuel Pitoiset	03b037a0e3	radv: disable logic op for float/srgb formats The Vulkan spec says: "The application can enable a logical operation between the fragment’s color values and the existing value in the framebuffer attachment. This logical operation is applied prior to updating the framebuffer attachment. Logical operations are applied only for signed and unsigned integer and normalized integer framebuffers. Logical operations are not applied to floating-point or sRGB format color attachments." Missing VKCTS coverage has been reported. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12345 Cc: mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32826>	2025-01-03 09:35:45 +00:00
Samuel Pitoiset	0019900312	radv/meta: do not create redundant pipeline layout objects Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32834>	2025-01-03 09:11:59 +00:00
Samuel Pitoiset	105e809a9d	radv/meta: add radv_meta_get_noop_pipeline_layout() To avoid duplicated objects. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32834>	2025-01-03 09:11:59 +00:00
Samuel Pitoiset	dd7343f278	radv/meta: reduce length of some cache keys For faster hashing. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32834>	2025-01-03 09:11:59 +00:00
Samuel Pitoiset	c8d2614113	radv/meta: fix loading the meta pipeline cache This has been removed by mistake. Fixes: `f528c9e8f5` ("radv/meta: stop initializing RT accel structs") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32838>	2025-01-03 08:49:42 +00:00
Samuel Pitoiset	370e392313	radv: fix adding the BO to cmdbuf list when emitting buffer markers Found by inspection. Cc: mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32840>	2025-01-03 08:19:23 +00:00
David Rosca	3474978d52	radv: Fix sampling from image layers of video decode target Video decode target needs custom height alignment, but tex descriptor still needs to be set to the original size the image was created with. This makes the descriptor wrong for layer > 0, so we need to calculate the layer offset and add it to bo address for this case. Fixes: `5deb476095` ("radv: align video images internal width/height inside the driver.") Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32069>	2025-01-03 01:28:07 +00:00
David Rosca	9d477fae68	radv/video: Remove dt_field_mode handling code This would be used for decoding into interlaced buffer, but since that's not support it is a dead code. Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32069>	2025-01-03 01:28:07 +00:00
David Rosca	ca0cb78869	radv/video: Use correct array index for decode target and DPB images Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12057 Cc: mesa-stable Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32069>	2025-01-03 01:28:07 +00:00
David Rosca	8dabb480e2	radv/video: Fix DPB tier2 surface params Fixes: `3e2c768aa8` ("radv/vcn: enable dynamic dpb tier 2 for h264/h265 on navi21+") Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32069>	2025-01-03 01:28:07 +00:00
Marek Olšák	7fbca998b1	amd: optimize atomics before lowering intrinsics ac_nir_lower_intrinsics_to_args will lower most system values. I have to keep the divergence analysis in ACO, otherwise it goes haywire. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:56 +00:00
Marek Olšák	5dd9171765	ac/nir: set upper ranges for range analysis while lowering system values Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00
Marek Olšák	0d5b03f2b9	ac/nir: split local_invocation_ids to 3 separate VGPR inputs so that we can set the upper range per VGPR. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00
Marek Olšák	65d241c947	ac/nir: set arg_upper_bound_u32 for vs_rel_patch_id Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00
Marek Olšák	1d9fbe5387	ac/nir: add helper ac_nir_load_arg_upper_bound Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00
Marek Olšák	cfeaa45dc6	ac/nir: clean up ac_nir_lower_indirect_derefs IO variables can't occur here anymore. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00
Marek Olšák	ae22da2ff8	ac/nir: lower more loads in ac_nir_lower_intrinsics_to_args instead of drivers Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00
Marek Olšák	dc8a40ff3e	ac/llvm: remove already lowered cases Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00

1 2 3 4 5 ...

16622 commits