fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-04-29 10:40:42 +02:00

Author	SHA1	Message	Date
Rhys Perry	01fae0c5c2	ac/llvm: use ds_bpermute_b32 for GFX12 wave64 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details It works. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35489>	2025-06-14 13:59:11 +00:00
Rhys Perry	9a5073e3a4	ac/llvm: rewrite shuffle waterfall loop This can't break until we have read all lanes, otherwise it might read from an inactive lane. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35489>	2025-06-14 13:59:11 +00:00
Rhys Perry	2ff53fd97c	ac/llvm: convert to integer after reductions These return floating point types for floating point ops. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Backport-to: 25.1 Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35489>	2025-06-14 13:59:10 +00:00
Rhys Perry	8609008aeb	ac/llvm: fix mul24 intrinsic overloading Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `e3edc6029b` ("ac/llvm: use mul24 intrinsics") Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35489>	2025-06-14 13:59:10 +00:00
Rhys Perry	3c2b3fbd03	ac/llvm: fix overloading of intrinsic names Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Backport-to: 25.1 Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35489>	2025-06-14 13:59:10 +00:00
Karol Herbst	4ff66b4343	ac/llvm: fix bitfield ops Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35423>	2025-06-13 07:33:03 +00:00
Pierre-Eric Pelloux-Prayer	4a84ebfcb1	ac/llvm: rework component trimming in visit_tex The referenced commit was a step in the right direction, but not complete. ac_build_image_opcode returns a vec<4> or a struct<vec<4>, int> so we can simplify visit_tex. We just need to map these 4/5 values to the expected layout from NIR. eg: depth + TFE would produces "<d, x, x, x>, t" so it has to be transformed into <d, t>. nir_texop_fragment_mask_fetch_amd + sparse doesn't exist, so it's another opportunity for simplification. This is required to get KHR-GL46.sparse_texture2_tests.SparseTexture2Lookup_texture_2d_depth_component16 working properly. The same test fails with ACO so it probably needs a change in the same area. Fixes: `c0ef2aa7f8` ("DEPENDENCY: ac/llvm: fix sparse code handling") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35206>	2025-06-11 12:11:28 +00:00
Marek Olšák	447d744833	ac/llvm: allocate LLVM PS output variables on demand This stops relying on si_shader_info, allowing further cleanup of si_shader_info. radv_load_output was unused. Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35392>	2025-06-10 03:31:20 +00:00
Marek Olšák	c3034fa82c	amd: replace most u_bit_consecutive* with BITFIELD_MASK/RANGE Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35346>	2025-06-04 17:46:38 +00:00
Rhys Perry	d0a09b6ff7	ac/llvm: correctly set alignment of vector global load/store Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details For coherent/volatile access, this would be too high for vector access. Even when we didn't set the alignment, LLVM seemed to assume too high of an alignment for 8/16-bit vector access. Fixes generated_tests/cl/vload/vload-char-constant.cl Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Michel Dänzer <mdaenzer@redhat.com> Backport-to: 25.0 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34903>	2025-05-12 10:51:57 +00:00
Rhys Perry	c1ecad2b11	ac/llvm: correctly split vector 8/16-bit stores This assumes that the start of the load is 32-bit aligned. For example, a vec3 16-bit store with align_offset=2 should split off the first component, not the last. This probably also fixed splitting with 8-bit stores. Fixes arb_copy_buffer-overlap Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Michel Dänzer <mdaenzer@redhat.com> Backport-to: 25.0 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34903>	2025-05-12 10:51:57 +00:00
Georg Lehmann	44be05cc45	ac/llvm: support nir_op_bfdot2_bfadd Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>	2025-05-09 11:20:26 +00:00
Karol Herbst	e3edc6029b	ac/llvm: use mul24 intrinsics With the current code in clpeak LLVM ended up generating v_mad_u64_u32 instructions, with this we get nice v_mad_u32_s24 ones instead and an 4x performance increase in the int24 benchmark. Suggested-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34630>	2025-04-23 01:11:48 +00:00
Georg Lehmann	7c6a2b16e0	ac/llvm: support mul24_relaxed I didn't find dedicated 24bit intrinsics, but I also haven't looked that hard. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33871>	2025-03-27 06:24:16 +00:00
Natalie Vock	0e7c94b2b3	ac/llvm: Don't use getTriple() on LLVM21+ setTargetTriple() takes a Triple now. Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33981>	2025-03-11 20:54:34 +00:00
Timur Kristóf	a91f105e5b	ac: Don't include full nir.h anymore. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33439>	2025-02-12 22:33:07 +01:00
Marek Olšák	82047fa82f	amd: drop support for LLVM 15, 16, 17 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33211>	2025-02-01 04:22:30 +00:00
Marek Olšák	de790c3c5f	Revert "ac/llvm: enable wqm for ac_build_quad_swizzle from ac_build_fs_interp_mov" This reverts commit `9d4d9e6150`. It breaks on Navi31: * KHR-GL46.shaders.uniform_block.instance_array_basic_type.shared.bvec3,Fail * KHR-GL46.shaders.uniform_block.instance_array_basic_type.std140.bvec3,Fail * KHR-GL46.shaders.uniform_block.random.all_per_block_buffers.13,Fail * KHR-GL46.shaders.uniform_block.random.all_per_block_buffers.3,Fail * KHR-GL46.shaders.uniform_block.single_basic_array.shared.bvec3,Fail * KHR-GL46.shaders.uniform_block.single_basic_array.std140.bvec3,Fail * KHR-GLES3.shaders.uniform_block.instance_array_basic_type.shared.bvec3,Fail * KHR-GLES3.shaders.uniform_block.instance_array_basic_type.std140.bvec3,Fail * KHR-GLES3.shaders.uniform_block.random.all_per_block_buffers.13,Fail * KHR-GLES3.shaders.uniform_block.random.all_per_block_buffers.3,Fail * KHR-GLES3.shaders.uniform_block.single_basic_array.shared.bvec3,Fail * KHR-GLES3.shaders.uniform_block.single_basic_array.std140.bvec3,Fail * dEQP-GLES3.functional.ubo.instance_array_basic_type.shared.bvec3_both,Fail * dEQP-GLES3.functional.ubo.instance_array_basic_type.std140.bvec3_both,Fail * dEQP-GLES3.functional.ubo.random.vector_types.24,Fail * dEQP-GLES3.functional.ubo.single_basic_array.shared.bvec3_both,Fail * dEQP-GLES3.functional.ubo.single_basic_array.std140.bvec3_both,Fail Fixes: `9d4d9e6150` Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33118>	2025-01-21 11:58:37 +00:00
Pierre-Eric Pelloux-Prayer	9d4d9e6150	ac/llvm: enable wqm for ac_build_quad_swizzle from ac_build_fs_interp_mov Without this, WQM is only used for the lds_param_load like this: s_wqm_b64 exec, exec lds_param_load v5, attr0.x wait_vdst:15 s_mov_b64 exec, s[0:1] v_mov_b32_dpp v5, v5 quad_perm:[0,0,0,0] row_mask:0xf bank_mask:0xf With this change we get: s_wqm_b64 exec, exec lds_param_load v5, attr0.x wait_vdst:15 s_mov_b64 exec, s[0:1] ... s_wqm_b64 exec, exec v_mov_b32_dpp v5, v5 quad_perm:[0,0,0,0] row_mask:0xf bank_mask:0xf s_mov_b64 exec, s[0:1] This fixes KHR-GL46.shaders.uniform_block.random.nested_structs_instance_arrays.0 and other similar tests with LLVM. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32959>	2025-01-17 09:55:45 +00:00
Pierre-Eric Pelloux-Prayer	182d662ccf	ac/llvm: add wqm param to ac_build_quad_swizzle And to ac_build_dpp because it's used from quad_swizzle. No functional changes but will be used in the next commit. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32959>	2025-01-17 09:55:45 +00:00
Marek Olšák	1f5220b03d	ac/llvm: remove the low-optimizing compiler option Not needed with ACO. It was used for big shaders on old APUs that took forever to compile. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33046>	2025-01-16 02:58:03 +00:00
Marek Olšák	08a47fa05c	ac/llvm: lower vector load_const in NIR Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33046>	2025-01-16 02:58:03 +00:00
Marek Olšák	d160252270	ac: use Z_EXPORT_FORMAT=32_AR for Z + Alpha mrtz exports This should be faster than 32_ABGR. Also, stencil exports are changed from UINT16_ABGR to 32_GR, which should have no effect on performance. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33046>	2025-01-16 02:58:03 +00:00
Timur Kristóf	cc0166462e	ac/nir: Move ac_nir_get_mem_access_flags to ac_nir.c And change its name to indicate that it is NIR specific. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>	2025-01-14 13:45:30 +01:00
Marek Olšák	dc8a40ff3e	ac/llvm: remove already lowered cases Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00
Marek Olšák	ceb6f8fc32	amd: lower load_tess_rel_patch_id/primitive_id/tess_coord and overwrite.. in NIR The overwrite instruction complicates it a little, which is why these intrinsics are lowered together. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00
Marek Olšák	61bfb4fa06	amd: lower load_subgroup_invocation in NIR Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00
Marek Olšák	e69f47faee	amd: lower load_local_invocation_index in NIR This is the last intrinsic that needed the LS VGPR bug workaround in ACO and ac_nir_to_llvm. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00
Marek Olšák	342dcbdc8b	amd: lower load_vertex_id/instance_id and overwrite_vs_arguments in NIR 2 things complicate this: - overwrite_vs_arguments_amd - the LS VGPR bug workaround Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00
Marek Olšák	66dd70adc5	amd: lower load_gs_wave_id_amd in NIR Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00
Marek Olšák	923f59c971	amd: lower load_barycentric_at_offset in NIR Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00
Marek Olšák	16ab05fad1	amd: lower load_barycentric_pixel/centroid/sample in NIR radeonsi needs to preserve interp_mode in the arg load. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00
Marek Olšák	7e83f6ca8b	amd: lower load_front_face in NIR radeonsi must do this after si_lower_nir_abi, which optimizes front_face, but doesn't lower it. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00
Marek Olšák	6ad5225b2a	amd: lower load_frag_shading_rate in NIR Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00
Marek Olšák	6d2e29ff6e	amd: lower load_sample_pos in NIR Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00
Marek Olšák	110e474b4f	amd: lower load_sample_id in NIR Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00
Marek Olšák	684c8da553	amd: lower load_invocation_id in NIR ACO can't look for it because it's lowered there. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00
Marek Olšák	d281240c57	amd: lower load_first_vertex/base_instance/draw_id/view_index in NIR Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00
Marek Olšák	0d372b043b	amd: lower load_local_invocation_id in NIR This is based on ACO. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00
Marek Olšák	13cb5c7b72	amd: lower load_frag_coord in NIR Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00
Marek Olšák	58cb155068	amd: lower load_pixel_coord in NIR Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32782>	2025-01-02 17:36:55 +00:00
Georg Lehmann	43fca7fffe	amd: support load_front_face_fsign Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32791>	2024-12-30 22:31:35 +00:00
Georg Lehmann	aee0c7274c	amd: switch to FRONT_FACE_ALL_BITS(0) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32791>	2024-12-30 22:31:34 +00:00
Marek Olšák	19c00c586e	ac/llvm: remove unused code Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32780>	2024-12-26 10:12:43 +00:00
Marek Olšák	08abddd235	radeonsi/gfx11: fix alpha-to-coverage + alpha-to-one used together alpha-to-coverage must be applied before alpha-to-one. The only way to do that is to export alpha for alpha-to-coverage via mrtz, and export 1 via mrt0.a. ACO and monolithic shader support is already in place thanks to RADV, so we only need to change the LLVM PS epilog and the shader key. Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:20 +00:00
Pierre-Eric Pelloux-Prayer	c0ef2aa7f8	DEPENDENCY: ac/llvm: fix sparse code handling The existing code produced a incorrectly sized result from visit_tex. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>	2024-12-24 12:02:19 +00:00
Rhys Perry	0619e4db63	nir,aco,ac/llvm: add nir_op_alignbyte_amd Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904>	2024-11-13 12:59:26 +00:00
Georg Lehmann	2cd8a9fef7	amd: lower gl_FragCoord.w rcp in NIR This allows NIR to remove the rcps if the application uses rcp(gl_FragCoord.w). D3D provides w, not 1/w like GL/VK in the shader, so this is commonly used. Foz-DB Navi21: Totals from 2068 (2.61% of 79206) affected shaders: MaxWaves: 45636 -> 45652 (+0.04%) Instrs: 2173444 -> 2169671 (-0.17%); split: -0.18%, +0.00% CodeSize: 11881304 -> 11867208 (-0.12%); split: -0.12%, +0.01% VGPRs: 118000 -> 117968 (-0.03%) Latency: 35689676 -> 35675909 (-0.04%); split: -0.06%, +0.02% InvThroughput: 9167199 -> 9159801 (-0.08%); split: -0.08%, +0.00% VClause: 45076 -> 45078 (+0.00%); split: -0.01%, +0.02% SClause: 92503 -> 92366 (-0.15%); split: -0.31%, +0.17% Copies: 140282 -> 140303 (+0.01%); split: -0.13%, +0.14% Branches: 53347 -> 53346 (-0.00%); split: -0.01%, +0.00% PreVGPRs: 96495 -> 96465 (-0.03%) VALU: 1522980 -> 1519252 (-0.24%); split: -0.25%, +0.01% SALU: 213451 -> 213460 (+0.00%); split: -0.02%, +0.02% Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31967>	2024-11-06 12:57:08 +00:00
Georg Lehmann	42d5cb62bb	ac/llvm: implement load_pixel_coord Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31864>	2024-11-04 12:34:30 +00:00
Georg Lehmann	a2baff4810	ac/llvm: handle shared atomic base offset Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31080>	2024-10-29 09:31:08 +00:00

1 2 3 4 5 ...

774 commits