fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-22 08:48:07 +02:00

Author	SHA1	Message	Date
Marek Olšák	bfb6c41b64	amd: remove unnecessary and transitive #includes Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reported by clang tools. See: https://clangd.llvm.org/guides/include-cleaner struct ac_cmdbuf had to be moved to ac_cmdbuf_base.h because we can't include ac_cmdbuf.h->sid.h->amdgfxregs.h in radeon_winsys.h for r300. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41091>	2026-04-24 21:53:07 +00:00
Rhys Perry	5c3b5688a1	amd: rename ac_cu_info to ac_compiler_info Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40042>	2026-03-03 08:50:12 +00:00
Rhys Perry	a65089dfce	ac/nir: pass ac_cu_info to ac_nir_compute_tess_wg_info Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40042>	2026-03-03 08:50:11 +00:00
Marek Olšák	13cfd0176c	ac/gpu_info: add #define AMD_MEMCHANNEL_INTERLEAVE_BYTES radeon_info::pipe_interleave_bytes is renamed to r600_pipe_interleave_bytes where it can be 512 on some chips. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39120>	2026-01-06 20:32:10 +00:00
Marek Olšák	92133bb0ab	amd: demystify various optimizations we already have for memory channels Explain why we do what we do, and use the radeon_info field properly. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39120>	2026-01-06 20:32:10 +00:00
Daniel Schürmann	1e8d367537	amd: add and use ac_cu_info::has_vtx_format_alpha_adjust_bug Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>	2025-12-22 07:34:48 +00:00
Marek Olšák	9bd2c6dcb2	ac/nir: allow smaller workgroups for GS Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details It's not good for performance, but it's possible to use for debugging. Running single-wave GS workgroups could work around any LDS race conditions. Setting the workgroup size to 64 reliably works around GLCTS primitive_counterline failures, indicating streamout data corruption with multi-wave GS workgroups. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38328>	2025-12-12 04:27:32 +00:00
Georg Lehmann	9ed94371f7	amd: stop using custom gl_access_qualifier for access type Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36764>	2025-08-15 08:26:10 +00:00
Georg Lehmann	f17cb6b714	amd: replace ACCESS_TYPE_SMEM with ACCESS_SMEM_AMD Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36764>	2025-08-15 08:26:10 +00:00
Qiang Yu	196569b1a4	all: rename gl_shader_stage to mesa_shader_stage It's not only for GL, change to a generic name. Use command: find . -type f -not -path '/.git/' -exec sed -i 's/\bgl_shader_stage\b/mesa_shader_stage/g' {} + Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Yonggang Luo <luoyonggang@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>	2025-08-06 10:28:40 +08:00
Antonio Ospite	ddf2aa3a4d	build: avoid redefining unreachable() which is standard in C23 In the C23 standard unreachable() is now a predefined function-like macro in <stddef.h> See https://android.googlesource.com/platform/bionic/+/HEAD/docs/c23.md#is-now-a-predefined-function_like-macro-in And this causes build errors when building for C23: ----------------------------------------------------------------------- In file included from ../src/util/log.h:30, from ../src/util/log.c:30: ../src/util/macros.h:123:9: warning: "unreachable" redefined 123 \| #define unreachable(str) \ \| ^~~~~~~~~~~ In file included from ../src/util/macros.h:31: /usr/lib/gcc/x86_64-linux-gnu/14/include/stddef.h:456:9: note: this is the location of the previous definition 456 \| #define unreachable() (__builtin_unreachable ()) \| ^~~~~~~~~~~ ----------------------------------------------------------------------- So don't redefine it with the same name, but use the name UNREACHABLE() to also signify it's a macro. Using a different name also makes sense because the behavior of the macro was extending the one of __builtin_unreachable() anyway, and it also had a different signature, accepting one argument, compared to the standard unreachable() with no arguments. This change improves the chances of building mesa with the C23 standard, which for instance is the default in recent AOSP versions. All the instances of the macro, including the definition, were updated with the following command line: git grep -l '[^_]unreachable(' -- "src/**" \| sort \| uniq \| \ while read file; \ do \ sed -e 's/$[^_]$unreachable(/\1UNREACHABLE(/g' -i "$file"; \ done && \ sed -e 's/#undef unreachable/#undef UNREACHABLE/g' -i src/intel/isl/isl_aux_info.c Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36437>	2025-07-31 17:49:42 +00:00
Marek Olšák	65c5ee1628	radeonsi: stop using LLVM LDS linking logic for the GS out LDS offset This will enable large code removal. shader->config.lds_size is now always computed the same as ACO except for compute shaders. We have to add a new 8-bit user SGPR bitfield called GS_STATE_GS_OUT_LDS_OFFSET_256B, which contains the offset that was previously set by the relocation. Since the offset must be a multiple of 256, we have to add padding to the LDS size computation to make sure the alignment to 256 for the ESGS LDS size doesn't cause us to exceed the maximum LDS size. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>	2025-07-12 10:28:20 +00:00
Marek Olšák	76ce37058d	radv: set the maximum possible workgroup size for legacy GS before linking The optimal workgroup size will be set after lowering. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35473>	2025-07-12 05:20:00 +00:00
Marek Olšák	098d33766a	ac: add legacy GS subgroup size computation from radeonsi Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:43 +00:00
Marek Olšák	fa8db1ccd3	ac: add NGG subgroup size computation from radeonsi RADV will use it. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>	2025-07-02 20:27:42 +00:00
Marek Olšák	5994e08f8b	ac: set LDS limit for TCS to 32K for all chips Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:39 +00:00
Marek Olšák	c1237256cb	ac/nir/tess: execute the tess level workgroup vote on all chips It will be used to skip stores for discarded patches. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:39 +00:00
Marek Olšák	742227c65c	radv,radeonsi: make TCS_OFFCHIP_LAYOUT_NUM_PATCHES not off by one We never use 128 anyway. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:39 +00:00
Marek Olšák	534b282573	ac/nir/tess: adjust memory layout of TCS outputs to have aligned store offsets There is a comment that explains it. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780>	2025-06-07 16:29:38 +00:00
Marek Olšák	870d17012a	ac: adjust maximum HS workgroup size Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This has no effect on triangles because max 64 patches implied max 192 threads, but it improves performance for cases when the number of threads per patch is > 3. This improves the score for gfxbench5 "gl_tess_off" (offscreen) by 11% on Navi48. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34863>	2025-05-08 02:54:13 +00:00
Marek Olšák	b8d15fee3d	ac: minor cleanup of ac_compute_num_tess_patches No change in behavior. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34544>	2025-04-19 22:55:00 -04:00
Marek Olšák	a905a17f39	ac: use HS offchip wg size from radeon_info in ac_compute_num_tess_patches Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34544>	2025-04-19 22:55:00 -04:00
Samuel Pitoiset	e433a57650	ac,radeonsi: rework computing scratch wavesize and tmpring register To be re-used by RADV. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34549>	2025-04-17 10:35:40 +00:00
Samuel Pitoiset	d94f8b4460	ac/gpu_info,radv: add scratch_wavesize_granularity info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34549>	2025-04-17 10:35:40 +00:00
Marek Olšák	8e8eda4089	radeonsi: fix PS prolog not counting used fragcoord VGPRs correctly Using the used component count is not enough. We need to consider the component mask because any component can be disabled. This might fix tests. This removes the component counting from ac_get_fs_input_vgpr_cnt and determines the component mask where it's needed. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32910>	2025-01-29 07:19:40 +00:00
Marek Olšák	4f63b21df0	ac/nir: drop 16x EQAA support from ac_get_ps_iter_mask We don't support 16x EQAA anymore. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33024>	2025-01-25 12:20:25 -05:00
Marek Olšák	d160252270	ac: use Z_EXPORT_FORMAT=32_AR for Z + Alpha mrtz exports This should be faster than 32_ABGR. Also, stencil exports are changed from UINT16_ABGR to 32_GR, which should have no effect on performance. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33046>	2025-01-16 02:58:03 +00:00
Timur Kristóf	fe9eda9969	ac: Stop including ac_nir.h from ac_shader_util.c Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>	2025-01-14 13:46:28 +01:00
Timur Kristóf	305fdfddb5	ac/nir: Move ac_set_nir_options to ac_nir.c And rename it to ac_nir_set_options to match other functions. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>	2025-01-14 13:45:34 +01:00
Timur Kristóf	855de0483f	ac/nir: Move ac_nir callback functions to ac_nir.c Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>	2025-01-14 13:45:32 +01:00
Timur Kristóf	cc0166462e	ac/nir: Move ac_nir_get_mem_access_flags to ac_nir.c And change its name to indicate that it is NIR specific. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>	2025-01-14 13:45:30 +01:00
Timur Kristóf	ad5c0b7103	ac/nir: Move ac_nir_lower_bit_size_callback to ac_nir.c ac_shader_util should not concern itself with NIR stuff. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>	2025-01-14 13:45:28 +01:00
Marek Olšák	7e21b48a2e	ac/nir: split ac_nir_lower_ps into 2 passes It's split into ac_nir_lower_ps_early ac_nir_lower_ps_late. ac_nir_lower_ps_early doesn't generate any AMD specific intrinsics except some system values and is mainly an optimization pass with some lowering. The new change here is that it also eliminates output components not needed by spi_shader_col_format. ac_nir_lower_ps_late lowers output stores to exports and does the bc_optimize thing. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32966>	2025-01-14 13:45:25 +01:00
Marek Olšák	e640d5a9c3	amd: vectorize SMEM loads aggressively, allow overfetching for ACO If there is a 4-byte hole between 2 loads, they are vectorized. Example: load 4 + hole 4 + load 8 -> load 16 This helps GLSL uniform loads, which are often sparse. See the code for more info. RADV could get better code by vectorizing later. radeonsi+ACO - TOTALS FROM AFFECTED SHADERS (45482/58355) Spilled SGPRs: 841 -> 747 (-11.18 %) Code Size: 67552396 -> 65291092 (-3.35 %) bytes Max Waves: 714439 -> 714520 (0.01 %) This should have no effect on LLVM because ac_build_buffer_load scalarizes SMEM, but it's improved for some reason: radeonsi+LLVM - TOTALS FROM AFFECTED SHADERS (4673/58355) Spilled SGPRs: 1450 -> 1282 (-11.59 %) Spilled VGPRs: 106 -> 107 (0.94 %) Scratch size: 101 -> 102 (0.99 %) dwords per thread Code Size: 14994624 -> 14956316 (-0.26 %) bytes Max Waves: 66679 -> 66735 (0.08 %) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29399>	2025-01-09 22:01:54 +00:00
Marek Olšák	abd5216ae8	ac,radeonsi: scalarize overfetching loads There is nothing preventing ACO from generating loads with unused components. This happens often with GLSL uniforms. Some of those loads are partially re-vectorized after this. radeonsi+ACO: TOTALS FROM AFFECTED SHADERS (19564/58918) VGPRs: 732900 -> 728448 (-0.61 %) Spilled SGPRs: 429 -> 433 (0.93 %) Code Size: 38446004 -> 38485612 (0.10 %) bytes Max Waves: 305440 -> 305549 (0.04 %) Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29399>	2025-01-09 22:01:54 +00:00
Timur Kristóf	652a0b48bc	amd: Set lower_layer_fs_input_to_sysval in common code, not in drivers. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32641>	2025-01-02 14:07:51 +00:00
Marek Olšák	c21bc65ba7	nir/opt_load_store_vectorize: make hole_size signed to indicate overlapping loads A negative hole size means the loads overlap. This will be used by drivers to handle overlapping loads in the callback easily. Reviewed-by: Mel Henning <drawoc@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32699>	2025-01-01 00:03:55 +00:00
Georg Lehmann	e112e2b047	nir,amd: optimize front_face ? a : -a Foz-DB Navi31: Totals from 3345 (4.21% of 79395) affected shaders: MaxWaves: 96182 -> 96174 (-0.01%) Instrs: 3135439 -> 3129508 (-0.19%); split: -0.24%, +0.05% CodeSize: 16776088 -> 16718048 (-0.35%); split: -0.38%, +0.03% VGPRs: 190884 -> 190848 (-0.02%); split: -0.03%, +0.01% Latency: 32624132 -> 32621734 (-0.01%); split: -0.16%, +0.16% InvThroughput: 5759987 -> 5749957 (-0.17%); split: -0.23%, +0.05% VClause: 51044 -> 51086 (+0.08%); split: -0.12%, +0.20% SClause: 103415 -> 103223 (-0.19%); split: -0.64%, +0.45% Copies: 170398 -> 170555 (+0.09%); split: -0.64%, +0.74% PreSGPRs: 135567 -> 133887 (-1.24%) PreVGPRs: 140569 -> 141317 (+0.53%) VALU: 1959144 -> 1953839 (-0.27%); split: -0.30%, +0.03% SALU: 217956 -> 217676 (-0.13%); split: -0.20%, +0.07% Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32791>	2024-12-30 22:31:35 +00:00
Qiang Yu	21f888a3ed	ac,radv: move ac_nir_lower_bit_size_callback to common place To be used by radeonsi for OpenCL. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32781>	2024-12-27 01:58:38 +00:00
Marek Olšák	c6fd69bd5e	ac: remove unused code Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32780>	2024-12-26 10:12:43 +00:00
Marek Olšák	8c2f9f0665	radv: switch to the new TCS LDS/offchip size computation to use the same logic as radeonsi. This could be improved, see TODOs. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31673>	2024-12-18 11:07:59 +00:00
Marek Olšák	3056bf1cb1	ac/nir: add new helpers for computing the TCS LDS/offchip size accurately This is based on how the HS lowering passes address TCS inputs and outputs. The new LDS size is lower in some cases. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31673>	2024-12-18 11:07:59 +00:00
Marek Olšák	f4eebb373c	ac/nir: reserve the first LDS vec4 for the HS tf0/1 group vote in TCS Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31673>	2024-12-18 11:07:59 +00:00
Samuel Pitoiset	c3a050da07	radv: fix alpha-to-coverage with alpha-to-one without MRTZ This injects a MRTZ export with only the alpha channel to select it with COVERAGE_TO_MASK_ENABLE for alpha-to-coverage. Co-Authored-by: Rhys Perry <pendingchaos02@gmail.com> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32583>	2024-12-12 10:07:25 +00:00
Georg Lehmann	239c0124df	radv: optimize sample mask comparisons Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32327>	2024-11-26 18:44:39 +00:00
Marek Olšák	25d4943481	nir: make use_interpolated_input_intrinsics a nir_lower_io parameter This will need to be set to true when the GLSL linker lowers IO, which can later be unlowered by st/mesa, and then drivers can lower it again without load_interpolated_input. Therefore, it can't be a global immutable option. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32229>	2024-11-20 02:45:37 +00:00
Marek Olšák	f9b03cf405	nir/opt_varyings: add nir_io_compaction_rotates_color_channels This was enabled by default in nir_opt_varyings, but vc4 can't handle when shader outputs write Y but not X. Add an option for it and enable it only for the driver that benefits from it. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32174>	2024-11-18 13:39:08 +00:00
Georg Lehmann	cba575f4df	nir: always emit ddx intrinsics Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>	2024-10-17 09:50:19 +00:00
Marek Olšák	02923e237d	nir: add hole_size parameter into the vectorize callback It will be used to allow merging loads with a hole between them. Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Qiang Yu	588a65f29a	ac: do not lower some ops in nir_lower_packing AMD does not implement nir_op_pack_32_4x8_split, others are implemented, so don't lower them. Fixes: `0f937426cc` ("radeonsi: lower subgroup ops after wave size is known") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11781 Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30885>	2024-08-30 05:46:51 +00:00

1 2 3

115 commits