fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-21 00:18:09 +02:00

Author	SHA1	Message	Date
Emma Anholt	ed8676dc28	nir: Rename the unit_test_*_amd intrinics to be un-vendored. We'll reuse these from the nir_opt_algebraic_pattern_test. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076>	2026-01-15 19:09:37 +00:00
Natalie Vock	cc81c7de23	nir,aco: Clean up useless lowering of sbt_base_amd Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29580>	2026-01-14 14:19:07 +00:00
Natalie Vock	0a1911b220	radv,aco: Use function call structure for RT programs Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29580>	2026-01-14 14:19:07 +00:00
Natalie Vock	06c2e90e35	aco: Note if a parameter needs to be explicitly preserved Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29580>	2026-01-14 14:19:05 +00:00
Rhys Perry	7a09e4a740	aco: use correct addition opcodes in gfx6-8 RT prolog Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `60dd9d797e` ("aco: Swizzle ray launch IDs in the RT prolog") Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39232>	2026-01-14 11:23:42 +00:00
Rhys Perry	da728d5a1a	aco: micro-optimize ray launch ID swizzling Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39232>	2026-01-14 11:23:42 +00:00
Natalie Vock	0d93e8ce54	aco: Don't insert p_reload_preserved in loops This can't work. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39157>	2026-01-12 21:46:50 +00:00
Konstantin Seurer	39d58a55a7	aco: Add support to f2f16 with rtpi/rtni Those rounding modes are needed when computing 16-bit bounding boxes since the bounding box must not get smaller. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37883>	2026-01-10 11:34:12 +01:00
Natalie Vock	60dd9d797e	aco: Swizzle ray launch IDs in the RT prolog This converts from 1D workgroups to 2D ray launch IDs entirely via shader ALU, including handling partial/cut-off workgroups optimally. Doing this entirely in-shader means it Just Works(TM) with indirect dispatches as well. Previous approaches manipulating various things on CPU depending on the dispatch size couldn't handle indirect dispatches. The swizzle implemented here also swizzles with a recursive Z-order pattern, which should be a little more optimal than arranging invocations linearly within the wave. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39142>	2026-01-08 19:49:55 +01:00
Natalie Vock	1f6ac3fa93	radv/rt,aco: Always dispatch 1D workgroups for RT We will swizzle the workgroups ourselves in the next commit. Removes the need for 1D dispatch workarounds. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39142>	2026-01-08 19:49:54 +01:00
Georg Lehmann	eb4737a1dd	nir: add nir_alu_instr_is_exact helper Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39103>	2026-01-07 09:40:57 +00:00
Daniel Schürmann	1e8d367537	amd: add and use ac_cu_info::has_vtx_format_alpha_adjust_bug Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>	2025-12-22 07:34:48 +00:00
Daniel Schürmann	addd4ea59f	aco: pass aco_compiler_options to init_program() Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>	2025-12-22 07:34:46 +00:00
Alyssa Rosenzweig	079e9ae606	treewide: use BITSET_*_COUNT Mix of Coccinelle patch, manual fix ups, sed, etc. Probably best to review the diff as-if hand written: Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38955>	2025-12-16 17:42:10 +00:00
Timur Kristóf	f001515c87	aco: Use only VGPR offset on buffer atomics on GFX6-7 SGPR offset is not included in the bounds check according to the ISA documentation of GFX6-7 and indeed it can trigger VM faults on OOB access. Note that ACO already doesn't use the SGPR offset on GFX6-7 for buffer loads and stores. This commit just does the same for buffer atomics. This commit mitigates a ton of VM faults that are exposed by: `24e75fea4b` Fossil DB stats on Hawaii (GFX7): Totals from 148 (0.24% of 61818) affected shaders: Instrs: 324004 -> 327352 (+1.03%) CodeSize: 1556468 -> 1514100 (-2.72%); split: -2.74%, +0.02% Latency: 1271480 -> 1276894 (+0.43%) InvThroughput: 396850 -> 397740 (+0.22%) VClause: 6861 -> 6858 (-0.04%) Copies: 34083 -> 37430 (+9.82%) PreVGPRs: 5705 -> 5706 (+0.02%) VALU: 147529 -> 150898 (+2.28%) SALU: 98194 -> 98172 (-0.02%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38958>	2025-12-15 21:03:19 +00:00
Georg Lehmann	a2b70ce4ec	aco/isel: remove uniform reduce/scan optimization This is now done in NIR, with the exception of exclusive min/max/and/or scans. But those are not really useful, and if we ever come across them we can optimize them in NIR using write_invocation_amd. No Foz-DB changes on Navi21. Acked-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38902>	2025-12-15 12:22:32 +00:00
Georg Lehmann	072815e5cb	aco/gfx6: move mrtz writemask workaround to assembler and handle all mrt Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38853>	2025-12-12 17:00:51 +00:00
Georg Lehmann	ef246aaf72	aco/isel: emit register copies for workgroup ids This way, we don't overestimate SGPR pressure. Foz-DB Navi48: Totals from 1413 (1.45% of 97637) affected shaders: Instrs: 3468375 -> 3468585 (+0.01%); split: -0.01%, +0.02% CodeSize: 18643064 -> 18643520 (+0.00%); split: -0.01%, +0.01% VGPRs: 71776 -> 71788 (+0.02%) SpillSGPRs: 18575 -> 18561 (-0.08%) Latency: 23207643 -> 23207998 (+0.00%); split: -0.00%, +0.01% InvThroughput: 8116806 -> 8116541 (-0.00%); split: -0.01%, +0.00% VClause: 75250 -> 75252 (+0.00%); split: -0.00%, +0.00% SClause: 65274 -> 65283 (+0.01%); split: -0.02%, +0.04% Copies: 275750 -> 275942 (+0.07%); split: -0.03%, +0.10% PreSGPRs: 70246 -> 69072 (-1.67%) VALU: 1892111 -> 1892092 (-0.00%); split: -0.00%, +0.00% SALU: 523460 -> 523648 (+0.04%); split: -0.02%, +0.05% VOPD: 41097 -> 41102 (+0.01%) Sadly the RA noise is slightly negative for instruction count. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38830>	2025-12-11 08:06:59 +00:00
Georg Lehmann	911e1ce168	aco/isel: emit exec copy for ballot(true) Once copy propagated in the optimizer, this will allow using nir_opt_uniform_subgroup without too many regressions. Foz-DB Navi48: Totals from 405 (0.41% of 97637) affected shaders: Instrs: 3796716 -> 3796894 (+0.00%); split: -0.00%, +0.00% CodeSize: 20116136 -> 20116652 (+0.00%); split: -0.00%, +0.00% Latency: 18326661 -> 18327114 (+0.00%); split: -0.00%, +0.00% InvThroughput: 3353206 -> 3353268 (+0.00%); split: -0.00%, +0.00% Copies: 292307 -> 293830 (+0.52%) SALU: 507523 -> 507738 (+0.04%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38830>	2025-12-11 08:06:58 +00:00
Marek Olšák	308da55f1a	radv,radeonsi: use FRAG_RESULT_DUAL_SRC_BLEND this is slightly nicer Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38604>	2025-12-10 19:16:46 +00:00
Natalie Vock	8bc5fdef53	aco: Remove unused p_reload_preserved def Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38281>	2025-12-08 19:12:52 +00:00
Marek Olšák	2c9995a94f	ac/nir: move aco_nir_op_supports_packed_math_16bit here aco_nir_op_supports_packed_math_16bit currently can't be used by amd/common because tests don't link with ACO, so linking would fail, but we want to move the nir_opt_vectorize callback here that uses it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38603>	2025-11-28 20:16:10 +00:00
Georg Lehmann	0f7a1ce23e	aco/optimizer: some more mul opts Foz-DB Navi48: Totals from 1650 (2.00% of 82419) affected shaders: Instrs: 975716 -> 970609 (-0.52%); split: -0.53%, +0.00% CodeSize: 4986260 -> 4982916 (-0.07%); split: -0.09%, +0.02% Latency: 2795394 -> 2793211 (-0.08%); split: -0.09%, +0.01% InvThroughput: 620892 -> 620914 (+0.00%); split: -0.00%, +0.01% VClause: 18773 -> 18729 (-0.23%) SClause: 13219 -> 13218 (-0.01%) Copies: 53619 -> 53620 (+0.00%); split: -0.01%, +0.01% VALU: 592094 -> 592096 (+0.00%); split: -0.00%, +0.00% SALU: 96586 -> 93532 (-3.16%); split: -3.17%, +0.00% Foz-DB Navi21: Totals from 1647 (2.00% of 82387) affected shaders: Instrs: 1104100 -> 1100149 (-0.36%); split: -0.36%, +0.00% CodeSize: 5631092 -> 5637668 (+0.12%); split: -0.00%, +0.12% Latency: 3503029 -> 3501621 (-0.04%); split: -0.05%, +0.01% InvThroughput: 1088494 -> 1088495 (+0.00%); split: -0.00%, +0.00% VClause: 20898 -> 20885 (-0.06%) Copies: 72641 -> 72635 (-0.01%); split: -0.02%, +0.01% VALU: 725593 -> 725592 (-0.00%); split: -0.00%, +0.00% SALU: 139046 -> 135175 (-2.78%) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530>	2025-11-25 11:49:17 +00:00
Georg Lehmann	3a175b54a4	aco,nir: support subdword v_permlane_b16 Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38389>	2025-11-17 23:33:59 +00:00
Marek Olšák	e372365cf4	nir: rename nir_copy_prop -> nir_opt_copy_prop Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38411>	2025-11-15 02:16:38 +00:00
Konstantin Seurer	de32f9275f	treewide: add & use parent instr helpers We add a bunch of new helpers to avoid the need to touch >parent_instr, including the full set of: * nir_def_is_* * nir_def_as__or_null nir_def_as_* [assumes the right instr type] * nir_src_is_* * nir_src_as_* * nir_scalar_is_* * nir_scalar_as_* Plus nir_def_instr() where there's no more suitable helper. Also an existing helper is renamed to unify all the names, while we're churning the tree: * nir_src_as_alu_instr -> nir_src_as_alu ..and then we port the tree to use the helpers as much as possible, using nir_def_instr() where that does not work. Acked-by: Marek Olšák <maraeo@gmail.com> --- To eliminate nir_def::parent_instr we need to churn the tree anyway, so I'm taking this opportunity to clean up a lot of NIR patterns. Co-authored-by: Konstantin Seurer <konstantin.seurer@gmail.com> Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38313>	2025-11-12 21:22:13 +00:00
Daniel Schürmann	5682e39e6b	amd: enable load/store_shared2_amd for GFX6 Totals from 1509 (2.43% of 62200) affected shaders: (Pitcairn) MaxWaves: 8078 -> 8057 (-0.26%); split: +0.09%, -0.35% Instrs: 977182 -> 951746 (-2.60%); split: -2.62%, +0.02% CodeSize: 4951468 -> 4758192 (-3.90%); split: -3.92%, +0.01% SGPRs: 76704 -> 76696 (-0.01%) VGPRs: 81092 -> 81068 (-0.03%); split: -0.34%, +0.31% Latency: 11663237 -> 11526070 (-1.18%); split: -1.19%, +0.01% InvThroughput: 6198904 -> 6114851 (-1.36%); split: -1.43%, +0.07% VClause: 26656 -> 26655 (-0.00%); split: -0.05%, +0.05% SClause: 22304 -> 22307 (+0.01%); split: -0.03%, +0.04% Copies: 107503 -> 109564 (+1.92%); split: -0.23%, +2.15% Branches: 22917 -> 22918 (+0.00%) PreSGPRs: 42246 -> 42242 (-0.01%); split: -0.01%, +0.00% PreVGPRs: 64561 -> 64761 (+0.31%); split: -0.01%, +0.32% VALU: 600285 -> 601139 (+0.14%); split: -0.26%, +0.40% SALU: 130622 -> 130851 (+0.18%); split: -0.16%, +0.33% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37682>	2025-11-11 17:12:17 +00:00
Natalie Vock	f0c613765c	aco: Add preload_preserved pseudo instruction These are helper instructions for the spill_preserved pass to insert reloads for registers that are preserved by the ABI, yet clobbered by the callee shader. There is one p_reload_preserved instruction at the end of each block. This allows us to insert reloads early, to alleviate the high latency of scratch reloads. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37381>	2025-11-06 12:09:39 +00:00
Samuel Pitoiset	a0d607bfdb	radv,aco: wait for all VMEM loads when the prolog loads large 64-bit attributes Not the most optimal solution but 64-bit vertex attributes are rarely used. Could still revisit if we find a real use case that matters. This fixes recent VKCTS coverage: dEQP-VK.pipeline.fast_linked_library.vertex_input.component_mismatch.r64g64b64._to_dvec2 dEQP-VK.pipeline.shader_object_..vertex_input.component_mismatch.r64g64b64.*_to_dvec2 Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14243 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38237>	2025-11-05 07:26:45 +00:00
Samuel Pitoiset	ba5bf81aa2	aco: fix reserving VGPRs for 64-bit attributes in VS prologs Otherwise the fetch index would be overwritten if the attribute format is 64-bit and more than 2 components are loaded. Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14242 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38237>	2025-11-05 07:26:45 +00:00
Georg Lehmann	0f54136730	aco/isel: emit vop2 v_lshlrev_b64 for gfx12+ Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38156>	2025-10-31 08:31:03 +00:00
Georg Lehmann	7ac67e2711	aco/isel: emit vop2 v_max_f64 for gfx12+ Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38156>	2025-10-31 08:31:03 +00:00
Georg Lehmann	8397b91934	aco/isel: emit vop2 v_min_f64 for gfx12+ Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38156>	2025-10-31 08:31:02 +00:00
Georg Lehmann	2e120d4e26	aco/isel: emit vop2 v_mul_f64 for gfx12+ Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38156>	2025-10-31 08:31:01 +00:00
Georg Lehmann	86ea462f4d	aco/isel: emit vop2 v_fadd_f64 for gfx12+ Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38156>	2025-10-31 08:31:01 +00:00
Georg Lehmann	0c8b885e21	aco/isel: emit v_mul_f64 for fp64 fsat Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38011>	2025-10-29 17:57:52 +00:00
Georg Lehmann	9ece74ce79	aco/isel: emit v_mul_f64 with modifiers for fneg/fabs Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38011>	2025-10-29 17:57:52 +00:00
Konstantin Seurer	47ffe2ecd4	aco: Fixup out_launch_size_y in the RT prolog for 1D dispatch launch_size_y is set to ACO_RT_CONVERTED_2D_LAUNCH_SIZE for 1D dispatches. The prolog needs to set it to 1 so that the app shader loads the correct value. cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37974>	2025-10-23 07:56:35 +00:00
Daniel Schürmann	eecd1c020d	amd: keep ac_shader_config::lds_size unaligned Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577>	2025-10-15 11:20:09 +00:00
Daniel Schürmann	fe6ff6d1ef	aco: remove DeviceInfo::lds_encoding_granule and DeviceInfo::lds_alloc_granule Use utility functions instead. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577>	2025-10-15 11:20:08 +00:00
Daniel Schürmann	11db02d5d9	radv: calculate LDS allocation requirements independently from the compiler Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577>	2025-10-15 11:20:07 +00:00
Daniel Schürmann	b651234414	amd: change ac_shader_config::lds_size to bytes We still keep it aligned to allocation granularity. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577>	2025-10-15 11:20:07 +00:00
Daniel Schürmann	d0b87a0d5f	ac/nir_flag_smem_for_loads: call divergence analysis internally Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Also don't flag more SMEM instructions (in ACO) after the last call to ac_nir_lower_mem_access_bit_sizes(). Totals from 75 (0.09% of 79839) affected shaders: (Navi48) Instrs: 191246 -> 189960 (-0.67%) CodeSize: 996840 -> 985976 (-1.09%) Latency: 3066184 -> 2945500 (-3.94%) InvThroughput: 355373 -> 353106 (-0.64%); split: -0.66%, +0.02% SClause: 4848 -> 4699 (-3.07%) Copies: 13827 -> 13925 (+0.71%); split: -0.07%, +0.78% Branches: 5176 -> 5003 (-3.34%) PreSGPRs: 6222 -> 6272 (+0.80%) VALU: 108934 -> 108993 (+0.05%); split: -0.00%, +0.06% SALU: 31679 -> 31210 (-1.48%); split: -1.51%, +0.03% SMEM: 7158 -> 6739 (-5.85%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37843>	2025-10-14 16:33:12 +00:00
Daniel Schürmann	8ff44f17ef	amd/lower_mem_access_bit_sizes: also use SMEM for subdword loads We can simply extract from the loaded dwords as per nir_lower_mem_access_bit_sizes() lowering. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37843>	2025-10-14 16:33:11 +00:00
Samuel Pitoiset	bc32286e5b	radv: declare a new user SGPR for dynamic descriptors To move them out of push constants. fossils-db (GFX1201): Totals from 20700 (25.99% of 79646) affected shaders: Instrs: 14375624 -> 14370051 (-0.04%); split: -0.07%, +0.03% CodeSize: 76746128 -> 76723772 (-0.03%); split: -0.05%, +0.02% Latency: 74103586 -> 74113651 (+0.01%); split: -0.01%, +0.02% InvThroughput: 11908817 -> 11908798 (-0.00%); split: -0.00%, +0.00% VClause: 249605 -> 249607 (+0.00%); split: -0.00%, +0.00% SClause: 337914 -> 337772 (-0.04%); split: -0.08%, +0.04% Copies: 843585 -> 839233 (-0.52%); split: -0.62%, +0.10% PreSGPRs: 836283 -> 837260 (+0.12%) SALU: 1790713 -> 1786374 (-0.24%); split: -0.29%, +0.05% Co-authored-by: Konstantin Seurer <konstantin.seurer@gmail.com> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37768>	2025-10-14 15:34:43 +00:00
Georg Lehmann	58163f65f0	aco/optimizer: rework packed fneg opt Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35272>	2025-10-14 08:33:40 +00:00
Georg Lehmann	6eac72088c	aco/gfx10+: only work around split execution of uniform LDS in WGP mode Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details LDS instructions from one CU won't split the execution of other LDS instruction on the same CU. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31630>	2025-10-13 10:22:22 +00:00
Georg Lehmann	c13caa5e5f	aco: fix global_atomic_swap offset overflow check Fixes: `d7dcd81c77` ("aco/gfx6: allow both constant and gpr offset for global with sgpr address") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37821>	2025-10-13 09:41:41 +00:00
Marek Olšák	3fe651f607	nir: remove load_smem_amd replaced by load_global_amd + ACCESS_SMEM_AMD Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36936>	2025-10-08 08:54:11 +00:00
Rhys Perry	20af16b4d8	aco: use MTBUF for 64-bit atomic load/store Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details A 64-bit atomic load/store should be considered entirely out-of-bounds if any part of it is out-of-bounds. Since we implemented these as 32-bit vec2 load/store, it would have been possible for the first half to be in-bounds while the second half is out-of-bounds. From 9.6.1. Robust Buffer Access of Vulkan 1.4.324 specification: > Any non-atomic access to a uniform, storage, uniform texel, or storage > texel buffer wider than 32-bits may be treated as multiple 32-bit > accesses that are separately bounds checked. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36602>	2025-10-07 17:41:31 +00:00

1 2 3 4

165 commits