fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 04:58:08 +02:00

Author	SHA1	Message	Date
Aitor Camacho	fcf53988c4	nir/opt_varyings: Support implementations that cannot compact 16-bits Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Add nir_io_compact_to_higher_16 flag so that the pass knows if it can compact 16-bit varyings into the higher 16 bits of a 32-bit varying. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Aitor Camacho <aitor@lunarg.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38994>	2026-01-14 20:44:41 +00:00
Alyssa Rosenzweig	235e868ef7	ac/nir: use nir_is_shared_access Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39219>	2026-01-09 20:51:13 +00:00
Georg Lehmann	6d07a56c6a	ac/nir/lower_ps_late: preserve signed zero, inf, nan for exports Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39187>	2026-01-09 11:58:52 +00:00
Georg Lehmann	84ecac58a6	ac/nir/opt_pack_half: preserve fp_math_ctrl Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39187>	2026-01-09 11:58:52 +00:00
Georg Lehmann	5241343ccb	ac/nir/lower_sin_cos: preserve fp_math_ctrl Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39187>	2026-01-09 11:58:52 +00:00
Georg Lehmann	9331726157	ac/nir/lower_sin_cos: use nir_shader_alu_pass Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39187>	2026-01-09 11:58:52 +00:00
Marek Olšák	13cfd0176c	ac/gpu_info: add #define AMD_MEMCHANNEL_INTERLEAVE_BYTES radeon_info::pipe_interleave_bytes is renamed to r600_pipe_interleave_bytes where it can be 512 on some chips. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39120>	2026-01-06 20:32:10 +00:00
Marek Olšák	92133bb0ab	amd: demystify various optimizations we already have for memory channels Explain why we do what we do, and use the radeon_info field properly. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39120>	2026-01-06 20:32:10 +00:00
Rhys Perry	614437ead5	ac/nir: don't vectorize 16-bit shared loads to 8-bit Fixes an issue where a 2x16 load was vectorized with a 1x16 load into a 8x8 load. This became possible after `49d923078f` increased aligned_new_size from 6 bytes to 8 bytes. fossil-db (navi31): Totals from 5 (0.01% of 79825) affected shaders: Instrs: 6994 -> 6257 (-10.54%) CodeSize: 44000 -> 39464 (-10.31%) Latency: 90482 -> 89795 (-0.76%) InvThroughput: 202955 -> 201926 (-0.51%) VClause: 560 -> 565 (+0.89%) Copies: 1135 -> 1108 (-2.38%); split: -2.82%, +0.44% PreVGPRs: 201 -> 199 (-1.00%) VALU: 3882 -> 3201 (-17.54%) SALU: 493 -> 479 (-2.84%) VOPD: 262 -> 258 (-1.53%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `49d923078f` ("ac/nir: fix calculation of aligned_new_size") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14500 Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39156>	2026-01-06 10:28:02 +00:00
Marek Olšák	3552028e87	ac/lower_ngg_mesh: fix a segfault accessing out_variables out of bounds Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details component_addr_off (in bytes) was used to offset a component index (in dwords). Cc: mesa-stable Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39144>	2026-01-05 22:23:42 +00:00
Timur Kristóf	18b8543026	ac/nir: Add pass to fixup SMEM on GFX6-7 The pass implements two mitigations for the GFX6-7 SMEM bug: 1. To mitigate VM faults by NULL descriptors: Make sure that SMEM buffer loads always access a mapped BO. Use either the descriptor BO (or compute scratch BO), or otherwise use the zero-filled BO in their place. 2. To mitigate VM faults by OOB robust buffer access: Add an instruction to clamp the offset source to the num_records field of the descriptor. It will be still out of bounds, but the VM fault can be completely mitigated if the driver adds a padding to each memory allocation. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38769>	2026-01-02 23:42:16 +00:00
Marek Olšák	f00f054087	ac,radeonsi: move lowering to load_color0/1 to ac_nir_lower_ps_early It's better to have these all in one pass. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38802>	2026-01-01 18:30:29 +00:00
Georg Lehmann	cbedced5e8	ac/nir/cull: do not reuse variables if subgroup ops are used Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Subgroup ops make divergence information useless for our purpose, we would need workgroup divergence. The game affected here has control flow dependent on vote_any, so it's possible that a wave only executes the code after culling/reordering invocations. That means we can't reuse the maybe undefined value from before culling. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14459 Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39060>	2025-12-29 18:38:29 +00:00
Timur Kristóf	7dbabc6acc	ac/nir/lower_taskmesh_io_to_mem: Use AC_TASK_DRAW_ENTRY_BYTES Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Replace draw_entry_bytes with AC_TASK_DRAW_ENTRY_BYTES. This is 16 on all AMD HW that supports task/mesh shaders. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39032>	2025-12-22 15:17:59 +00:00
Timur Kristóf	fc57fa4589	radv, radeonsi: Don't pass task ring info to mesh/task payload lowering The pass now uses the ring descriptors to figure these out. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39032>	2025-12-22 15:17:59 +00:00
Timur Kristóf	4d381c9136	ac/nir/lower_taskmesh_io_to_mem: Don't hardcode payload entry size in shaders Currently the number of task payload entry size is hardcoded in shaders as a constant. This isn't a good idea because it makes the code inflexible, eg. doesn't allow us to change the number of entries dynamically. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39032>	2025-12-22 15:17:59 +00:00
Timur Kristóf	5348d953aa	ac/nir/lower_taskmesh_io_to_mem: Don't hardcode num_entries in shaders Currently the number of task shader ring entries is hardcoded in shaders as a constant. This isn't a good idea because it makes the code inflexible, eg. prevents us from using the same shader binary accross some chips as well as doesn't allow us to change the number of entries dynamically. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39032>	2025-12-22 15:17:58 +00:00
Daniel Schürmann	f7c4aa48a0	ac/gpu_info: add some more flags to ac_cu_info Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>	2025-12-22 07:34:46 +00:00
Emma Anholt	059d301c79	nir: Drop the mode argument of nir_lower_vars_to_scratch(). It only makes sense for function temps, and that's the only way it's been used. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37245>	2025-12-17 19:50:28 +00:00
Georg Lehmann	da197c3d55	ac/nir/lower_ps_late: remove gfx6 mrtz writemask workaround This is now done in the backends. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38853>	2025-12-12 17:00:51 +00:00
Rhys Perry	b5cf3b1628	ac/nir: fix check for increasing size of non-descriptor loads Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details In the previous version, "end" could have been zero, which would have allowed an increase of "mul" bytes, when it should not not be increased at all. For example: - align_offset=4 - mul=4 - unaligned_new_size=96 - aligned_new_size=128 This would have loaded a dword which was not loaded previously. fossil-db (gfx1201): Totals from 115 (0.14% of 79839) affected shaders: Instrs: 286697 -> 287097 (+0.14%); split: -0.16%, +0.30% CodeSize: 1477728 -> 1481256 (+0.24%); split: -0.13%, +0.37% SpillSGPRs: 1662 -> 1658 (-0.24%); split: -0.42%, +0.18% Latency: 2288612 -> 2290248 (+0.07%); split: -0.04%, +0.11% InvThroughput: 467307 -> 467602 (+0.06%); split: -0.03%, +0.10% VClause: 3689 -> 3691 (+0.05%) SClause: 5052 -> 5064 (+0.24%); split: -0.20%, +0.44% Copies: 34837 -> 35103 (+0.76%); split: -0.80%, +1.56% Branches: 7402 -> 7401 (-0.01%) PreSGPRs: 9147 -> 9143 (-0.04%); split: -0.44%, +0.39% VALU: 159333 -> 159372 (+0.02%); split: -0.01%, +0.04% SALU: 52047 -> 52276 (+0.44%); split: -0.55%, +0.99% SMEM: 9556 -> 9697 (+1.48%) fossil-db (navi31): Totals from 238 (0.30% of 79825) affected shaders: Instrs: 484480 -> 485105 (+0.13%); split: -0.05%, +0.17% CodeSize: 2514012 -> 2517928 (+0.16%); split: -0.06%, +0.22% SpillSGPRs: 1064 -> 1059 (-0.47%) Latency: 3941121 -> 3944670 (+0.09%); split: -0.04%, +0.13% InvThroughput: 897483 -> 898090 (+0.07%); split: -0.04%, +0.11% VClause: 7101 -> 7098 (-0.04%) SClause: 9036 -> 9052 (+0.18%); split: -0.44%, +0.62% Copies: 42790 -> 43096 (+0.72%); split: -0.30%, +1.01% PreSGPRs: 14357 -> 14342 (-0.10%); split: -0.37%, +0.26% VALU: 298325 -> 298347 (+0.01%); split: -0.01%, +0.02% SALU: 57288 -> 57577 (+0.50%); split: -0.20%, +0.70% SMEM: 18768 -> 18967 (+1.06%); split: -0.01%, +1.07% fossil-db (navi21): Totals from 239 (0.30% of 79825) affected shaders: Instrs: 444783 -> 445177 (+0.09%); split: -0.07%, +0.15% CodeSize: 2371776 -> 2373136 (+0.06%); split: -0.13%, +0.19% Latency: 4226478 -> 4219221 (-0.17%); split: -0.24%, +0.07% InvThroughput: 1430962 -> 1428445 (-0.18%); split: -0.23%, +0.06% SClause: 9357 -> 9398 (+0.44%); split: -0.20%, +0.64% Copies: 42742 -> 42927 (+0.43%); split: -0.53%, +0.96% Branches: 12975 -> 12970 (-0.04%); split: -0.05%, +0.02% PreSGPRs: 14368 -> 14312 (-0.39%); split: -0.47%, +0.08% VALU: 306642 -> 306720 (+0.03%); split: -0.02%, +0.05% SALU: 63702 -> 63790 (+0.14%); split: -0.31%, +0.45% SMEM: 20030 -> 20231 (+1.00%); split: -0.00%, +1.01% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14458 Backport-to: 25.3 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38903>	2025-12-12 13:58:42 +00:00
Rhys Perry	49d923078f	ac/nir: fix calculation of aligned_new_size This should consider nir_round_up_components(). fossil-db (gfx1201): Totals from 90 (0.11% of 79839) affected shaders: MaxWaves: 1829 -> 1901 (+3.94%) Instrs: 410780 -> 411825 (+0.25%); split: -0.02%, +0.27% CodeSize: 2227956 -> 2234464 (+0.29%); split: -0.02%, +0.31% VGPRs: 6952 -> 6760 (-2.76%); split: -3.11%, +0.35% Latency: 3071765 -> 3073960 (+0.07%); split: -0.00%, +0.07% InvThroughput: 766201 -> 767322 (+0.15%); split: -0.00%, +0.15% VClause: 7887 -> 7898 (+0.14%); split: -0.08%, +0.22% Copies: 48189 -> 48324 (+0.28%); split: -0.05%, +0.33% PreVGPRs: 6605 -> 6595 (-0.15%); split: -0.18%, +0.03% VALU: 237272 -> 238147 (+0.37%); split: -0.01%, +0.37% SALU: 48987 -> 49003 (+0.03%) VMEM: 15542 -> 15560 (+0.12%) VOPD: 188 -> 200 (+6.38%) fossil-db (navi31): Totals from 89 (0.11% of 79825) affected shaders: MaxWaves: 1811 -> 1883 (+3.98%) Instrs: 403695 -> 404691 (+0.25%); split: -0.01%, +0.26% CodeSize: 2150612 -> 2154860 (+0.20%); split: -0.03%, +0.23% VGPRs: 6892 -> 6676 (-3.13%) Latency: 3306107 -> 3310010 (+0.12%); split: -0.01%, +0.13% InvThroughput: 813092 -> 814382 (+0.16%); split: -0.00%, +0.16% VClause: 7999 -> 8010 (+0.14%); split: -0.06%, +0.20% Copies: 50089 -> 50210 (+0.24%); split: -0.05%, +0.29% PreVGPRs: 6596 -> 6586 (-0.15%); split: -0.18%, +0.03% VALU: 239617 -> 240392 (+0.32%); split: -0.01%, +0.33% SALU: 45349 -> 45363 (+0.03%) VMEM: 15762 -> 15780 (+0.11%) VOPD: 258 -> 262 (+1.55%) fossil-db (navi21): Totals from 89 (0.11% of 79825) affected shaders: Instrs: 345634 -> 346426 (+0.23%); split: -0.00%, +0.23% CodeSize: 1895616 -> 1900156 (+0.24%); split: -0.00%, +0.24% Latency: 3043334 -> 3046859 (+0.12%); split: -0.01%, +0.13% InvThroughput: 928236 -> 929626 (+0.15%); split: -0.01%, +0.16% VClause: 7894 -> 7905 (+0.14%); split: -0.06%, +0.20% Copies: 48694 -> 48785 (+0.19%); split: -0.03%, +0.22% PreVGPRs: 6580 -> 6570 (-0.15%); split: -0.18%, +0.03% VALU: 228323 -> 229072 (+0.33%); split: -0.01%, +0.33% SALU: 47202 -> 47216 (+0.03%) VMEM: 16546 -> 16564 (+0.11%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14458 Backport-to: 25.3 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38903>	2025-12-12 13:58:42 +00:00
Emma Anholt	10ba7675c8	nir/uub: Use an optional max_samples from drivers for sample counts. This triggers some unrolling in Fallout 4, GTAV, and Rocky Planet in my shader-db. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38585>	2025-12-11 14:26:11 +00:00
Marek Olšák	308da55f1a	radv,radeonsi: use FRAG_RESULT_DUAL_SRC_BLEND this is slightly nicer Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38604>	2025-12-10 19:16:46 +00:00
Arcady Goldmints-Orlov	0df8aa940c	nir: Use nir_shader_intrinsics_pass in nir_lower_io_to_scalar Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38816>	2025-12-05 22:30:22 +00:00
Marek Olšák	e6499fa73e	nir/recompute_io_bases: move color input bases after all other inputs This is related to the FS prolog. It should have no effect on other drivers. v2: make it optional via io_options Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> (v1) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38599>	2025-11-29 05:00:40 +00:00
Marek Olšák	fa0bea5ff8	nir: remove nir_io_add_const_offset_to_base Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details nir_opt_constant_folding does it now. Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38277>	2025-11-29 00:16:38 +00:00
Marek Olšák	21cdbfa223	ac,radv: move opt_vectorize_callback to common code Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details radeonsi will use it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38603>	2025-11-28 20:16:10 +00:00
Marek Olšák	2c9995a94f	ac/nir: move aco_nir_op_supports_packed_math_16bit here aco_nir_op_supports_packed_math_16bit currently can't be used by amd/common because tests don't link with ACO, so linking would fail, but we want to move the nir_opt_vectorize callback here that uses it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38603>	2025-11-28 20:16:10 +00:00
Marek Olšák	9e339f4b32	nir: rename nir_lower_indirect_derefs -> nir_lower_indirect_derefs_to_if_else_trees This describes better what it does. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38471>	2025-11-20 05:42:11 +00:00
Marek Olšák	e372365cf4	nir: rename nir_copy_prop -> nir_opt_copy_prop Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38411>	2025-11-15 02:16:38 +00:00
Rhys Perry	00edddf542	ac/nir: add some tests for ac_nir_lower_mem_access_bit_sizes These test that nothing crashes for any possible input. With print=true, it can also be used to compare the behaviour of two different ac_nir_lower_mem_access_bit_sizes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37995>	2025-11-13 15:23:20 +00:00
Konstantin Seurer	de32f9275f	treewide: add & use parent instr helpers We add a bunch of new helpers to avoid the need to touch >parent_instr, including the full set of: * nir_def_is_* * nir_def_as__or_null nir_def_as_* [assumes the right instr type] * nir_src_is_* * nir_src_as_* * nir_scalar_is_* * nir_scalar_as_* Plus nir_def_instr() where there's no more suitable helper. Also an existing helper is renamed to unify all the names, while we're churning the tree: * nir_src_as_alu_instr -> nir_src_as_alu ..and then we port the tree to use the helpers as much as possible, using nir_def_instr() where that does not work. Acked-by: Marek Olšák <maraeo@gmail.com> --- To eliminate nir_def::parent_instr we need to churn the tree anyway, so I'm taking this opportunity to clean up a lot of NIR patterns. Co-authored-by: Konstantin Seurer <konstantin.seurer@gmail.com> Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38313>	2025-11-12 21:22:13 +00:00
Timur Kristóf	7f5f8b3932	ac/nir/ngg: Use align() instead of ALIGN() Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38364>	2025-11-12 13:40:55 +00:00
Timur Kristóf	8f99d736d0	ac/nir/ngg: Fix scratch space for NGG GS streamout For GS streamout, we need the following LDS scratch space: - Repacking streamout vertices takes 1 dword per 4 waves per stream (max 16 bytes for Wave64, max 32 bytes for Wave32) - 1 dword per stream for buffer info (16 bytes) - 1 dword per buffer for buffer info (16 bytes) Previously, the space used for buffer info aliased with the space for repacking the output vertices in ngg_gs_finale(), and there was no barrier in between, which caused a race condition, resulting in random failure. Fix this by allocating a few more LDS dwords so that aliasing is not required, which also allows us to remove an extra workgroup barrier. Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12705 Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38364>	2025-11-12 13:40:55 +00:00
Marek Olšák	9125e34372	amd: lower get_ssbo_size in ac_nir_lower_resinfo The code for lowering get_ssbo_size will be different in future chips, so do it in common code to reduce duplication in the future. Lower get_ssbo_size to ssbo_descriptor_amd + nir_channel. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38097>	2025-11-02 01:42:07 +00:00
Marek Olšák	9def0a6e5b	ac/nir: set support_indirect_inputs/outputs in common code This fixes mesh shader performance of RADV for GravityMark by stopping the lowering of ClipDistance[64][4] indirect access for mesh shader outputs. The perf improvement is 14% on Navi48. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38155>	2025-10-31 00:57:46 +00:00
Marek Olšák	966cb36722	amd: constify struct radeon_surf Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38093>	2025-10-29 12:50:44 +00:00
Rhys Perry	f3ff2375ec	ac/nir: don't consider quads incomplete inside loops We move terminates to outside loops, so this doesn't matter anymore. fossil-db (gfx1201): Totals from 145 (0.18% of 79839) affected shaders: Instrs: 174693 -> 174389 (-0.17%); split: -0.18%, +0.01% CodeSize: 917068 -> 915692 (-0.15%); split: -0.16%, +0.01% VGPRs: 8340 -> 8184 (-1.87%) Latency: 2528888 -> 2521006 (-0.31%); split: -0.48%, +0.16% InvThroughput: 502383 -> 504082 (+0.34%); split: -0.44%, +0.78% Copies: 15968 -> 15632 (-2.10%); split: -2.14%, +0.04% PreVGPRs: 5918 -> 5858 (-1.01%) VALU: 92802 -> 92484 (-0.34%); split: -0.35%, +0.01% SALU: 29437 -> 29430 (-0.02%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37561>	2025-10-23 11:22:02 +00:00
Rhys Perry	9babec1366	radv,radeonsi: use optimize_txd Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37561>	2025-10-23 11:22:01 +00:00
Rhys Perry	7d552d71e9	ac/nir: optimize txd(coord, ddx/ddy(coord)) This is done in ac_nir_lower_tex so that we can optimize derivative calculations with a different exec mask than the texture sample by using the nir_strict_wqm_coord_amd path. It's also more aware of divergence than nir_lower_tex is. fossil-db (gfx1201): Totals from 103 (0.13% of 79839) affected shaders: MaxWaves: 2610 -> 2620 (+0.38%) Instrs: 347283 -> 345912 (-0.39%); split: -0.40%, +0.00% CodeSize: 1892380 -> 1883824 (-0.45%); split: -0.46%, +0.00% VGPRs: 8028 -> 7824 (-2.54%) Latency: 3942575 -> 3939623 (-0.07%); split: -0.08%, +0.01% InvThroughput: 867147 -> 865281 (-0.22%); split: -0.24%, +0.02% VClause: 6230 -> 6221 (-0.14%); split: -0.19%, +0.05% SClause: 3910 -> 3914 (+0.10%); split: -0.26%, +0.36% Copies: 16091 -> 15721 (-2.30%); split: -2.74%, +0.44% PreSGPRs: 4651 -> 4658 (+0.15%) PreVGPRs: 6389 -> 6320 (-1.08%); split: -1.17%, +0.09% VALU: 228715 -> 227490 (-0.54%); split: -0.54%, +0.01% SALU: 32763 -> 32767 (+0.01%); split: -0.06%, +0.07% VMEM: 9027 -> 9024 (-0.03%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37561>	2025-10-23 11:22:00 +00:00
Rhys Perry	309ac1f0c0	ac/nir: refactor move_coords_from_divergent_cf a bit Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37561>	2025-10-23 11:21:59 +00:00
Rhys Perry	42bb81137e	ac/nir: stop using NIR_PASS in ac_nir_lower_ngg_nogs() When NIR_DEBUG=serialize or NIR_DEBUG=clone is used, NIR_PASS recreates nir_function_impl and nir_variable objects, causing use-after-free since ac_nir_lower_ngg_nogs() keeps pointers to those in local variables. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13946 Reviewed-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37573>	2025-10-23 10:44:38 +00:00
Rhys Perry	b18421ae3d	amd/lower_mem_access_bit_sizes: fix shared access when bytes<bit_size/8 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This can happen with (for example) 32x2 loads with align_mul=4,align_offset=2. This patch does bit_size=min(bit_size,bytes) to prevent num_components from being 0. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `52cd5f7e69` ("ac/nir_lower_mem_access_bit_sizes: Split unsupported shared memory instructions") Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37953>	2025-10-21 22:10:34 +00:00
Rhys Perry	e89b22280f	amd/lower_mem_access_bit_sizes: be more careful with 8/16-bit scratch load Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Backport-to: 25.3 Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37953>	2025-10-21 22:10:34 +00:00
Rhys Perry	8829fc3bd6	amd/lower_mem_access_bit_sizes: improve subdword/unaligned SMEM lowering Summary of changes: - handle unaligned 16-bit scalar loads when supported_dword=true - increases the size of 8/16/32/64-bit buffer loads which are not dword aligned, which can create less SMEM loads. - handles when "bytes" is less than "bit_size / 8" fossil-db (gfx1201): Totals from 26 (0.03% of 79839) affected shaders: Instrs: 12676 -> 12710 (+0.27%); split: -0.30%, +0.57% CodeSize: 67272 -> 67384 (+0.17%); split: -0.24%, +0.40% Latency: 44399 -> 44375 (-0.05%); split: -0.09%, +0.04% SClause: 352 -> 344 (-2.27%) SALU: 3972 -> 3992 (+0.50%) SMEM: 554 -> 528 (-4.69%) fossil-db (navi21): Totals from 6 (0.01% of 79825) affected shaders: Instrs: 2192 -> 2186 (-0.27%) CodeSize: 12188 -> 12140 (-0.39%) Latency: 10037 -> 10033 (-0.04%); split: -0.12%, +0.08% SMEM: 124 -> 118 (-4.84%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `fbf0399517` ("amd/lower_mem_access_bit_sizes: lower all SMEM instructions to supported sizes") Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37953>	2025-10-21 22:10:34 +00:00
Rhys Perry	79b2fa785d	amd/lower_mem_access_bit_sizes: don't create subdword UBO loads with LLVM These are unsupported. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14127 Fixes: `fbf0399517` ("amd/lower_mem_access_bit_sizes: lower all SMEM instructions to supported sizes") Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37953>	2025-10-21 22:10:33 +00:00
Georg Lehmann	9e41a7c139	treewide: use nir_load_global alias of nir_build_load_global Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37959>	2025-10-21 12:37:58 +02:00
Timur Kristóf	d20049b430	ac/nir/ngg_mesh: Lower num_subgroups to constant Mesh shader workgroups always have the same amount of subgroups. When the API workgroup size is the same as the real workgroup size, this is a small optimization (using a constant instead of a shader arg). When the API workgroup size is smaller than the real workgroup size (eg. when the number of output vertices or primitves is greater than the API workgroup size on RDNA 2), this fixes a potential bug because num_subgroups would return the "real" workgroup size instead of the API one. Cc: mesa-stable Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37947>	2025-10-20 14:05:40 +00:00
Daniel Schürmann	eecd1c020d	amd: keep ac_shader_config::lds_size unaligned Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577>	2025-10-15 11:20:09 +00:00

1 2 3 4 5 ...

283 commits