fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 15:58:06 +02:00

Author	SHA1	Message	Date
Rhys Perry	29f8237d30	amd: move various flags to ac_cu_info Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39992>	2026-02-26 15:49:14 +00:00
Rhys Perry	78b3e07bed	ac/nir: remove ac_nir_lower_ps_late_options::family This is unused. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39992>	2026-02-26 15:49:12 +00:00
Rhys Perry	6d31054d86	ac/nir: remove gfx_level parameter from ac_nir_lower_indirect_derefs This was unused. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39992>	2026-02-26 15:49:12 +00:00
Samuel Pitoiset	e8710152fb	ac/nir: stop passing radeon_info for addr->coord helpers Only for gb_addr_config. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40041>	2026-02-26 07:21:06 +00:00
Samuel Pitoiset	2eb9420061	ac/nir: fix writemask for dual source blending on GFX11+ Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This should definitely be an OR operation if MRT0 and MRT1 don't write the same channels. This also requires to set the writemask manually because when it's 0 (in case a dual-source output is missing), the intrinsic computes the mask itself with the number of components. No fossils-db changes on NAVI33. Fixes: `45d8cd037a` ("ac/nir: rewrite ac_nir_lower_ps epilog to fix dual src blending with mono PS") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14878 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39996>	2026-02-24 20:20:02 +00:00
Priya Hosur	0bfad39f15	ac/nir/ngg: re-enable use of known compile-time GS connectivity Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38075>	2026-02-18 01:29:37 +00:00
Marek Olšák	a2309edb6b	ac/nir/meta: properly align sparse buffer clears with 12-byte clear values Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39841>	2026-02-17 14:47:41 +00:00
Marek Olšák	62cce3abcd	ac/nir/meta: use the clear/copy compute shader if CP DMA doesn't support sparse ac_prepare_cs_clear_copy_buffer determines whether to use CP DMA, and the driver obeys that. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39841>	2026-02-17 14:47:41 +00:00
Marek Olšák	bbcfab9f4f	ac/nir/meta: don't scalarize sparse loads if the address is aligned to load size This should make copying sparse faster if we get aligned buffer bounds. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39841>	2026-02-17 14:47:41 +00:00
Rhys Perry	e4b8ade092	ac/nir,radv,radeonsi: flip branches to avoid waitcnts fossil-db (navi31): Totals from 5123 (6.42% of 79825) affected shaders: Instrs: 12712435 -> 12703672 (-0.07%); split: -0.12%, +0.05% CodeSize: 67068852 -> 67033244 (-0.05%); split: -0.10%, +0.05% VGPRs: 363896 -> 363956 (+0.02%) SpillSGPRs: 5035 -> 5074 (+0.77%); split: -0.83%, +1.61% Latency: 115048972 -> 111944013 (-2.70%); split: -2.89%, +0.19% InvThroughput: 19102126 -> 18696069 (-2.13%); split: -2.34%, +0.22% VClause: 258693 -> 258770 (+0.03%); split: -0.01%, +0.04% SClause: 346271 -> 346225 (-0.01%); split: -0.02%, +0.00% Copies: 1040815 -> 1042017 (+0.12%); split: -0.23%, +0.34% Branches: 332467 -> 332565 (+0.03%); split: -0.04%, +0.07% PreSGPRs: 304888 -> 304699 (-0.06%); split: -0.10%, +0.04% PreVGPRs: 296652 -> 296654 (+0.00%) VALU: 7591803 -> 7594601 (+0.04%); split: -0.01%, +0.05% SALU: 1454420 -> 1455764 (+0.09%); split: -0.24%, +0.33% VOPD: 1826 -> 1810 (-0.88%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262>	2026-02-16 19:39:43 +00:00
Marek Olšák	a9df891bc6	nir: allow get_ssbo_size to return a 64-bit result to match get_ubo_size, and to support HW where SSBOs can have a 64-bit size. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743>	2026-02-16 12:59:36 +00:00
Marek Olšák	d1e6a5c1c8	ac: lower load_num_workgroups in NIR Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638>	2026-02-13 15:33:19 +00:00
Marek Olšák	1e11e83d1c	ac/nir: add ac_nir_lower_intrinsics_to_args_options structure Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638>	2026-02-13 15:33:19 +00:00
Marek Olšák	a9e47751d2	ac: lower load_subgroup_id for ACO in NIR Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638>	2026-02-13 15:33:19 +00:00
Marek Olšák	0a9bdcac79	ac: lower load_workgroup_ids for ACO in NIR Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638>	2026-02-13 15:33:19 +00:00
Marek Olšák	85916c8af0	ac/nir: lower buffer image_load to load_buffer_amd in NIR Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>	2026-02-02 17:56:54 +00:00
Marek Olšák	ef3d43085a	ac/nir: lower buffer txf to load_buffer_amd in NIR This also: - removes the sparse flag (TFE) if it has no uses - removes trailing unused components (if not sparse) or all contiguous unused components before the sparse flag (if sparse) - lowers 64-bit formatted buffer loads to 32 bits Everything here could also be used by 64-bit non-buffer image loads and txf if needed. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>	2026-02-02 17:56:54 +00:00
Marek Olšák	30ee7044bc	ac/nir: rename ac_nir_lower_tex -> ac_nir_lower_image_tex It will lower txf and buffer image loads to load_buffer_amd. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>	2026-02-02 17:56:54 +00:00
Marek Olšák	61bfc298ba	ac: set missing dest_type for image_deref_load required for lowering to load_buffer_amd Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>	2026-02-02 17:56:53 +00:00
Marek Olšák	fbfac92738	ac,radeonsi: add AC_NIR_TEX_BACKEND_FLAG_IS_IMAGE image_load lowered to tex will use this (descriptor loads only for now) Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>	2026-02-02 17:56:53 +00:00
Marek Olšák	44bc1e6bf4	nir: add dest_type to load_buffer_amd for lowering the result to 16 bits Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>	2026-02-02 17:56:52 +00:00
Georg Lehmann	809fb0fba3	ac/nir/lower_ps_late: emit scalar f2f16_rtz for when one half of a packed export is undef Foz-DB Navi48: Totals from 7200 (8.74% of 82405) affected shaders: Instrs: 9056391 -> 9048177 (-0.09%); split: -0.09%, +0.00% CodeSize: 48681288 -> 48640684 (-0.08%); split: -0.09%, +0.00% VGPRs: 413088 -> 413784 (+0.17%) Latency: 76340711 -> 76320080 (-0.03%); split: -0.03%, +0.00% InvThroughput: 12692959 -> 12684618 (-0.07%); split: -0.07%, +0.00% VClause: 148823 -> 148821 (-0.00%) Copies: 601739 -> 601874 (+0.02%); split: -0.01%, +0.03% VALU: 5213356 -> 5207253 (-0.12%); split: -0.12%, +0.00% SALU: 1160815 -> 1160817 (+0.00%); split: -0.00%, +0.00% VOPD: 79520 -> 79444 (-0.10%); split: +0.09%, -0.18% Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>	2026-01-26 10:54:23 +00:00
Georg Lehmann	8c895c5c61	ac/nir/lower_ps_late: CSE partial packed exports Foz-DB Navi48: Totals from 425 (0.52% of 82405) affected shaders: Instrs: 1110029 -> 1109658 (-0.03%); split: -0.03%, +0.00% CodeSize: 6135272 -> 6133848 (-0.02%); split: -0.02%, +0.00% VGPRs: 29856 -> 29844 (-0.04%) Latency: 10258411 -> 10258043 (-0.00%); split: -0.00%, +0.00% InvThroughput: 1898177 -> 1897661 (-0.03%) Copies: 88221 -> 88173 (-0.05%) VALU: 575276 -> 574894 (-0.07%) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>	2026-01-26 10:54:22 +00:00
Marek Olšák	ebeb904c95	ac,radeonsi: set optimal COMPUTE_DISPATCH_INTERLEAVE for buffer clears/copies Small buffer clears are a bit faster now. The numbers were tuned specifically for this compute shader. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39290>	2026-01-22 22:28:39 +00:00
Marek Olšák	a5e1d31dad	ac/nir/meta: tune 12B clear buffer performance for gfx12 Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39290>	2026-01-22 22:28:39 +00:00
Marek Olšák	9257cf04a1	ac/nir/meta: tune image clear & copy performance for gfx12 Compute shaders are the fastest for all copies and some clears. Note that this is a very different compute shader than the one in RADV. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39290>	2026-01-22 22:28:38 +00:00
Samuel Pitoiset	de64c7238a	ac/nir: fix computing cube derivatives when the major axis is negative Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This corresponds to the face 1.0, 3.0 or 5.0. Cc: mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39303>	2026-01-21 07:12:34 +00:00
Georg Lehmann	711598982a	ac/nir,radv: remove ac_nir_opt_pack_half Foz-DB Navi21: Totals from 2937 (3.01% of 97591) affected shaders: Instrs: 1908695 -> 1908291 (-0.02%); split: -0.02%, +0.00% CodeSize: 10232148 -> 10229224 (-0.03%); split: -0.03%, +0.01% VGPRs: 142168 -> 142080 (-0.06%) Latency: 8052895 -> 8052622 (-0.00%); split: -0.01%, +0.01% InvThroughput: 2550330 -> 2549602 (-0.03%); split: -0.03%, +0.01% VClause: 32601 -> 32603 (+0.01%); split: -0.01%, +0.02% Copies: 118570 -> 118587 (+0.01%); split: -0.04%, +0.05% PreVGPRs: 110090 -> 110082 (-0.01%) VALU: 1468422 -> 1468043 (-0.03%); split: -0.03%, +0.00% SALU: 173858 -> 173828 (-0.02%) Foz-DB Navi48: Totals from 4196 (4.30% of 97637) affected shaders: MaxWaves: 118678 -> 118680 (+0.00%); split: +0.01%, -0.01% Instrs: 3627604 -> 3624093 (-0.10%); split: -0.10%, +0.00% CodeSize: 18956684 -> 18939824 (-0.09%); split: -0.09%, +0.01% VGPRs: 225624 -> 225060 (-0.25%); split: -0.26%, +0.01% Latency: 11856204 -> 11857280 (+0.01%); split: -0.01%, +0.02% InvThroughput: 2388584 -> 2389178 (+0.02%); split: -0.01%, +0.03% VClause: 50409 -> 50410 (+0.00%) SClause: 64701 -> 64699 (-0.00%) Copies: 208353 -> 207522 (-0.40%); split: -0.43%, +0.03% PreVGPRs: 161314 -> 161306 (-0.00%) VALU: 2345604 -> 2345172 (-0.02%); split: -0.02%, +0.00% SALU: 391466 -> 388723 (-0.70%) VOPD: 1788 -> 1806 (+1.01%) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38815>	2026-01-20 14:48:23 +00:00
Emma Anholt	ed8676dc28	nir: Rename the unit_test_*_amd intrinics to be un-vendored. We'll reuse these from the nir_opt_algebraic_pattern_test. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076>	2026-01-15 19:09:37 +00:00
Aitor Camacho	fcf53988c4	nir/opt_varyings: Support implementations that cannot compact 16-bits Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Add nir_io_compact_to_higher_16 flag so that the pass knows if it can compact 16-bit varyings into the higher 16 bits of a 32-bit varying. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Aitor Camacho <aitor@lunarg.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38994>	2026-01-14 20:44:41 +00:00
Alyssa Rosenzweig	235e868ef7	ac/nir: use nir_is_shared_access Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39219>	2026-01-09 20:51:13 +00:00
Georg Lehmann	6d07a56c6a	ac/nir/lower_ps_late: preserve signed zero, inf, nan for exports Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39187>	2026-01-09 11:58:52 +00:00
Georg Lehmann	84ecac58a6	ac/nir/opt_pack_half: preserve fp_math_ctrl Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39187>	2026-01-09 11:58:52 +00:00
Georg Lehmann	5241343ccb	ac/nir/lower_sin_cos: preserve fp_math_ctrl Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39187>	2026-01-09 11:58:52 +00:00
Georg Lehmann	9331726157	ac/nir/lower_sin_cos: use nir_shader_alu_pass Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39187>	2026-01-09 11:58:52 +00:00
Marek Olšák	13cfd0176c	ac/gpu_info: add #define AMD_MEMCHANNEL_INTERLEAVE_BYTES radeon_info::pipe_interleave_bytes is renamed to r600_pipe_interleave_bytes where it can be 512 on some chips. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39120>	2026-01-06 20:32:10 +00:00
Marek Olšák	92133bb0ab	amd: demystify various optimizations we already have for memory channels Explain why we do what we do, and use the radeon_info field properly. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39120>	2026-01-06 20:32:10 +00:00
Rhys Perry	614437ead5	ac/nir: don't vectorize 16-bit shared loads to 8-bit Fixes an issue where a 2x16 load was vectorized with a 1x16 load into a 8x8 load. This became possible after `49d923078f` increased aligned_new_size from 6 bytes to 8 bytes. fossil-db (navi31): Totals from 5 (0.01% of 79825) affected shaders: Instrs: 6994 -> 6257 (-10.54%) CodeSize: 44000 -> 39464 (-10.31%) Latency: 90482 -> 89795 (-0.76%) InvThroughput: 202955 -> 201926 (-0.51%) VClause: 560 -> 565 (+0.89%) Copies: 1135 -> 1108 (-2.38%); split: -2.82%, +0.44% PreVGPRs: 201 -> 199 (-1.00%) VALU: 3882 -> 3201 (-17.54%) SALU: 493 -> 479 (-2.84%) VOPD: 262 -> 258 (-1.53%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `49d923078f` ("ac/nir: fix calculation of aligned_new_size") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14500 Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39156>	2026-01-06 10:28:02 +00:00
Marek Olšák	3552028e87	ac/lower_ngg_mesh: fix a segfault accessing out_variables out of bounds Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details component_addr_off (in bytes) was used to offset a component index (in dwords). Cc: mesa-stable Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39144>	2026-01-05 22:23:42 +00:00
Timur Kristóf	18b8543026	ac/nir: Add pass to fixup SMEM on GFX6-7 The pass implements two mitigations for the GFX6-7 SMEM bug: 1. To mitigate VM faults by NULL descriptors: Make sure that SMEM buffer loads always access a mapped BO. Use either the descriptor BO (or compute scratch BO), or otherwise use the zero-filled BO in their place. 2. To mitigate VM faults by OOB robust buffer access: Add an instruction to clamp the offset source to the num_records field of the descriptor. It will be still out of bounds, but the VM fault can be completely mitigated if the driver adds a padding to each memory allocation. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38769>	2026-01-02 23:42:16 +00:00
Marek Olšák	f00f054087	ac,radeonsi: move lowering to load_color0/1 to ac_nir_lower_ps_early It's better to have these all in one pass. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38802>	2026-01-01 18:30:29 +00:00
Georg Lehmann	cbedced5e8	ac/nir/cull: do not reuse variables if subgroup ops are used Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Subgroup ops make divergence information useless for our purpose, we would need workgroup divergence. The game affected here has control flow dependent on vote_any, so it's possible that a wave only executes the code after culling/reordering invocations. That means we can't reuse the maybe undefined value from before culling. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14459 Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39060>	2025-12-29 18:38:29 +00:00
Timur Kristóf	7dbabc6acc	ac/nir/lower_taskmesh_io_to_mem: Use AC_TASK_DRAW_ENTRY_BYTES Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Replace draw_entry_bytes with AC_TASK_DRAW_ENTRY_BYTES. This is 16 on all AMD HW that supports task/mesh shaders. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39032>	2025-12-22 15:17:59 +00:00
Timur Kristóf	fc57fa4589	radv, radeonsi: Don't pass task ring info to mesh/task payload lowering The pass now uses the ring descriptors to figure these out. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39032>	2025-12-22 15:17:59 +00:00
Timur Kristóf	4d381c9136	ac/nir/lower_taskmesh_io_to_mem: Don't hardcode payload entry size in shaders Currently the number of task payload entry size is hardcoded in shaders as a constant. This isn't a good idea because it makes the code inflexible, eg. doesn't allow us to change the number of entries dynamically. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39032>	2025-12-22 15:17:59 +00:00
Timur Kristóf	5348d953aa	ac/nir/lower_taskmesh_io_to_mem: Don't hardcode num_entries in shaders Currently the number of task shader ring entries is hardcoded in shaders as a constant. This isn't a good idea because it makes the code inflexible, eg. prevents us from using the same shader binary accross some chips as well as doesn't allow us to change the number of entries dynamically. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39032>	2025-12-22 15:17:58 +00:00
Daniel Schürmann	f7c4aa48a0	ac/gpu_info: add some more flags to ac_cu_info Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38701>	2025-12-22 07:34:46 +00:00
Emma Anholt	059d301c79	nir: Drop the mode argument of nir_lower_vars_to_scratch(). It only makes sense for function temps, and that's the only way it's been used. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37245>	2025-12-17 19:50:28 +00:00
Georg Lehmann	da197c3d55	ac/nir/lower_ps_late: remove gfx6 mrtz writemask workaround This is now done in the backends. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38853>	2025-12-12 17:00:51 +00:00
Rhys Perry	b5cf3b1628	ac/nir: fix check for increasing size of non-descriptor loads Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details In the previous version, "end" could have been zero, which would have allowed an increase of "mul" bytes, when it should not not be increased at all. For example: - align_offset=4 - mul=4 - unaligned_new_size=96 - aligned_new_size=128 This would have loaded a dword which was not loaded previously. fossil-db (gfx1201): Totals from 115 (0.14% of 79839) affected shaders: Instrs: 286697 -> 287097 (+0.14%); split: -0.16%, +0.30% CodeSize: 1477728 -> 1481256 (+0.24%); split: -0.13%, +0.37% SpillSGPRs: 1662 -> 1658 (-0.24%); split: -0.42%, +0.18% Latency: 2288612 -> 2290248 (+0.07%); split: -0.04%, +0.11% InvThroughput: 467307 -> 467602 (+0.06%); split: -0.03%, +0.10% VClause: 3689 -> 3691 (+0.05%) SClause: 5052 -> 5064 (+0.24%); split: -0.20%, +0.44% Copies: 34837 -> 35103 (+0.76%); split: -0.80%, +1.56% Branches: 7402 -> 7401 (-0.01%) PreSGPRs: 9147 -> 9143 (-0.04%); split: -0.44%, +0.39% VALU: 159333 -> 159372 (+0.02%); split: -0.01%, +0.04% SALU: 52047 -> 52276 (+0.44%); split: -0.55%, +0.99% SMEM: 9556 -> 9697 (+1.48%) fossil-db (navi31): Totals from 238 (0.30% of 79825) affected shaders: Instrs: 484480 -> 485105 (+0.13%); split: -0.05%, +0.17% CodeSize: 2514012 -> 2517928 (+0.16%); split: -0.06%, +0.22% SpillSGPRs: 1064 -> 1059 (-0.47%) Latency: 3941121 -> 3944670 (+0.09%); split: -0.04%, +0.13% InvThroughput: 897483 -> 898090 (+0.07%); split: -0.04%, +0.11% VClause: 7101 -> 7098 (-0.04%) SClause: 9036 -> 9052 (+0.18%); split: -0.44%, +0.62% Copies: 42790 -> 43096 (+0.72%); split: -0.30%, +1.01% PreSGPRs: 14357 -> 14342 (-0.10%); split: -0.37%, +0.26% VALU: 298325 -> 298347 (+0.01%); split: -0.01%, +0.02% SALU: 57288 -> 57577 (+0.50%); split: -0.20%, +0.70% SMEM: 18768 -> 18967 (+1.06%); split: -0.01%, +1.07% fossil-db (navi21): Totals from 239 (0.30% of 79825) affected shaders: Instrs: 444783 -> 445177 (+0.09%); split: -0.07%, +0.15% CodeSize: 2371776 -> 2373136 (+0.06%); split: -0.13%, +0.19% Latency: 4226478 -> 4219221 (-0.17%); split: -0.24%, +0.07% InvThroughput: 1430962 -> 1428445 (-0.18%); split: -0.23%, +0.06% SClause: 9357 -> 9398 (+0.44%); split: -0.20%, +0.64% Copies: 42742 -> 42927 (+0.43%); split: -0.53%, +0.96% Branches: 12975 -> 12970 (-0.04%); split: -0.05%, +0.02% PreSGPRs: 14368 -> 14312 (-0.39%); split: -0.47%, +0.08% VALU: 306642 -> 306720 (+0.03%); split: -0.02%, +0.05% SALU: 63702 -> 63790 (+0.14%); split: -0.31%, +0.45% SMEM: 20030 -> 20231 (+1.00%); split: -0.00%, +1.01% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14458 Backport-to: 25.3 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38903>	2025-12-12 13:58:42 +00:00

1 2 3 4 5 ...

312 commits