fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 20:08:06 +02:00

Author	SHA1	Message	Date
Emma Anholt	10ba7675c8	nir/uub: Use an optional max_samples from drivers for sample counts. This triggers some unrolling in Fallout 4, GTAV, and Rocky Planet in my shader-db. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38585>	2025-12-11 14:26:11 +00:00
Marek Olšák	308da55f1a	radv,radeonsi: use FRAG_RESULT_DUAL_SRC_BLEND this is slightly nicer Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38604>	2025-12-10 19:16:46 +00:00
Arcady Goldmints-Orlov	0df8aa940c	nir: Use nir_shader_intrinsics_pass in nir_lower_io_to_scalar Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38816>	2025-12-05 22:30:22 +00:00
Marek Olšák	e6499fa73e	nir/recompute_io_bases: move color input bases after all other inputs This is related to the FS prolog. It should have no effect on other drivers. v2: make it optional via io_options Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> (v1) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38599>	2025-11-29 05:00:40 +00:00
Marek Olšák	fa0bea5ff8	nir: remove nir_io_add_const_offset_to_base Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details nir_opt_constant_folding does it now. Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38277>	2025-11-29 00:16:38 +00:00
Marek Olšák	21cdbfa223	ac,radv: move opt_vectorize_callback to common code Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details radeonsi will use it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38603>	2025-11-28 20:16:10 +00:00
Marek Olšák	2c9995a94f	ac/nir: move aco_nir_op_supports_packed_math_16bit here aco_nir_op_supports_packed_math_16bit currently can't be used by amd/common because tests don't link with ACO, so linking would fail, but we want to move the nir_opt_vectorize callback here that uses it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38603>	2025-11-28 20:16:10 +00:00
Marek Olšák	9e339f4b32	nir: rename nir_lower_indirect_derefs -> nir_lower_indirect_derefs_to_if_else_trees This describes better what it does. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38471>	2025-11-20 05:42:11 +00:00
Marek Olšák	e372365cf4	nir: rename nir_copy_prop -> nir_opt_copy_prop Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38411>	2025-11-15 02:16:38 +00:00
Rhys Perry	00edddf542	ac/nir: add some tests for ac_nir_lower_mem_access_bit_sizes These test that nothing crashes for any possible input. With print=true, it can also be used to compare the behaviour of two different ac_nir_lower_mem_access_bit_sizes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37995>	2025-11-13 15:23:20 +00:00
Konstantin Seurer	de32f9275f	treewide: add & use parent instr helpers We add a bunch of new helpers to avoid the need to touch >parent_instr, including the full set of: * nir_def_is_* * nir_def_as__or_null nir_def_as_* [assumes the right instr type] * nir_src_is_* * nir_src_as_* * nir_scalar_is_* * nir_scalar_as_* Plus nir_def_instr() where there's no more suitable helper. Also an existing helper is renamed to unify all the names, while we're churning the tree: * nir_src_as_alu_instr -> nir_src_as_alu ..and then we port the tree to use the helpers as much as possible, using nir_def_instr() where that does not work. Acked-by: Marek Olšák <maraeo@gmail.com> --- To eliminate nir_def::parent_instr we need to churn the tree anyway, so I'm taking this opportunity to clean up a lot of NIR patterns. Co-authored-by: Konstantin Seurer <konstantin.seurer@gmail.com> Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38313>	2025-11-12 21:22:13 +00:00
Timur Kristóf	7f5f8b3932	ac/nir/ngg: Use align() instead of ALIGN() Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38364>	2025-11-12 13:40:55 +00:00
Timur Kristóf	8f99d736d0	ac/nir/ngg: Fix scratch space for NGG GS streamout For GS streamout, we need the following LDS scratch space: - Repacking streamout vertices takes 1 dword per 4 waves per stream (max 16 bytes for Wave64, max 32 bytes for Wave32) - 1 dword per stream for buffer info (16 bytes) - 1 dword per buffer for buffer info (16 bytes) Previously, the space used for buffer info aliased with the space for repacking the output vertices in ngg_gs_finale(), and there was no barrier in between, which caused a race condition, resulting in random failure. Fix this by allocating a few more LDS dwords so that aliasing is not required, which also allows us to remove an extra workgroup barrier. Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12705 Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38364>	2025-11-12 13:40:55 +00:00
Marek Olšák	9125e34372	amd: lower get_ssbo_size in ac_nir_lower_resinfo The code for lowering get_ssbo_size will be different in future chips, so do it in common code to reduce duplication in the future. Lower get_ssbo_size to ssbo_descriptor_amd + nir_channel. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38097>	2025-11-02 01:42:07 +00:00
Marek Olšák	9def0a6e5b	ac/nir: set support_indirect_inputs/outputs in common code This fixes mesh shader performance of RADV for GravityMark by stopping the lowering of ClipDistance[64][4] indirect access for mesh shader outputs. The perf improvement is 14% on Navi48. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38155>	2025-10-31 00:57:46 +00:00
Marek Olšák	966cb36722	amd: constify struct radeon_surf Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38093>	2025-10-29 12:50:44 +00:00
Rhys Perry	f3ff2375ec	ac/nir: don't consider quads incomplete inside loops We move terminates to outside loops, so this doesn't matter anymore. fossil-db (gfx1201): Totals from 145 (0.18% of 79839) affected shaders: Instrs: 174693 -> 174389 (-0.17%); split: -0.18%, +0.01% CodeSize: 917068 -> 915692 (-0.15%); split: -0.16%, +0.01% VGPRs: 8340 -> 8184 (-1.87%) Latency: 2528888 -> 2521006 (-0.31%); split: -0.48%, +0.16% InvThroughput: 502383 -> 504082 (+0.34%); split: -0.44%, +0.78% Copies: 15968 -> 15632 (-2.10%); split: -2.14%, +0.04% PreVGPRs: 5918 -> 5858 (-1.01%) VALU: 92802 -> 92484 (-0.34%); split: -0.35%, +0.01% SALU: 29437 -> 29430 (-0.02%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37561>	2025-10-23 11:22:02 +00:00
Rhys Perry	9babec1366	radv,radeonsi: use optimize_txd Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37561>	2025-10-23 11:22:01 +00:00
Rhys Perry	7d552d71e9	ac/nir: optimize txd(coord, ddx/ddy(coord)) This is done in ac_nir_lower_tex so that we can optimize derivative calculations with a different exec mask than the texture sample by using the nir_strict_wqm_coord_amd path. It's also more aware of divergence than nir_lower_tex is. fossil-db (gfx1201): Totals from 103 (0.13% of 79839) affected shaders: MaxWaves: 2610 -> 2620 (+0.38%) Instrs: 347283 -> 345912 (-0.39%); split: -0.40%, +0.00% CodeSize: 1892380 -> 1883824 (-0.45%); split: -0.46%, +0.00% VGPRs: 8028 -> 7824 (-2.54%) Latency: 3942575 -> 3939623 (-0.07%); split: -0.08%, +0.01% InvThroughput: 867147 -> 865281 (-0.22%); split: -0.24%, +0.02% VClause: 6230 -> 6221 (-0.14%); split: -0.19%, +0.05% SClause: 3910 -> 3914 (+0.10%); split: -0.26%, +0.36% Copies: 16091 -> 15721 (-2.30%); split: -2.74%, +0.44% PreSGPRs: 4651 -> 4658 (+0.15%) PreVGPRs: 6389 -> 6320 (-1.08%); split: -1.17%, +0.09% VALU: 228715 -> 227490 (-0.54%); split: -0.54%, +0.01% SALU: 32763 -> 32767 (+0.01%); split: -0.06%, +0.07% VMEM: 9027 -> 9024 (-0.03%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37561>	2025-10-23 11:22:00 +00:00
Rhys Perry	309ac1f0c0	ac/nir: refactor move_coords_from_divergent_cf a bit Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37561>	2025-10-23 11:21:59 +00:00
Rhys Perry	42bb81137e	ac/nir: stop using NIR_PASS in ac_nir_lower_ngg_nogs() When NIR_DEBUG=serialize or NIR_DEBUG=clone is used, NIR_PASS recreates nir_function_impl and nir_variable objects, causing use-after-free since ac_nir_lower_ngg_nogs() keeps pointers to those in local variables. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13946 Reviewed-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37573>	2025-10-23 10:44:38 +00:00
Rhys Perry	b18421ae3d	amd/lower_mem_access_bit_sizes: fix shared access when bytes<bit_size/8 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This can happen with (for example) 32x2 loads with align_mul=4,align_offset=2. This patch does bit_size=min(bit_size,bytes) to prevent num_components from being 0. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `52cd5f7e69` ("ac/nir_lower_mem_access_bit_sizes: Split unsupported shared memory instructions") Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37953>	2025-10-21 22:10:34 +00:00
Rhys Perry	e89b22280f	amd/lower_mem_access_bit_sizes: be more careful with 8/16-bit scratch load Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Backport-to: 25.3 Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37953>	2025-10-21 22:10:34 +00:00
Rhys Perry	8829fc3bd6	amd/lower_mem_access_bit_sizes: improve subdword/unaligned SMEM lowering Summary of changes: - handle unaligned 16-bit scalar loads when supported_dword=true - increases the size of 8/16/32/64-bit buffer loads which are not dword aligned, which can create less SMEM loads. - handles when "bytes" is less than "bit_size / 8" fossil-db (gfx1201): Totals from 26 (0.03% of 79839) affected shaders: Instrs: 12676 -> 12710 (+0.27%); split: -0.30%, +0.57% CodeSize: 67272 -> 67384 (+0.17%); split: -0.24%, +0.40% Latency: 44399 -> 44375 (-0.05%); split: -0.09%, +0.04% SClause: 352 -> 344 (-2.27%) SALU: 3972 -> 3992 (+0.50%) SMEM: 554 -> 528 (-4.69%) fossil-db (navi21): Totals from 6 (0.01% of 79825) affected shaders: Instrs: 2192 -> 2186 (-0.27%) CodeSize: 12188 -> 12140 (-0.39%) Latency: 10037 -> 10033 (-0.04%); split: -0.12%, +0.08% SMEM: 124 -> 118 (-4.84%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `fbf0399517` ("amd/lower_mem_access_bit_sizes: lower all SMEM instructions to supported sizes") Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37953>	2025-10-21 22:10:34 +00:00
Rhys Perry	79b2fa785d	amd/lower_mem_access_bit_sizes: don't create subdword UBO loads with LLVM These are unsupported. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14127 Fixes: `fbf0399517` ("amd/lower_mem_access_bit_sizes: lower all SMEM instructions to supported sizes") Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37953>	2025-10-21 22:10:33 +00:00
Georg Lehmann	9e41a7c139	treewide: use nir_load_global alias of nir_build_load_global Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37959>	2025-10-21 12:37:58 +02:00
Timur Kristóf	d20049b430	ac/nir/ngg_mesh: Lower num_subgroups to constant Mesh shader workgroups always have the same amount of subgroups. When the API workgroup size is the same as the real workgroup size, this is a small optimization (using a constant instead of a shader arg). When the API workgroup size is smaller than the real workgroup size (eg. when the number of output vertices or primitves is greater than the API workgroup size on RDNA 2), this fixes a potential bug because num_subgroups would return the "real" workgroup size instead of the API one. Cc: mesa-stable Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37947>	2025-10-20 14:05:40 +00:00
Daniel Schürmann	eecd1c020d	amd: keep ac_shader_config::lds_size unaligned Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577>	2025-10-15 11:20:09 +00:00
Daniel Schürmann	6fd5766620	amd: add and use utility functions for LDS size encoding Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577>	2025-10-15 11:20:08 +00:00
Daniel Schürmann	b651234414	amd: change ac_shader_config::lds_size to bytes We still keep it aligned to allocation granularity. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577>	2025-10-15 11:20:07 +00:00
Daniel Schürmann	d0b87a0d5f	ac/nir_flag_smem_for_loads: call divergence analysis internally Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Also don't flag more SMEM instructions (in ACO) after the last call to ac_nir_lower_mem_access_bit_sizes(). Totals from 75 (0.09% of 79839) affected shaders: (Navi48) Instrs: 191246 -> 189960 (-0.67%) CodeSize: 996840 -> 985976 (-1.09%) Latency: 3066184 -> 2945500 (-3.94%) InvThroughput: 355373 -> 353106 (-0.64%); split: -0.66%, +0.02% SClause: 4848 -> 4699 (-3.07%) Copies: 13827 -> 13925 (+0.71%); split: -0.07%, +0.78% Branches: 5176 -> 5003 (-3.34%) PreSGPRs: 6222 -> 6272 (+0.80%) VALU: 108934 -> 108993 (+0.05%); split: -0.00%, +0.06% SALU: 31679 -> 31210 (-1.48%); split: -1.51%, +0.03% SMEM: 7158 -> 6739 (-5.85%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37843>	2025-10-14 16:33:12 +00:00
Daniel Schürmann	9b1a635bb3	amd/common: merge radv_nir_opt_access_speculate() into ac_nir_flag_smem_for_loads() One shader is negatively affected, but we save 2 entire iterations over every shader. This effect is also mitigated with the next commits. Totals from 1 (0.00% of 79839) affected shaders: (Navi48) Instrs: 947 -> 958 (+1.16%) CodeSize: 4728 -> 4732 (+0.08%) Latency: 20678 -> 20723 (+0.22%) InvThroughput: 2697 -> 2698 (+0.04%) SClause: 26 -> 27 (+3.85%) Copies: 139 -> 145 (+4.32%) Branches: 46 -> 47 (+2.17%) VALU: 460 -> 463 (+0.65%) SALU: 201 -> 204 (+1.49%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37843>	2025-10-14 16:33:12 +00:00
Daniel Schürmann	8ff44f17ef	amd/lower_mem_access_bit_sizes: also use SMEM for subdword loads We can simply extract from the loaded dwords as per nir_lower_mem_access_bit_sizes() lowering. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37843>	2025-10-14 16:33:11 +00:00
Daniel Schürmann	fbf0399517	amd/lower_mem_access_bit_sizes: lower all SMEM instructions to supported sizes This creates more SMEM instruction, mostly because vec3 64bit are being split instead of overfetched. Totals from 442 (0.55% of 79839) affected shaders: (Navi48) Instrs: 288998 -> 289469 (+0.16%); split: -0.04%, +0.21% CodeSize: 1538212 -> 1541460 (+0.21%); split: -0.03%, +0.24% Latency: 3010072 -> 3009373 (-0.02%); split: -0.04%, +0.01% InvThroughput: 885572 -> 885564 (-0.00%); split: -0.00%, +0.00% VClause: 6900 -> 6885 (-0.22%); split: -0.28%, +0.06% SClause: 4457 -> 4469 (+0.27%); split: -0.18%, +0.45% VALU: 162473 -> 162469 (-0.00%) SALU: 42871 -> 42855 (-0.04%); split: -0.05%, +0.01% SMEM: 6893 -> 7239 (+5.02%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37843>	2025-10-14 16:33:11 +00:00
Georg Lehmann	e26a8be7af	ac/nir: enable nir atomic load/store opts Foz-DB GFX1201: Totals from 4 (0.00% of 80287) affected shaders: Instrs: 2928 -> 2920 (-0.27%); split: -0.31%, +0.03% CodeSize: 15424 -> 15392 (-0.21%); split: -0.23%, +0.03% Latency: 835578 -> 823220 (-1.48%) InvThroughput: 3307941 -> 3258515 (-1.49%) Copies: 459 -> 447 (-2.61%) VALU: 1297 -> 1291 (-0.46%) SALU: 595 -> 589 (-1.01%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37822>	2025-10-14 06:24:17 +00:00
Marek Olšák	3fe651f607	nir: remove load_smem_amd replaced by load_global_amd + ACCESS_SMEM_AMD Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36936>	2025-10-08 08:54:11 +00:00
Daniel Schürmann	3ae2f12eb4	ac/nir: switch load_smem_amd to use load_global Totals from 24920 (31.21% of 79839) affected shaders: (Navi48) Instrs: 22044185 -> 22413945 (+1.68%); split: -0.01%, +1.68% CodeSize: 117211728 -> 118623656 (+1.20%); split: -0.01%, +1.21% VGPRs: 1199008 -> 1198948 (-0.01%) SpillSGPRs: 7421 -> 7365 (-0.75%); split: -0.78%, +0.03% SpillVGPRs: 2177 -> 2184 (+0.32%); split: -0.09%, +0.41% Scratch: 7037952 -> 7038208 (+0.00%) Latency: 155140452 -> 155530877 (+0.25%); split: -0.02%, +0.27% InvThroughput: 23601713 -> 23634131 (+0.14%); split: -0.01%, +0.15% VClause: 458456 -> 458575 (+0.03%); split: -0.09%, +0.11% SClause: 651928 -> 649405 (-0.39%); split: -1.26%, +0.87% Copies: 1681110 -> 1677057 (-0.24%); split: -0.42%, +0.17% Branches: 515419 -> 515322 (-0.02%); split: -0.02%, +0.00% PreSGPRs: 992903 -> 990545 (-0.24%); split: -0.24%, +0.00% VALU: 11971995 -> 11967962 (-0.03%); split: -0.04%, +0.00% SALU: 3247576 -> 3476720 (+7.06%); split: -0.03%, +7.08% VMEM: 821046 -> 821056 (+0.00%); split: -0.00%, +0.00% SMEM: 988476 -> 988779 (+0.03%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36936>	2025-10-08 08:54:11 +00:00
Daniel Schürmann	fdd6bdf03d	ac/nir_lower_global_access: don't assume pack_64_2x32 is the same as u2u64 It might also be the expanded base address. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36936>	2025-10-08 08:53:58 +00:00
Daniel Schürmann	0209065229	ac/nir_lower_global_access: require no_unsigned wrap when extracting from 32-bit additions Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36936>	2025-10-08 08:53:58 +00:00
Rhys Perry	20af16b4d8	aco: use MTBUF for 64-bit atomic load/store Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details A 64-bit atomic load/store should be considered entirely out-of-bounds if any part of it is out-of-bounds. Since we implemented these as 32-bit vec2 load/store, it would have been possible for the first half to be in-bounds while the second half is out-of-bounds. From 9.6.1. Robust Buffer Access of Vulkan 1.4.324 specification: > Any non-atomic access to a uniform, storage, uniform texel, or storage > texel buffer wider than 32-bits may be treated as multiple 32-bit > accesses that are separately bounds checked. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36602>	2025-10-07 17:41:31 +00:00
Georg Lehmann	84f26ed117	nir: optimize atomic isub if supported Foz-DB Navi48: Totals from 1 (0.00% of 80287) affected shaders: Instrs: 1641 -> 1637 (-0.24%) CodeSize: 8472 -> 8456 (-0.19%) Latency: 19132 -> 19131 (-0.01%) InvThroughput: 9566 -> 9565 (-0.01%) Copies: 126 -> 125 (-0.79%) VALU: 565 -> 563 (-0.35%) SALU: 439 -> 438 (-0.23%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37702>	2025-10-07 14:07:56 +00:00
Daniel Schürmann	0e3bc3d8c0	nir/opt_offsets: call allow_offset_wrap() for try_fold_shared2() Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This prevents applying wrapping offsets on GFX6. Fixes: `e1a692f74b` ('nir/opt_offsets: allow for unsigned wraps when folding load/store_shared2_amd offsets') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37667>	2025-10-03 07:54:12 +00:00
Daniel Schürmann	93ce29c42e	amd: don't allow unsigned wraps for shared memory offsets on GFX6 Fixes: `10266e7b21` ('radv: allow for unsigned wraps for shared memory intrinsics in nir_opt_offsets') Fixes: `dd68825feb` ('radeonsi: allow for unsigned wraps for shared memory intrinsics in nir_opt_offsets') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37667>	2025-10-03 07:54:12 +00:00
Timur Kristóf	d3579190d6	ac/nir/ngg: Fix scalarized mesh primitive indices Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Take the write_mask into account when storing primitive indices, otherwise they will end up being stored in the wrong place. Fixes: `8e24d3426d` ("ac/nir/ngg: Refactor MS primitive indices for scalarized IO.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37610>	2025-09-29 08:07:54 +00:00
Timur Kristóf	3dc9c1a91e	ac/nir/ngg: Remove dead code for 64-bit mesh shader variables We already lower all 64-bit I/O to 32-bit before this pass, and the rest of the code here already asserts that I/O variables must be 32-bit or smaller. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37610>	2025-09-29 08:07:54 +00:00
Daniel Schürmann	10266e7b21	radv: allow for unsigned wraps for shared memory intrinsics in nir_opt_offsets Totals from 76 (0.10% of 79839) affected shaders: (Navi48) Instrs: 237450 -> 237323 (-0.05%); split: -0.05%, +0.00% CodeSize: 1276732 -> 1275824 (-0.07%); split: -0.07%, +0.00% Latency: 1123467 -> 1123387 (-0.01%); split: -0.01%, +0.01% InvThroughput: 364942 -> 364738 (-0.06%); split: -0.06%, +0.00% Copies: 20654 -> 20636 (-0.09%); split: -0.09%, +0.00% Branches: 7326 -> 7327 (+0.01%) PreSGPRs: 5197 -> 5195 (-0.04%) PreVGPRs: 3395 -> 3396 (+0.03%) VALU: 96134 -> 96034 (-0.10%) SALU: 48059 -> 48041 (-0.04%); split: -0.04%, +0.00% VOPD: 10 -> 8 (-20.00%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37453>	2025-09-24 14:28:24 +00:00
Rhys Perry	92a2ab8b64	ac/nir: fix progress reporting in ac_nir_lower_tex Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35069>	2025-09-24 08:20:27 +00:00
Georg Lehmann	714a149396	nir: remove unsigned upper bound config All config information is now either in nir->info or nir->options. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37361>	2025-09-16 09:24:04 +00:00
Georg Lehmann	76a502d75a	ac/nir: set subgroup size for gs copy shader Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37294>	2025-09-14 13:21:21 +00:00
Georg Lehmann	83326af899	nir/builder: add nir_inverse_ballot_imm Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37178>	2025-09-04 14:03:56 +00:00

1 2 3 4 5 ...

261 commits