fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 04:58:08 +02:00

Author	SHA1	Message	Date
Marek Olšák	d7f03c649e	nir/lower_io_passes: only sort variables for nir_lower_io_vars_to_temporaries Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38470>	2025-11-20 12:17:31 -05:00
Marek Olšák	02148dc6bc	nir/lower_io_passes: fold bool lower_indirect_inputs Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38470>	2025-11-20 12:17:30 -05:00
Marek Olšák	9b4fc64324	nir/lower_io_passes: simplify conditions for when to lower IO to temps Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38470>	2025-11-20 12:17:28 -05:00
Marek Olšák	edfa3fdfbc	nir/lower_io_passes: lower indirect TCS outputs sooner and clarify the behavior We don't have to enter the lower-IO-to-temps block for TCS at all. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38470>	2025-11-20 12:17:26 -05:00
Marek Olšák	9e339f4b32	nir: rename nir_lower_indirect_derefs -> nir_lower_indirect_derefs_to_if_else_trees This describes better what it does. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38471>	2025-11-20 05:42:11 +00:00
Marek Olšák	22871fb8bd	nir: for nir_shift_channels, fill undefined components with undef instead of .x This potentially results in better code because we don't add def uses where undef is allowed. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38468>	2025-11-20 04:26:55 +00:00
Boris Brezillon	98bd0850da	nir: Add a pass to downgrade inout PLS vars to {in,out} only ones Shaders might declare PLS vars as inout but might just use them as in or out but not both. This pass detects those cases and adjusts the variable/deref modes accordingly. This pass should be called before nir_lower_io_vars_to_temporaries(), otherwise the copy_derefs will be inserted, turning unused variables into used ones. This should ideally be called after DCE to make sure we don't leave PLS inout variables behind. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37110>	2025-11-18 20:25:43 +00:00
Boris Brezillon	2cc254d8cb	nir: Teach nir_lower_io_vars_to_temporaries() about PLS vars Pixel local storage variables are like fragment shader outputs that might be read, written or both. Teach nir_lower_io_vars_to_temporaries() about these variables so they can be lowered along with the regular fragment outputs. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37110>	2025-11-18 20:25:43 +00:00
Boris Brezillon	ea4d4d2a77	nir: Prepare nir_lower_io_vars_to_temporaries() for optional PLS lowering Rather than adding another boolean to optionally lower PLS vars, pass the types we want to lowers through a nir_variable_mode bitmask. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37110>	2025-11-18 20:25:42 +00:00
Eric R. Smith	ab867cc3cd	nir: add intrinsics for pixel local storage The pixel local storage load and store instructions keep track of the format of the pixel local storage variables. This allows drivers to insert the appropriate conversions on load/store. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37110>	2025-11-18 20:25:42 +00:00
Ryan Mckeever	75263ce911	nir: add support for pixel_local_storage variables Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37110>	2025-11-18 20:25:42 +00:00
Anna Maniscalco	9a72696e02	nir/lower_tex: copy `is_sparse` when lowering txd Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38497>	2025-11-18 19:03:36 +00:00
Georg Lehmann	3a175b54a4	aco,nir: support subdword v_permlane_b16 Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38389>	2025-11-17 23:33:59 +00:00
Dave Airlie	438245404c	nir: add support for cooperative matrix reduction operations. This adds some new call operations to handle various parts of the reductions. cmat_reduce: is the initial toplevel operation from SPIR-V this is used after lowering for row/col operation on single hw supported matrix sizes. The spir-v operation is lowered into multiple of these on flex dimensions, but also can be lowered into others. cmat_reduce_finish: after multiple reduction operations on a flexible dimension matrix, there is often subsequent operations on the output matrices to finish the operation. cmat_reduce_2x2: this takes 4 input matrices, and 1 dst to do a 2x2 reduction op. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38389>	2025-11-17 23:33:59 +00:00
Dave Airlie	9385d94bc9	nir: add a flag for functions that are used in cmat calls. With coopmat2 a bunch of functions need a lot of lowering passes to happen before they can be lowered, so mark them as to be lowered later. Drivers needing these should call the nir_remove_non_cmat_call_entrypoints where they remove entrypoints now, and call the original nir_remove_non_entrypoints after lowering coopmat2. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38389>	2025-11-17 23:33:58 +00:00
Dave Airlie	26eaba935d	nir: add a cmat call instruction type. This adds a new instruction type to handle cooperative matrix calls. This clones the call instr, drops callee, and adds a single metadata slot and a call operation (dummy only for now). (Not NACKed by Alyssa) Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38389>	2025-11-17 23:33:58 +00:00
Christoph Pillmayer	da3d8c8b4b	nir: Update progress info in nir_sort_unstructured_blocks Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38354>	2025-11-17 10:30:37 +00:00
Christoph Pillmayer	8db66767a9	nir: Fix preseved metadata in sort_unstructured_blocks Fixes: `c859ea5783` ("nir: Add a sort_unstructured_blocks() helper") Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38354>	2025-11-17 10:30:37 +00:00
Marek Olšák	f9341082a2	nir,glsl,zink: remove the option nir_io_separate_clip_cull_distance_arrays Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This calls nir_separate_merged_clip_cull_io in zink, which is better than having to handle separate clip & cull arrays in all passes. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38452>	2025-11-15 03:30:10 +00:00
Marek Olšák	da52bc466f	nir: add nir_separate_merged_clip_cull_io Only needed by zink. This clip/cull distance separation pass is needed to remove nir_io_separate_clip_cull_distance_arrays, so that all shared GLSL code only uses merged clip+cull distance outputs. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38452>	2025-11-15 03:30:10 +00:00
Marek Olšák	1e0fe81b69	nir: document how nir_opt_cse works and suggest improvements Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details not planning to work on the TODOs immediately Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38385>	2025-11-15 02:56:30 +00:00
Marek Olšák	9247a78925	nir: document how nir_opt_dce works Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38385>	2025-11-15 02:56:30 +00:00
Marek Olšák	e372365cf4	nir: rename nir_copy_prop -> nir_opt_copy_prop Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38411>	2025-11-15 02:16:38 +00:00
Marek Olšák	296839f489	nir/opt_copy_propagate: refactor for readability, describe missing stuff No functional change. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38411>	2025-11-15 02:16:38 +00:00
Marek Olšák	4e834b4321	nir: add NIR_PASS_ASSERT_NO_PROGRESS Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This aborts if a pass would make any progress. It can be used to assert that: - our minimalist pass invocation loops in drivers are sufficient and don't leave any unoptimized code in the shader - our lowering is sufficient and other passes don't add instructions that would cause lowering having to be repeated Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38406>	2025-11-14 21:39:12 +00:00
Connor Abbott	d30ff374a1	nir, glsl: Add support for softfloat32 Based on existing softfloat64 support and Berkeley SoftFloat. This is targeted at drivers that can't preserve denorms, so operations where denorm support is irrelevant like conversions to/from integers aren't handled. Because the existing mechanism used by Gallium for softfloat64 doesn't support includes, we unfortunately can't extract common code into a header. This can be done later if we switch Gallium to using glslang and spirv-to-nir. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37608>	2025-11-14 19:31:17 +00:00
Daniel Schürmann	11fb6c30b3	nir/lower_vars_to_ssa: return early if there is no local variables to lower Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38367>	2025-11-14 09:09:15 +00:00
Daniel Schürmann	7ee1932309	treewide: Never preserve nir_metadata_dominance without nir_metadata_block_index Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38367>	2025-11-14 09:09:14 +00:00
Daniel Schürmann	0d70716c8a	nir/opt_large_constants: Fix dead deref instructions accessing lowered variables It could happen that unused derefs weren't removed if DCE wasn't called prior to nir_opt_large_constants. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38367>	2025-11-14 09:09:14 +00:00
Alyssa Rosenzweig	65fcdf4c81	nir/sweep: fix use-after-free with dominance LCA Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Either we need to save this pointer or toss it. ==146166==ERROR: AddressSanitizer: heap-use-after-free on address 0x7bfe77013920 at pc 0x7b9e6fd5b978 bp 0x7ffc30ef18e0 sp 0x7ffc30ef18d8 READ of size 4 at 0x7bfe77013920 thread T0 #0 0x7b9e6fd5b977 in get_header ../src/util/ralloc.c:83 #1 0x7b9e6fd5b977 in ralloc_parent ../src/util/ralloc.c:382 #2 0x7b9e6fd5b977 in reralloc_size ../src/util/ralloc.c:198 #3 0x7b9e6fd5b977 in reralloc_array_size ../src/util/ralloc.c:241 #4 0x7b9e705f83c2 in range_minimum_query_table_resize ../src/util/range_minimum_query.c:21 #5 0x7b9e7018af1d in realloc_info ../src/compiler/nir/nir_dominance_lca.c:33 #6 0x7b9e7018af1d in nir_calc_dominance_lca_impl ../src/compiler/nir/nir_dominance_lca.c:126 #7 0x7b9e6ff9815c in nir_metadata_require ../src/compiler/nir/nir_metadata.c:42 #8 0x7b9e6ff998e4 in nir_metadata_require_most ../src/compiler/nir/nir_metadata.c:200 #9 0x7b9e6f8aab4d in st_finalize_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:735 #10 0x7b9e6f0afb14 in st_create_common_variant ../src/mesa/state_tracker/st_program.c:858 #11 0x7b9e6f0be2d3 in st_get_common_variant ../src/mesa/state_tracker/st_program.c:973 #12 0x7b9e6f0bf9cf in st_precompile_shader_variant ../src/mesa/state_tracker/st_program.c:1478 #13 0x7b9e6f0bf9cf in st_finalize_program ../src/mesa/state_tracker/st_program.c:1596 #14 0x7b9e6f8b0127 in st_link_glsl_to_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:633 #15 0x7b9e6f8b3611 in st_link_shader ../src/mesa/state_tracker/st_glsl_to_nir.cpp:816 #16 0x7b9e6f7bcf51 in link_program ../src/mesa/main/shaderapi.c:1412 #17 0x7b9e6f7bcf51 in link_program_error ../src/mesa/main/shaderapi.c:1474 #18 0x0000004020b0 in main._omp_fn.0 /home/alyssa/shader-db/run.c:872 #19 0x7f9e7893dd65 in GOMP_parallel (/lib64/libgomp.so.1+0xdd65) (BuildId: 9cc501fdca53b5d4ab094f709486781c98573bc9) #20 0x000000400d6a in main /home/alyssa/shader-db/run.c:689 #21 0x7f9e78011574 in __libc_start_call_main (/lib64/libc.so.6+0x3574) (BuildId: 48c4b9b1efb1df15da8e787f489128bf31893317) #22 0x7f9e78011627 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x3627) (BuildId: 48c4b9b1efb1df15da8e787f489128bf31893317) #23 0x000000401014 in _start (/home/alyssa/shader-db/run+0x401014) (BuildId: a83b8d830cc265be3f54ea3e7a21a0fb5156624b) 0x7bfe77013920 is located 0 bytes inside of 64-byte region [0x7bfe77013920,0x7bfe77013960) freed by thread T0 here: #0 0x7f9e782e5beb in free.part.0 (/usr/lib64/libasan.so.8+0xe5beb) (BuildId: cab80046dbc1c97c6e14490acc37d079701f8d9a) #1 0x7b9e6fd5bc39 in unsafe_free ../src/util/ralloc.c:319 #2 0x7b9e6fd5bc39 in ralloc_free ../src/util/ralloc.c:264 #3 0x7b9e70063d81 in nir_sweep ../src/compiler/nir/nir_sweep.c:219 #4 0x7b9e6f0bf499 in st_finalize_program ../src/mesa/state_tracker/st_program.c:1585 #5 0x7b9e6f8b0127 in st_link_glsl_to_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:633 #6 0x7b9e6f8b3611 in st_link_shader ../src/mesa/state_tracker/st_glsl_to_nir.cpp:816 #7 0x7b9e6f7bcf51 in link_program ../src/mesa/main/shaderapi.c:1412 #8 0x7b9e6f7bcf51 in link_program_error ../src/mesa/main/shaderapi.c:1474 #9 0x0000004020b0 in main._omp_fn.0 /home/alyssa/shader-db/run.c:872 previously allocated by thread T0 here: #0 0x7f9e782e5e4b in realloc.part.0 (/usr/lib64/libasan.so.8+0xe5e4b) (BuildId: cab80046dbc1c97c6e14490acc37d079701f8d9a) #1 0x7b9e6fd5a883 in resize ../src/util/ralloc.c:167 #2 0x7b9e705f83c2 in range_minimum_query_table_resize ../src/util/range_minimum_query.c:21 #3 0x7b9e7018af1d in realloc_info ../src/compiler/nir/nir_dominance_lca.c:33 #4 0x7b9e7018af1d in nir_calc_dominance_lca_impl ../src/compiler/nir/nir_dominance_lca.c:126 #5 0x7b9e6ff9815c in nir_metadata_require ../src/compiler/nir/nir_metadata.c:42 #6 0x7b9e6ff998e4 in nir_metadata_require_most ../src/compiler/nir/nir_metadata.c:200 #7 0x7b9e6f8b0ede in st_link_glsl_to_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:550 #8 0x7b9e6f8b3611 in st_link_shader ../src/mesa/state_tracker/st_glsl_to_nir.cpp:816 #9 0x7b9e6f7bcf51 in link_program ../src/mesa/main/shaderapi.c:1412 #10 0x7b9e6f7bcf51 in link_program_error ../src/mesa/main/shaderapi.c:1474 #11 0x0000004020b0 in main._omp_fn.0 /home/alyssa/shader-db/run.c:872 Fixes: `17876a00af` ("nir: Add a faster lowest common ancestor algorithm") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38412>	2025-11-13 20:17:22 +00:00
Yonggang Luo	ecb0ccf603	treewide: Replace calling to function ALIGN with align This is done by grep ALIGN( to align( docs,*.xml,blake3 is excluded Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38365>	2025-11-12 21:58:40 +00:00
Konstantin Seurer	b241b26d11	nir: Remove nir_def::parent_instr This reduces the footprint of nir_def by 8B on 64-bit systems. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Acked-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38313>	2025-11-12 21:22:13 +00:00
Konstantin Seurer	de32f9275f	treewide: add & use parent instr helpers We add a bunch of new helpers to avoid the need to touch >parent_instr, including the full set of: * nir_def_is_* * nir_def_as__or_null nir_def_as_* [assumes the right instr type] * nir_src_is_* * nir_src_as_* * nir_scalar_is_* * nir_scalar_as_* Plus nir_def_instr() where there's no more suitable helper. Also an existing helper is renamed to unify all the names, while we're churning the tree: * nir_src_as_alu_instr -> nir_src_as_alu ..and then we port the tree to use the helpers as much as possible, using nir_def_instr() where that does not work. Acked-by: Marek Olšák <maraeo@gmail.com> --- To eliminate nir_def::parent_instr we need to churn the tree anyway, so I'm taking this opportunity to clean up a lot of NIR patterns. Co-authored-by: Konstantin Seurer <konstantin.seurer@gmail.com> Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38313>	2025-11-12 21:22:13 +00:00
Yonggang Luo	34e7fa2fe6	nir: Disable gcc warning -Wstringop-overflow for nir_intrinsic_set_* for latter commit gcc has a a false positive here, silenced with the pragmas, use separate commit for easily revert latter once gcc fixed it. Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38313>	2025-11-12 21:22:13 +00:00
Konstantin Seurer	e231aec0c9	nir: Move nir_def directly after nir_instr This way, all instruction types have the nir_def at the same offset. Acked-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38313>	2025-11-12 21:22:13 +00:00
Faith Ekstrand	6ee4ea5ea3	nir: Add a type parameter to nir_lower_point_size() On Mali, we need not only clamp but also convert to float16 on Valhall+. We could have a separate pass for this but it fits in nicely with the rest of nir_lower_point_size() so we might as well put it there. Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38379>	2025-11-12 01:34:36 +00:00
Faith Ekstrand	0e9fcb33c3	nir: Add a couple panfrost sysvals to divergence analysis Fixes: `2af6e4beeb` ("pan: Don't pretend we support load_{vertex_id_zero_base,first_vertex}") Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>	2025-11-11 17:38:36 +00:00
Daniel Schürmann	9abbcbc00e	nir/opt_load_store_vectorize: don't add negative offsets to load/store_shared2_amd By hoisting the low address instead, we can make use of these instructions on GFX6. Totals from 3 (0.00% of 79839) affected shaders: (Navi48) Instrs: 3768 -> 3776 (+0.21%); split: -0.03%, +0.24% CodeSize: 20024 -> 20048 (+0.12%); split: -0.04%, +0.16% Latency: 16093 -> 16198 (+0.65%) InvThroughput: 3868 -> 3864 (-0.10%) VClause: 97 -> 93 (-4.12%) VALU: 2333 -> 2331 (-0.09%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37682>	2025-11-11 17:12:15 +00:00
Kenneth Graunke	6151eb4372	nir: Drop writemask from all Intel memory store intrinsics The backend has been fully ignoring all writemasks for a long time, so it really doesn't make sense to have them on our custom intrinsics. I'm not sure they even make sense for some of the block intrinsics. Also, the store_ssbo -> store_ssbo_intel pass was not setting writemask at all, leaving it at the default value of 0 (aka write nothing, if it had been respected...) Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38343>	2025-11-11 10:55:41 +00:00
Georg Lehmann	9ef0c96f26	nir/opt_algebraic: optimize open coded pack_32_2x16 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Foz-DB Navi48: Totals from 4 (0.00% of 80287) affected shaders: Instrs: 6231 -> 6101 (-2.09%) CodeSize: 35916 -> 35156 (-2.12%) Latency: 72190 -> 71317 (-1.21%) InvThroughput: 20817 -> 19962 (-4.11%) VALU: 3145 -> 3029 (-3.69%) VOPD: 310 -> 312 (+0.65%) Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37937>	2025-11-10 19:04:32 +00:00
Ian Romanick	d9bed33c11	nir/opt_if: Both parts of logic-joined conditions can be evaluated For cases like 'if (X && Y)', both X and Y must be true in the then branch. Their values are unknown in the else branch. Similarly, 'if (X \|\| Y)' must have both X and Y false in the else branch. The shader-db results are pretty bad, especially on Skylake. Ouch. The fossil-db results are good enough that they make up for it. v2: s/alu/alu_src/ in nir_src_parent_instr(use_src) != &alu_src->instr. Noticed by Rhys. shader-db: Lunar Lake total instructions in shared programs: 17203905 -> 17196251 (-0.04%) instructions in affected programs: 668828 -> 661174 (-1.14%) helped: 352 / HURT: 2 total cycles in shared programs: 879896264 -> 888462774 (0.97%) cycles in affected programs: 330523984 -> 339090494 (2.59%) helped: 187 / HURT: 167 total spills in shared programs: 3318 -> 3329 (0.33%) spills in affected programs: 4 -> 15 (275.00%) helped: 0 / HURT: 4 total fills in shared programs: 1903 -> 1917 (0.74%) fills in affected programs: 7 -> 21 (200.00%) helped: 0 / HURT: 4 Meteor Lake and DG2 had similar results. (Meteor Lake shown) total instructions in shared programs: 19969129 -> 19961439 (-0.04%) instructions in affected programs: 665860 -> 658170 (-1.15%) helped: 354 / HURT: 0 total cycles in shared programs: 884509249 -> 887353784 (0.32%) cycles in affected programs: 323242817 -> 326087352 (0.88%) helped: 208 / HURT: 146 total spills in shared programs: 4801 -> 4808 (0.15%) spills in affected programs: 14 -> 21 (50.00%) helped: 0 / HURT: 6 total fills in shared programs: 4454 -> 4467 (0.29%) fills in affected programs: 17 -> 30 (76.47%) helped: 0 / HURT: 6 Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 19913774 -> 19906147 (-0.04%) instructions in affected programs: 667348 -> 659721 (-1.14%) helped: 351 / HURT: 3 total cycles in shared programs: 861253468 -> 864535803 (0.38%) cycles in affected programs: 325577148 -> 328859483 (1.01%) helped: 180 / HURT: 174 total spills in shared programs: 3440 -> 3455 (0.44%) spills in affected programs: 18 -> 33 (83.33%) helped: 0 / HURT: 8 total fills in shared programs: 1946 -> 1961 (0.77%) fills in affected programs: 18 -> 33 (83.33%) helped: 0 / HURT: 8 Skylake total instructions in shared programs: 19031768 -> 19023604 (-0.04%) instructions in affected programs: 671633 -> 663469 (-1.22%) helped: 347 / HURT: 7 total cycles in shared programs: 868474831 -> 868132073 (-0.04%) cycles in affected programs: 320499758 -> 320157000 (-0.11%) helped: 246 / HURT: 108 total spills in shared programs: 4024 -> 4063 (0.97%) spills in affected programs: 28 -> 67 (139.29%) helped: 0 / HURT: 18 total fills in shared programs: 3722 -> 3746 (0.64%) fills in affected programs: 34 -> 58 (70.59%) helped: 0 / HURT: 18 fossil-db: Lunar Lake Totals: Instrs: 928574038 -> 928568364 (-0.00%); split: -0.00%, +0.00% Subgroup size: 40916656 -> 40916672 (+0.00%) Send messages: 41467974 -> 41467909 (-0.00%); split: -0.00%, +0.00% Loop count: 970202 -> 970191 (-0.00%) Cycle count: 106297789925 -> 106301305901 (+0.00%); split: -0.00%, +0.01% Spill count: 3424464 -> 3424452 (-0.00%); split: -0.00%, +0.00% Fill count: 6525458 -> 6525119 (-0.01%); split: -0.01%, +0.00% Max live registers: 193525368 -> 193524886 (-0.00%); split: -0.00%, +0.00% Non SSA regs after NIR: 232027347 -> 232026610 (-0.00%); split: -0.00%, +0.00% Totals from 1130 (0.06% of 2018793) affected shaders: Instrs: 2662692 -> 2657018 (-0.21%); split: -0.27%, +0.06% Subgroup size: 16 -> 32 (+100.00%) Send messages: 112689 -> 112624 (-0.06%); split: -0.07%, +0.01% Loop count: 5723 -> 5712 (-0.19%) Cycle count: 1176696438 -> 1180212414 (+0.30%); split: -0.33%, +0.63% Spill count: 9895 -> 9883 (-0.12%); split: -0.13%, +0.01% Fill count: 26892 -> 26553 (-1.26%); split: -1.26%, +0.00% Max live registers: 215462 -> 214980 (-0.22%); split: -0.30%, +0.08% Non SSA regs after NIR: 398940 -> 398203 (-0.18%); split: -0.21%, +0.03% Meteor Lake, DG2, Tiger Lake, Ice Lake, and Skylake had similar results. (Meteor Lake shown) Totals: Instrs: 1000318839 -> 1000314218 (-0.00%); split: -0.00%, +0.00% Send messages: 45548952 -> 45548887 (-0.00%); split: -0.00%, +0.00% Loop count: 1026441 -> 1026430 (-0.00%) Cycle count: 92411461807 -> 92395024225 (-0.02%); split: -0.02%, +0.00% Spill count: 3665265 -> 3665221 (-0.00%); split: -0.00%, +0.00% Fill count: 6504830 -> 6504801 (-0.00%); split: -0.00%, +0.00% Max live registers: 121790079 -> 121789811 (-0.00%); split: -0.00%, +0.00% Max dispatch width: 38062488 -> 38062648 (+0.00%) Non SSA regs after NIR: 256900770 -> 256900038 (-0.00%); split: -0.00%, +0.00% Totals from 1124 (0.05% of 2284852) affected shaders: Instrs: 2724110 -> 2719489 (-0.17%); split: -0.24%, +0.07% Send messages: 112096 -> 112031 (-0.06%); split: -0.07%, +0.01% Loop count: 5697 -> 5686 (-0.19%) Cycle count: 960659254 -> 944221672 (-1.71%); split: -1.91%, +0.20% Spill count: 13791 -> 13747 (-0.32%); split: -0.40%, +0.08% Fill count: 43216 -> 43187 (-0.07%); split: -0.14%, +0.08% Max live registers: 114877 -> 114609 (-0.23%); split: -0.31%, +0.07% Max dispatch width: 12768 -> 12928 (+1.25%) Non SSA regs after NIR: 412320 -> 411588 (-0.18%); split: -0.20%, +0.03% Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38321>	2025-11-10 18:30:42 +00:00
Ian Romanick	3e0c9ad316	nir/opt_if: Conditionally do not propagate constants through bcsel In some cases propagating through a bcsel may be harmful. If the bcsel uses are unlikely to be eliminated in both branch of an if statement, propagating through it may result in extra moves for phi instructions and extended live ranges. v2: Fix missing parameter in call. Noticed by Rhys. I fixed this on the test machine, but I must have forgotten to propagate the change back to my dev machine. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38321>	2025-11-10 18:30:41 +00:00
Ian Romanick	a3b6d05a3b	nir/opt_if: Specify which branches are valid for evaluate_if_condition Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38321>	2025-11-10 18:30:41 +00:00
Marek Olšák	0216f09e45	nir/lower_interpolation: check IO location correctly Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Vangogh timed out. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38337>	2025-11-10 16:44:36 +00:00
Lars-Ivar Hesselberg Simonsen	b3b6fba548	nir: Add pan intrinsics for texel buffer access Will be used by panfrost to access texel buffers. Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37007>	2025-11-07 17:03:53 +00:00
Faith Ekstrand	35cdddf632	nir: Simplify assign_io_var_locations() The size and stage parameters are left-overs from history. Originally, the function acted on a list and so it needed an explicit stage and size output. Now that it takes a NIR shader and a mode, we can just take the stage from the shader and set num_(in\|out)puts. The one caller that actually used the explicit output parameter was turnip. However, given that the helper sorts and re-numbers all the I/O variables, it's not like changing num_(in\|out)puts instead of writing it to some other location is that big of a deal. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38297>	2025-11-07 16:29:56 +00:00
Job Noorman	3908a228bd	nir: add opt_uub pass Add a pass that uses nir_unsigned_upper_bound to simplify some ALU operations: - iand src, mask: if mask is constant with N least significant bits set and uub(src) < 2^N, the iand does nothing and can be removed. - ult src, const: if uub(src) < cmp -> true - uge src, const: if uub(src) < cmp -> false - ilt src, const: if uub(src) >= 0 && cmp < 0 -> false - if uub(src) >= 0 && cmp >= 0 -> ult src, const - ige src, const: if uub(src) >= 0 && cmp < 0 -> true - if uub(src) >= 0 && cmp >= 0 -> uge src, const - umin src, const: if uub(src) <= const -> src - umax src, const: if uub(src) <= const -> const - imin src, const: if uub(src) >= 0 && const < 0 -> const - if uub(src) >= 0 && const >= 0 -> umin src, const - imax src, const: if uub(src) >= 0 && const < 0 -> src - if uub(src) >= 0 && const >= 0 -> umax src, const - imul src0, src1: if uub(srci) < UINT16_MAX -> umul_16x16 src0, src1 - imul src0, src1: if uub(srci) < UINT24_MAX -> umul24 src0, src1 - imul src0, src1: if uub(srci) < UINT23_MAX -> imul24 src0, src1 The imul optimization needs to be explicitly enabled using a pass option. This is useful since 1) most backends don't support umul_16x16, and 2) some passes (e.g., nir_opt_load_store_vectorize) need to analyze imuls so lowering them before running such a pass makes their job more difficult. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37869>	2025-11-07 10:23:29 +00:00
Job Noorman	0b348fb375	nir: add has_umul_16x16 option Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37869>	2025-11-07 10:23:29 +00:00
Natalie Vock	0cb1fca8fa	nir: Use sparse bitset for liveness information Some shaders, especially RTPSO shaders that have parts of the PSO inlined, can become absolutely huge. Using a sparse bitset avoids quadratic complexity in memory consumption for the liveness information. This reduces peak memory usage in worst-case tests (hammering compilation of many huge RTPSOs on 32 threads concurrently) by ~60%, from 43GB to 18GB. CPU time (seconds) differences for a workload with mostly small shaders: Difference at 95.0% confidence -5.27 +/- 1.08963 -0.88811% +/- 0.183626% (Student's t, pooled s = 0.629735) Peak resident set usage for the mostly-small workload: Difference at 95.0% confidence 30809 +/- 13394.3 1.59276% +/- 0.69246% (Student's t, pooled s = 7741.09) CPU time for the heavy workload did not show any difference. Co-authored-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37908>	2025-11-06 21:34:33 +00:00
Faith Ekstrand	0ccadf7a86	nir: Check the deref mode in lower_point_size() This is more robust because it ensures that we only ever check the location on something that we know is an outupt. Also, if it's an output then we know (thanks, validation!) that it's a variable. Reviewed-by: Olivia Lee <olivia.lee@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38265>	2025-11-06 14:57:31 +00:00

1 2 3 4 5 ...

6810 commits