fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-04-28 06:30:40 +02:00

Author	SHA1	Message	Date
Pierre-Eric Pelloux-Prayer	e92638b6bf	nir/opt_varyings: fix build with PRINT_RELOCATE_SLOT Fixes: `e3d122ed7b` ("nir/opt_varyings: completely exclude mediump from type changes") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36411>	2025-08-23 14:44:29 +00:00
Jesse Natalie	5b3756f231	nir: Add missing #include for c99_alloca.h Fixes: `3dd9a978` ("nir: add new pass nir_lower_io_indirect_loads") Reviewed-by: Yonggang Luo <luoyonggang@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36940>	2025-08-22 22:33:50 +00:00
Rhys Perry	2d597b6919	nir/load_store_vectorize: use nir_def_num_lsb_zero in calc_alignment Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details fossil-db (gfx1201): Totals from 20 (0.03% of 79839) affected shaders: Instrs: 15370 -> 15251 (-0.77%) CodeSize: 89764 -> 88952 (-0.90%) Latency: 150295 -> 149963 (-0.22%) InvThroughput: 210291 -> 210105 (-0.09%) Copies: 1337 -> 1320 (-1.27%) PreVGPRs: 589 -> 590 (+0.17%) VALU: 7519 -> 7466 (-0.70%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>	2025-08-22 15:45:55 +00:00
Rhys Perry	b03eeb12a9	nir/load_store_vectorize: use nir_def_num_lsb_zero in check_for_robustness fossil-db (gfx1201): Totals from 499 (0.63% of 79839) affected shaders: MaxWaves: 14276 -> 14234 (-0.29%) Instrs: 520883 -> 508159 (-2.44%); split: -2.45%, +0.01% CodeSize: 2831220 -> 2731080 (-3.54%); split: -3.54%, +0.00% VGPRs: 27156 -> 27348 (+0.71%) SpillSGPRs: 360 -> 390 (+8.33%) Latency: 4473898 -> 4414552 (-1.33%); split: -1.54%, +0.21% InvThroughput: 494468 -> 493508 (-0.19%); split: -0.62%, +0.43% VClause: 14211 -> 14060 (-1.06%); split: -1.16%, +0.10% SClause: 14653 -> 14354 (-2.04%); split: -2.39%, +0.35% Copies: 36772 -> 37056 (+0.77%); split: -0.65%, +1.42% Branches: 11502 -> 11486 (-0.14%) PreSGPRs: 22605 -> 22848 (+1.07%); split: -0.39%, +1.47% PreVGPRs: 20571 -> 20833 (+1.27%) VALU: 242982 -> 243151 (+0.07%); split: -0.08%, +0.14% SALU: 91332 -> 88069 (-3.57%); split: -3.71%, +0.14% VMEM: 32275 -> 29137 (-9.72%) SMEM: 26239 -> 22400 (-14.63%) VOPD: 345 -> 330 (-4.35%) SClause: 14646 -> 14347 (-2.04%); split: -2.39%, +0.35% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>	2025-08-22 15:45:55 +00:00
Rhys Perry	46da666205	nir/algebraic: allow non-const for iand(iadd()) -> iadd(iand()) fossil-db (gfx1201): Totals from 596 (0.75% of 79839) affected shaders: Instrs: 691926 -> 691819 (-0.02%); split: -0.11%, +0.09% CodeSize: 3675216 -> 3675180 (-0.00%); split: -0.08%, +0.08% VGPRs: 37464 -> 37452 (-0.03%) Latency: 8566849 -> 8563162 (-0.04%); split: -0.09%, +0.05% InvThroughput: 1068038 -> 1063279 (-0.45%); split: -0.46%, +0.01% VClause: 17859 -> 17897 (+0.21%); split: -0.01%, +0.22% SClause: 16704 -> 16735 (+0.19%); split: -0.07%, +0.26% Copies: 45422 -> 45395 (-0.06%); split: -0.15%, +0.09% PreSGPRs: 24345 -> 24351 (+0.02%) PreVGPRs: 29121 -> 29128 (+0.02%) VALU: 349959 -> 348117 (-0.53%); split: -0.54%, +0.01% SALU: 105926 -> 107576 (+1.56%); split: -0.02%, +1.58% VOPD: 252 -> 234 (-7.14%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>	2025-08-22 15:45:55 +00:00
Rhys Perry	4f83059ac5	nir/algebraic: improve is_unsigned_multiple_of_4 and use it more fossil-db (gfx1201): Totals from 160 (0.20% of 79839) affected shaders: MaxWaves: 4008 -> 3952 (-1.40%) Instrs: 390073 -> 379834 (-2.62%); split: -2.63%, +0.00% CodeSize: 2126020 -> 2053740 (-3.40%); split: -3.40%, +0.00% VGPRs: 9492 -> 9612 (+1.26%) Latency: 6746019 -> 6723893 (-0.33%); split: -0.33%, +0.00% InvThroughput: 849571 -> 848942 (-0.07%); split: -0.42%, +0.35% VClause: 11977 -> 11983 (+0.05%); split: -0.20%, +0.25% SClause: 11828 -> 11824 (-0.03%); split: -0.14%, +0.11% Copies: 30003 -> 30938 (+3.12%); split: -0.09%, +3.20% PreSGPRs: 8914 -> 8938 (+0.27%) PreVGPRs: 7352 -> 7514 (+2.20%); split: -0.04%, +2.24% VALU: 171829 -> 168829 (-1.75%); split: -1.76%, +0.01% SALU: 66503 -> 66543 (+0.06%); split: -0.01%, +0.07% VMEM: 29365 -> 25327 (-13.75%) VOPD: 864 -> 1013 (+17.25%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>	2025-08-22 15:45:55 +00:00
Rhys Perry	09ab7ff01e	nir: add nir_def_num_lsb_zero Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>	2025-08-22 15:45:55 +00:00
Rhys Perry	51dd513789	nir/search: reorder match_value to check constants first Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>	2025-08-22 15:45:55 +00:00
Rhys Perry	84fe10f939	nir/search: don't clear empty hash tables _mesa_hash_table_clear() memsets the entries, even if it's already empty. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>	2025-08-22 15:45:55 +00:00
Rhys Perry	2a12624532	nir/search: add nir_search_state A future commit will add another hash table. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>	2025-08-22 15:45:55 +00:00
Georg Lehmann	996c07353b	nir/shrink_vec_array_vars: use range analysis for non constant indices Foz-DB Navi21: Totals from 84 (0.10% of 80255) affected shaders: MaxWaves: 1700 -> 1806 (+6.24%); split: +6.59%, -0.35% Instrs: 90479 -> 91278 (+0.88%); split: -0.15%, +1.04% CodeSize: 499644 -> 504572 (+0.99%); split: -0.10%, +1.08% VGPRs: 5400 -> 4912 (-9.04%); split: -9.93%, +0.89% LDS: 292864 -> 152064 (-48.08%) Latency: 2001405 -> 2002335 (+0.05%); split: -0.01%, +0.06% InvThroughput: 545293 -> 543073 (-0.41%); split: -0.52%, +0.11% VClause: 1510 -> 1508 (-0.13%) SClause: 2096 -> 2097 (+0.05%); split: -0.05%, +0.10% Copies: 6373 -> 6431 (+0.91%); split: -0.64%, +1.55% Branches: 1648 -> 1686 (+2.31%); split: -0.36%, +2.67% PreVGPRs: 3918 -> 3960 (+1.07%); split: -0.03%, +1.10% VALU: 67591 -> 68107 (+0.76%); split: -0.14%, +0.90% SALU: 8352 -> 8490 (+1.65%); split: -0.25%, +1.90% VMEM: 2685 -> 2683 (-0.07%) Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26388>	2025-08-22 13:47:47 +00:00
Georg Lehmann	c7df3b4f64	nir/shrink_vec_array_vars: allow nir_var_mem_shared This should just work. Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26388>	2025-08-22 13:47:47 +00:00
Rhys Perry	2b5681f257	nir/opt_load_skip_helpers: always require helpers for handles Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36850>	2025-08-22 13:15:05 +00:00
Rhys Perry	81dd60df95	nir/opt_load_skip_helpers: move divergence check earlier This should fix a hypothetical issue such as: address = load_global() value = load_global(address, access=uses-smem) where divergence analysis can't prove that 'address' is uniform, but can prove that 'value' is uniform. We might then add both load_global to the load_worklist, but only disable helpers for the first because the second is uniform, making 'address' divergent for real and potentially incorrect when used with v_readfirstlane_b32. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36850>	2025-08-22 13:15:05 +00:00
Rhys Perry	354df09c88	nir: add global_amd to nir_get_io_offset_src/nir_get_io_index_src This is needed for nir_opt_load_skip_helpers. fossil-db (gfx1201): Totals from 5 (0.01% of 79839) affected shaders: Instrs: 2288 -> 2286 (-0.09%); split: -0.13%, +0.04% CodeSize: 12372 -> 12364 (-0.06%); split: -0.10%, +0.03% Latency: 18378 -> 20044 (+9.07%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Fixes: `883b1ca364` ("aco: disable wqm for tex loads when not needed") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36850>	2025-08-22 13:15:04 +00:00
Qiang Yu	dbbb46aa38	nir: compute io base for fragment shader inputs which maybe per primitive Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Some inputs is per vertex while vertex pipeline, and per primitive when mesh pipeline. Put these inputs after other inputs to share the same fragment shader code for two pipelines. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36749>	2025-08-22 02:42:57 +00:00
Qiang Yu	7c3f7e1046	nir: lower io support task and mesh shader mesh shader does not have input, and we skip task shader IO lowering like compute shader. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36749>	2025-08-22 02:42:57 +00:00
Yonggang Luo	a34756bbed	Revert "nir: Temporarily disable optimizations for MSVC ARM64" This reverts commit `55d153b9f5`. The msvc bug is https://developercommunity.visualstudio.com/t/Stack-overflow-compiling-C-code-to-ARM64/916235 and Fixed In: Visual Studio 2022 version 17.7 Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36767>	2025-08-22 01:28:23 +00:00
Georg Lehmann	e24db36f20	nir/uub: handle bit_count Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36874>	2025-08-21 10:36:09 +00:00
Georg Lehmann	aff391bc77	nir/uub: handle more reduction ops Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36874>	2025-08-21 10:36:09 +00:00
Georg Lehmann	773ee60e48	nir/uub: decrease default max subgroup size to 128 128 is the maximum all apis allow. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36874>	2025-08-21 10:36:09 +00:00
Georg Lehmann	a2e48d2ede	nir/uub: fix exclusive scans Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36874>	2025-08-21 10:36:09 +00:00
Calder Young	a3ecdf33a3	nir/builder: Add helper for building uvec8 immediates Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36455>	2025-08-21 09:04:54 +00:00
Marek Olšák	33a076789c	nir/gather_info: don't allocate the ralloc context It's only used by the set, so we can just free the set directly. Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>	2025-08-21 06:13:49 +00:00
Marek Olšák	390631e30a	nir/opt_dead_write_vars: don't use ralloc context, share dynarray among blocks Instead of allocating the ralloc context, which is useless because it's only used by the dynarray, we can free the dynarray directly. Also share the same dynarray among all blocks instead of allocating a new one for every block. This eliminates realloc invocations. Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>	2025-08-21 06:13:49 +00:00
Marek Olšák	c601308615	nir: convert nir_instr_worklist to init/fini semantics w/out allocation This removes the malloc overhead. Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>	2025-08-21 06:13:49 +00:00
Marek Olšák	4efdf247ab	nir/opt_load_store_vectorize: don't allocate 0-sized offset_defs It still allocates the ralloc header, which is wasteful. Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>	2025-08-21 06:13:48 +00:00
Marek Olšák	604584383d	nir/serialize: don't allocate the hash tables Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>	2025-08-21 06:13:48 +00:00
Marek Olšák	8d2acfdeee	nir/split_vars: don't allocate the hash tables Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>	2025-08-21 06:13:48 +00:00
Marek Olšák	ba56b7940b	nir/opt_find_array_copies: don't allocate the hash tables Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>	2025-08-21 06:13:48 +00:00
Marek Olšák	316dc7b163	nir/lower_vars_to_ssa: don't ralloc the hash table Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>	2025-08-21 06:13:48 +00:00
Marek Olšák	639c0106bd	nir/opt_copy_prop_vars: don't allocate copies::ht hash table Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>	2025-08-21 06:13:48 +00:00
Marek Olšák	f131efbe92	nir/opt_copy_prop_vars: don't allocate vars_written_map hash table Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>	2025-08-21 06:13:48 +00:00
Marek Olšák	0ebe788203	nir/opt_copy_prop_vars: don't allocate vars_written::derefs hash table Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>	2025-08-21 06:13:48 +00:00
Marek Olšák	d87bde4abf	nir/search: don't ralloc the hash table Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>	2025-08-21 06:13:48 +00:00
Marek Olšák	9c118c9936	nir/gather_info: don't ralloc the set Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>	2025-08-21 06:13:48 +00:00
Marek Olšák	0e0cc12de6	nir/opt_vectorize: don't ralloc the set Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>	2025-08-21 06:13:48 +00:00
Marek Olšák	f7ca848ad5	nir/remove_dead_variables: don't ralloc the set Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>	2025-08-21 06:13:48 +00:00
Marek Olšák	68b80e4d25	nir/instr_set: don't ralloc the set Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>	2025-08-21 06:13:48 +00:00
Marek Olšák	c1ae58d479	nir/lower_vars_to_ssa: don't ralloc sets reducing ralloc overhead Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>	2025-08-21 06:13:48 +00:00
Marek Olšák	3aadae22ad	nir: make nir_block::predecessors & dom_frontier sets non-malloc'd We can just place the set structures inside nir_block. This reduces the number of ralloc calls by 6.7% when compiling Heaven shaders with radeonsi+ACO using a release build (i.e. not including nir_validate set allocations, which are also removed). Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>	2025-08-21 06:13:48 +00:00
Marek Olšák	81cb571642	nir/dominance: eliminate ralloc overhead for allocating dom_children This is only 1% of all ralloc calls of Unigine Heaven with the gallium noop driver, but it's an easy one to get rid of. Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>	2025-08-21 06:13:48 +00:00
Marek Olšák	aeed2cc19d	nir/dominance: don't allocate 0-sized dom_children 86% of all ralloc calls for dom_children in Unigine Heaven + Superposition had size == 0. It was only allocating the ralloc header. It was 6.1% of all ralloc calls with the gallium noop driver. Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>	2025-08-21 06:13:48 +00:00
Job Noorman	30716cc524	nir/lower_explicit_io: add support for offset_shift The goal here is to generate addresses that are a right-shifted version of the actual byte address and record the shift amount in the offset_shift index. While we could just insert a ushr at the end of deref chains, this will prevent the shift to be optimized away in many cases. Instead, we try to extract the shift from the array strides and struct offsets that make up the deref chain, and only insert a ushr when absolutely necessary (i.e., for casts). This means we have to walk the entire deref chain at once for accesses that support offset_shift and we don't use the standard algorithm of replacing each deref one at a time. To be able to legally right-shift casts, we use the alignment information and never shift more than what the alignment could support. It should also be noted that casts generally have two sources: something provided by the driver (e.g., a Vulkan resource index) or a variable pointer coming from a phi/bcsel. For the latter, the entire access chain consists of multiple parts that are ended by either a phi/bcsel or an access. Only the part the ends in an access is handled by this new algorithm; the other parts are handled as usual. This is necessary because we have no way to encode the offset shift or to even know how much we would be able to shift without knowing how it is accessed. This commit adds the general implementation for lowering accesses using offset_shift and adds a compiler option for drivers to enable it for SSBO accesses. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	1406eafbcd	nir/lower_explicit_io: add alignment parameters to address builder We will need this when building shifted addresses. Since adding these parameters has a lot of code churn which would distract from the main changes, it is split-off in a separate commit. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	553a439b54	nir/lower_explicit_io: use nir_io_offset to pass around addresses We will add support for shifted addresses; this commit makes sure the APIs of the functions already support passing shifts. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	4c9afbd01d	nir/lower_explicit_io: add helper to build address The helper is used to build the address passed to build_explicit_io_load/store. For now, it simply takes care of adding the component offset when scalarizing. In the future, this can be used to do more complex address manipulations, like calculating the full deref chain address. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	1fffba12a0	nir/lower_explicit_io: make offset calculation reusable nir_explicit_io_address_from_deref implicitly builds the offset but only makes the full address available. Split-out the offset calculation in a separate function so we can reuse it elsewhere. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	b0bc97cb43	nir/opt_load_store_vectorize: fix wrap check for scaled offsets Hardware will typically do bounds checking on the final scaled address so the wrap check should do the same. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00
Job Noorman	cb773dec8c	nir/opt_load_store_vectorize: add support for offset_shift Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>	2025-08-20 07:51:30 +00:00

1 2 3 4 5 ...

6554 commits