fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 06:58:16 +02:00

Author	SHA1	Message	Date
Kenneth Graunke	beb4b78fe7	intel: Rename intel_msaa_flags to intel_fs_config This started out as dynamic configuration for MSAA related state, but has since expanded to cover many dynamic fragment shader options. We rename it to intel_fs_config, similar to intel_tess_config, to better indicate its purpose. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39748>	2026-02-06 20:51:43 -08:00
Daniel Schürmann	f71a38e9de	nir/opt_load_store_vectorize: don't use shared2 vectorization across blocks Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Besides the undesireable combinations this can produce, it would also require to update the last_entry in every previous block. Totals from 99 (0.12% of 84383) affected shaders: (Navi48) Instrs: 288989 -> 289727 (+0.26%); split: -0.02%, +0.28% CodeSize: 1542572 -> 1546616 (+0.26%); split: -0.02%, +0.28% SpillSGPRs: 17 -> 16 (-5.88%) Latency: 2104020 -> 2103286 (-0.03%); split: -0.17%, +0.13% InvThroughput: 472380 -> 472265 (-0.02%); split: -0.08%, +0.05% VClause: 9778 -> 9779 (+0.01%) Copies: 24937 -> 25173 (+0.95%); split: -0.05%, +0.99% Branches: 10124 -> 10156 (+0.32%); split: -0.01%, +0.33% PreSGPRs: 6112 -> 6091 (-0.34%) PreVGPRs: 4079 -> 4069 (-0.25%); split: -0.39%, +0.15% VALU: 120208 -> 120421 (+0.18%); split: -0.03%, +0.21% SALU: 56338 -> 56312 (-0.05%); split: -0.09%, +0.04% VOPD: 34 -> 37 (+8.82%) Fixes: `4ca7ee7bd7` ('nir/opt_load_store_vectorize: Allow to vectorize at most one entry of each type across blocks') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39733>	2026-02-06 16:34:15 +00:00
Daniel Schürmann	5e86cfac8e	nir/opt_load_store_vectorize: Vectorize speculatable instructions across blocks Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This should always be safe. Totals from 446 (0.53% of 84383) affected shaders: (Navi48) Instrs: 995942 -> 994416 (-0.15%); split: -0.17%, +0.02% CodeSize: 5500372 -> 5489900 (-0.19%); split: -0.20%, +0.01% SpillSGPRs: 197 -> 195 (-1.02%) Latency: 14872922 -> 14851646 (-0.14%); split: -0.15%, +0.00% InvThroughput: 2395050 -> 2391537 (-0.15%); split: -0.15%, +0.00% VClause: 20207 -> 20195 (-0.06%); split: -0.07%, +0.01% SClause: 27090 -> 26427 (-2.45%); split: -2.51%, +0.07% Copies: 84182 -> 84228 (+0.05%); split: -0.08%, +0.13% Branches: 22927 -> 22928 (+0.00%) PreSGPRs: 27275 -> 27524 (+0.91%); split: -0.02%, +0.93% PreVGPRs: 29116 -> 29131 (+0.05%) VALU: 545565 -> 545549 (-0.00%); split: -0.01%, +0.00% SALU: 124275 -> 124329 (+0.04%); split: -0.05%, +0.09% VMEM: 39044 -> 39030 (-0.04%) SMEM: 44052 -> 43205 (-1.92%) VOPD: 32354 -> 32337 (-0.05%); split: +0.02%, -0.07% Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39373>	2026-02-06 10:16:50 +00:00
Daniel Schürmann	4ca7ee7bd7	nir/opt_load_store_vectorize: Allow to vectorize at most one entry of each type across blocks The idea is to initialize the vectorization table with one entry from the previous blocks if it's the same for all predecessors. In order to not speculatively load out-of-bounds, backends need to set a new bounds_checked_modes option indicating variable modes for which per-component bounds checks are supported. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39373>	2026-02-06 10:16:50 +00:00
Daniel Schürmann	0a07ea20e6	nir/opt_load_store_vectorize: create add_entry_to_hash_table() helper Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39373>	2026-02-06 10:16:50 +00:00
Daniel Schürmann	e5bd9cbf90	nir/opt_load_store_vectorize: use linear allocator instead of ralloc Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39373>	2026-02-06 10:16:49 +00:00
Georg Lehmann	5e2f28e723	nir: remove split unpack_half opcodes Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511>	2026-02-06 06:12:36 +00:00
Georg Lehmann	81e3162cf8	microsoft/compiler: switch to a backend specific unpack half opcode Sadly, just f2f32 isn't enough for dxil. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511>	2026-02-06 06:12:36 +00:00
Georg Lehmann	45cb1d3b6f	nir/opt_algebraic: remove unpack_half_2x16_split Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511>	2026-02-06 06:12:36 +00:00
Georg Lehmann	5a2ef27f7d	nir/format_convert: use f2f32 instead of unpack_half Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511>	2026-02-06 06:12:36 +00:00
Georg Lehmann	a3bd2ae465	nir/opt_16bit_tex_image: remove unpack_half support Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511>	2026-02-06 06:12:36 +00:00
Georg Lehmann	6f7d4cd75b	nir/lower_tex: use f2f32 instead of unpack_half Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511>	2026-02-06 06:12:36 +00:00
Georg Lehmann	609c46cf23	nir/lower_alu_width: emit f2f32 for unpack_half_2x16 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511>	2026-02-06 06:12:36 +00:00
Georg Lehmann	b18d9c1b33	nir/opt_algebraic: optimize unpack_32_2x16 of extract Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511>	2026-02-06 06:12:36 +00:00
Timothy Arceri	da6c3ad237	nir: speedup nir_find_inlinable_uniforms() Here we speedup nir_find_inlinable_uniforms() by making sure we only check a src is inlinable once. If we have a bunch of nested if-statements where the conditions keep building on the alu chains of previous conditions we can end up with exponential processing times due to repeatedly processing the same srcs over and over. A big cause of the exponential grow seems to be instructions like `ffma %594, %594, %599` or `fmul %600, %600` where each essentially causes us to process the entire previous part of the chain twice. Shaders such as that in issue #14663 took multiple minutes to compile previously, calling collect_src_uniforms billions of times and now compile within a second with this change. Closes: mesa/mesa#14663 Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39664>	2026-02-05 23:19:29 +00:00
Timothy Arceri	aaea962808	nir: update asserts in inline uniforms collect_src_uniforms() is now only called internally and uni_offsets should never be NULL. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39664>	2026-02-05 23:19:29 +00:00
Timothy Arceri	0410377b63	nir: make nir_add_inlinable_uniforms() private Hasn't been used externally since `e93592dc62` Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39664>	2026-02-05 23:19:28 +00:00
Timothy Arceri	257875034d	nir: make nir_collect_src_uniforms() private Hasn't been used externally since `e93592dc62` Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39664>	2026-02-05 23:19:28 +00:00
Karol Herbst	e5bf1f5aff	nir/opt_offsets: support nvidias intrinsics Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39525>	2026-02-03 22:23:51 +00:00
Karol Herbst	cb60e4d14f	nir/opt_offsets: support negative offsets and 64 bit sources Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39525>	2026-02-03 22:23:51 +00:00
Karol Herbst	4add3959e9	nir: add BASE to nvidia memory intrinsics Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39525>	2026-02-03 22:23:50 +00:00
Karol Herbst	e779538ad2	nir: add nvidia IO intrinsics Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39525>	2026-02-03 22:23:50 +00:00
Marek Olšák	a3f022d0a2	nir: reassociate a $op (b ? #c : #d) for div, mod, rem Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This eliminates expensive div, mod, rem opcodes with non-constant src1 being constant src1 hiding behind bcsel. gcc and LLVM are missing this. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39560>	2026-02-02 21:34:48 +00:00
Marek Olšák	30e9f0bdf3	nir/opt_16bit_tex_image: lower dst of load_buffer_amd Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>	2026-02-02 17:56:52 +00:00
Marek Olšák	44bc1e6bf4	nir: add dest_type to load_buffer_amd for lowering the result to 16 bits Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>	2026-02-02 17:56:52 +00:00
Marek Olšák	9eaaf9e525	nir: add ACCESS_SPARSE trying to reduce the combinatorial explosion of intrinsics Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>	2026-02-02 17:56:52 +00:00
Marek Olšák	3350bca3eb	nir/print: fix a crash due to unhandled GLSL_SAMPLER_DIM_EXTERNAL Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>	2026-02-02 17:56:52 +00:00
Georg Lehmann	bdc084aae5	nir/algebraic: make subexpression inexact on creation Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Removes the runtime code for this, and means we propergate the signed zero/inf/nan checks to subexpessions too, not just exact. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39616>	2026-01-31 15:30:25 +00:00
Georg Lehmann	293d2e3b0d	nir/algebraic: remove ability to create Value from Expression Not used, and it would break in the future. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39616>	2026-01-31 15:30:25 +00:00
Georg Lehmann	ad6f8291bf	nir/opt_algebraic: rework ignore_exact to work like other internal conditions Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39616>	2026-01-31 15:30:25 +00:00
Georg Lehmann	a879b9a5d5	nir/search: preserve nan/inf/sz if any alu in a replaced expression did Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39616>	2026-01-31 15:30:25 +00:00
Georg Lehmann	575affaf48	nir/search: gather union of all fp_math_ctrl Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39616>	2026-01-31 15:30:25 +00:00
Karol Herbst	24d20df3d6	nir: fix nir_fixup_is_exported for LLVM-22 Starting with LLVM-22 we won't see the kernel wrapper anymore, and this is a trivial fix to get around this. See: `5458eb2511` Cc: mesa-stable Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39374>	2026-01-30 16:06:25 +00:00
Georg Lehmann	70f0e75262	nir/opt_algebraic: optimize pack_half_2x16_rtz of float converted from 16bit Foz-DB Navi48: Totals from 177 (0.21% of 82405) affected shaders: Instrs: 326628 -> 325955 (-0.21%); split: -0.21%, +0.00% CodeSize: 1726720 -> 1722500 (-0.24%); split: -0.24%, +0.00% Latency: 5076631 -> 5075700 (-0.02%); split: -0.02%, +0.00% InvThroughput: 596010 -> 595598 (-0.07%); split: -0.07%, +0.00% VClause: 3613 -> 3616 (+0.08%) Copies: 24427 -> 24501 (+0.30%); split: -0.06%, +0.36% VALU: 182468 -> 182029 (-0.24%); split: -0.24%, +0.00% SALU: 55449 -> 55452 (+0.01%); split: -0.01%, +0.01% Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39531>	2026-01-29 14:44:37 +00:00
Georg Lehmann	c3e12429c5	nir/opt_algebaric: improve a < 0.0 ? 0.0 : sqrt(a) pattern Fix the NaN correctness of the original pattern, and add more variants. Foz-DB Navi48: Totals from 372 (0.45% of 82405) affected shaders: Instrs: 208946 -> 207522 (-0.68%); split: -0.71%, +0.03% CodeSize: 1116436 -> 1109804 (-0.59%); split: -0.61%, +0.02% VGPRs: 19452 -> 19104 (-1.79%) Latency: 1121222 -> 1120423 (-0.07%); split: -0.13%, +0.05% InvThroughput: 158228 -> 157567 (-0.42%); split: -0.61%, +0.19% VClause: 3695 -> 3704 (+0.24%) Copies: 9516 -> 9606 (+0.95%); split: -0.24%, +1.19% VALU: 118696 -> 118031 (-0.56%); split: -0.61%, +0.05% VOPD: 380 -> 372 (-2.11%) Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39507>	2026-01-29 11:29:48 +00:00
Georg Lehmann	f872c13707	nir/opt_algebraic: use contract instead of inexact for more patterns These use more precise operations, so contract is enough. Foz-DB Navi48: Totals from 248 (0.30% of 82405) affected shaders: Instrs: 284686 -> 284318 (-0.13%); split: -0.14%, +0.01% CodeSize: 1528856 -> 1527520 (-0.09%); split: -0.10%, +0.01% Latency: 2368390 -> 2367345 (-0.04%); split: -0.06%, +0.01% InvThroughput: 346623 -> 346335 (-0.08%); split: -0.09%, +0.01% SClause: 6752 -> 6756 (+0.06%); split: -0.12%, +0.18% Copies: 14685 -> 14694 (+0.06%); split: -0.01%, +0.07% VALU: 179922 -> 179727 (-0.11%); split: -0.11%, +0.01% SALU: 28706 -> 28707 (+0.00%) VOPD: 1196 -> 1198 (+0.17%) Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39507>	2026-01-29 11:29:48 +00:00
Georg Lehmann	f472bbf017	nir/algebraic: remove manual opcode validation Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The properly terminated regex automatically detects this case now. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39586>	2026-01-28 18:46:23 +00:00
Georg Lehmann	a5f55be021	nir/algebraic: terminate opcode regex Instead of silently dropping the unmatched rest. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39586>	2026-01-28 18:46:23 +00:00
Georg Lehmann	d8ef28671d	nir/opt_algebraic: use correct syntax to create exact fsat Fixes: `3b06824e4c` ("nir/opt_algebraic: optimize some post peephole select patterns") Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39586>	2026-01-28 18:46:22 +00:00
Iván Briano	5b48805b42	brw: fix local_invocation_index with quad derivaties on mesh/task shaders For mesh/task shaders, the thread payload provides a local invocation index, but it's always linear so it doesn't give the correct value when quad derivatives are in use. The lowering pass where all of this is done correctly for compute shaders assumes load_local_invocation_index will be lowered in the backend for mesh/task, calculates the values for the quads correctly but then avoid replacing the original intrinsic and we remain with the wrong results. Add an intel specific intrinsic and always lower the generic one to that (or whatever else was calculated) to avoid ambiguities and fix the value for quad derivatives. Fixes future CTS tests using mesh/task shaders under: dEQP-VK.spirv_assembly.instruction.compute.compute_shader_derivatives.* Fixes: `d89bfb1ff7` ("intel/brw: Reorganize lowering of LocalID/Index to handle Mesh/Task") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39276>	2026-01-27 22:28:19 +00:00
Emma Anholt	eb990cd81e	nir: Bump test timeouts. nir_opt_algebraic_tests has been pushing our qemu-ed tests over the line. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39563>	2026-01-27 21:31:14 +00:00
Eric Engestrom	d12e3454e6	nir/meson: fix cpp_args of nir_opt_algebraic_pattern_tests Fixes: `4c30c44b75` ("nir: Generate unit tests for nir_opt_algebraic") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39550>	2026-01-27 20:03:16 +00:00
Kenneth Graunke	b844082017	nir: Add a round_up_components callback to load/store vectorization By default, load/store vectorization uses nir_round_up_components() to round up loads and possibly writemasked stores to the next valid NIR vector width. However, some backends may not support load/stores at all sizes. For example, older Intel supports only power-of-two vector widths. Newer Intel also supports vec2 and vec3, but not vec5/6/7. By providing a callback, backends can request promotion to their next supported memory load/store vector width. The existing "should we vectorize?" callback should continue to return false for unsupported vector widths (i.e. beyond the maximum supported). With this new callback, they do not need to say "no" to vectorization that would normally produce an unsupported count (e.g. vec5/6/7) but instead request that the component count be rounded up appropriately. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Kenneth Graunke	e23a83b786	nir: Add load/store vectorizer option for rounding up masked stores This adds a new option, round_up_store_components, which rounds up the number of components for stores that support writemasking to the next valid vector size. For example, vec4+vec2 stores would round up from 6 components (which wouldn't be supported) to a full supportable vec8 store, relying on writemasking to ensure the correct pieces are written. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Kenneth Graunke	37f3c59b2c	nir: Teach opt_load_store_vectorize how to handle Intel URB intrinsics URB intrinsics are simply memory load/stores to a special memory region, so it's pretty reasonable to handle these in the memory vectorizer. We treat emit_vertex_* intrinsics as a barrier for shader outputs. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Acked-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Kenneth Graunke	c2f03ba12f	nir: Add memory modes to URB load intrinsics This makes it easier for NIR passes to distinguish between inputs and outputs without having to reason about which URB handle source was passed to the intrinsic. It probably also makes it a bit easier for humans to read the NIR too. v2: Don't add memory mode to store intrinsics. It's always output. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Emma Anholt	e922c2cabc	nir,spirv: Add support for SPV_QCOM_image_processing. Initial work was done by Mark Collins, which I significantly rewrote. Signed-off-by: Mark Collins <mark@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38559>	2026-01-27 02:00:40 +00:00
Dave Airlie	6d53931cf4	nir: add cmat call to propogate invariants This just adds this as lavapipe uses this pass. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38964>	2026-01-26 22:39:40 +00:00
Daniel Schürmann	6313e9f549	nir/opt_loop: Relax restrictions on opt_loop_peel_initial_break() for more loops In addition to loops where the break condition can be constant-folded, we also allow to peel the initial break from loops which have at least one phi with a constant loop-carried source, effectively removing that phi from the loop. Totals from 172 (0.22% of 79377) affected shaders: (Navi31) Instrs: 372798 -> 369181 (-0.97%); split: -1.07%, +0.10% CodeSize: 1907312 -> 1891948 (-0.81%); split: -0.89%, +0.09% VGPRs: 8436 -> 8460 (+0.28%) Latency: 3646016 -> 3396657 (-6.84%) InvThroughput: 434848 -> 389079 (-10.53%) Copies: 28436 -> 27118 (-4.63%); split: -4.79%, +0.15% Branches: 26504 -> 25344 (-4.38%); split: -4.44%, +0.06% PreSGPRs: 8585 -> 8603 (+0.21%) VALU: 148291 -> 148355 (+0.04%); split: -0.01%, +0.06% SALU: 95625 -> 92649 (-3.11%); split: -3.22%, +0.11% Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33666>	2026-01-26 12:02:49 +00:00
Georg Lehmann	b2d9615000	nir/opt_algebraic: optimize bcsel to hi 16bits with undef lo Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>	2026-01-26 10:54:20 +00:00

1 2 3 4 5 ...

7069 commits