Commit graph

7069 commits

Author SHA1 Message Date
Kenneth Graunke
beb4b78fe7 intel: Rename intel_msaa_flags to intel_fs_config
This started out as dynamic configuration for MSAA related state, but
has since expanded to cover many dynamic fragment shader options.

We rename it to intel_fs_config, similar to intel_tess_config, to
better indicate its purpose.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39748>
2026-02-06 20:51:43 -08:00
Daniel Schürmann
f71a38e9de nir/opt_load_store_vectorize: don't use shared2 vectorization across blocks
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Besides the undesireable combinations this can produce,
it would also require to update the last_entry in every
previous block.

Totals from 99 (0.12% of 84383) affected shaders: (Navi48)
Instrs: 288989 -> 289727 (+0.26%); split: -0.02%, +0.28%
CodeSize: 1542572 -> 1546616 (+0.26%); split: -0.02%, +0.28%
SpillSGPRs: 17 -> 16 (-5.88%)
Latency: 2104020 -> 2103286 (-0.03%); split: -0.17%, +0.13%
InvThroughput: 472380 -> 472265 (-0.02%); split: -0.08%, +0.05%
VClause: 9778 -> 9779 (+0.01%)
Copies: 24937 -> 25173 (+0.95%); split: -0.05%, +0.99%
Branches: 10124 -> 10156 (+0.32%); split: -0.01%, +0.33%
PreSGPRs: 6112 -> 6091 (-0.34%)
PreVGPRs: 4079 -> 4069 (-0.25%); split: -0.39%, +0.15%
VALU: 120208 -> 120421 (+0.18%); split: -0.03%, +0.21%
SALU: 56338 -> 56312 (-0.05%); split: -0.09%, +0.04%
VOPD: 34 -> 37 (+8.82%)

Fixes: 4ca7ee7bd7 ('nir/opt_load_store_vectorize: Allow to vectorize at most one entry of each type across blocks')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39733>
2026-02-06 16:34:15 +00:00
Daniel Schürmann
5e86cfac8e nir/opt_load_store_vectorize: Vectorize speculatable instructions across blocks
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This should always be safe.

Totals from 446 (0.53% of 84383) affected shaders: (Navi48)
Instrs: 995942 -> 994416 (-0.15%); split: -0.17%, +0.02%
CodeSize: 5500372 -> 5489900 (-0.19%); split: -0.20%, +0.01%
SpillSGPRs: 197 -> 195 (-1.02%)
Latency: 14872922 -> 14851646 (-0.14%); split: -0.15%, +0.00%
InvThroughput: 2395050 -> 2391537 (-0.15%); split: -0.15%, +0.00%
VClause: 20207 -> 20195 (-0.06%); split: -0.07%, +0.01%
SClause: 27090 -> 26427 (-2.45%); split: -2.51%, +0.07%
Copies: 84182 -> 84228 (+0.05%); split: -0.08%, +0.13%
Branches: 22927 -> 22928 (+0.00%)
PreSGPRs: 27275 -> 27524 (+0.91%); split: -0.02%, +0.93%
PreVGPRs: 29116 -> 29131 (+0.05%)
VALU: 545565 -> 545549 (-0.00%); split: -0.01%, +0.00%
SALU: 124275 -> 124329 (+0.04%); split: -0.05%, +0.09%
VMEM: 39044 -> 39030 (-0.04%)
SMEM: 44052 -> 43205 (-1.92%)
VOPD: 32354 -> 32337 (-0.05%); split: +0.02%, -0.07%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39373>
2026-02-06 10:16:50 +00:00
Daniel Schürmann
4ca7ee7bd7 nir/opt_load_store_vectorize: Allow to vectorize at most one entry of each type across blocks
The idea is to initialize the vectorization table with one
entry from the previous blocks if it's the same for all predecessors.
In order to not speculatively load out-of-bounds, backends need to
set a new bounds_checked_modes option indicating variable modes
for which per-component bounds checks are supported.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39373>
2026-02-06 10:16:50 +00:00
Daniel Schürmann
0a07ea20e6 nir/opt_load_store_vectorize: create add_entry_to_hash_table() helper
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39373>
2026-02-06 10:16:50 +00:00
Daniel Schürmann
e5bd9cbf90 nir/opt_load_store_vectorize: use linear allocator instead of ralloc
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39373>
2026-02-06 10:16:49 +00:00
Georg Lehmann
5e2f28e723 nir: remove split unpack_half opcodes
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511>
2026-02-06 06:12:36 +00:00
Georg Lehmann
81e3162cf8 microsoft/compiler: switch to a backend specific unpack half opcode
Sadly, just f2f32 isn't enough for dxil.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511>
2026-02-06 06:12:36 +00:00
Georg Lehmann
45cb1d3b6f nir/opt_algebraic: remove unpack_half_2x16_split
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511>
2026-02-06 06:12:36 +00:00
Georg Lehmann
5a2ef27f7d nir/format_convert: use f2f32 instead of unpack_half
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511>
2026-02-06 06:12:36 +00:00
Georg Lehmann
a3bd2ae465 nir/opt_16bit_tex_image: remove unpack_half support
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511>
2026-02-06 06:12:36 +00:00
Georg Lehmann
6f7d4cd75b nir/lower_tex: use f2f32 instead of unpack_half
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511>
2026-02-06 06:12:36 +00:00
Georg Lehmann
609c46cf23 nir/lower_alu_width: emit f2f32 for unpack_half_2x16
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511>
2026-02-06 06:12:36 +00:00
Georg Lehmann
b18d9c1b33 nir/opt_algebraic: optimize unpack_32_2x16 of extract
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39511>
2026-02-06 06:12:36 +00:00
Timothy Arceri
da6c3ad237 nir: speedup nir_find_inlinable_uniforms()
Here we speedup nir_find_inlinable_uniforms() by making sure we only
check a src is inlinable once.

If we have a bunch of nested if-statements where the conditions keep
building on the alu chains of previous conditions we can end up
with exponential processing times due to repeatedly processing the
same srcs over and over.

A big cause of the exponential grow seems to be instructions like
`ffma %594, %594, %599` or `fmul %600, %600` where each essentially
causes us to process the entire previous part of the chain
twice.

Shaders such as that in issue #14663 took multiple minutes to
compile previously, calling collect_src_uniforms billions of times
and now compile within a second with this change.

Closes: mesa/mesa#14663

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39664>
2026-02-05 23:19:29 +00:00
Timothy Arceri
aaea962808 nir: update asserts in inline uniforms
collect_src_uniforms() is now only called internally and uni_offsets
should never be NULL.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39664>
2026-02-05 23:19:29 +00:00
Timothy Arceri
0410377b63 nir: make nir_add_inlinable_uniforms() private
Hasn't been used externally since e93592dc62

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39664>
2026-02-05 23:19:28 +00:00
Timothy Arceri
257875034d nir: make nir_collect_src_uniforms() private
Hasn't been used externally since e93592dc62

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39664>
2026-02-05 23:19:28 +00:00
Karol Herbst
e5bf1f5aff nir/opt_offsets: support nvidias intrinsics
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39525>
2026-02-03 22:23:51 +00:00
Karol Herbst
cb60e4d14f nir/opt_offsets: support negative offsets and 64 bit sources
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39525>
2026-02-03 22:23:51 +00:00
Karol Herbst
4add3959e9 nir: add BASE to nvidia memory intrinsics
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39525>
2026-02-03 22:23:50 +00:00
Karol Herbst
e779538ad2 nir: add nvidia IO intrinsics
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39525>
2026-02-03 22:23:50 +00:00
Marek Olšák
a3f022d0a2 nir: reassociate a $op (b ? #c : #d) for div, mod, rem
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This eliminates expensive div, mod, rem opcodes with non-constant src1 being
constant src1 hiding behind bcsel.

gcc and LLVM are missing this.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39560>
2026-02-02 21:34:48 +00:00
Marek Olšák
30e9f0bdf3 nir/opt_16bit_tex_image: lower dst of load_buffer_amd
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>
2026-02-02 17:56:52 +00:00
Marek Olšák
44bc1e6bf4 nir: add dest_type to load_buffer_amd
for lowering the result to 16 bits

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>
2026-02-02 17:56:52 +00:00
Marek Olšák
9eaaf9e525 nir: add ACCESS_SPARSE
trying to reduce the combinatorial explosion of intrinsics

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>
2026-02-02 17:56:52 +00:00
Marek Olšák
3350bca3eb nir/print: fix a crash due to unhandled GLSL_SAMPLER_DIM_EXTERNAL
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>
2026-02-02 17:56:52 +00:00
Georg Lehmann
bdc084aae5 nir/algebraic: make subexpression inexact on creation
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Removes the runtime code for this, and means we propergate the
signed zero/inf/nan checks to subexpessions too, not just exact.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39616>
2026-01-31 15:30:25 +00:00
Georg Lehmann
293d2e3b0d nir/algebraic: remove ability to create Value from Expression
Not used, and it would break in the future.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39616>
2026-01-31 15:30:25 +00:00
Georg Lehmann
ad6f8291bf nir/opt_algebraic: rework ignore_exact to work like other internal conditions
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39616>
2026-01-31 15:30:25 +00:00
Georg Lehmann
a879b9a5d5 nir/search: preserve nan/inf/sz if any alu in a replaced expression did
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39616>
2026-01-31 15:30:25 +00:00
Georg Lehmann
575affaf48 nir/search: gather union of all fp_math_ctrl
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39616>
2026-01-31 15:30:25 +00:00
Karol Herbst
24d20df3d6 nir: fix nir_fixup_is_exported for LLVM-22
Starting with LLVM-22 we won't see the kernel wrapper anymore, and this
is a trivial fix to get around this.

See: 5458eb2511

Cc: mesa-stable
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39374>
2026-01-30 16:06:25 +00:00
Georg Lehmann
70f0e75262 nir/opt_algebraic: optimize pack_half_2x16_rtz of float converted from 16bit
Foz-DB Navi48:
Totals from 177 (0.21% of 82405) affected shaders:
Instrs: 326628 -> 325955 (-0.21%); split: -0.21%, +0.00%
CodeSize: 1726720 -> 1722500 (-0.24%); split: -0.24%, +0.00%
Latency: 5076631 -> 5075700 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 596010 -> 595598 (-0.07%); split: -0.07%, +0.00%
VClause: 3613 -> 3616 (+0.08%)
Copies: 24427 -> 24501 (+0.30%); split: -0.06%, +0.36%
VALU: 182468 -> 182029 (-0.24%); split: -0.24%, +0.00%
SALU: 55449 -> 55452 (+0.01%); split: -0.01%, +0.01%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39531>
2026-01-29 14:44:37 +00:00
Georg Lehmann
c3e12429c5 nir/opt_algebaric: improve a < 0.0 ? 0.0 : sqrt(a) pattern
Fix the NaN correctness of the original pattern, and add more variants.

Foz-DB Navi48:
Totals from 372 (0.45% of 82405) affected shaders:
Instrs: 208946 -> 207522 (-0.68%); split: -0.71%, +0.03%
CodeSize: 1116436 -> 1109804 (-0.59%); split: -0.61%, +0.02%
VGPRs: 19452 -> 19104 (-1.79%)
Latency: 1121222 -> 1120423 (-0.07%); split: -0.13%, +0.05%
InvThroughput: 158228 -> 157567 (-0.42%); split: -0.61%, +0.19%
VClause: 3695 -> 3704 (+0.24%)
Copies: 9516 -> 9606 (+0.95%); split: -0.24%, +1.19%
VALU: 118696 -> 118031 (-0.56%); split: -0.61%, +0.05%
VOPD: 380 -> 372 (-2.11%)

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39507>
2026-01-29 11:29:48 +00:00
Georg Lehmann
f872c13707 nir/opt_algebraic: use contract instead of inexact for more patterns
These use more precise operations, so contract is enough.

Foz-DB Navi48:
Totals from 248 (0.30% of 82405) affected shaders:
Instrs: 284686 -> 284318 (-0.13%); split: -0.14%, +0.01%
CodeSize: 1528856 -> 1527520 (-0.09%); split: -0.10%, +0.01%
Latency: 2368390 -> 2367345 (-0.04%); split: -0.06%, +0.01%
InvThroughput: 346623 -> 346335 (-0.08%); split: -0.09%, +0.01%
SClause: 6752 -> 6756 (+0.06%); split: -0.12%, +0.18%
Copies: 14685 -> 14694 (+0.06%); split: -0.01%, +0.07%
VALU: 179922 -> 179727 (-0.11%); split: -0.11%, +0.01%
SALU: 28706 -> 28707 (+0.00%)
VOPD: 1196 -> 1198 (+0.17%)

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39507>
2026-01-29 11:29:48 +00:00
Georg Lehmann
f472bbf017 nir/algebraic: remove manual opcode validation
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The properly terminated regex automatically detects this case now.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39586>
2026-01-28 18:46:23 +00:00
Georg Lehmann
a5f55be021 nir/algebraic: terminate opcode regex
Instead of silently dropping the unmatched rest.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39586>
2026-01-28 18:46:23 +00:00
Georg Lehmann
d8ef28671d nir/opt_algebraic: use correct syntax to create exact fsat
Fixes: 3b06824e4c ("nir/opt_algebraic: optimize some post peephole select patterns")

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39586>
2026-01-28 18:46:22 +00:00
Iván Briano
5b48805b42 brw: fix local_invocation_index with quad derivaties on mesh/task shaders
For mesh/task shaders, the thread payload provides a local invocation
index, but it's always linear so it doesn't give the correct value when
quad derivatives are in use.
The lowering pass where all of this is done correctly for compute
shaders assumes load_local_invocation_index will be lowered in the
backend for mesh/task, calculates the values for the quads correctly but
then avoid replacing the original intrinsic and we remain with the wrong
results.

Add an intel specific intrinsic and always lower the generic one to that
(or whatever else was calculated) to avoid ambiguities and fix the value
for quad derivatives.

Fixes future CTS tests using mesh/task shaders under:
dEQP-VK.spirv_assembly.instruction.compute.compute_shader_derivatives.*

Fixes: d89bfb1ff7 ("intel/brw: Reorganize lowering of LocalID/Index to handle Mesh/Task")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39276>
2026-01-27 22:28:19 +00:00
Emma Anholt
eb990cd81e nir: Bump test timeouts.
nir_opt_algebraic_tests has been pushing our qemu-ed tests over the line.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39563>
2026-01-27 21:31:14 +00:00
Eric Engestrom
d12e3454e6 nir/meson: fix cpp_args of nir_opt_algebraic_pattern_tests
Fixes: 4c30c44b75 ("nir: Generate unit tests for nir_opt_algebraic")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39550>
2026-01-27 20:03:16 +00:00
Kenneth Graunke
b844082017 nir: Add a round_up_components callback to load/store vectorization
By default, load/store vectorization uses nir_round_up_components()
to round up loads and possibly writemasked stores to the next valid
NIR vector width.  However, some backends may not support load/stores
at all sizes.  For example, older Intel supports only power-of-two
vector widths.  Newer Intel also supports vec2 and vec3, but not
vec5/6/7.  By providing a callback, backends can request promotion
to their next supported memory load/store vector width.

The existing "should we vectorize?" callback should continue to return
false for unsupported vector widths (i.e. beyond the maximum supported).
With this new callback, they do not need to say "no" to vectorization
that would normally produce an unsupported count (e.g. vec5/6/7) but
instead request that the component count be rounded up appropriately.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>
2026-01-27 16:08:36 +00:00
Kenneth Graunke
e23a83b786 nir: Add load/store vectorizer option for rounding up masked stores
This adds a new option, round_up_store_components, which rounds up the
number of components for stores that support writemasking to the next
valid vector size.  For example, vec4+vec2 stores would round up from
6 components (which wouldn't be supported) to a full supportable vec8
store, relying on writemasking to ensure the correct pieces are written.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>
2026-01-27 16:08:36 +00:00
Kenneth Graunke
37f3c59b2c nir: Teach opt_load_store_vectorize how to handle Intel URB intrinsics
URB intrinsics are simply memory load/stores to a special memory region,
so it's pretty reasonable to handle these in the memory vectorizer.  We
treat emit_vertex_* intrinsics as a barrier for shader outputs.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>
2026-01-27 16:08:36 +00:00
Kenneth Graunke
c2f03ba12f nir: Add memory modes to URB load intrinsics
This makes it easier for NIR passes to distinguish between inputs and
outputs without having to reason about which URB handle source was
passed to the intrinsic.  It probably also makes it a bit easier for
humans to read the NIR too.

v2: Don't add memory mode to store intrinsics.  It's always output.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>
2026-01-27 16:08:36 +00:00
Emma Anholt
e922c2cabc nir,spirv: Add support for SPV_QCOM_image_processing.
Initial work was done by Mark Collins, which I significantly rewrote.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38559>
2026-01-27 02:00:40 +00:00
Dave Airlie
6d53931cf4 nir: add cmat call to propogate invariants
This just adds this as lavapipe uses this pass.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38964>
2026-01-26 22:39:40 +00:00
Daniel Schürmann
6313e9f549 nir/opt_loop: Relax restrictions on opt_loop_peel_initial_break() for more loops
In addition to loops where the break condition can be constant-folded,
we also allow to peel the initial break from loops which have at least
one phi with a constant loop-carried source, effectively removing that
phi from the loop.

Totals from 172 (0.22% of 79377) affected shaders: (Navi31)
Instrs: 372798 -> 369181 (-0.97%); split: -1.07%, +0.10%
CodeSize: 1907312 -> 1891948 (-0.81%); split: -0.89%, +0.09%
VGPRs: 8436 -> 8460 (+0.28%)
Latency: 3646016 -> 3396657 (-6.84%)
InvThroughput: 434848 -> 389079 (-10.53%)
Copies: 28436 -> 27118 (-4.63%); split: -4.79%, +0.15%
Branches: 26504 -> 25344 (-4.38%); split: -4.44%, +0.06%
PreSGPRs: 8585 -> 8603 (+0.21%)
VALU: 148291 -> 148355 (+0.04%); split: -0.01%, +0.06%
SALU: 95625 -> 92649 (-3.11%); split: -3.22%, +0.11%

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33666>
2026-01-26 12:02:49 +00:00
Georg Lehmann
b2d9615000 nir/opt_algebraic: optimize bcsel to hi 16bits with undef lo
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>
2026-01-26 10:54:20 +00:00