Commit graph

11900 commits

Author SHA1 Message Date
Marek Olšák
a965ada6ee Inline mesa_sha1, SHA1_CTX
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:27 +00:00
Marek Olšák
0da88d237a Inline SHA1_DIGEST_STRING_LENGTH
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:27 +00:00
Marek Olšák
110632f702 Inline SHA1_DIGEST_LENGTH
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:27 +00:00
Marek Olšák
2283244975 nir: change export_amd intrinsics to use target instead of base
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40415>
2026-03-23 06:10:49 +00:00
Marek Olšák
b75a3112fd nir: change export_amd intrinsics to use enabled_channels instead of write_mask
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40415>
2026-03-23 06:10:49 +00:00
Marek Olšák
f9a10c46fa nir/inline_uniforms: track visited state per component
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This prevents an instruction from being marked inlinable or non-inlinable
when only a subset of components meet that condition.

This might only be relevant for non-scalar ALU.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40413>
2026-03-21 17:55:40 +00:00
Marek Olšák
d9a2fac925 nir/inline_uniforms: update comments
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40413>
2026-03-21 17:55:40 +00:00
Marek Olšák
3b004ec60b nir/inline_uniforms: rename new_num -> new_num_uniforms
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40413>
2026-03-21 17:55:39 +00:00
Marek Olšák
727d663f79 nir/inline_uniforms: rename num_offsets -> num_uniforms
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40413>
2026-03-21 17:55:39 +00:00
Timothy Arceri
06fc27b5a4 nir: test loop analyze sets exact trip flags correctly
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Introduces new test helper to create loop with multiple terminators
and tests some scenaros to make sure exact trip flags are set
correctly.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32473>
2026-03-21 11:46:14 +00:00
Timothy Arceri
82b474c3fb nir: remove is_only_uniform_src() restriction
Loop analysis seems to have assumed we needed a const here to be
a useful loop, however this isn't true so drop the restriction.

This allows the optimisation from 6ca81adffc to become more powerful.

Shader-db results radeonsi:

TOTALS FROM AFFECTED SHADERS (19/168079)
  SGPRS: 904.00 -> 848.00 (-6.19 %)
  VGPRS: 712.00 -> 684.00 (-3.93 %)
  Spilled SGPRs: 0.00 -> 0.00 (0.00 %)
  Spilled VGPRs: 0.00 -> 0.00 (0.00 %)
  Private memory VGPRs: 0.00 -> 0.00 (0.00 %)
  Scratch size: 0.00 -> 0.00 (0.00 %) dwords per thread
  Code Size: 80340.00 -> 92980.00 (15.73 %) bytes
  Max Waves: 236.00 -> 238.00 (0.85 %)
  Outputs: 0.00 -> 0.00 (0.00 %)
  Patch Outputs: 0.00 -> 0.00 (0.00 %)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32473>
2026-03-21 11:46:14 +00:00
Daniel Schürmann
4ca0eb9f54 nir: validate that loop continue statements always link to continue constructs
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>
2026-03-21 07:42:55 +00:00
Daniel Schürmann
94f959972d nir: ensure that loop continue statements always link to continue constructs
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>
2026-03-21 07:42:55 +00:00
Daniel Schürmann
0089d81fb3 nir/tests: change opt_loop_peel_initial_break test to not use nir_jump_continue
We are going to disallow continue statements without
loop continue constructs.

Replaced with a test that checks that the optimization is not
applied in absense of actual work after the conditional break.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>
2026-03-21 07:42:55 +00:00
Daniel Schürmann
ff8c8858dc nir/lower_goto_ifs: Add and lower loop continue constructs
We are going to disallow continue statements without
loop continue constructs.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>
2026-03-21 07:42:55 +00:00
Daniel Schürmann
f159669cf3 nir/lower_continue_constructs: Remove unnecessary handling of multiple continue statements
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>
2026-03-21 07:42:55 +00:00
Daniel Schürmann
31af989270 nir/lower_continue_constructs: Simplify loops before lowering continue constructs
The idea is inspired by LLVM's LoopSimplify pass. Before
lowering continue constructs, the pass now also lowers
all continue statements, leaving only the trivial continue.
This ensures that loops will always only have one back-edge.

Totals from 396 (0.47% of 84383) affected shaders: (Navi48)
Instrs: 900330 -> 899850 (-0.05%); split: -0.17%, +0.12%
CodeSize: 4727216 -> 4727508 (+0.01%); split: -0.13%, +0.13%
Latency: 7276816 -> 7097199 (-2.47%); split: -2.53%, +0.06%
InvThroughput: 1580718 -> 1558646 (-1.40%); split: -1.42%, +0.03%
VClause: 12872 -> 12879 (+0.05%); split: -0.01%, +0.06%
SClause: 22237 -> 22240 (+0.01%); split: -0.00%, +0.02%
Copies: 67359 -> 65723 (-2.43%); split: -2.56%, +0.14%
Branches: 24252 -> 24163 (-0.37%); split: -0.52%, +0.15%
PreSGPRs: 34371 -> 34399 (+0.08%)
PreVGPRs: 25268 -> 25280 (+0.05%); split: -0.00%, +0.05%
VALU: 512493 -> 511580 (-0.18%); split: -0.33%, +0.15%
SALU: 122767 -> 122993 (+0.18%); split: -0.13%, +0.32%
VMEM: 22181 -> 22213 (+0.14%)
SMEM: 41370 -> 41376 (+0.01%)

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942>
2026-03-21 07:42:55 +00:00
Mary Guillemard
c6d8f7ce0c nir/dead_cf: Add missing load_global_nv handling
This was missing when this intrinsic was added.
Fix some issue with FSI lowering and probably more.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: e779538ad2 ("nir: add nvidia IO intrinsics")
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40543>
2026-03-20 20:19:35 +00:00
Mary Guillemard
bb6fc8cc20 nir/dead_cf: Add missing load_global_bounded handling
Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: caa0854da8 ("nir: plumb load_global_bounded")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40543>
2026-03-20 20:19:34 +00:00
Mary Guillemard
6013667d61 nir/dead_cf: Add missing load_ssbo_ir3 handling
Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: 0092edfec0 ("nir/dead_cf: Do not remove loops with loads that can't be reordered")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40543>
2026-03-20 20:19:34 +00:00
Connor Abbott
ec37fed52b tu, ir3, nir: Plumb through driver param for alpha-to-coverage
We will need this when alpha-to-coverage is dynamic and we need to
emulate it.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39335>
2026-03-20 18:09:49 +00:00
Connor Abbott
22a061fb91 nir: Use better calculation for alpha-to-coverage mask
The old calculation depended on the sample count, and gave subpar
results for 8x MSAA with standard sample locations. The new calculation
is based on the Intel pass, with some changing of the constants so that
the sample count is always proportional to alpha for 2xMSAA and 4xMSAA
and the addition of rotating the sample mask based on the pixel.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39335>
2026-03-20 18:09:48 +00:00
Georg Lehmann
643dd510d4 nir/opt_algebraic: optimize b2f(a) * b
When the multiplication is only used by fadd, it's not a clear win
because of potential fma fusion.

Totals from 8015 (6.99% of 114655) affected shaders:
MaxWaves: 199394 -> 199466 (+0.04%); split: +0.04%, -0.01%
Instrs: 17461518 -> 17451076 (-0.06%); split: -0.10%, +0.04%
CodeSize: 94779552 -> 94769828 (-0.01%); split: -0.07%, +0.06%
VGPRs: 526012 -> 525532 (-0.09%); split: -0.10%, +0.01%
SpillSGPRs: 12466 -> 12517 (+0.41%); split: -0.09%, +0.50%
Latency: 191274766 -> 191297394 (+0.01%); split: -0.03%, +0.04%
InvThroughput: 31465968 -> 31456785 (-0.03%); split: -0.07%, +0.04%
VClause: 312081 -> 312073 (-0.00%); split: -0.10%, +0.09%
SClause: 366914 -> 366906 (-0.00%); split: -0.02%, +0.01%
Copies: 1222482 -> 1221933 (-0.04%); split: -0.20%, +0.15%
Branches: 376651 -> 376577 (-0.02%); split: -0.03%, +0.01%
PreSGPRs: 442974 -> 443240 (+0.06%); split: -0.01%, +0.07%
PreVGPRs: 415964 -> 415668 (-0.07%); split: -0.09%, +0.02%
VALU: 9403517 -> 9393916 (-0.10%); split: -0.12%, +0.02%
SALU: 2799420 -> 2800430 (+0.04%); split: -0.13%, +0.16%
VOPD: 472826 -> 472347 (-0.10%); split: +0.09%, -0.19%

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399>
2026-03-20 08:50:41 +00:00
Georg Lehmann
d2b37b667e nir/opt_algebraic: optimize more fmulz(1.0, a) remains
If dxvk's opencoded fmulz gets partially constant folded,
it leaves this mess behind.

It's important to do this before the more general fmul+b2f patterns added
in the next commit, because they change the signed zero behavior in a way
that can't be optimized back.

Foz-DB Navi48:
Totals from 36 (0.03% of 114655) affected shaders:

Instrs: 16513 -> 15706 (-4.89%)
CodeSize: 99756 -> 95760 (-4.01%)
Latency: 45165 -> 44151 (-2.25%)
InvThroughput: 8344 -> 7886 (-5.49%)
VClause: 395 -> 401 (+1.52%)
Copies: 639 -> 634 (-0.78%)
PreSGPRs: 1158 -> 1154 (-0.35%)
PreVGPRs: 1227 -> 1225 (-0.16%)
VALU: 11310 -> 10769 (-4.78%)
SALU: 813 -> 809 (-0.49%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399>
2026-03-20 08:50:41 +00:00
Georg Lehmann
3ad142d4d7 nir/search: never insert movs for alu uses
This means we respect the pattern order better because
simple replacements like bcsel(False, a, b) -> b no longer
insert movs that can block more specialized patterns.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399>
2026-03-20 08:50:41 +00:00
Georg Lehmann
1626df7a90 nir: rework nir_alu_src_is_trivial_ssa to take an alu src
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399>
2026-03-20 08:50:41 +00:00
Georg Lehmann
b96c42c916 nir/opt_algebraic: optimize more near useless bcsel
Foz-DB Navi48:
Totals from 327 (0.29% of 114655) affected shaders:
Instrs: 732971 -> 731642 (-0.18%); split: -0.19%, +0.01%
CodeSize: 3696020 -> 3689824 (-0.17%); split: -0.17%, +0.00%
Latency: 4405319 -> 4403413 (-0.04%); split: -0.06%, +0.01%
InvThroughput: 650209 -> 649659 (-0.08%); split: -0.10%, +0.01%
Copies: 53872 -> 53736 (-0.25%); split: -0.27%, +0.02%
Branches: 15598 -> 15571 (-0.17%)
VALU: 262391 -> 261969 (-0.16%)
SALU: 268112 -> 267699 (-0.15%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399>
2026-03-20 08:50:41 +00:00
Georg Lehmann
6cfe6eaa79 nir/opt_algebraic: create ldexp from exp2
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
ldexp uses the full width VALU path, exp2 the transcendental SIMD8.

Foz-DB Navi21:
Totals from 729 (0.64% of 114627) affected shaders:
MaxWaves: 20071 -> 20103 (+0.16%); split: +0.18%, -0.02%
Instrs: 869129 -> 867654 (-0.17%); split: -0.17%, +0.00%
CodeSize: 4709000 -> 4708460 (-0.01%); split: -0.02%, +0.00%
VGPRs: 31184 -> 31128 (-0.18%); split: -0.23%, +0.05%
Latency: 7610726 -> 7597238 (-0.18%); split: -0.18%, +0.00%
InvThroughput: 1822323 -> 1819815 (-0.14%); split: -0.14%, +0.00%
VClause: 22494 -> 22493 (-0.00%); split: -0.03%, +0.02%
SClause: 20520 -> 20509 (-0.05%)
Copies: 72025 -> 72024 (-0.00%); split: -0.01%, +0.01%
Branches: 22028 -> 22029 (+0.00%)
PreVGPRs: 21601 -> 21602 (+0.00%)
VALU: 604821 -> 603339 (-0.25%); split: -0.25%, +0.00%
SALU: 114258 -> 114262 (+0.00%); split: -0.00%, +0.01%

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33900>
2026-03-20 08:15:08 +00:00
Georg Lehmann
ec331cc48a nir: replace lower_ldexp with has_ldexp
I can be bothered to fix all the backends that don't set lower_ldexp,
and only two backends have ldexp anyway.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33900>
2026-03-20 08:15:08 +00:00
Faith Ekstrand
3418525a82 pan/bi: Lower VS outputs in NIR
Co-authored-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>
2026-03-19 11:25:32 +00:00
Lorenzo Rossi
43ffcf06f4 pan/bi,nir: Divide memory_access from segments
Valhall removed Bifrost's memory segments and added in its place memory
access.  Those were bolted on reserved bits as "pseudo-segments" and the
emitter would catch these and emit the right memory access.  This commit
cleans it up a bit by making memory_access available directly and
exposing it to NIR (this will be useful later).

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>
2026-03-19 11:25:30 +00:00
Lorenzo Rossi
c730e41ed5 pan/bi: Add is_psiz_store flag in bi_instr
This removes the previous hack that searched the psiz write by looking
for 16-bit stores with the correct pseudo segment.  We also add a new
intrinsic that mimicks global stores but tags psiz writes, this will be
used later in the series.

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>
2026-03-19 11:25:30 +00:00
Faith Ekstrand
de338dc908 pan,nir: Rework converted_mem_pan intrinsics
First, rename them to make them a bit more clear.  They act on global
memory so they should be _global and they map to ld/st_cvt so so _cvt is
nice and obvious.  Second, they don't need IO semantics as they're not
IO.  But they do need ACCESS so that we can better control things like
CAN_REORDER.  Third, add a src_type to store_global_cvt even though it
won't be used just yet because we'll want it for lowering VS stores.

Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>
2026-03-19 11:25:29 +00:00
Faith Ekstrand
d2f430bea9 pan/bi: Add new FS input load intrinsics
Unlike load[_interpolated]_input, which has to deal with all sorts of
ABI nonsense between driver and compiler, these new intrinsics are
dumber than bricks.  They're literally just the HW ops as NIR
intrinsics.  These will allow us do the lowering in NIR and put the
driver in total control over what goes down what path.  Among other
things, a driver could choose to lower some things to ld_var and others
to ld_var_buf.

Co-authored-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>
2026-03-19 11:25:28 +00:00
Georg Lehmann
57c05f72f9 nir/opt_large_constants: only use 16bit float alu when supported
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>
2026-03-19 06:59:18 +00:00
Georg Lehmann
5f37788ae9 nir/opt_large_constants: handle floating point power of two fractions
Foz-DB Navi48:
Totals from 365 (0.32% of 114655) affected shaders:
MaxWaves: 10020 -> 10016 (-0.04%)
Instrs: 486252 -> 486097 (-0.03%); split: -0.21%, +0.18%
CodeSize: 2629536 -> 2628452 (-0.04%); split: -0.19%, +0.14%
VGPRs: 19884 -> 19896 (+0.06%); split: -0.06%, +0.12%
SpillSGPRs: 210 -> 212 (+0.95%)
Latency: 3818610 -> 3765549 (-1.39%); split: -1.50%, +0.11%
InvThroughput: 598445 -> 596281 (-0.36%); split: -0.58%, +0.22%
VClause: 10053 -> 9698 (-3.53%); split: -3.54%, +0.01%
SClause: 17548 -> 17334 (-1.22%); split: -1.24%, +0.02%
Copies: 43196 -> 42249 (-2.19%); split: -2.34%, +0.14%
Branches: 16695 -> 16628 (-0.40%); split: -0.47%, +0.07%
PreSGPRs: 17988 -> 17971 (-0.09%)
PreVGPRs: 13552 -> 13520 (-0.24%)
VALU: 244842 -> 246611 (+0.72%); split: -0.02%, +0.74%
SALU: 79163 -> 77778 (-1.75%); split: -2.05%, +0.30%
VMEM: 13468 -> 13084 (-2.85%)
SMEM: 23571 -> 23393 (-0.76%)
VOPD: 8384 -> 8372 (-0.14%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>
2026-03-19 06:59:18 +00:00
Georg Lehmann
372c1a23dc nir/opt_large_constants: support negative small constants
Foz-DB Navi48:
Totals from 511 (0.45% of 114655) affected shaders:
MaxWaves: 14554 -> 14552 (-0.01%)
Instrs: 767577 -> 768334 (+0.10%); split: -0.17%, +0.27%
CodeSize: 4171036 -> 4181400 (+0.25%); split: -0.10%, +0.35%
VGPRs: 27676 -> 27724 (+0.17%)
SpillSGPRs: 144 -> 183 (+27.08%)
Latency: 4053919 -> 4027092 (-0.66%); split: -0.88%, +0.22%
InvThroughput: 817990 -> 819490 (+0.18%); split: -0.21%, +0.39%
VClause: 11573 -> 11172 (-3.46%); split: -3.47%, +0.01%
SClause: 14418 -> 14579 (+1.12%); split: -0.46%, +1.57%
Copies: 71638 -> 71365 (-0.38%); split: -1.54%, +1.16%
Branches: 20212 -> 20425 (+1.05%); split: -0.39%, +1.44%
PreSGPRs: 21765 -> 21743 (-0.10%); split: -0.23%, +0.12%
PreVGPRs: 19475 -> 19307 (-0.86%); split: -0.91%, +0.05%
VALU: 411365 -> 413642 (+0.55%); split: -0.02%, +0.57%
SALU: 126940 -> 125411 (-1.20%); split: -1.53%, +0.32%
VMEM: 20574 -> 20062 (-2.49%)
SMEM: 23724 -> 23677 (-0.20%); split: -0.25%, +0.05%
VOPD: 19838 -> 19847 (+0.05%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>
2026-03-19 06:59:18 +00:00
Georg Lehmann
a9f3efcae0 nir/opt_large_constants: optimize small vector constant arrays
Foz-DB Navi48:
Totals from 2956 (2.58% of 114655) affected shaders:
MaxWaves: 85080 -> 85110 (+0.04%)
Instrs: 5167735 -> 5170572 (+0.05%); split: -0.12%, +0.17%
CodeSize: 28882716 -> 28867340 (-0.05%); split: -0.14%, +0.08%
VGPRs: 164484 -> 164616 (+0.08%); split: -0.09%, +0.18%
SpillSGPRs: 612 -> 611 (-0.16%)
Latency: 35017837 -> 34391146 (-1.79%); split: -1.80%, +0.01%
InvThroughput: 6336245 -> 6323807 (-0.20%); split: -0.49%, +0.29%
VClause: 112504 -> 111117 (-1.23%); split: -1.32%, +0.09%
SClause: 121125 -> 117618 (-2.90%); split: -3.04%, +0.15%
Copies: 392203 -> 384977 (-1.84%); split: -1.88%, +0.04%
Branches: 155578 -> 155376 (-0.13%); split: -0.13%, +0.01%
PreSGPRs: 127654 -> 127205 (-0.35%); split: -0.39%, +0.04%
PreVGPRs: 112486 -> 112449 (-0.03%); split: -0.04%, +0.00%
VALU: 2577362 -> 2586379 (+0.35%); split: -0.00%, +0.35%
SALU: 889569 -> 888472 (-0.12%); split: -1.01%, +0.89%
VMEM: 167203 -> 165750 (-0.87%)
SMEM: 190438 -> 187313 (-1.64%)
VOPD: 194411 -> 194344 (-0.03%); split: +0.01%, -0.04%

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>
2026-03-19 06:59:18 +00:00
Georg Lehmann
f782524c36 nir/opt_large_constants: enable small constant optimization for non trivial strides
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>
2026-03-19 06:59:17 +00:00
Georg Lehmann
568b96f8b2 nir/opt_large_constants: set fp_math_ctrl for bit exact results
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>
2026-03-19 06:59:17 +00:00
Georg Lehmann
e810382a1e nir/opt_large_constants: don't add constants implemented with ALU to the constant data
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>
2026-03-19 06:59:16 +00:00
Konstantin Seurer
581df90a89 nir/tests: Test nir_opt_large_constants
Tests a whole bunch of cases that can be turned into literals.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002>
2026-03-19 06:59:16 +00:00
Timothy Arceri
87ae5cab94 mesa: add force_explicit_uniform_loc_zero workaround
Allows a uniform name to be passed to force_explicit_uniform_loc_zero
allowing us to set that uniform to an explicit location of zero.

Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40448>
2026-03-18 07:28:07 +00:00
Caio Oliveira
f07138f244 spirv: Lower ShuffleUpINTEL and ShuffleDownINTEL to intrinsics
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40376>
2026-03-17 17:21:52 +00:00
Caio Oliveira
a2cbdfbde3 nir: Add intrinsics for ShuffleUpINTEL and ShuffleDownINTEL
Move lowering to nir_lower_subgroups.  At some point Intel
backend might want to skip that and lower at the backend IR
boundary, but for now lowering always applies.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40376>
2026-03-17 17:21:52 +00:00
Caio Oliveira
b494faa12d spirv: Remove dead code in subgroup instruction handling
This codepath had a bug (always setting `elems[0]`) since it was last
reworked, but there's no subgroup instruction that uses this helper and
support Composites, so it can be replace with an assert.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40356>
2026-03-17 15:32:36 +00:00
Erik Faye-Lund
5127568b98 compiler/nir: use common ycbcr math
Let's use the common code, so we have a single place to update in case
we want to add features etc.

Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40175>
2026-03-17 15:00:54 +00:00
Connor Abbott
c13bdaaa40 vtn: Fix vtn_mediump_upconvert_value() with transposed matrices
We can produce a transposed value sometimes, and we have to make sure
that val->transposed is also updated when that happens.

Noticed by inspection after the previous commit.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40017>
2026-03-16 18:33:54 +00:00
Connor Abbott
048d2a0c68 vtn: Fix vtn_mediump_downconvert_value() for transposed matrices
We forgot to set the actual value. This meant that whenever we actually
needed to use the transposed matrix we would immediately segfault.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40017>
2026-03-16 18:33:54 +00:00
Mike Blumenkrantz
fbf3305c1b nir/print: print per_vertex for variables
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40412>
2026-03-16 14:42:11 +00:00