Marek Olšák
fa5175023b
Final rename of sha1 names to blake3
...
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383 >
2026-03-23 07:03:28 +00:00
Marek Olšák
ae9ea27e0d
Rename *_sha1 names to *_blake3
...
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383 >
2026-03-23 07:03:28 +00:00
Marek Olšák
353fe94c0e
Rename SHA1 words to BLAKE3
...
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383 >
2026-03-23 07:03:28 +00:00
Marek Olšák
102d41799b
Rename more sha and sha1 names to blake3
...
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383 >
2026-03-23 07:03:28 +00:00
Marek Olšák
282bd2e6db
Rename sha words to blake3
...
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383 >
2026-03-23 07:03:28 +00:00
Marek Olšák
d4831aaf5f
Rename sha1_* and sha_* names to blake3_*
...
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383 >
2026-03-23 07:03:28 +00:00
Marek Olšák
c0ac992a2a
Remove mesa-sha1.h
...
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383 >
2026-03-23 07:03:27 +00:00
Marek Olšák
53c64973e8
Inline _mesa_sha1_compute/format, remove the other unused ones
...
_mesa_sha1_format has a few remaining uses, so it's moved to build_id.c,
which is its last user.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383 >
2026-03-23 07:03:27 +00:00
Marek Olšák
699f9d7066
Inline _mesa_sha1_init/update/final functions
...
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383 >
2026-03-23 07:03:27 +00:00
Marek Olšák
a965ada6ee
Inline mesa_sha1, SHA1_CTX
...
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383 >
2026-03-23 07:03:27 +00:00
Marek Olšák
0da88d237a
Inline SHA1_DIGEST_STRING_LENGTH
...
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383 >
2026-03-23 07:03:27 +00:00
Marek Olšák
110632f702
Inline SHA1_DIGEST_LENGTH
...
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383 >
2026-03-23 07:03:27 +00:00
Marek Olšák
2283244975
nir: change export_amd intrinsics to use target instead of base
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40415 >
2026-03-23 06:10:49 +00:00
Marek Olšák
b75a3112fd
nir: change export_amd intrinsics to use enabled_channels instead of write_mask
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40415 >
2026-03-23 06:10:49 +00:00
Marek Olšák
f9a10c46fa
nir/inline_uniforms: track visited state per component
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This prevents an instruction from being marked inlinable or non-inlinable
when only a subset of components meet that condition.
This might only be relevant for non-scalar ALU.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40413 >
2026-03-21 17:55:40 +00:00
Marek Olšák
d9a2fac925
nir/inline_uniforms: update comments
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40413 >
2026-03-21 17:55:40 +00:00
Marek Olšák
3b004ec60b
nir/inline_uniforms: rename new_num -> new_num_uniforms
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40413 >
2026-03-21 17:55:39 +00:00
Marek Olšák
727d663f79
nir/inline_uniforms: rename num_offsets -> num_uniforms
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40413 >
2026-03-21 17:55:39 +00:00
Timothy Arceri
06fc27b5a4
nir: test loop analyze sets exact trip flags correctly
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Introduces new test helper to create loop with multiple terminators
and tests some scenaros to make sure exact trip flags are set
correctly.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32473 >
2026-03-21 11:46:14 +00:00
Timothy Arceri
82b474c3fb
nir: remove is_only_uniform_src() restriction
...
Loop analysis seems to have assumed we needed a const here to be
a useful loop, however this isn't true so drop the restriction.
This allows the optimisation from 6ca81adffc to become more powerful.
Shader-db results radeonsi:
TOTALS FROM AFFECTED SHADERS (19/168079)
SGPRS: 904.00 -> 848.00 (-6.19 %)
VGPRS: 712.00 -> 684.00 (-3.93 %)
Spilled SGPRs: 0.00 -> 0.00 (0.00 %)
Spilled VGPRs: 0.00 -> 0.00 (0.00 %)
Private memory VGPRs: 0.00 -> 0.00 (0.00 %)
Scratch size: 0.00 -> 0.00 (0.00 %) dwords per thread
Code Size: 80340.00 -> 92980.00 (15.73 %) bytes
Max Waves: 236.00 -> 238.00 (0.85 %)
Outputs: 0.00 -> 0.00 (0.00 %)
Patch Outputs: 0.00 -> 0.00 (0.00 %)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32473 >
2026-03-21 11:46:14 +00:00
Daniel Schürmann
4ca0eb9f54
nir: validate that loop continue statements always link to continue constructs
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942 >
2026-03-21 07:42:55 +00:00
Daniel Schürmann
94f959972d
nir: ensure that loop continue statements always link to continue constructs
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942 >
2026-03-21 07:42:55 +00:00
Daniel Schürmann
0089d81fb3
nir/tests: change opt_loop_peel_initial_break test to not use nir_jump_continue
...
We are going to disallow continue statements without
loop continue constructs.
Replaced with a test that checks that the optimization is not
applied in absense of actual work after the conditional break.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942 >
2026-03-21 07:42:55 +00:00
Daniel Schürmann
ff8c8858dc
nir/lower_goto_ifs: Add and lower loop continue constructs
...
We are going to disallow continue statements without
loop continue constructs.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942 >
2026-03-21 07:42:55 +00:00
Daniel Schürmann
f159669cf3
nir/lower_continue_constructs: Remove unnecessary handling of multiple continue statements
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942 >
2026-03-21 07:42:55 +00:00
Daniel Schürmann
31af989270
nir/lower_continue_constructs: Simplify loops before lowering continue constructs
...
The idea is inspired by LLVM's LoopSimplify pass. Before
lowering continue constructs, the pass now also lowers
all continue statements, leaving only the trivial continue.
This ensures that loops will always only have one back-edge.
Totals from 396 (0.47% of 84383) affected shaders: (Navi48)
Instrs: 900330 -> 899850 (-0.05%); split: -0.17%, +0.12%
CodeSize: 4727216 -> 4727508 (+0.01%); split: -0.13%, +0.13%
Latency: 7276816 -> 7097199 (-2.47%); split: -2.53%, +0.06%
InvThroughput: 1580718 -> 1558646 (-1.40%); split: -1.42%, +0.03%
VClause: 12872 -> 12879 (+0.05%); split: -0.01%, +0.06%
SClause: 22237 -> 22240 (+0.01%); split: -0.00%, +0.02%
Copies: 67359 -> 65723 (-2.43%); split: -2.56%, +0.14%
Branches: 24252 -> 24163 (-0.37%); split: -0.52%, +0.15%
PreSGPRs: 34371 -> 34399 (+0.08%)
PreVGPRs: 25268 -> 25280 (+0.05%); split: -0.00%, +0.05%
VALU: 512493 -> 511580 (-0.18%); split: -0.33%, +0.15%
SALU: 122767 -> 122993 (+0.18%); split: -0.13%, +0.32%
VMEM: 22181 -> 22213 (+0.14%)
SMEM: 41370 -> 41376 (+0.01%)
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39942 >
2026-03-21 07:42:55 +00:00
Mary Guillemard
c6d8f7ce0c
nir/dead_cf: Add missing load_global_nv handling
...
This was missing when this intrinsic was added.
Fix some issue with FSI lowering and probably more.
Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: e779538ad2 ("nir: add nvidia IO intrinsics")
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40543 >
2026-03-20 20:19:35 +00:00
Mary Guillemard
bb6fc8cc20
nir/dead_cf: Add missing load_global_bounded handling
...
Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: caa0854da8 ("nir: plumb load_global_bounded")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40543 >
2026-03-20 20:19:34 +00:00
Mary Guillemard
6013667d61
nir/dead_cf: Add missing load_ssbo_ir3 handling
...
Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: 0092edfec0 ("nir/dead_cf: Do not remove loops with loads that can't be reordered")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40543 >
2026-03-20 20:19:34 +00:00
Connor Abbott
ec37fed52b
tu, ir3, nir: Plumb through driver param for alpha-to-coverage
...
We will need this when alpha-to-coverage is dynamic and we need to
emulate it.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39335 >
2026-03-20 18:09:49 +00:00
Connor Abbott
22a061fb91
nir: Use better calculation for alpha-to-coverage mask
...
The old calculation depended on the sample count, and gave subpar
results for 8x MSAA with standard sample locations. The new calculation
is based on the Intel pass, with some changing of the constants so that
the sample count is always proportional to alpha for 2xMSAA and 4xMSAA
and the addition of rotating the sample mask based on the pixel.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39335 >
2026-03-20 18:09:48 +00:00
Georg Lehmann
643dd510d4
nir/opt_algebraic: optimize b2f(a) * b
...
When the multiplication is only used by fadd, it's not a clear win
because of potential fma fusion.
Totals from 8015 (6.99% of 114655) affected shaders:
MaxWaves: 199394 -> 199466 (+0.04%); split: +0.04%, -0.01%
Instrs: 17461518 -> 17451076 (-0.06%); split: -0.10%, +0.04%
CodeSize: 94779552 -> 94769828 (-0.01%); split: -0.07%, +0.06%
VGPRs: 526012 -> 525532 (-0.09%); split: -0.10%, +0.01%
SpillSGPRs: 12466 -> 12517 (+0.41%); split: -0.09%, +0.50%
Latency: 191274766 -> 191297394 (+0.01%); split: -0.03%, +0.04%
InvThroughput: 31465968 -> 31456785 (-0.03%); split: -0.07%, +0.04%
VClause: 312081 -> 312073 (-0.00%); split: -0.10%, +0.09%
SClause: 366914 -> 366906 (-0.00%); split: -0.02%, +0.01%
Copies: 1222482 -> 1221933 (-0.04%); split: -0.20%, +0.15%
Branches: 376651 -> 376577 (-0.02%); split: -0.03%, +0.01%
PreSGPRs: 442974 -> 443240 (+0.06%); split: -0.01%, +0.07%
PreVGPRs: 415964 -> 415668 (-0.07%); split: -0.09%, +0.02%
VALU: 9403517 -> 9393916 (-0.10%); split: -0.12%, +0.02%
SALU: 2799420 -> 2800430 (+0.04%); split: -0.13%, +0.16%
VOPD: 472826 -> 472347 (-0.10%); split: +0.09%, -0.19%
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399 >
2026-03-20 08:50:41 +00:00
Georg Lehmann
d2b37b667e
nir/opt_algebraic: optimize more fmulz(1.0, a) remains
...
If dxvk's opencoded fmulz gets partially constant folded,
it leaves this mess behind.
It's important to do this before the more general fmul+b2f patterns added
in the next commit, because they change the signed zero behavior in a way
that can't be optimized back.
Foz-DB Navi48:
Totals from 36 (0.03% of 114655) affected shaders:
Instrs: 16513 -> 15706 (-4.89%)
CodeSize: 99756 -> 95760 (-4.01%)
Latency: 45165 -> 44151 (-2.25%)
InvThroughput: 8344 -> 7886 (-5.49%)
VClause: 395 -> 401 (+1.52%)
Copies: 639 -> 634 (-0.78%)
PreSGPRs: 1158 -> 1154 (-0.35%)
PreVGPRs: 1227 -> 1225 (-0.16%)
VALU: 11310 -> 10769 (-4.78%)
SALU: 813 -> 809 (-0.49%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399 >
2026-03-20 08:50:41 +00:00
Georg Lehmann
3ad142d4d7
nir/search: never insert movs for alu uses
...
This means we respect the pattern order better because
simple replacements like bcsel(False, a, b) -> b no longer
insert movs that can block more specialized patterns.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399 >
2026-03-20 08:50:41 +00:00
Georg Lehmann
1626df7a90
nir: rework nir_alu_src_is_trivial_ssa to take an alu src
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399 >
2026-03-20 08:50:41 +00:00
Georg Lehmann
b96c42c916
nir/opt_algebraic: optimize more near useless bcsel
...
Foz-DB Navi48:
Totals from 327 (0.29% of 114655) affected shaders:
Instrs: 732971 -> 731642 (-0.18%); split: -0.19%, +0.01%
CodeSize: 3696020 -> 3689824 (-0.17%); split: -0.17%, +0.00%
Latency: 4405319 -> 4403413 (-0.04%); split: -0.06%, +0.01%
InvThroughput: 650209 -> 649659 (-0.08%); split: -0.10%, +0.01%
Copies: 53872 -> 53736 (-0.25%); split: -0.27%, +0.02%
Branches: 15598 -> 15571 (-0.17%)
VALU: 262391 -> 261969 (-0.16%)
SALU: 268112 -> 267699 (-0.15%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40399 >
2026-03-20 08:50:41 +00:00
Georg Lehmann
6cfe6eaa79
nir/opt_algebraic: create ldexp from exp2
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
ldexp uses the full width VALU path, exp2 the transcendental SIMD8.
Foz-DB Navi21:
Totals from 729 (0.64% of 114627) affected shaders:
MaxWaves: 20071 -> 20103 (+0.16%); split: +0.18%, -0.02%
Instrs: 869129 -> 867654 (-0.17%); split: -0.17%, +0.00%
CodeSize: 4709000 -> 4708460 (-0.01%); split: -0.02%, +0.00%
VGPRs: 31184 -> 31128 (-0.18%); split: -0.23%, +0.05%
Latency: 7610726 -> 7597238 (-0.18%); split: -0.18%, +0.00%
InvThroughput: 1822323 -> 1819815 (-0.14%); split: -0.14%, +0.00%
VClause: 22494 -> 22493 (-0.00%); split: -0.03%, +0.02%
SClause: 20520 -> 20509 (-0.05%)
Copies: 72025 -> 72024 (-0.00%); split: -0.01%, +0.01%
Branches: 22028 -> 22029 (+0.00%)
PreVGPRs: 21601 -> 21602 (+0.00%)
VALU: 604821 -> 603339 (-0.25%); split: -0.25%, +0.00%
SALU: 114258 -> 114262 (+0.00%); split: -0.00%, +0.01%
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33900 >
2026-03-20 08:15:08 +00:00
Georg Lehmann
ec331cc48a
nir: replace lower_ldexp with has_ldexp
...
I can be bothered to fix all the backends that don't set lower_ldexp,
and only two backends have ldexp anyway.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33900 >
2026-03-20 08:15:08 +00:00
Faith Ekstrand
3418525a82
pan/bi: Lower VS outputs in NIR
...
Co-authored-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391 >
2026-03-19 11:25:32 +00:00
Lorenzo Rossi
43ffcf06f4
pan/bi,nir: Divide memory_access from segments
...
Valhall removed Bifrost's memory segments and added in its place memory
access. Those were bolted on reserved bits as "pseudo-segments" and the
emitter would catch these and emit the right memory access. This commit
cleans it up a bit by making memory_access available directly and
exposing it to NIR (this will be useful later).
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391 >
2026-03-19 11:25:30 +00:00
Lorenzo Rossi
c730e41ed5
pan/bi: Add is_psiz_store flag in bi_instr
...
This removes the previous hack that searched the psiz write by looking
for 16-bit stores with the correct pseudo segment. We also add a new
intrinsic that mimicks global stores but tags psiz writes, this will be
used later in the series.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391 >
2026-03-19 11:25:30 +00:00
Faith Ekstrand
de338dc908
pan,nir: Rework converted_mem_pan intrinsics
...
First, rename them to make them a bit more clear. They act on global
memory so they should be _global and they map to ld/st_cvt so so _cvt is
nice and obvious. Second, they don't need IO semantics as they're not
IO. But they do need ACCESS so that we can better control things like
CAN_REORDER. Third, add a src_type to store_global_cvt even though it
won't be used just yet because we'll want it for lowering VS stores.
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391 >
2026-03-19 11:25:29 +00:00
Faith Ekstrand
d2f430bea9
pan/bi: Add new FS input load intrinsics
...
Unlike load[_interpolated]_input, which has to deal with all sorts of
ABI nonsense between driver and compiler, these new intrinsics are
dumber than bricks. They're literally just the HW ops as NIR
intrinsics. These will allow us do the lowering in NIR and put the
driver in total control over what goes down what path. Among other
things, a driver could choose to lower some things to ld_var and others
to ld_var_buf.
Co-authored-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391 >
2026-03-19 11:25:28 +00:00
Georg Lehmann
57c05f72f9
nir/opt_large_constants: only use 16bit float alu when supported
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002 >
2026-03-19 06:59:18 +00:00
Georg Lehmann
5f37788ae9
nir/opt_large_constants: handle floating point power of two fractions
...
Foz-DB Navi48:
Totals from 365 (0.32% of 114655) affected shaders:
MaxWaves: 10020 -> 10016 (-0.04%)
Instrs: 486252 -> 486097 (-0.03%); split: -0.21%, +0.18%
CodeSize: 2629536 -> 2628452 (-0.04%); split: -0.19%, +0.14%
VGPRs: 19884 -> 19896 (+0.06%); split: -0.06%, +0.12%
SpillSGPRs: 210 -> 212 (+0.95%)
Latency: 3818610 -> 3765549 (-1.39%); split: -1.50%, +0.11%
InvThroughput: 598445 -> 596281 (-0.36%); split: -0.58%, +0.22%
VClause: 10053 -> 9698 (-3.53%); split: -3.54%, +0.01%
SClause: 17548 -> 17334 (-1.22%); split: -1.24%, +0.02%
Copies: 43196 -> 42249 (-2.19%); split: -2.34%, +0.14%
Branches: 16695 -> 16628 (-0.40%); split: -0.47%, +0.07%
PreSGPRs: 17988 -> 17971 (-0.09%)
PreVGPRs: 13552 -> 13520 (-0.24%)
VALU: 244842 -> 246611 (+0.72%); split: -0.02%, +0.74%
SALU: 79163 -> 77778 (-1.75%); split: -2.05%, +0.30%
VMEM: 13468 -> 13084 (-2.85%)
SMEM: 23571 -> 23393 (-0.76%)
VOPD: 8384 -> 8372 (-0.14%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002 >
2026-03-19 06:59:18 +00:00
Georg Lehmann
372c1a23dc
nir/opt_large_constants: support negative small constants
...
Foz-DB Navi48:
Totals from 511 (0.45% of 114655) affected shaders:
MaxWaves: 14554 -> 14552 (-0.01%)
Instrs: 767577 -> 768334 (+0.10%); split: -0.17%, +0.27%
CodeSize: 4171036 -> 4181400 (+0.25%); split: -0.10%, +0.35%
VGPRs: 27676 -> 27724 (+0.17%)
SpillSGPRs: 144 -> 183 (+27.08%)
Latency: 4053919 -> 4027092 (-0.66%); split: -0.88%, +0.22%
InvThroughput: 817990 -> 819490 (+0.18%); split: -0.21%, +0.39%
VClause: 11573 -> 11172 (-3.46%); split: -3.47%, +0.01%
SClause: 14418 -> 14579 (+1.12%); split: -0.46%, +1.57%
Copies: 71638 -> 71365 (-0.38%); split: -1.54%, +1.16%
Branches: 20212 -> 20425 (+1.05%); split: -0.39%, +1.44%
PreSGPRs: 21765 -> 21743 (-0.10%); split: -0.23%, +0.12%
PreVGPRs: 19475 -> 19307 (-0.86%); split: -0.91%, +0.05%
VALU: 411365 -> 413642 (+0.55%); split: -0.02%, +0.57%
SALU: 126940 -> 125411 (-1.20%); split: -1.53%, +0.32%
VMEM: 20574 -> 20062 (-2.49%)
SMEM: 23724 -> 23677 (-0.20%); split: -0.25%, +0.05%
VOPD: 19838 -> 19847 (+0.05%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002 >
2026-03-19 06:59:18 +00:00
Georg Lehmann
a9f3efcae0
nir/opt_large_constants: optimize small vector constant arrays
...
Foz-DB Navi48:
Totals from 2956 (2.58% of 114655) affected shaders:
MaxWaves: 85080 -> 85110 (+0.04%)
Instrs: 5167735 -> 5170572 (+0.05%); split: -0.12%, +0.17%
CodeSize: 28882716 -> 28867340 (-0.05%); split: -0.14%, +0.08%
VGPRs: 164484 -> 164616 (+0.08%); split: -0.09%, +0.18%
SpillSGPRs: 612 -> 611 (-0.16%)
Latency: 35017837 -> 34391146 (-1.79%); split: -1.80%, +0.01%
InvThroughput: 6336245 -> 6323807 (-0.20%); split: -0.49%, +0.29%
VClause: 112504 -> 111117 (-1.23%); split: -1.32%, +0.09%
SClause: 121125 -> 117618 (-2.90%); split: -3.04%, +0.15%
Copies: 392203 -> 384977 (-1.84%); split: -1.88%, +0.04%
Branches: 155578 -> 155376 (-0.13%); split: -0.13%, +0.01%
PreSGPRs: 127654 -> 127205 (-0.35%); split: -0.39%, +0.04%
PreVGPRs: 112486 -> 112449 (-0.03%); split: -0.04%, +0.00%
VALU: 2577362 -> 2586379 (+0.35%); split: -0.00%, +0.35%
SALU: 889569 -> 888472 (-0.12%); split: -1.01%, +0.89%
VMEM: 167203 -> 165750 (-0.87%)
SMEM: 190438 -> 187313 (-1.64%)
VOPD: 194411 -> 194344 (-0.03%); split: +0.01%, -0.04%
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002 >
2026-03-19 06:59:18 +00:00
Georg Lehmann
f782524c36
nir/opt_large_constants: enable small constant optimization for non trivial strides
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002 >
2026-03-19 06:59:17 +00:00
Georg Lehmann
568b96f8b2
nir/opt_large_constants: set fp_math_ctrl for bit exact results
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002 >
2026-03-19 06:59:17 +00:00
Georg Lehmann
e810382a1e
nir/opt_large_constants: don't add constants implemented with ALU to the constant data
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33002 >
2026-03-19 06:59:16 +00:00