Commit graph

212980 commits

Author SHA1 Message Date
Vitaliy Triang3l Kuzmin
fe165f4e2a radeonsi: Disable TC-compatible HTILE when bug workarounds conflict
GFX1013 has bugs that need mutually exclusive workarounds.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33962>
2025-10-02 08:29:49 +00:00
Vitaliy Triang3l Kuzmin
4e3a5f60e1 radv,ac: Split has_tc_compat_zrange_bug into Z and ZS, document it
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33962>
2025-10-02 08:29:49 +00:00
Vitaliy Triang3l Kuzmin
5243f292ef radv,ac: GFX10 depth/stencil HTILE mipmap bug info variable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33962>
2025-10-02 08:29:48 +00:00
Icenowy Zheng
1c27ddefd0 gallivm: orcjit: put object cache under the protect of lookup_mutex
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Different threads calling gallivm share the same LPJIT (and the
underlying LLJIT) instance, which could be only bound to a single cache
object at the same time.

Pass the object cache when looking up the symbol and put it under the
protect of lookup_mutex to prevent accessing wrong cache.

This seems to fix some MissingSymbolDefinitions error when running
Plasma Shell with llvmpipe on RISC-V.

Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37497>
2025-10-02 14:28:28 +08:00
Erik Faye-Lund
84db809e0a pvr: kill off pvr_private.h
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
All useful bits has been moved elsewhere, so let's remove this
mega-header and replace it with more targeted includes instead.

Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37432>
2025-10-02 05:34:09 +00:00
Erik Faye-Lund
34e9cb59e3 pvr: avoid including pvr_private.h from headers
This header contains a lot of includes, and gets everywhere. Let's make
sure we don't include it from headers, which makes this much easier to
manage.

Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37432>
2025-10-02 05:34:09 +00:00
Erik Faye-Lund
93d00bdbc1 pvr: break out macros to separate header
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37432>
2025-10-02 05:34:09 +00:00
Erik Faye-Lund
a68d22b6ad pvr: break out wsi to separate header
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37432>
2025-10-02 05:34:09 +00:00
Erik Faye-Lund
73a50e12cd pvr: break out descriptor sets to separate header
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37432>
2025-10-02 05:34:08 +00:00
Erik Faye-Lund
bedb90a67e pvr: break out pipelines to separate header
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37432>
2025-10-02 05:34:08 +00:00
Erik Faye-Lund
b51fac6212 pvr: break out queries to separate header
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37432>
2025-10-02 05:34:07 +00:00
Erik Faye-Lund
9d2478d353 pvr: break out cmd-buffer to separate header
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37432>
2025-10-02 05:34:07 +00:00
Erik Faye-Lund
8c043e651d pvr: break out render-pass to separate headers
To avoid some circular dependencies due to pvr_private.h, split out
pvr_framebuffer etc into their own header.

We often only need to peek into the framebuffer, so this seems like a
good idea anyway. We can reconsider this once dynamic rendering has
landed, and we know how much remains here.

Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37432>
2025-10-02 05:34:06 +00:00
Erik Faye-Lund
87193fc6ce pvr: break out buffer to separate header
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37432>
2025-10-02 05:34:06 +00:00
Erik Faye-Lund
e0d9effa7a pvr: break out image to separate header
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37432>
2025-10-02 05:34:05 +00:00
Erik Faye-Lund
0cf8839a3d pvr: break out instance/device to separate header
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37432>
2025-10-02 05:34:05 +00:00
Erik Faye-Lund
af431e7495 pvr: break out queue to separate header
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37432>
2025-10-02 05:34:04 +00:00
Erik Faye-Lund
dd296e0543 pvr: move pvr_pds_upload to pvr_common.h
This is used in a lot of places in the driver, and doesn't naturally
belong in any of the smaller modules that we're about to introduce.

Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37432>
2025-10-02 05:34:03 +00:00
Jianxun Zhang
e02a1bb173 iris: Enable Xe2 modifiers on all newer platforms
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35776>
2025-10-01 14:51:53 -07:00
Jianxun Zhang
42c3585ea1 isl: Reuse Xe2 modifers on newer platforms
We will reuse LNL and BMG modifiers on newer platforms until
new modifiers show up.

Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35776>
2025-10-01 14:51:53 -07:00
Georg Lehmann
9533e7cdae aco/optimizer: fix incorrect operand order assumption for neg(mul) opt
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The code that labels instructions doesn't care about the order either.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14013
Cc: mesa-stable

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37643>
2025-10-01 20:52:12 +00:00
Sil Vilerino
700ccea319 mediafoundation: Implement video encode spatial adaptive quantization interface
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Pohsiang (John) Hsu <pohhsu@microsoft.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37581>
2025-10-01 14:46:31 -04:00
Sil Vilerino
3ba07819aa mediafoundation: Remove Agility v717 guards for features now available in v618
Reviewed-by: Pohsiang (John) Hsu <pohhsu@microsoft.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37581>
2025-10-01 14:46:28 -04:00
Sil Vilerino
b06b2fbaba d3d12: Remove Agility v717 guards for features now available in v618
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37581>
2025-10-01 14:46:24 -04:00
Sil Vilerino
0e73c6470e d3d12: Implement video encode spatial adaptive quantization interface
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37581>
2025-10-01 14:46:21 -04:00
Sil Vilerino
ce7c4e14ef pipe: Add video encode spatial adaptive quantization interface
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37581>
2025-10-01 14:46:18 -04:00
Sil Vilerino
0556fa09f0 ci: Bump DirectX-Headers and Agility SDK dependencies to 1.618.1
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37581>
2025-10-01 14:46:15 -04:00
Silvio Vilerino
b5e856c6af d3d12: Video encode - Check driver caps to determine which output stats are supported
Reviewed-by: Pohsiang (John) Hsu <pohhsu@microsoft.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37581>
2025-10-01 14:46:11 -04:00
Rohit Athavale
ddcc6baad9 mediafoundation: Lock QP Map Buffer when in use, unlock after
Reviewed-by: Pohsiang (John) Hsu <pohhsu@microsoft.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37581>
2025-10-01 14:46:08 -04:00
Rohit Athavale
cd6a83a637 d3d12: Update d3d12 back to use pipe_enc_qpmap_input_info
Currently, the CPU map is not being used. That will come in a seperate
PR. Attempting to map existing functionality as-is.

Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37581>
2025-10-01 14:46:05 -04:00
Rohit Athavale
1636620eab pipe: Add pipe_enc_qpmap_input_info to contain GPU & CPU QP Maps
Add a new pipe struct to contain
- GPU QP Map Handle
- CPU QP Map (8-bit and 16-bit)

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37581>
2025-10-01 14:46:00 -04:00
Natalie Vock
52c7b0d20c radv/bvh: Encode empty AS bounds as NaN
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
If there are no leaves, the root node bounds still span -inf/inf.
Making empty BLASs infinite-sized guarantees ray traversal needs to
enter the BLAS (and immediately exit because it's empty). Remove the
BLAS from the BVH entirely by marking its bounds as NaN. As a bonus,
this works around RADV encountering issues in Silent Hill 2 on RDNA4 due
to infinite-sized BVHs.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37492>
2025-10-01 14:27:15 +00:00
Natalie Vock
33099040a3 vulkan/bvh: Mark instances with NAN AABBs as inactive
They can never be hit, remove them from the BVH.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37492>
2025-10-01 14:27:15 +00:00
TellowKrinkle
e14adc5cb2 hk: Add non-cached memory type
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37655>
2025-10-01 13:26:51 +00:00
TellowKrinkle
05b927ac7e hk: Enable caching on memory marked with HOST_CACHED_BIT
cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37655>
2025-10-01 13:26:51 +00:00
Charles Giessen
2b70575b9d docs: Use correct ICD path in install.rst
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Path was missing the `.d`.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37611>
2025-10-01 11:42:23 +00:00
Eric Engestrom
ddedac739f docs: add sha sum for 25.2.4
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37654>
2025-10-01 12:47:49 +02:00
Eric Engestrom
ddc344ac67 docs: add release notes for 25.2.4
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37654>
2025-10-01 12:47:49 +02:00
Eric Engestrom
77f0ae594a docs: update calendar for 25.2.4
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37654>
2025-10-01 12:47:48 +02:00
Martin Roukala (né Peres)
9bb74929bc ci,crnm: remove unused imports
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37641>
2025-10-01 09:38:02 +00:00
Martin Roukala (né Peres)
c2a2d7215a ci,crnm: remove unsupported arguments by console.print
Fixes: 51c3f56aa (ci,crnm: migrate colorama to rich)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37641>
2025-10-01 09:38:02 +00:00
Fafa Kitten
42a78a1aae meson: detect memfd_create() and getrandom() from headers, not system libraries
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
When compiling Mesa on Android targeting Android, under some conditions `memfd_create()` and `getrandom()` are detected as available when they are not, so they should not be assumed to be available unless they are detected by Meson as available in the appropriate header passed in the `prefix` argument to Meson's `has_function()` method.

`memfd_create()` is not available unless the target Android API level is 30 (Android 11) or higher and the define `_GNU_SOURCE` (which in turn sets the define `__USE_GNU`) is set, and Mesa does set `_GNU_SOURCE`, so by setting `args: pre_args` in the appropriate call to `has_function()` (which causes `-D_GNU_SOURCE` to be added to the arguments used by Meson to check the header only if it was detected as necessary for `pre_args` in the earlier condition), `memfd_create()` will be correctly detected as available when targeting (only) Android 11 or newer (and other operating systems that support `memfd_create()`) after this PR,

and `getrandom()` is not available unless the target Android API level is 28 (Android 9) or higher, so `getrandom()` will be detected as available when targeting (only) Android 9 or newer (and other operating systems that support `getrandom()`) after this PR.

Related information:

https://android.googlesource.com/platform/bionic/+/refs/heads/android15-release/libc/include/sys/mman.h#186

https://android.googlesource.com/platform/bionic/+/refs/heads/android15-release/libc/include/sys/random.h#55

https://android.googlesource.com/platform/bionic/+/refs/heads/android15-release/libc/include/sys/cdefs.h#182

927f65caf3/meson.build (L1074)

cab3b67cfe/docs/markdown/Compiler-properties.md (does-a-function-exist)

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13566
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37630>
2025-09-30 21:21:53 -05:00
Aleksi Sapon
927f65caf3 vk: Fix MSVC warning C4189
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37638>
2025-09-30 22:40:28 +00:00
Marek Vasut
5eafa246ab etnaviv: Turn ETNA_CORE_ into ETNA_FEATURE_CORE_
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The Vivante GC8000 Nano Ultra VIP r6205 present in ST STM32MP25xx
is a combined GPU and NPU single device. The either ETNA_CORE_GPU
or ETNA_CORE_NPU behavior does not apply to this device. Instead
of adding new combined ETNA_CORE_GPU_AND_NPU variant, convert the
ETNA_CORE_GPU and ETNA_CORE_NPU into ETNA_FEATURE_CORE_GPU and
ETNA_FEATURE_CORE_NPU, so they can be tested as flags. This allows
handling of such combined devices.

Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Signed-off-by: Marek Vasut <marek.vasut@mailbox.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37488>
2025-09-30 20:45:17 +00:00
Mary Guillemard
2873f47aeb asahi: Add base expectation on VKCTS main
Seems we have new failures.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37632>
2025-09-30 20:31:27 +00:00
Caio Oliveira
d16d7ac470 intel/executor: Destroy syncobjs after using them
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37645>
2025-09-30 20:17:01 +00:00
Kenneth Graunke
937fa18bb9 iris/ci: Update trace checksums
The difference here was 1-2 pixels.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36750>
2025-09-30 19:44:03 +00:00
Kenneth Graunke
3af4e63061 brw: Skip compilation of larger SIMDs when pressure is too high
This allows us to skip the entire backend compilation process for
large SIMD widths when register pressure is high enough that we'd
likely decide to prefer a smaller one in the end anyway.  The hope
is to make the same decisions as before, but with less CPU overhead.

We are making mostly the same decisions as before:

   | API / Platform | Total Shaders | Changed | % Identical
   --------------------------------------------------
   | VK / Arc A770 |       905,525 |   1,157 |   99.872% |
   | VK / Arc B580 |       788,127 |      53 |   99.993% |
   | VK / Panther  |       786,333 |      13 |   99.998% |
   | GL / Arc A770 |       308,618 |     269 |   99.913% |
   | GL / Arc B580 |       264,066 |      13 |   99.995% |
   | GL / Panther  |       273,212 |       0 |  100.000% |

Improves compile times on my i7-12700K:

   | Game                      | Arc B580 | Arc A770 |
   ---------------------------------------------------
   | Assassins Creed: Odyssey  |  -13.47% |  -10.98% |
   | Borderlands 3 (DX12)      |  -10.05% |  -11.31% |
   | Dark Souls 3              |  -21.06% |  -21.08% |
   | Oblivion Remastered       |  -11.10% |   -9.82% |
   | Phasmophobia              |  -32.73% |  -31.00% |
   | Red Dead Redemption 2     |  -20.10% |  -14.38% |
   | Total War: Warhammer III  |  -10.11% |  -14.44% |
   | Wolfenstein Youngblood    |  -15.91% |  -13.47% |
   | Shadow of the Tomb Raider |  -30.23% |  -25.86% |

It seems to have nearly no effect on compile times on Xe3 unfortunately,
as only 1,014 shaders in fossil-db even fail SIMD32 compilation in the
first place, and we want to let most of the "might succeed" cases
through to the backend for throughput analysis.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36750>
2025-09-30 19:44:03 +00:00
Kenneth Graunke
248050b6d0 brw: Add a quick NIR-based register pressure estimate pass
This tries to calculate an underestimate (lower bound) for the register
pressure at various SIMD widths, by counting live values in the NIR
shader.  This fundamentally won't be accurate, but it can give us an
idea of whether it's even worth trying a certain SIMD-width compile.

Doing this at the NIR level means we:
- Can use SSA structure rather than fuzzy liveness intervals
- Can avoid the backend scheduler aggressively trying to hide latency,
  presenting an overinflated view of the register pressure
- Have divergence information on-hand, making it easier to "scale up"
- Can skip cloning and optimizing NIR for compute shader SIMD widths

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36750>
2025-09-30 19:44:03 +00:00
Kenneth Graunke
5ebd766156 brw: Do most of NIR postprocessing before cloning for SIMD variants
We were doing a lot of NIR work repeatedly for each SIMD variant of
compute and mesh shaders.  Instead, do it once before cloning, and
just do one final optimization loop and out-of-SSA for each.

fossil-db results on Arc B580:

   Totals:
   Instrs: 233771096 -> 233794024 (+0.01%); split: -0.01%, +0.02%
   Subgroup size: 15922768 -> 15922736 (-0.00%); split: +0.00%, -0.00%
   Send messages: 12095619 -> 12098234 (+0.02%); split: -0.00%, +0.02%
   Loop count: 137562 -> 137523 (-0.03%)
   Cycle count: 32600323744 -> 32667411252 (+0.21%); split: -0.06%, +0.27%
   Spill count: 540908 -> 542027 (+0.21%); split: -0.07%, +0.28%
   Fill count: 700938 -> 698983 (-0.28%); split: -0.73%, +0.45%
   Scratch Memory Size: 37266432 -> 37304320 (+0.10%); split: -0.10%, +0.20%
   Max live registers: 72691728 -> 72692987 (+0.00%); split: -0.00%, +0.00%
   Non SSA regs after NIR: 67690309 -> 67688352 (-0.00%); split: -0.01%, +0.00%

   Totals from 3576 (0.45% of 789301) affected shaders:
   Instrs: 6932956 -> 6955884 (+0.33%); split: -0.41%, +0.74%
   Subgroup size: 88816 -> 88784 (-0.04%); split: +0.09%, -0.13%
   Send messages: 329168 -> 331783 (+0.79%); split: -0.02%, +0.81%
   Loop count: 8753 -> 8714 (-0.45%)
   Cycle count: 15153678820 -> 15220766328 (+0.44%); split: -0.14%, +0.58%
   Spill count: 213751 -> 214870 (+0.52%); split: -0.18%, +0.71%
   Fill count: 282616 -> 280661 (-0.69%); split: -1.82%, +1.13%
   Scratch Memory Size: 13056000 -> 13093888 (+0.29%); split: -0.27%, +0.56%
   Max live registers: 834757 -> 836016 (+0.15%); split: -0.11%, +0.26%
   Non SSA regs after NIR: 995033 -> 993076 (-0.20%); split: -0.48%, +0.28%

Looking at a few of the shaders with substantial instruction count
increases, it appears that it is largely due to more loops being
unrolled, which is probably actually a good thing.

The compile time impact of this patch appears to be negligable.
However, doing postprocessing before SIMD cloning allows us to
examine the postprocessed SSA-form NIR for improvements in an
upcoming patch.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36750>
2025-09-30 19:44:02 +00:00