Commit graph

218691 commits

Author SHA1 Message Date
Lars-Ivar Hesselberg Simonsen
2d9be41706 panvk/v13: Support HSR Prepass
Add an option to enable HSR Prepass.

It is currently disabled by default as it might cause performance
regressions for content that:

- Has very simple fragment work.
- Already does a ZS prepass.

Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39615>
2026-02-16 12:25:14 +00:00
Lars-Ivar Hesselberg Simonsen
3d6c7cf8b7 panvk/v13: Set HSR flags
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39615>
2026-02-16 12:25:14 +00:00
Lars-Ivar Hesselberg Simonsen
b10555ea63 pan/compiler: Add pass to collect HSR info
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39615>
2026-02-16 12:25:14 +00:00
Lars-Ivar Hesselberg Simonsen
6e88d9cbe3 pan/genxml/v13: Add HSR operation enums
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39615>
2026-02-16 12:25:14 +00:00
Lars-Ivar Hesselberg Simonsen
71500a32fa pan/genxml/v13: Fix HSR Prepass typo
Fixes: ece01443e1 ("pan/genxml: Add v13 definition")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39615>
2026-02-16 12:25:14 +00:00
Lars-Ivar Hesselberg Simonsen
75242b1862 panvk: Fix dcd_flags1 dirty bit
dcd_flags1 was not counted as dirty in case the color attachment map was
updated. This could lead to an outdated value for render_target_mask.

Fixes: a4670a67e0 ("panvk/csf: Set the correct DCD_FLAGS_1.render_rarget_mask")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39615>
2026-02-16 12:25:13 +00:00
Pavel Ondračka
0763fb947a r300: align macro-tiled stride-addressed textures in X
Odd macro-tile counts in X trigger flaky rendering/readback in
parallel stress runs with macro-tiled NPOT textures (for example
piglit draw-pixel-with-texture -auto -fbo).

When a texture is macro-tiled and uses stride addressing, align the
width to two macro tiles. This keeps the stride at an even number of
macro tiles in X and avoids the corruption without disabling
macrotiling.

I was not able to find anything about this in the docs.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39882>
2026-02-16 13:04:56 +01:00
Pavel Ondračka
7ae9262dc3 r300: split unaligned 3D texsubimage uploads by layer
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
TexSubImage3D failures were caused by tiled multi-slice uploads for
unaligned XY box. Falls back to per-layer uploads when the 3D box
depth is > 1 and XY box is unaligned.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39897>
2026-02-16 11:26:41 +00:00
Hyunjun Ko
eedbe136ea anv/video: remove unsupported feautres for encoders
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39884>
2026-02-16 10:58:40 +00:00
Hyunjun Ko
1185bbe18d anv/video: set Sad Qp Lambda values properly for H265 encoder.
This is taken from media-driver(Intel VA-API)

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39884>
2026-02-16 10:58:40 +00:00
Hyunjun Ko
1cb4fe5ef5 anv/video: Handle GPB(Generalized P and B frames) properly for H265 enc.
The previous code was copying RefPicList0 to RefPicList1 but not updating
num_ref_idx_l1_active_minus1, leaving it potentially uninitialized or zero.
This caused the hardware to see an inconsistent L1 list state.

Accordingly it sets num_ref_idx_active_override_flag if necessary.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39884>
2026-02-16 10:58:40 +00:00
Hyunjun Ko
4d4a5e4a42 anv/video: set Qp passed from apps for h265 encoder
Instead of 26 by default.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39884>
2026-02-16 10:58:40 +00:00
Hyunjun Ko
6efbb80c98 anv/video: set transform skip numbers according to qp
Instead of hardcode.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39884>
2026-02-16 10:58:40 +00:00
Juan A. Suarez Romero
f641eb4fad broadcom/ci: update expected results
Add new flakes found in the nightly runs.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39912>
2026-02-16 11:21:23 +01:00
Juan A. Suarez Romero
154b5ccc9e broadcom/ci: update available devices
These are the devices used for pre-merges.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39912>
2026-02-16 11:21:15 +01:00
Job Noorman
65362a9c38 ir3: don't use predication for large blocks
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Even when the branch condition is divergent, it may still be uniform
within a wave. When this happens for very large blocks, predication
causes a large overhead because the instructions in the false block are
not simply jumped over. Therefore, we fall back to normal branches for
large blocks.

Measuring renderpass time on some OpenGL traces:
average fps +0.5% (+/- 0.1%)
max fps +0.5% (+/- 0.1%)

Totals from 18522 (10.51% of 176279) affected shaders:
MaxWaves: 203126 -> 203156 (+0.01%)
Instrs: 23999194 -> 23521729 (-1.99%); split: -1.99%, +0.00%
CodeSize: 45462360 -> 45250224 (-0.47%); split: -0.47%, +0.00%
NOPs: 5078652 -> 4647917 (-8.48%); split: -8.48%, +0.00%
MOVs: 813450 -> 812615 (-0.10%); split: -0.26%, +0.15%
COVs: 296638 -> 296620 (-0.01%); split: -0.01%, +0.00%
Full: 334991 -> 334923 (-0.02%)
(ss): 636625 -> 636682 (+0.01%); split: -0.12%, +0.13%
(sy): 283395 -> 283429 (+0.01%); split: -0.10%, +0.11%
(ss)-stall: 2652246 -> 2651590 (-0.02%); split: -0.18%, +0.16%
(sy)-stall: 7862615 -> 7881590 (+0.24%); split: -0.13%, +0.37%
STPs: 15994 -> 15992 (-0.01%)
LDPs: 23360 -> 23356 (-0.02%)
Subgroup size: 896 -> 1792 (+100.00%)
Cat0: 5572855 -> 5097380 (-8.53%); split: -8.53%, +0.00%
Cat1: 1146050 -> 1145189 (-0.08%); split: -0.18%, +0.11%
Cat2: 8975537 -> 8974390 (-0.01%); split: -0.01%, +0.00%
Cat6: 196837 -> 196831 (-0.00%)
Cat7: 795866 -> 795890 (+0.00%); split: -0.06%, +0.06%

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39734>
2026-02-16 09:00:14 +00:00
Job Noorman
e7c3834a27 ir3: add block_can_be_predicated helper
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39734>
2026-02-16 09:00:13 +00:00
Samuel Pitoiset
47841c1142 radv/meta: remove useless DCC decompressions for image<->buffer
It's not needed to decompress DCC when formats are compatible each
other, this basically removes all decompressions on GFX11-GFX11.5.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39888>
2026-02-16 07:40:13 +00:00
Yiwei Zhang
b0397b967d venus: workaround a gcc-15 dead store elimination (DSE) bug
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
No issue with clang or gcc-14.x (or earlier versions). The issue only
shows up since gcc-15.1. The compiler somehow fails to consider those
cs helpers dereferencing the pointer from the pNext chain for reads,
and thus has falsely optimized away the pNext store. This change works
around this with a no-op memory clobber.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13242
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39906>
2026-02-16 01:39:10 +00:00
Vinson Lee
7239b5288f freedreno/decode: replace lua_pushunsigned with lua_pushinteger
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
lua_pushunsigned was introduced in Lua 5.2, deprecated in 5.3,
and removed in 5.4. Replace it with lua_pushinteger which has
been available since Lua 5.0 and handles the uint32_t value
safely via implicit widening to the 64-bit lua_Integer type.

This fixes the build with Lua 5.5:

../src/freedreno/decode/script.c: In function 'pushdecval':
../src/freedreno/decode/script.c:182:7: error: implicit declaration of function 'lua_pushunsigned'; did you mean 'lua_pushinteger'? [-Wimplicit-function-declaration]
  182 |       lua_pushunsigned(L, val.u);
      |       ^~~~~~~~~~~~~~~~
      |       lua_pushinteger

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39901>
2026-02-14 22:45:45 -08:00
Yiwei Zhang
a8baedef29 venus: expose VK_EXT_descriptor_heap behind a debug option
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
There's TODO for optimizing descriptor allocations, so currently we
expose the extension behind VN_DEBUG=desc_heap

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762>
2026-02-15 04:32:30 +00:00
Yiwei Zhang
6265dad4f2 venus: fill descriptor heap feats and props
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762>
2026-02-15 04:32:30 +00:00
Yiwei Zhang
dea6221a65 venus: take care of combined image sampler descriptor for ycbcr
We'd have to query it by reconstructing the sampler ycbcr conversion
image format props query. It's straightforward except having to consider
the modifier tiling case.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762>
2026-02-15 04:32:30 +00:00
Yiwei Zhang
4f475789d5 venus: ensure descriptor writes invariance
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762>
2026-02-15 04:32:30 +00:00
Yiwei Zhang
1d779f5af1 venus: cache descriptor size query
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762>
2026-02-15 04:32:29 +00:00
Yiwei Zhang
dfc5d76205 venus: rename format_update_mutex for general purpose
Not worth separate locks for physical device level queries.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762>
2026-02-15 04:32:29 +00:00
Yiwei Zhang
be52338399 venus: add vn_descriptor.h to be shared between different desc systems
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762>
2026-02-15 04:32:28 +00:00
Yiwei Zhang
990b5fca37 venus: skip image cache for VkOpaqueCaptureDataCreateInfoEXT
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762>
2026-02-15 04:32:28 +00:00
Yiwei Zhang
526788a097 venus: pipeline layout is now optional
If descriptor heap is used, there's no pipeline layout created. So we
have to patch compute and RT pipelines to allow it. Graphics pipeline
doesn't need the change because of GPL.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762>
2026-02-15 04:32:28 +00:00
Yiwei Zhang
95331f3bd0 venus: cmd inheritance info fix to consider descriptor heap
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762>
2026-02-15 04:32:28 +00:00
Yiwei Zhang
485b2b501c venus: implement all descriptor heap commands
There're potential optimizations available for below:
- vkWriteSamplerDescriptorsEXT
- vkWriteResourceDescriptorsEXT
- vkGetPhysicalDeviceDescriptorSizeEXT
- vkRegisterCustomBorderColorEXT

...and we can revisit if there's perf hit from above for real apps.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762>
2026-02-15 04:32:28 +00:00
Yiwei Zhang
04c0142aaa venus: sync latest protocol for VK_EXT_descriptor_heap support
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762>
2026-02-15 04:32:28 +00:00
Adam Jackson
5ab41818c4 zink: use VK_EXT_pci_bus_info for PCI address
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39894>
2026-02-15 03:02:08 +00:00
aerith
0c17a59ab5 zink: fix codegen for extensions with non-standard struct names
name_in_camel_case() converts "pci" to "Pci" but Vulkan uses "PCI"
in VkPhysicalDevicePCIBusInfoPropertiesEXT. use the actual struct
name from vk.xml when the constructed name doesn't match.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39894>
2026-02-15 03:02:08 +00:00
Konstantin Seurer
9a82e4ba81 vulkan/cmd_queue: Do not zero initialize vk_cmd_queue_entry
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Most of the struct will be initialized already. Make sure to initialize
everything so linear_alloc_child can be used.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39881>
2026-02-14 20:11:40 +00:00
Konstantin Seurer
be5ab80de1 vulkan/cmd_queue: Fixup stride for multi draws
Copying the draw infos packs them so the stride needs to be set to the
struct size.

cc: mesa-stable

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39881>
2026-02-14 20:11:40 +00:00
Konstantin Seurer
bf61736aa5 vulkan/cmd_queue: Remove get_array_member_copy
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39881>
2026-02-14 20:11:40 +00:00
Konstantin Seurer
d2ea8b3d14 vulkan: Remove vk_cmd_queue_entry::driver_data
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39881>
2026-02-14 20:11:39 +00:00
Konstantin Seurer
017b3b73bb lavapipe: Extend vk_cmd_queue_entry_base for internal commands
lavapipe is the only driver that is using driver_data.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39881>
2026-02-14 20:11:39 +00:00
Collabora's Gfx CI Team
1dc39405ce Uprev Piglit to 0d79fb4a59c7d213ff144afa4c73e3b32ebe6500
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
62d499d63d...0d79fb4a59

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39575>
2026-02-14 14:43:20 +01:00
Thomas H.P. Andersen
331af5e746 nvk: add app workaround layer
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This adopts the device internal app workaround layer from radv

The layer allows to fix up game input in the layer instead of
adding workarounds within the driver.

Initially this only includes the workaround for Metro exodus as
I have verified that it fixes a crash on NVK. Follow up commits
can add the other relevant workarounds when the fixes are verified
to be needed for NVK.

Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39870>
2026-02-14 08:33:11 +00:00
Thomas H.P. Andersen
0a6509e94c nvk: prepare for driver internal layers
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39870>
2026-02-14 08:33:11 +00:00
Timothy Arceri
a6fcc2835e st/glsl_to_nir: make sure the variant has the correct locations set
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
For drivers that set allow_st_finalize_nir_twice locations are set
when the variable is created. But for variants here we update the
locations in case parameter opt pass or something else changed the
location.

Fixes: 891d46f517 ("st/glsl_to_nir: dont add duplicate state tokens")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14837

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39867>
2026-02-14 06:37:10 +00:00
Timothy Arceri
c3aae0714c mesa: add _mesa_lookup_state_param_idx() helper
This will be used in the following patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39867>
2026-02-14 06:37:10 +00:00
Ian Romanick
df704bd38e elk: Call nir_opt_algebraic_late in elk_postprocess_nir
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Make sure that lowering undone in elk_nir_optimize are reapplied.

No shader-db or fossil-db changes on any Intel platform. This is most
likely to impact either Gfx8 on ANV or Gfx7.5 on HASVK. I don't
fossil-db test either of those platforms.

I tried doing a similar thing here as is done in BRW (previous commit),
but that caused a couple Haswell shaders to fall off a performance
cliff:

total spills in shared programs: 8247 -> 8311 (0.78%)
spills in affected programs: 6 -> 70 (1066.67%)
helped: 0 / HURT: 2

total fills in shared programs: 8558 -> 8910 (4.11%)
fills in affected programs: 6 -> 358 (5866.67%)
helped: 0 / HURT: 2

Fixes: 442daeb54a ("nir/opt_algebraic: use fcanonicalize")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39567>
2026-02-14 02:06:59 +00:00
Ian Romanick
11b96a84b0 brw: Call nir_opt_algebraic_late later in brw_postprocess_nir_opts
Move the call to nir_opt_algebraic_late after the last time
brw_nir_optimize might be called. nir_opt_algebraic_distribute_src_mods
works together with the late algebraic optimizations, so move it also.

shader-db:

Lunar Lake
total instructions in shared programs: 17081222 -> 17080842 (<.01%)
instructions in affected programs: 419931 -> 419551 (-0.09%)
helped: 545 / HURT: 826

total cycles in shared programs: 878437752 -> 879236226 (0.09%)
cycles in affected programs: 506003142 -> 506801616 (0.16%)
helped: 3091 / HURT: 3189

LOST:   18
GAINED: 16

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
total instructions in shared programs: 19994270 -> 19993231 (<.01%)
instructions in affected programs: 490499 -> 489460 (-0.21%)
helped: 660 / HURT: 800

total cycles in shared programs: 882498776 -> 882834186 (0.04%)
cycles in affected programs: 477858602 -> 478194012 (0.07%)
helped: 3458 / HURT: 3564

total fills in shared programs: 4371 -> 4370 (-0.02%)
fills in affected programs: 7 -> 6 (-14.29%)
helped: 1 / HURT: 0

LOST:   28
GAINED: 10

Tiger Lake, Ice Lake, and Skylake had similar results. (Tiger Lake shown)
total instructions in shared programs: 19943849 -> 19942782 (<.01%)
instructions in affected programs: 467384 -> 466317 (-0.23%)
helped: 655 / HURT: 796

total cycles in shared programs: 860085674 -> 861410289 (0.15%)
cycles in affected programs: 426900998 -> 428225613 (0.31%)
helped: 3250 / HURT: 3441

LOST:   19
GAINED: 14

fossil-db:

Lunar Lake
Totals:
Instrs: 926472091 -> 926204838 (-0.03%); split: -0.04%, +0.01%
CodeSize: 14845921056 -> 14842776112 (-0.02%); split: -0.10%, +0.08%
Send messages: 41459570 -> 41459574 (+0.00%); split: -0.00%, +0.00%
Cycle count: 104481085069 -> 104583692712 (+0.10%); split: -0.14%, +0.24%
Spill count: 3454651 -> 3457340 (+0.08%); split: -0.15%, +0.23%
Fill count: 4958779 -> 4958487 (-0.01%); split: -0.46%, +0.45%
Max live registers: 193805970 -> 193839002 (+0.02%); split: -0.00%, +0.02%
Max dispatch width: 49114416 -> 49113776 (-0.00%); split: +0.01%, -0.01%
Non SSA regs after NIR: 142953905 -> 142800740 (-0.11%); split: -0.12%, +0.01%

Totals from 420256 (20.80% of 2020128) affected shaders:
Instrs: 448571327 -> 448304074 (-0.06%); split: -0.09%, +0.03%
CodeSize: 7312002800 -> 7308857856 (-0.04%); split: -0.21%, +0.17%
Send messages: 17716494 -> 17716498 (+0.00%); split: -0.00%, +0.00%
Cycle count: 52178854998 -> 52281462641 (+0.20%); split: -0.28%, +0.48%
Spill count: 2945654 -> 2948343 (+0.09%); split: -0.17%, +0.26%
Fill count: 4404768 -> 4404476 (-0.01%); split: -0.51%, +0.51%
Max live registers: 60875448 -> 60908480 (+0.05%); split: -0.01%, +0.06%
Max dispatch width: 9455280 -> 9454640 (-0.01%); split: +0.04%, -0.04%
Non SSA regs after NIR: 60542740 -> 60389575 (-0.25%); split: -0.28%, +0.02%

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 1000081384 -> 999726726 (-0.04%); split: -0.05%, +0.01%
CodeSize: 16764458080 -> 16761624256 (-0.02%); split: -0.09%, +0.07%
Subgroup size: 27599528 -> 27599544 (+0.00%)
Send messages: 45538933 -> 45538951 (+0.00%); split: -0.00%, +0.00%
Cycle count: 93303830912 -> 93370118192 (+0.07%); split: -0.19%, +0.26%
Spill count: 3739306 -> 3739719 (+0.01%); split: -0.22%, +0.23%
Fill count: 5089719 -> 5083626 (-0.12%); split: -0.56%, +0.44%
Max live registers: 122041364 -> 122055848 (+0.01%); split: -0.00%, +0.01%
Max dispatch width: 38117296 -> 38127200 (+0.03%); split: +0.06%, -0.03%
Non SSA regs after NIR: 164296197 -> 164299306 (+0.00%); split: -0.01%, +0.01%

Totals from 338754 (14.82% of 2285730) affected shaders:
Instrs: 452723479 -> 452368821 (-0.08%); split: -0.10%, +0.03%
CodeSize: 7861878032 -> 7859044208 (-0.04%); split: -0.19%, +0.16%
Subgroup size: 16 -> 32 (+100.00%)
Send messages: 17050010 -> 17050028 (+0.00%); split: -0.00%, +0.00%
Cycle count: 52881801997 -> 52948089277 (+0.13%); split: -0.33%, +0.46%
Spill count: 3271458 -> 3271871 (+0.01%); split: -0.25%, +0.26%
Fill count: 4628422 -> 4622329 (-0.13%); split: -0.61%, +0.48%
Max live registers: 30738902 -> 30753386 (+0.05%); split: -0.01%, +0.06%
Max dispatch width: 4787264 -> 4797168 (+0.21%); split: +0.47%, -0.26%
Non SSA regs after NIR: 61748026 -> 61751135 (+0.01%); split: -0.03%, +0.03%

Tiger Lake
Totals:
Instrs: 1011068379 -> 1010977290 (-0.01%); split: -0.03%, +0.02%
CodeSize: 14197751744 -> 14197683040 (-0.00%); split: -0.07%, +0.07%
Send messages: 46431228 -> 46431220 (-0.00%); split: -0.00%, +0.00%
Cycle count: 85066526419 -> 85085088071 (+0.02%); split: -0.16%, +0.18%
Spill count: 3853750 -> 3855185 (+0.04%); split: -0.15%, +0.19%
Fill count: 6716746 -> 6719594 (+0.04%); split: -0.25%, +0.29%
Max live registers: 122307387 -> 122326083 (+0.02%); split: -0.00%, +0.02%
Max dispatch width: 38009632 -> 38003280 (-0.02%); split: +0.03%, -0.05%
Non SSA regs after NIR: 158403572 -> 158415390 (+0.01%); split: -0.01%, +0.02%

Totals from 277728 (12.17% of 2281577) affected shaders:
Instrs: 349206856 -> 349115767 (-0.03%); split: -0.07%, +0.05%
CodeSize: 5042621104 -> 5042552400 (-0.00%); split: -0.20%, +0.20%
Send messages: 13132243 -> 13132235 (-0.00%); split: -0.00%, +0.00%
Cycle count: 36183327716 -> 36201889368 (+0.05%); split: -0.38%, +0.43%
Spill count: 2210072 -> 2211507 (+0.06%); split: -0.26%, +0.33%
Fill count: 4188439 -> 4191287 (+0.07%); split: -0.39%, +0.46%
Max live registers: 24956695 -> 24975391 (+0.07%); split: -0.02%, +0.09%
Max dispatch width: 3948832 -> 3942480 (-0.16%); split: +0.32%, -0.48%
Non SSA regs after NIR: 45616425 -> 45628243 (+0.03%); split: -0.04%, +0.06%

Ice Lake
Totals:
Instrs: 1009584306 -> 1009411757 (-0.02%); split: -0.02%, +0.01%
CodeSize: 12593466880 -> 12592958096 (-0.00%); split: -0.01%, +0.01%
Send messages: 47274203 -> 47274171 (-0.00%); split: -0.00%, +0.00%
Cycle count: 84920281455 -> 84914027301 (-0.01%); split: -0.05%, +0.04%
Spill count: 2988523 -> 2986191 (-0.08%); split: -0.14%, +0.07%
Fill count: 5296078 -> 5288737 (-0.14%); split: -0.21%, +0.07%
Max live registers: 125429384 -> 125444786 (+0.01%); split: -0.00%, +0.02%
Max dispatch width: 41269072 -> 41267312 (-0.00%); split: +0.03%, -0.03%
Non SSA regs after NIR: 163223895 -> 163236623 (+0.01%); split: -0.01%, +0.02%

Totals from 243818 (10.45% of 2334244) affected shaders:
Instrs: 296953759 -> 296781210 (-0.06%); split: -0.08%, +0.02%
CodeSize: 3643224480 -> 3642715696 (-0.01%); split: -0.04%, +0.03%
Send messages: 11518671 -> 11518639 (-0.00%); split: -0.00%, +0.00%
Cycle count: 33065548412 -> 33059294258 (-0.02%); split: -0.13%, +0.11%
Spill count: 1346515 -> 1344183 (-0.17%); split: -0.32%, +0.15%
Fill count: 2537906 -> 2530565 (-0.29%); split: -0.43%, +0.14%
Max live registers: 21476776 -> 21492178 (+0.07%); split: -0.02%, +0.09%
Max dispatch width: 3727288 -> 3725528 (-0.05%); split: +0.31%, -0.35%
Non SSA regs after NIR: 41050474 -> 41063202 (+0.03%); split: -0.04%, +0.07%

Skylake
Totals:
Instrs: 513573157 -> 513462971 (-0.02%); split: -0.02%, +0.00%
CodeSize: 5950280672 -> 5950001392 (-0.00%); split: -0.01%, +0.00%
Send messages: 24909757 -> 24909758 (+0.00%); split: -0.00%, +0.00%
Cycle count: 57636102242 -> 57634726342 (-0.00%); split: -0.03%, +0.03%
Spill count: 627286 -> 627241 (-0.01%); split: -0.01%, +0.00%
Fill count: 837888 -> 837804 (-0.01%); split: -0.01%, +0.00%
Max live registers: 87272271 -> 87284192 (+0.01%); split: -0.00%, +0.02%
Max dispatch width: 32278832 -> 32271800 (-0.02%); split: +0.02%, -0.04%
Non SSA regs after NIR: 87387713 -> 87387614 (-0.00%); split: -0.00%, +0.00%

Totals from 177432 (10.30% of 1722906) affected shaders:
Instrs: 127170648 -> 127060462 (-0.09%); split: -0.10%, +0.01%
CodeSize: 1443406368 -> 1443127088 (-0.02%); split: -0.03%, +0.01%
Send messages: 5444220 -> 5444221 (+0.00%); split: -0.00%, +0.00%
Cycle count: 15423028495 -> 15421652595 (-0.01%); split: -0.10%, +0.10%
Spill count: 235844 -> 235799 (-0.02%); split: -0.03%, +0.01%
Fill count: 333783 -> 333699 (-0.03%); split: -0.03%, +0.01%
Max live registers: 13765573 -> 13777494 (+0.09%); split: -0.01%, +0.10%
Max dispatch width: 3086880 -> 3079848 (-0.23%); split: +0.24%, -0.47%
Non SSA regs after NIR: 17623772 -> 17623673 (-0.00%); split: -0.00%, +0.00%

Fixes: 442daeb54a ("nir/opt_algebraic: use fcanonicalize")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39567>
2026-02-14 02:06:59 +00:00
Ian Romanick
5af0b8bd09 brw: Call nir_opt_algebraic_late in brw_nir_create_raygen_trampoline
Make sure that lowering undone in brw_nir_optimize are reapplied.

No shader-db changes on any Intel platform.

Why are there fossil-db changes on platforms that don't support ray tracing?

Lunar Lake
Totals:
Instrs: 926636441 -> 926636313 (-0.00%); split: -0.00%, +0.00%
Send messages: 41510729 -> 41510723 (-0.00%); split: -0.00%, +0.00%
Cycle count: 104509492613 -> 104509490569 (-0.00%); split: -0.00%, +0.00%
Max live registers: 193792922 -> 193792890 (-0.00%); split: -0.00%, +0.00%
Non SSA regs after NIR: 150091934 -> 150092170 (+0.00%); split: -0.00%, +0.00%

Totals from 10 (0.00% of 2020428) affected shaders:
Instrs: 8142 -> 8014 (-1.57%); split: -3.14%, +1.57%
Send messages: 192 -> 186 (-3.12%); split: -7.29%, +4.17%
Cycle count: 131892 -> 129848 (-1.55%); split: -6.93%, +5.38%
Max live registers: 1442 -> 1410 (-2.22%); split: -3.05%, +0.83%
Non SSA regs after NIR: 950 -> 1186 (+24.84%); split: -26.95%, +51.79%

Meteor Lake
Totals:
Instrs: 1000805547 -> 1000805543 (-0.00%); split: -0.00%, +0.00%
Cycle count: 93131592265 -> 93131619619 (+0.00%); split: -0.00%, +0.00%
Max live registers: 122081268 -> 122081244 (-0.00%); split: -0.00%, +0.00%

Totals from 16 (0.00% of 2286241) affected shaders:
Instrs: 18652 -> 18648 (-0.02%); split: -1.39%, +1.37%
Cycle count: 369520 -> 396874 (+7.40%); split: -2.94%, +10.34%
Max live registers: 1350 -> 1326 (-1.78%); split: -4.15%, +2.37%

DG2
Totals:
Instrs: 999834626 -> 999834651 (+0.00%); split: -0.00%, +0.00%
Send messages: 45719398 -> 45719403 (+0.00%); split: -0.00%, +0.00%
Cycle count: 93118238139 -> 93118269557 (+0.00%); split: -0.00%, +0.00%
Max live registers: 122098944 -> 122098936 (-0.00%); split: -0.00%, +0.00%
Non SSA regs after NIR: 169413734 -> 169413661 (-0.00%); split: -0.00%, +0.00%

Totals from 13 (0.00% of 2286795) affected shaders:
Instrs: 18799 -> 18824 (+0.13%); split: -1.04%, +1.18%
Send messages: 492 -> 497 (+1.02%); split: -2.44%, +3.46%
Cycle count: 352838 -> 384256 (+8.90%); split: -1.08%, +9.98%
Max live registers: 1237 -> 1229 (-0.65%); split: -2.91%, +2.26%
Non SSA regs after NIR: 2191 -> 2118 (-3.33%); split: -20.86%, +17.53%

Tiger Lake
Totals:
Instrs: 1011816778 -> 1011816714 (-0.00%); split: -0.00%, +0.00%
Send messages: 46515289 -> 46515285 (-0.00%); split: -0.00%, +0.00%
Cycle count: 85148902406 -> 85148894668 (-0.00%); split: -0.00%, +0.00%
Max live registers: 122362180 -> 122362172 (-0.00%); split: -0.00%, +0.00%
Max dispatch width: 38036160 -> 38036176 (+0.00%)
Non SSA regs after NIR: 160317521 -> 160317649 (+0.00%); split: -0.00%, +0.00%

Totals from 6 (0.00% of 2282318) affected shaders:
Instrs: 9204 -> 9140 (-0.70%); split: -1.43%, +0.74%
Send messages: 258 -> 254 (-1.55%); split: -3.10%, +1.55%
Cycle count: 287652 -> 279914 (-2.69%); split: -3.29%, +0.60%
Max live registers: 552 -> 544 (-1.45%); split: -2.90%, +1.45%
Max dispatch width: 48 -> 64 (+33.33%)
Non SSA regs after NIR: 914 -> 1042 (+14.00%); split: -14.00%, +28.01%

Ice Lake
Totals:
Instrs: 1012203285 -> 1012203249 (-0.00%); split: -0.00%, +0.00%
Send messages: 47358859 -> 47358858 (-0.00%); split: -0.00%, +0.00%
Cycle count: 85112165276 -> 85112171905 (+0.00%); split: -0.00%, +0.00%
Max live registers: 125545002 -> 125544992 (-0.00%); split: -0.00%, +0.00%
Max dispatch width: 41335696 -> 41335656 (-0.00%)
Non SSA regs after NIR: 166448597 -> 166448602 (+0.00%); split: -0.00%, +0.00%

Totals from 13 (0.00% of 2335519) affected shaders:
Instrs: 16486 -> 16450 (-0.22%); split: -1.67%, +1.46%
Send messages: 368 -> 367 (-0.27%); split: -4.89%, +4.62%
Cycle count: 347643 -> 354272 (+1.91%); split: -1.34%, +3.25%
Max live registers: 1104 -> 1094 (-0.91%); split: -3.80%, +2.90%
Max dispatch width: 192 -> 152 (-20.83%)
Non SSA regs after NIR: 2100 -> 2105 (+0.24%); split: -21.76%, +22.00%

Skylake
Totals:
Instrs: 504548665 -> 504548057 (-0.00%); split: -0.00%, +0.00%
Send messages: 24479148 -> 24479118 (-0.00%); split: -0.00%, +0.00%
Cycle count: 57575198140 -> 57575179256 (-0.00%); split: -0.00%, +0.00%
Max live registers: 85570671 -> 85570575 (-0.00%); split: -0.00%, +0.00%
Non SSA regs after NIR: 85097646 -> 85098486 (+0.00%); split: -0.00%, +0.00%

Totals from 22 (0.00% of 1703671) affected shaders:
Instrs: 19866 -> 19258 (-3.06%); split: -3.72%, +0.66%
Send messages: 464 -> 434 (-6.47%); split: -8.19%, +1.72%
Cycle count: 250854 -> 231970 (-7.53%); split: -9.23%, +1.70%
Max live registers: 2024 -> 1928 (-4.74%); split: -5.53%, +0.79%
Non SSA regs after NIR: 2498 -> 3338 (+33.63%); split: -8.33%, +41.95%

Fixes: 442daeb54a ("nir/opt_algebraic: use fcanonicalize")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39567>
2026-02-14 02:06:59 +00:00
Ian Romanick
fd29183901 elk: Use F16TO32 for nir_op_f2f32 of float16 source
This matches the behavior of nir_op_unpack_half_2x16_split_x. Gfx7
uses a special opcode for this conversion. Fixes numerous assertion
failures in shader-db on Ivy Bridge and Haswell.

I am not sure why this was never encountered previously.

Fixes: 609c46cf23 ("nir/lower_alu_width: emit f2f32 for unpack_half_2x16")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39567>
2026-02-14 02:06:59 +00:00
Ian Romanick
9017d37e84 nir: Use STACK_ARRAY instead of NIR_VLA
The number of fields comes from the shader, so it could be a value large
enough that using alloca would be problematic.

Fixes: c11833ab24 ("nir,spirv: Rework function calls")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39866>
2026-02-14 01:19:27 +00:00
Ian Romanick
3da828d2dd spirv: Use STACK_ARRAY instead of NIR_VLA
The number of fields comes from the shader, so it could be a value large
enough that using alloca would be problematic.

Fixes: 2a023f30a6 ("nir/spirv: Add basic support for types")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39866>
2026-02-14 01:19:27 +00:00