Lars-Ivar Hesselberg Simonsen
2d9be41706
panvk/v13: Support HSR Prepass
...
Add an option to enable HSR Prepass.
It is currently disabled by default as it might cause performance
regressions for content that:
- Has very simple fragment work.
- Already does a ZS prepass.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39615 >
2026-02-16 12:25:14 +00:00
Lars-Ivar Hesselberg Simonsen
3d6c7cf8b7
panvk/v13: Set HSR flags
...
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39615 >
2026-02-16 12:25:14 +00:00
Lars-Ivar Hesselberg Simonsen
b10555ea63
pan/compiler: Add pass to collect HSR info
...
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39615 >
2026-02-16 12:25:14 +00:00
Lars-Ivar Hesselberg Simonsen
6e88d9cbe3
pan/genxml/v13: Add HSR operation enums
...
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39615 >
2026-02-16 12:25:14 +00:00
Lars-Ivar Hesselberg Simonsen
71500a32fa
pan/genxml/v13: Fix HSR Prepass typo
...
Fixes: ece01443e1 ("pan/genxml: Add v13 definition")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39615 >
2026-02-16 12:25:14 +00:00
Lars-Ivar Hesselberg Simonsen
75242b1862
panvk: Fix dcd_flags1 dirty bit
...
dcd_flags1 was not counted as dirty in case the color attachment map was
updated. This could lead to an outdated value for render_target_mask.
Fixes: a4670a67e0 ("panvk/csf: Set the correct DCD_FLAGS_1.render_rarget_mask")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39615 >
2026-02-16 12:25:13 +00:00
Pavel Ondračka
0763fb947a
r300: align macro-tiled stride-addressed textures in X
...
Odd macro-tile counts in X trigger flaky rendering/readback in
parallel stress runs with macro-tiled NPOT textures (for example
piglit draw-pixel-with-texture -auto -fbo).
When a texture is macro-tiled and uses stride addressing, align the
width to two macro tiles. This keeps the stride at an even number of
macro tiles in X and avoids the corruption without disabling
macrotiling.
I was not able to find anything about this in the docs.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39882 >
2026-02-16 13:04:56 +01:00
Pavel Ondračka
7ae9262dc3
r300: split unaligned 3D texsubimage uploads by layer
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
TexSubImage3D failures were caused by tiled multi-slice uploads for
unaligned XY box. Falls back to per-layer uploads when the 3D box
depth is > 1 and XY box is unaligned.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39897 >
2026-02-16 11:26:41 +00:00
Hyunjun Ko
eedbe136ea
anv/video: remove unsupported feautres for encoders
...
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39884 >
2026-02-16 10:58:40 +00:00
Hyunjun Ko
1185bbe18d
anv/video: set Sad Qp Lambda values properly for H265 encoder.
...
This is taken from media-driver(Intel VA-API)
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39884 >
2026-02-16 10:58:40 +00:00
Hyunjun Ko
1cb4fe5ef5
anv/video: Handle GPB(Generalized P and B frames) properly for H265 enc.
...
The previous code was copying RefPicList0 to RefPicList1 but not updating
num_ref_idx_l1_active_minus1, leaving it potentially uninitialized or zero.
This caused the hardware to see an inconsistent L1 list state.
Accordingly it sets num_ref_idx_active_override_flag if necessary.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39884 >
2026-02-16 10:58:40 +00:00
Hyunjun Ko
4d4a5e4a42
anv/video: set Qp passed from apps for h265 encoder
...
Instead of 26 by default.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39884 >
2026-02-16 10:58:40 +00:00
Hyunjun Ko
6efbb80c98
anv/video: set transform skip numbers according to qp
...
Instead of hardcode.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39884 >
2026-02-16 10:58:40 +00:00
Juan A. Suarez Romero
f641eb4fad
broadcom/ci: update expected results
...
Add new flakes found in the nightly runs.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39912 >
2026-02-16 11:21:23 +01:00
Juan A. Suarez Romero
154b5ccc9e
broadcom/ci: update available devices
...
These are the devices used for pre-merges.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39912 >
2026-02-16 11:21:15 +01:00
Job Noorman
65362a9c38
ir3: don't use predication for large blocks
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Even when the branch condition is divergent, it may still be uniform
within a wave. When this happens for very large blocks, predication
causes a large overhead because the instructions in the false block are
not simply jumped over. Therefore, we fall back to normal branches for
large blocks.
Measuring renderpass time on some OpenGL traces:
average fps +0.5% (+/- 0.1%)
max fps +0.5% (+/- 0.1%)
Totals from 18522 (10.51% of 176279) affected shaders:
MaxWaves: 203126 -> 203156 (+0.01%)
Instrs: 23999194 -> 23521729 (-1.99%); split: -1.99%, +0.00%
CodeSize: 45462360 -> 45250224 (-0.47%); split: -0.47%, +0.00%
NOPs: 5078652 -> 4647917 (-8.48%); split: -8.48%, +0.00%
MOVs: 813450 -> 812615 (-0.10%); split: -0.26%, +0.15%
COVs: 296638 -> 296620 (-0.01%); split: -0.01%, +0.00%
Full: 334991 -> 334923 (-0.02%)
(ss): 636625 -> 636682 (+0.01%); split: -0.12%, +0.13%
(sy): 283395 -> 283429 (+0.01%); split: -0.10%, +0.11%
(ss)-stall: 2652246 -> 2651590 (-0.02%); split: -0.18%, +0.16%
(sy)-stall: 7862615 -> 7881590 (+0.24%); split: -0.13%, +0.37%
STPs: 15994 -> 15992 (-0.01%)
LDPs: 23360 -> 23356 (-0.02%)
Subgroup size: 896 -> 1792 (+100.00%)
Cat0: 5572855 -> 5097380 (-8.53%); split: -8.53%, +0.00%
Cat1: 1146050 -> 1145189 (-0.08%); split: -0.18%, +0.11%
Cat2: 8975537 -> 8974390 (-0.01%); split: -0.01%, +0.00%
Cat6: 196837 -> 196831 (-0.00%)
Cat7: 795866 -> 795890 (+0.00%); split: -0.06%, +0.06%
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39734 >
2026-02-16 09:00:14 +00:00
Job Noorman
e7c3834a27
ir3: add block_can_be_predicated helper
...
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39734 >
2026-02-16 09:00:13 +00:00
Samuel Pitoiset
47841c1142
radv/meta: remove useless DCC decompressions for image<->buffer
...
It's not needed to decompress DCC when formats are compatible each
other, this basically removes all decompressions on GFX11-GFX11.5.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39888 >
2026-02-16 07:40:13 +00:00
Yiwei Zhang
b0397b967d
venus: workaround a gcc-15 dead store elimination (DSE) bug
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
No issue with clang or gcc-14.x (or earlier versions). The issue only
shows up since gcc-15.1. The compiler somehow fails to consider those
cs helpers dereferencing the pointer from the pNext chain for reads,
and thus has falsely optimized away the pNext store. This change works
around this with a no-op memory clobber.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13242
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39906 >
2026-02-16 01:39:10 +00:00
Vinson Lee
7239b5288f
freedreno/decode: replace lua_pushunsigned with lua_pushinteger
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
lua_pushunsigned was introduced in Lua 5.2, deprecated in 5.3,
and removed in 5.4. Replace it with lua_pushinteger which has
been available since Lua 5.0 and handles the uint32_t value
safely via implicit widening to the 64-bit lua_Integer type.
This fixes the build with Lua 5.5:
../src/freedreno/decode/script.c: In function 'pushdecval':
../src/freedreno/decode/script.c:182:7: error: implicit declaration of function 'lua_pushunsigned'; did you mean 'lua_pushinteger'? [-Wimplicit-function-declaration]
182 | lua_pushunsigned(L, val.u);
| ^~~~~~~~~~~~~~~~
| lua_pushinteger
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39901 >
2026-02-14 22:45:45 -08:00
Yiwei Zhang
a8baedef29
venus: expose VK_EXT_descriptor_heap behind a debug option
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
There's TODO for optimizing descriptor allocations, so currently we
expose the extension behind VN_DEBUG=desc_heap
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762 >
2026-02-15 04:32:30 +00:00
Yiwei Zhang
6265dad4f2
venus: fill descriptor heap feats and props
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762 >
2026-02-15 04:32:30 +00:00
Yiwei Zhang
dea6221a65
venus: take care of combined image sampler descriptor for ycbcr
...
We'd have to query it by reconstructing the sampler ycbcr conversion
image format props query. It's straightforward except having to consider
the modifier tiling case.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762 >
2026-02-15 04:32:30 +00:00
Yiwei Zhang
4f475789d5
venus: ensure descriptor writes invariance
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762 >
2026-02-15 04:32:30 +00:00
Yiwei Zhang
1d779f5af1
venus: cache descriptor size query
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762 >
2026-02-15 04:32:29 +00:00
Yiwei Zhang
dfc5d76205
venus: rename format_update_mutex for general purpose
...
Not worth separate locks for physical device level queries.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762 >
2026-02-15 04:32:29 +00:00
Yiwei Zhang
be52338399
venus: add vn_descriptor.h to be shared between different desc systems
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762 >
2026-02-15 04:32:28 +00:00
Yiwei Zhang
990b5fca37
venus: skip image cache for VkOpaqueCaptureDataCreateInfoEXT
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762 >
2026-02-15 04:32:28 +00:00
Yiwei Zhang
526788a097
venus: pipeline layout is now optional
...
If descriptor heap is used, there's no pipeline layout created. So we
have to patch compute and RT pipelines to allow it. Graphics pipeline
doesn't need the change because of GPL.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762 >
2026-02-15 04:32:28 +00:00
Yiwei Zhang
95331f3bd0
venus: cmd inheritance info fix to consider descriptor heap
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762 >
2026-02-15 04:32:28 +00:00
Yiwei Zhang
485b2b501c
venus: implement all descriptor heap commands
...
There're potential optimizations available for below:
- vkWriteSamplerDescriptorsEXT
- vkWriteResourceDescriptorsEXT
- vkGetPhysicalDeviceDescriptorSizeEXT
- vkRegisterCustomBorderColorEXT
...and we can revisit if there's perf hit from above for real apps.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762 >
2026-02-15 04:32:28 +00:00
Yiwei Zhang
04c0142aaa
venus: sync latest protocol for VK_EXT_descriptor_heap support
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39762 >
2026-02-15 04:32:28 +00:00
Adam Jackson
5ab41818c4
zink: use VK_EXT_pci_bus_info for PCI address
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39894 >
2026-02-15 03:02:08 +00:00
aerith
0c17a59ab5
zink: fix codegen for extensions with non-standard struct names
...
name_in_camel_case() converts "pci" to "Pci" but Vulkan uses "PCI"
in VkPhysicalDevicePCIBusInfoPropertiesEXT. use the actual struct
name from vk.xml when the constructed name doesn't match.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39894 >
2026-02-15 03:02:08 +00:00
Konstantin Seurer
9a82e4ba81
vulkan/cmd_queue: Do not zero initialize vk_cmd_queue_entry
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Most of the struct will be initialized already. Make sure to initialize
everything so linear_alloc_child can be used.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39881 >
2026-02-14 20:11:40 +00:00
Konstantin Seurer
be5ab80de1
vulkan/cmd_queue: Fixup stride for multi draws
...
Copying the draw infos packs them so the stride needs to be set to the
struct size.
cc: mesa-stable
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39881 >
2026-02-14 20:11:40 +00:00
Konstantin Seurer
bf61736aa5
vulkan/cmd_queue: Remove get_array_member_copy
...
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39881 >
2026-02-14 20:11:40 +00:00
Konstantin Seurer
d2ea8b3d14
vulkan: Remove vk_cmd_queue_entry::driver_data
...
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39881 >
2026-02-14 20:11:39 +00:00
Konstantin Seurer
017b3b73bb
lavapipe: Extend vk_cmd_queue_entry_base for internal commands
...
lavapipe is the only driver that is using driver_data.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39881 >
2026-02-14 20:11:39 +00:00
Collabora's Gfx CI Team
1dc39405ce
Uprev Piglit to 0d79fb4a59c7d213ff144afa4c73e3b32ebe6500
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
62d499d63d...0d79fb4a59
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39575 >
2026-02-14 14:43:20 +01:00
Thomas H.P. Andersen
331af5e746
nvk: add app workaround layer
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This adopts the device internal app workaround layer from radv
The layer allows to fix up game input in the layer instead of
adding workarounds within the driver.
Initially this only includes the workaround for Metro exodus as
I have verified that it fixes a crash on NVK. Follow up commits
can add the other relevant workarounds when the fixes are verified
to be needed for NVK.
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39870 >
2026-02-14 08:33:11 +00:00
Thomas H.P. Andersen
0a6509e94c
nvk: prepare for driver internal layers
...
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39870 >
2026-02-14 08:33:11 +00:00
Timothy Arceri
a6fcc2835e
st/glsl_to_nir: make sure the variant has the correct locations set
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
For drivers that set allow_st_finalize_nir_twice locations are set
when the variable is created. But for variants here we update the
locations in case parameter opt pass or something else changed the
location.
Fixes: 891d46f517 ("st/glsl_to_nir: dont add duplicate state tokens")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14837
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39867 >
2026-02-14 06:37:10 +00:00
Timothy Arceri
c3aae0714c
mesa: add _mesa_lookup_state_param_idx() helper
...
This will be used in the following patch.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39867 >
2026-02-14 06:37:10 +00:00
Ian Romanick
df704bd38e
elk: Call nir_opt_algebraic_late in elk_postprocess_nir
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Make sure that lowering undone in elk_nir_optimize are reapplied.
No shader-db or fossil-db changes on any Intel platform. This is most
likely to impact either Gfx8 on ANV or Gfx7.5 on HASVK. I don't
fossil-db test either of those platforms.
I tried doing a similar thing here as is done in BRW (previous commit),
but that caused a couple Haswell shaders to fall off a performance
cliff:
total spills in shared programs: 8247 -> 8311 (0.78%)
spills in affected programs: 6 -> 70 (1066.67%)
helped: 0 / HURT: 2
total fills in shared programs: 8558 -> 8910 (4.11%)
fills in affected programs: 6 -> 358 (5866.67%)
helped: 0 / HURT: 2
Fixes: 442daeb54a ("nir/opt_algebraic: use fcanonicalize")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39567 >
2026-02-14 02:06:59 +00:00
Ian Romanick
11b96a84b0
brw: Call nir_opt_algebraic_late later in brw_postprocess_nir_opts
...
Move the call to nir_opt_algebraic_late after the last time
brw_nir_optimize might be called. nir_opt_algebraic_distribute_src_mods
works together with the late algebraic optimizations, so move it also.
shader-db:
Lunar Lake
total instructions in shared programs: 17081222 -> 17080842 (<.01%)
instructions in affected programs: 419931 -> 419551 (-0.09%)
helped: 545 / HURT: 826
total cycles in shared programs: 878437752 -> 879236226 (0.09%)
cycles in affected programs: 506003142 -> 506801616 (0.16%)
helped: 3091 / HURT: 3189
LOST: 18
GAINED: 16
Meteor Lake and DG2 had similar results. (Meteor Lake shown)
total instructions in shared programs: 19994270 -> 19993231 (<.01%)
instructions in affected programs: 490499 -> 489460 (-0.21%)
helped: 660 / HURT: 800
total cycles in shared programs: 882498776 -> 882834186 (0.04%)
cycles in affected programs: 477858602 -> 478194012 (0.07%)
helped: 3458 / HURT: 3564
total fills in shared programs: 4371 -> 4370 (-0.02%)
fills in affected programs: 7 -> 6 (-14.29%)
helped: 1 / HURT: 0
LOST: 28
GAINED: 10
Tiger Lake, Ice Lake, and Skylake had similar results. (Tiger Lake shown)
total instructions in shared programs: 19943849 -> 19942782 (<.01%)
instructions in affected programs: 467384 -> 466317 (-0.23%)
helped: 655 / HURT: 796
total cycles in shared programs: 860085674 -> 861410289 (0.15%)
cycles in affected programs: 426900998 -> 428225613 (0.31%)
helped: 3250 / HURT: 3441
LOST: 19
GAINED: 14
fossil-db:
Lunar Lake
Totals:
Instrs: 926472091 -> 926204838 (-0.03%); split: -0.04%, +0.01%
CodeSize: 14845921056 -> 14842776112 (-0.02%); split: -0.10%, +0.08%
Send messages: 41459570 -> 41459574 (+0.00%); split: -0.00%, +0.00%
Cycle count: 104481085069 -> 104583692712 (+0.10%); split: -0.14%, +0.24%
Spill count: 3454651 -> 3457340 (+0.08%); split: -0.15%, +0.23%
Fill count: 4958779 -> 4958487 (-0.01%); split: -0.46%, +0.45%
Max live registers: 193805970 -> 193839002 (+0.02%); split: -0.00%, +0.02%
Max dispatch width: 49114416 -> 49113776 (-0.00%); split: +0.01%, -0.01%
Non SSA regs after NIR: 142953905 -> 142800740 (-0.11%); split: -0.12%, +0.01%
Totals from 420256 (20.80% of 2020128) affected shaders:
Instrs: 448571327 -> 448304074 (-0.06%); split: -0.09%, +0.03%
CodeSize: 7312002800 -> 7308857856 (-0.04%); split: -0.21%, +0.17%
Send messages: 17716494 -> 17716498 (+0.00%); split: -0.00%, +0.00%
Cycle count: 52178854998 -> 52281462641 (+0.20%); split: -0.28%, +0.48%
Spill count: 2945654 -> 2948343 (+0.09%); split: -0.17%, +0.26%
Fill count: 4404768 -> 4404476 (-0.01%); split: -0.51%, +0.51%
Max live registers: 60875448 -> 60908480 (+0.05%); split: -0.01%, +0.06%
Max dispatch width: 9455280 -> 9454640 (-0.01%); split: +0.04%, -0.04%
Non SSA regs after NIR: 60542740 -> 60389575 (-0.25%); split: -0.28%, +0.02%
Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 1000081384 -> 999726726 (-0.04%); split: -0.05%, +0.01%
CodeSize: 16764458080 -> 16761624256 (-0.02%); split: -0.09%, +0.07%
Subgroup size: 27599528 -> 27599544 (+0.00%)
Send messages: 45538933 -> 45538951 (+0.00%); split: -0.00%, +0.00%
Cycle count: 93303830912 -> 93370118192 (+0.07%); split: -0.19%, +0.26%
Spill count: 3739306 -> 3739719 (+0.01%); split: -0.22%, +0.23%
Fill count: 5089719 -> 5083626 (-0.12%); split: -0.56%, +0.44%
Max live registers: 122041364 -> 122055848 (+0.01%); split: -0.00%, +0.01%
Max dispatch width: 38117296 -> 38127200 (+0.03%); split: +0.06%, -0.03%
Non SSA regs after NIR: 164296197 -> 164299306 (+0.00%); split: -0.01%, +0.01%
Totals from 338754 (14.82% of 2285730) affected shaders:
Instrs: 452723479 -> 452368821 (-0.08%); split: -0.10%, +0.03%
CodeSize: 7861878032 -> 7859044208 (-0.04%); split: -0.19%, +0.16%
Subgroup size: 16 -> 32 (+100.00%)
Send messages: 17050010 -> 17050028 (+0.00%); split: -0.00%, +0.00%
Cycle count: 52881801997 -> 52948089277 (+0.13%); split: -0.33%, +0.46%
Spill count: 3271458 -> 3271871 (+0.01%); split: -0.25%, +0.26%
Fill count: 4628422 -> 4622329 (-0.13%); split: -0.61%, +0.48%
Max live registers: 30738902 -> 30753386 (+0.05%); split: -0.01%, +0.06%
Max dispatch width: 4787264 -> 4797168 (+0.21%); split: +0.47%, -0.26%
Non SSA regs after NIR: 61748026 -> 61751135 (+0.01%); split: -0.03%, +0.03%
Tiger Lake
Totals:
Instrs: 1011068379 -> 1010977290 (-0.01%); split: -0.03%, +0.02%
CodeSize: 14197751744 -> 14197683040 (-0.00%); split: -0.07%, +0.07%
Send messages: 46431228 -> 46431220 (-0.00%); split: -0.00%, +0.00%
Cycle count: 85066526419 -> 85085088071 (+0.02%); split: -0.16%, +0.18%
Spill count: 3853750 -> 3855185 (+0.04%); split: -0.15%, +0.19%
Fill count: 6716746 -> 6719594 (+0.04%); split: -0.25%, +0.29%
Max live registers: 122307387 -> 122326083 (+0.02%); split: -0.00%, +0.02%
Max dispatch width: 38009632 -> 38003280 (-0.02%); split: +0.03%, -0.05%
Non SSA regs after NIR: 158403572 -> 158415390 (+0.01%); split: -0.01%, +0.02%
Totals from 277728 (12.17% of 2281577) affected shaders:
Instrs: 349206856 -> 349115767 (-0.03%); split: -0.07%, +0.05%
CodeSize: 5042621104 -> 5042552400 (-0.00%); split: -0.20%, +0.20%
Send messages: 13132243 -> 13132235 (-0.00%); split: -0.00%, +0.00%
Cycle count: 36183327716 -> 36201889368 (+0.05%); split: -0.38%, +0.43%
Spill count: 2210072 -> 2211507 (+0.06%); split: -0.26%, +0.33%
Fill count: 4188439 -> 4191287 (+0.07%); split: -0.39%, +0.46%
Max live registers: 24956695 -> 24975391 (+0.07%); split: -0.02%, +0.09%
Max dispatch width: 3948832 -> 3942480 (-0.16%); split: +0.32%, -0.48%
Non SSA regs after NIR: 45616425 -> 45628243 (+0.03%); split: -0.04%, +0.06%
Ice Lake
Totals:
Instrs: 1009584306 -> 1009411757 (-0.02%); split: -0.02%, +0.01%
CodeSize: 12593466880 -> 12592958096 (-0.00%); split: -0.01%, +0.01%
Send messages: 47274203 -> 47274171 (-0.00%); split: -0.00%, +0.00%
Cycle count: 84920281455 -> 84914027301 (-0.01%); split: -0.05%, +0.04%
Spill count: 2988523 -> 2986191 (-0.08%); split: -0.14%, +0.07%
Fill count: 5296078 -> 5288737 (-0.14%); split: -0.21%, +0.07%
Max live registers: 125429384 -> 125444786 (+0.01%); split: -0.00%, +0.02%
Max dispatch width: 41269072 -> 41267312 (-0.00%); split: +0.03%, -0.03%
Non SSA regs after NIR: 163223895 -> 163236623 (+0.01%); split: -0.01%, +0.02%
Totals from 243818 (10.45% of 2334244) affected shaders:
Instrs: 296953759 -> 296781210 (-0.06%); split: -0.08%, +0.02%
CodeSize: 3643224480 -> 3642715696 (-0.01%); split: -0.04%, +0.03%
Send messages: 11518671 -> 11518639 (-0.00%); split: -0.00%, +0.00%
Cycle count: 33065548412 -> 33059294258 (-0.02%); split: -0.13%, +0.11%
Spill count: 1346515 -> 1344183 (-0.17%); split: -0.32%, +0.15%
Fill count: 2537906 -> 2530565 (-0.29%); split: -0.43%, +0.14%
Max live registers: 21476776 -> 21492178 (+0.07%); split: -0.02%, +0.09%
Max dispatch width: 3727288 -> 3725528 (-0.05%); split: +0.31%, -0.35%
Non SSA regs after NIR: 41050474 -> 41063202 (+0.03%); split: -0.04%, +0.07%
Skylake
Totals:
Instrs: 513573157 -> 513462971 (-0.02%); split: -0.02%, +0.00%
CodeSize: 5950280672 -> 5950001392 (-0.00%); split: -0.01%, +0.00%
Send messages: 24909757 -> 24909758 (+0.00%); split: -0.00%, +0.00%
Cycle count: 57636102242 -> 57634726342 (-0.00%); split: -0.03%, +0.03%
Spill count: 627286 -> 627241 (-0.01%); split: -0.01%, +0.00%
Fill count: 837888 -> 837804 (-0.01%); split: -0.01%, +0.00%
Max live registers: 87272271 -> 87284192 (+0.01%); split: -0.00%, +0.02%
Max dispatch width: 32278832 -> 32271800 (-0.02%); split: +0.02%, -0.04%
Non SSA regs after NIR: 87387713 -> 87387614 (-0.00%); split: -0.00%, +0.00%
Totals from 177432 (10.30% of 1722906) affected shaders:
Instrs: 127170648 -> 127060462 (-0.09%); split: -0.10%, +0.01%
CodeSize: 1443406368 -> 1443127088 (-0.02%); split: -0.03%, +0.01%
Send messages: 5444220 -> 5444221 (+0.00%); split: -0.00%, +0.00%
Cycle count: 15423028495 -> 15421652595 (-0.01%); split: -0.10%, +0.10%
Spill count: 235844 -> 235799 (-0.02%); split: -0.03%, +0.01%
Fill count: 333783 -> 333699 (-0.03%); split: -0.03%, +0.01%
Max live registers: 13765573 -> 13777494 (+0.09%); split: -0.01%, +0.10%
Max dispatch width: 3086880 -> 3079848 (-0.23%); split: +0.24%, -0.47%
Non SSA regs after NIR: 17623772 -> 17623673 (-0.00%); split: -0.00%, +0.00%
Fixes: 442daeb54a ("nir/opt_algebraic: use fcanonicalize")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39567 >
2026-02-14 02:06:59 +00:00
Ian Romanick
5af0b8bd09
brw: Call nir_opt_algebraic_late in brw_nir_create_raygen_trampoline
...
Make sure that lowering undone in brw_nir_optimize are reapplied.
No shader-db changes on any Intel platform.
Why are there fossil-db changes on platforms that don't support ray tracing?
Lunar Lake
Totals:
Instrs: 926636441 -> 926636313 (-0.00%); split: -0.00%, +0.00%
Send messages: 41510729 -> 41510723 (-0.00%); split: -0.00%, +0.00%
Cycle count: 104509492613 -> 104509490569 (-0.00%); split: -0.00%, +0.00%
Max live registers: 193792922 -> 193792890 (-0.00%); split: -0.00%, +0.00%
Non SSA regs after NIR: 150091934 -> 150092170 (+0.00%); split: -0.00%, +0.00%
Totals from 10 (0.00% of 2020428) affected shaders:
Instrs: 8142 -> 8014 (-1.57%); split: -3.14%, +1.57%
Send messages: 192 -> 186 (-3.12%); split: -7.29%, +4.17%
Cycle count: 131892 -> 129848 (-1.55%); split: -6.93%, +5.38%
Max live registers: 1442 -> 1410 (-2.22%); split: -3.05%, +0.83%
Non SSA regs after NIR: 950 -> 1186 (+24.84%); split: -26.95%, +51.79%
Meteor Lake
Totals:
Instrs: 1000805547 -> 1000805543 (-0.00%); split: -0.00%, +0.00%
Cycle count: 93131592265 -> 93131619619 (+0.00%); split: -0.00%, +0.00%
Max live registers: 122081268 -> 122081244 (-0.00%); split: -0.00%, +0.00%
Totals from 16 (0.00% of 2286241) affected shaders:
Instrs: 18652 -> 18648 (-0.02%); split: -1.39%, +1.37%
Cycle count: 369520 -> 396874 (+7.40%); split: -2.94%, +10.34%
Max live registers: 1350 -> 1326 (-1.78%); split: -4.15%, +2.37%
DG2
Totals:
Instrs: 999834626 -> 999834651 (+0.00%); split: -0.00%, +0.00%
Send messages: 45719398 -> 45719403 (+0.00%); split: -0.00%, +0.00%
Cycle count: 93118238139 -> 93118269557 (+0.00%); split: -0.00%, +0.00%
Max live registers: 122098944 -> 122098936 (-0.00%); split: -0.00%, +0.00%
Non SSA regs after NIR: 169413734 -> 169413661 (-0.00%); split: -0.00%, +0.00%
Totals from 13 (0.00% of 2286795) affected shaders:
Instrs: 18799 -> 18824 (+0.13%); split: -1.04%, +1.18%
Send messages: 492 -> 497 (+1.02%); split: -2.44%, +3.46%
Cycle count: 352838 -> 384256 (+8.90%); split: -1.08%, +9.98%
Max live registers: 1237 -> 1229 (-0.65%); split: -2.91%, +2.26%
Non SSA regs after NIR: 2191 -> 2118 (-3.33%); split: -20.86%, +17.53%
Tiger Lake
Totals:
Instrs: 1011816778 -> 1011816714 (-0.00%); split: -0.00%, +0.00%
Send messages: 46515289 -> 46515285 (-0.00%); split: -0.00%, +0.00%
Cycle count: 85148902406 -> 85148894668 (-0.00%); split: -0.00%, +0.00%
Max live registers: 122362180 -> 122362172 (-0.00%); split: -0.00%, +0.00%
Max dispatch width: 38036160 -> 38036176 (+0.00%)
Non SSA regs after NIR: 160317521 -> 160317649 (+0.00%); split: -0.00%, +0.00%
Totals from 6 (0.00% of 2282318) affected shaders:
Instrs: 9204 -> 9140 (-0.70%); split: -1.43%, +0.74%
Send messages: 258 -> 254 (-1.55%); split: -3.10%, +1.55%
Cycle count: 287652 -> 279914 (-2.69%); split: -3.29%, +0.60%
Max live registers: 552 -> 544 (-1.45%); split: -2.90%, +1.45%
Max dispatch width: 48 -> 64 (+33.33%)
Non SSA regs after NIR: 914 -> 1042 (+14.00%); split: -14.00%, +28.01%
Ice Lake
Totals:
Instrs: 1012203285 -> 1012203249 (-0.00%); split: -0.00%, +0.00%
Send messages: 47358859 -> 47358858 (-0.00%); split: -0.00%, +0.00%
Cycle count: 85112165276 -> 85112171905 (+0.00%); split: -0.00%, +0.00%
Max live registers: 125545002 -> 125544992 (-0.00%); split: -0.00%, +0.00%
Max dispatch width: 41335696 -> 41335656 (-0.00%)
Non SSA regs after NIR: 166448597 -> 166448602 (+0.00%); split: -0.00%, +0.00%
Totals from 13 (0.00% of 2335519) affected shaders:
Instrs: 16486 -> 16450 (-0.22%); split: -1.67%, +1.46%
Send messages: 368 -> 367 (-0.27%); split: -4.89%, +4.62%
Cycle count: 347643 -> 354272 (+1.91%); split: -1.34%, +3.25%
Max live registers: 1104 -> 1094 (-0.91%); split: -3.80%, +2.90%
Max dispatch width: 192 -> 152 (-20.83%)
Non SSA regs after NIR: 2100 -> 2105 (+0.24%); split: -21.76%, +22.00%
Skylake
Totals:
Instrs: 504548665 -> 504548057 (-0.00%); split: -0.00%, +0.00%
Send messages: 24479148 -> 24479118 (-0.00%); split: -0.00%, +0.00%
Cycle count: 57575198140 -> 57575179256 (-0.00%); split: -0.00%, +0.00%
Max live registers: 85570671 -> 85570575 (-0.00%); split: -0.00%, +0.00%
Non SSA regs after NIR: 85097646 -> 85098486 (+0.00%); split: -0.00%, +0.00%
Totals from 22 (0.00% of 1703671) affected shaders:
Instrs: 19866 -> 19258 (-3.06%); split: -3.72%, +0.66%
Send messages: 464 -> 434 (-6.47%); split: -8.19%, +1.72%
Cycle count: 250854 -> 231970 (-7.53%); split: -9.23%, +1.70%
Max live registers: 2024 -> 1928 (-4.74%); split: -5.53%, +0.79%
Non SSA regs after NIR: 2498 -> 3338 (+33.63%); split: -8.33%, +41.95%
Fixes: 442daeb54a ("nir/opt_algebraic: use fcanonicalize")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39567 >
2026-02-14 02:06:59 +00:00
Ian Romanick
fd29183901
elk: Use F16TO32 for nir_op_f2f32 of float16 source
...
This matches the behavior of nir_op_unpack_half_2x16_split_x. Gfx7
uses a special opcode for this conversion. Fixes numerous assertion
failures in shader-db on Ivy Bridge and Haswell.
I am not sure why this was never encountered previously.
Fixes: 609c46cf23 ("nir/lower_alu_width: emit f2f32 for unpack_half_2x16")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39567 >
2026-02-14 02:06:59 +00:00
Ian Romanick
9017d37e84
nir: Use STACK_ARRAY instead of NIR_VLA
...
The number of fields comes from the shader, so it could be a value large
enough that using alloca would be problematic.
Fixes: c11833ab24 ("nir,spirv: Rework function calls")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39866 >
2026-02-14 01:19:27 +00:00
Ian Romanick
3da828d2dd
spirv: Use STACK_ARRAY instead of NIR_VLA
...
The number of fields comes from the shader, so it could be a value large
enough that using alloca would be problematic.
Fixes: 2a023f30a6 ("nir/spirv: Add basic support for types")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39866 >
2026-02-14 01:19:27 +00:00