mesa/src
Ian Romanick 464955bbdd nir/algebraic: Optimize some open-coded extract_i8
These were initially observed in Hogwarts Legacy while working on
something else entirely. Two compute shaders in that app are helped
for spills and fills. On Skylake, one of the shaders benefits from
this change, and the other is hurt pretty significantly.

About 40 vertex shaders in Shadow of the Tomb Raider were helped for
instructions.

v2: Use ~0xff instead of 0xffffff00 to ensure the patterns will work
properly with all bit sizes. Noticed by Georg.

v3: No, really, fix the various errors to ensure the patterns will work
properly with all bit sizes. Noticed by Georg.

No shader-db changes on any Intel platform.

fossil-db:

Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake)
Totals:
Instrs: 210566294 -> 210561118 (-0.00%)
Cycle count: 31582309052 -> 31576352808 (-0.02%); split: -0.02%, +0.00%
Spill count: 519300 -> 519280 (-0.00%)
Fill count: 625181 -> 625161 (-0.00%)
Scratch Memory Size: 36289536 -> 36281344 (-0.02%)
Max live registers: 66068413 -> 66068161 (-0.00%)
Non SSA regs after NIR: 60230773 -> 60230775 (+0.00%)

Totals from 1662 (0.24% of 707082) affected shaders:
Instrs: 635064 -> 629888 (-0.82%)
Cycle count: 36549632 -> 30593388 (-16.30%); split: -16.43%, +0.14%
Spill count: 246 -> 226 (-8.13%)
Fill count: 280 -> 260 (-7.14%)
Scratch Memory Size: 16384 -> 8192 (-50.00%)
Max live registers: 178491 -> 178239 (-0.14%)
Non SSA regs after NIR: 169552 -> 169554 (+0.00%)

Tiger Lake
Totals:
Instrs: 238544730 -> 238539407 (-0.00%)
Cycle count: 23679446097 -> 23673238578 (-0.03%); split: -0.03%, +0.00%
Max live registers: 42494925 -> 42494799 (-0.00%)
Non SSA regs after NIR: 63639071 -> 63639074 (+0.00%)

Totals from 1662 (0.21% of 802704) affected shaders:
Instrs: 626604 -> 621281 (-0.85%)
Cycle count: 26444363 -> 20236844 (-23.47%); split: -23.50%, +0.02%
Max live registers: 95405 -> 95279 (-0.13%)
Non SSA regs after NIR: 181150 -> 181153 (+0.00%)

Ice Lake
Totals:
Instrs: 238855310 -> 238826534 (-0.01%)
Cycle count: 24952257277 -> 24944589398 (-0.03%); split: -0.03%, +0.00%
Spill count: 575510 -> 575117 (-0.07%)
Fill count: 713007 -> 708632 (-0.61%)
Max live registers: 42499556 -> 42499432 (-0.00%)
Non SSA regs after NIR: 64388747 -> 64388750 (+0.00%)

Totals from 1662 (0.21% of 805149) affected shaders:
Instrs: 926887 -> 898111 (-3.10%)
Cycle count: 67025583 -> 59357704 (-11.44%); split: -11.45%, +0.01%
Spill count: 5168 -> 4775 (-7.60%)
Fill count: 32883 -> 28508 (-13.30%)
Max live registers: 95614 -> 95490 (-0.13%)
Non SSA regs after NIR: 181150 -> 181153 (+0.00%)

Skylake
Totals:
Instrs: 161904416 -> 161895239 (-0.01%); split: -0.01%, +0.00%
Cycle count: 20098067714 -> 20090767583 (-0.04%); split: -0.04%, +0.00%
Spill count: 525546 -> 525789 (+0.05%); split: -0.04%, +0.09%
Fill count: 603369 -> 602276 (-0.18%); split: -0.28%, +0.10%
Max live registers: 33895714 -> 33895590 (-0.00%)
Non SSA regs after NIR: 57348729 -> 57348730 (+0.00%)

Totals from 1655 (0.25% of 653734) affected shaders:
Instrs: 769979 -> 760802 (-1.19%); split: -1.83%, +0.64%
Cycle count: 51365416 -> 44065285 (-14.21%); split: -14.22%, +0.01%
Spill count: 4186 -> 4429 (+5.81%); split: -4.90%, +10.70%
Fill count: 16356 -> 15263 (-6.68%); split: -10.50%, +3.82%
Max live registers: 95115 -> 94991 (-0.13%)
Non SSA regs after NIR: 180797 -> 180798 (+0.00%)

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34905>
2025-05-16 14:49:05 -07:00
..
amd aco/isel: move visit_intrinsic() into separate file 2025-05-16 11:01:19 +00:00
android_stub
asahi hk: Implement VK_EXT_map_memory_placed 2025-05-06 18:41:08 +00:00
broadcom v3d/compiler: Fix ub when using memcmp for texture comparisons. 2025-05-16 16:05:21 +00:00
c11 c11: use SPDX-License-Identifier header 2025-01-08 20:37:51 +00:00
compiler nir/algebraic: Optimize some open-coded extract_i8 2025-05-16 14:49:05 -07:00
drm-shim
egl egl: fix sw fallback rejection in non-sw EGL_PLATFORM=device 2025-04-30 19:09:44 +00:00
etnaviv Uprev Piglit to 1767af745ed96f77b16c0c205015366d1fbbdb22 2025-05-16 17:25:05 +00:00
freedreno Uprev Piglit to 1767af745ed96f77b16c0c205015366d1fbbdb22 2025-05-16 17:25:05 +00:00
gallium rusticl: replace unnecessary Vec references with slice refs 2025-05-16 19:58:31 +00:00
gbm meson: support building with system libgbm 2025-04-09 12:15:33 +00:00
getopt
gfxstream gfxstream: make sure by default descriptor is negative 2025-05-08 18:29:03 +00:00
glx Get rid of 5 remaining references to glapitable.h 2025-04-23 20:18:25 +00:00
gtest
imagination treewide: Switch to nir_progress 2025-02-26 15:19:53 +00:00
imgui
intel anv: Use the new nir_opt_acquire_release_barriers pass 2025-05-16 00:29:13 +00:00
loader loader: Use RTLD_LOCAL not RTLD_GLOBAL 2025-04-18 07:14:56 +00:00
mapi Get rid of 5 remaining references to glapitable.h 2025-04-23 20:18:25 +00:00
mesa gallium/aux: move util_pipe_tex_to_tgsi_tex to u_blitter.c 2025-05-14 16:33:12 +00:00
microsoft d3d12: Do not build microsoft/compiler when graphics, gl or vk disabled 2025-05-08 14:17:22 +00:00
nouveau nak/dce: Use BitSet for live phis and SSA values 2025-05-15 22:28:31 -04:00
panfrost panvk: Expose support for multiview on v7 2025-05-15 14:04:29 +00:00
tool perfetto/android: align datasource names with tooling expectations 2025-04-08 18:29:10 +00:00
util util/u_printf: fix memory leak in u_printf_singleton_add_serialized 2025-05-16 14:28:50 +00:00
virtio venus: filter out venus incapable physical devices 2025-05-16 14:12:36 +00:00
vulkan vulkan: fix random tabs to spaces 2025-05-16 03:57:31 +00:00
x11 glx/egl/x11: fix x11_dri3_check_multibuffer 2025-02-17 02:50:15 +00:00
.clang-format radv: Add radv_foreach_stage to ForEachMacros again. 2025-04-11 18:01:47 +00:00
meson.build meson: support building with system libgbm 2025-04-09 12:15:33 +00:00