mesa/src
Lionel Landwerlin a25f96c00c intel/fs: switch from SIMD 1 to 8 instructions surface/sampler rematerialization
SIMD1 instructions are problematic because they are considered partial
writes. This increases the liveness of the destination register
written by those instructions. To workaround this we use UNDEF
instructions to bound the liveness of the register. But this causing
other issues like in this case :

  undef(1) vgrf2
  mov(1)   vgrf2, u4.0
  add(1)   vgrf3, vgrf2.0, 64UD

In this case the copy propagation pass in unable to see that vgrf2 in
the add() instruction can be replaced with the uniform u4.0.

To fix this problem, we switch NoMask SIMD8 instructions that cover
the entire register. We can drop the UNDEF instructions and now copy
propagation can do its job.

Good results on 2 apps :

Cyberpunk 2077 :

  Totals from 7258 (68.80% of 10549) affected shaders:
  Instrs: 6332210 -> 6073833 (-4.08%); split: -4.11%, +0.03%
  Cycles: 130667501 -> 127351268 (-2.54%); split: -3.12%, +0.58%
  Subgroup size: 90320 -> 90400 (+0.09%)
  Spill count: 90 -> 68 (-24.44%)
  Fill count: 82 -> 64 (-21.95%)
  Scratch Memory Size: 8192 -> 6144 (-25.00%)
  Max live registers: 385464 -> 375152 (-2.68%)
  Max dispatch width: 64336 -> 64424 (+0.14%); split: +0.96%, -0.82%

  Gaining 60 SIMD16/SIMD32 shaders, loosing 33

Strange Brigade :

  Totals from 2137 (53.12% of 4023) affected shaders:
  Instrs: 1544031 -> 1457544 (-5.60%); split: -5.60%, +0.00%
  Cycles: 22292564 -> 21868978 (-1.90%); split: -2.43%, +0.53%
  Subgroup size: 25328 -> 25344 (+0.06%)
  Max live registers: 113716 -> 111214 (-2.20%)
  Max dispatch width: 17232 -> 18608 (+7.99%); split: +8.36%, -0.37%

  Gaining 138 SIMD16/SIMD32 shaders, loosing 4

On app slightly negatively affected :

Dota2 :

  Totals from 232 (14.73% of 1575) affected shaders:
  Instrs: 30029 -> 28194 (-6.11%)
  Cycles: 385155 -> 371422 (-3.57%); split: -3.59%, +0.02%
  Max live registers: 6792 -> 6780 (-0.18%)
  Max dispatch width: 2256 -> 2160 (-4.26%)

  Loosing 6 SIMD32 shaders

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24554>
2023-09-29 10:46:47 +00:00
..
amd radv: emit missing PA_{SC,SU}_LINE_STIPPLE_xxx regs in gfx preamble 2023-09-29 07:50:46 +00:00
android_stub
asahi treewide: Drop nir_ssa_for_src users 2023-09-18 10:25:17 -04:00
broadcom v3dv: allow headless device without display device 2023-09-27 18:46:31 +00:00
c11 util/meson: Getting mesa util core to be self contained 2023-08-02 03:41:24 +00:00
compiler compiler/types: Add void parameter to ensure these are valid C prototypes 2023-09-28 22:43:45 +00:00
drm-shim drm-shim: Avoid assertion fail if someone does close(-1). 2023-06-01 01:50:41 +00:00
egl egl/wayland: enable WL_bind_wayland_display for zink 2023-09-19 02:47:31 +00:00
etnaviv ci/etnaviv: Drop some gc2k flakes that I think are resolved. 2023-09-28 16:34:51 +00:00
freedreno tu/kgsl: Fix bitfield of DITHER_MODE_MRT6 2023-09-27 21:45:40 +00:00
gallium rusticl/mesa: create COMPUTE_ONLY contexts 2023-09-28 23:02:24 +02:00
gbm egl/drm: Assume modern DRI interface versions 2023-07-28 12:25:19 +00:00
getopt
glx glx: XFree visual info 2023-09-28 12:17:49 +00:00
gtest gtest: backport ansi color fix 2023-08-18 21:33:14 +00:00
imagination pvr: Force compile error on GNU void pointer arithmetic 2023-09-27 15:25:32 +00:00
imgui
intel intel/fs: switch from SIMD 1 to 8 instructions surface/sampler rematerialization 2023-09-29 10:46:47 +00:00
loader loader: add DRI_PRIME_DEBUG env var 2023-09-18 07:45:27 +00:00
mapi glthread: sync for VDPAU sync functions 2023-08-17 04:53:37 +00:00
mesa compiler/types: Add support for Cooperative Matrix types 2023-09-28 07:35:02 +00:00
microsoft microsoft/compiler: Fix printf formatting string issues 2023-09-22 10:47:33 -07:00
nouveau nvk: Init pipelineCacheUUID 2023-09-28 15:31:21 +00:00
panfrost panfrost: advertise YUV formats for valhall 2023-09-26 21:13:51 +00:00
tool pps-producer: add ability to select device with DRI_PRIME 2023-09-07 10:44:51 +00:00
util util: Fix bookkeeping of linear node sizes 2023-09-26 02:53:27 +00:00
virtio llvmpipe/fs: fix regression in sample mask handling from tgsi removal. 2023-09-26 20:15:22 +00:00
vulkan vulkan: Handle vkSetDebugUtilsObjectNameEXT on WSI objects 2023-09-28 01:23:55 +00:00
.clang-format nir: Add nir_foreach_block_in_cf_node_reverse 2023-09-22 10:05:58 +00:00
meson.build nvk: add vulkan skeleton 2023-08-04 21:31:52 +00:00