mesa/src at 4d73988f6fef39e9263ec0bb49cd5efff68393bc - fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-02-24 16:00:29 +01:00

History

Francisco Jerez 4d73988f6f intel/ir/gen12+: Work around FS performance regressions due to SIMD32 discard divergence. This avoids some performance regressions on Gen12 platforms caused by SIMD32 fragment shaders reported in titles like Dota2, TF2, Xonotic, and GFXBench5 Car Chase and Aztec Ruins. The most obvious pattern in the regressing shaders I identified among these workloads is that they all had non-uniform discard statements, which are handled rather optimistically by the current IR analysis pass: No penalty is currently applied to the SIMD32 variant of the shader in the form of differing branching weights like we do for other control flow instructions in order to account for the greater likelihood of divergence of a SIMD32 shader. Simply changing that by giving the same treatment to discard statements as we give to other branching instructions seemed to hurt more than it helped on platforms earlier than Gen12, since it reversed most of the improvement obtained from SIMD32 fragment shaders in Manhattan for no measurable benefit in other workloads (Manhattan has a handful of shaders with statically non-uniform discard statements which actually perform better in SIMD32 mode due to their approximate dynamic uniformity). For that reason this change is applied to Gen12+ platforms only. I've been running a number of tests trying to understand the difference in behavior between Gen12 and earlier platforms, and most of the evidence I've gathered seems to point at EU fusion being the culprit: Unlike previous generations, on Gen12 EUs are arranged in pairs which execute instructions in lockstep, giving an effective warp size of 64 threads in SIMD32 mode, which seems to increase the likelihood for control flow divergence in some of the affected shaders significantly. Fixes: `188a3659ae` "intel/ir: Import shader performance analysis pass." Reported-by: Caleb Callaway <caleb.callaway@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5910>		2020-07-23 01:40:06 +00:00
..
amd	radeonsi: enable preemption if the kernel enabled it	2020-07-22 12:08:33 -04:00
broadcom	nir: Add a face_sysval argument to nir_lower_two_sided_color	2020-07-17 14:50:26 +00:00
compiler	nir/lower_io: Add support for global scratch addressing	2020-07-22 23:43:35 +00:00
drm-shim	meson: use gnu_symbol_visibility argument	2020-06-01 18:59:18 +00:00
egl	egl/dri2: try to bind old context if bindContext failed	2020-07-21 18:42:03 +00:00
etnaviv	etnaviv: replace all dup() with os_dupfd_cloexec()	2020-06-18 02:09:56 +00:00
freedreno	turnip: disable tiling for NV12/IYUV formats	2020-07-21 20:08:07 +00:00
gallium	softpipe: Enable PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS;	2020-07-23 00:24:26 +00:00
gbm	gbm: document that gbm_bo_map exposes a linear view	2020-06-03 10:09:52 +00:00
getopt	meson: build getopt when using msvc	2019-09-10 20:36:47 +00:00
glx	glx: Fix build and warnings with -Dglx=dri -Dglx-direct=false	2020-07-23 01:23:12 +00:00
gtest	gtest: Update to 1.10.0	2020-04-20 11:57:11 +00:00
hgl	scons: Prune out unnecessary targets.	2020-03-30 13:38:01 +00:00
imgui	meson: drop `intel_` prefix on imgui_core	2019-12-10 15:16:02 +00:00
intel	intel/ir/gen12+: Work around FS performance regressions due to SIMD32 discard divergence.	2020-07-23 01:40:06 +00:00
loader	Revert "loader/dri3: Check for window destruction in dri3_wait_for_event_locked"	2020-07-03 09:55:50 +00:00
mapi	glx: Fix build and warnings with -Dglx=dri -Dglx-direct=false	2020-07-23 01:23:12 +00:00
mesa	mesa/program: fix shadow property for samplers	2020-07-22 12:51:51 +00:00
panfrost	pan/mdg: Use the blend RT for blend shader framebuffer fetches	2020-07-20 14:15:49 +00:00
util	driconf: allowlist/denylist	2020-07-16 21:56:08 +00:00
vulkan	meson: Add mising git_sha1.h dependency.	2020-07-22 00:02:26 +00:00
meson.build	meson: use gnu_symbol_visibility argument	2020-06-01 18:59:18 +00:00
SConscript	driconf: drop now unused translation facility	2020-06-22 21:50:12 +00:00