mesa/src
Yevhenii Kolesnikov 9427aaeab7 nir/loop_analyze: Determine iteration counts for more kinds of loops
If loop iterator is incremented with something other than regular
addition, it would be more error prone to calculate the number of
iterations theoretically. What we can do instead, is try to emulate the
loop, and determine the number of iterations empirically.

These operations are covered:
 - imul
 - fmul
 - ishl
 - ishr
 - ushr

Also add unit tests for loop unrollment.

Improves performance of Aztec Ruins (sixonix
gfxbench5.aztec_ruins_vk_high) by -1.28042% +/- 0.498555% (N=5) on Intel
Arc A770.

v2 (idr): Rebase on 3 years. :( Use nir_phi_instr_add_src in the test
cases.

v3 (idr): Use try_eval_const_alu in to evaluate loop termination
condition in get_iteration_empirical. Also restructure the loop
slightly. This fixed off by one iteration errors in "inverted" loop
tests (e.g., nir_loop_analyze_test.ushr_ieq_known_count_invert_31).

v4 (idr): Use try_eval_const_alu in to evaluate induction variable
update in get_iteration_empirical. This fixes non-commutative update
operations (e.g., shifts) when the induction varible is not the first
source. This fixes the unit test
nir_loop_analyze_test.ishl_rev_ieq_infinite_loop_unknown_count.

v5 (idr): Fix _type parameter for fadd and fadd_rev loop unroll
tests. Hopefully that fixes the failure on s390x. Temporarily disable
fmul. This works-around the revealed problem in
glsl-fs-loop-unroll-mul-fp64, and there were no shader-db or fossil-db
changes.

v6 (idr): Plumb max_unroll_iterations into get_iteration_empirical. I
was going to do this, but I forgot. Suggested by Tim.

v7 (idr): Disable fadd tests on s390x. They fail because S390 is weird.

Almost all of the shaders affected (OpenGL or Vulkan) are from gfxbench
or geekbench. A couple shaders in Deus Ex (OpenGL), Dirt Rally (OpenGL),
Octopath Traveler (Vulkan), and Rise of the Tomb Raider (Vulkan) are
helped.

The lost / gained shaders in OpenGL are an Aztec Ruins shader that goes
from SIMD16 to SIMD8. The spills / fills affected are in a single Aztec
Ruins (Vulkan) compute shader.

shader-db results:

Skylake, Ice Lake, and Tiger Lake had similar results. (Tiger Lake shown)
total loops in shared programs: 5514 -> 5470 (-0.80%)
loops in affected programs: 62 -> 18 (-70.97%)
helped: 37 / HURT: 0

LOST:   2
GAINED: 2

Haswell and Broadwell had similar results. (Broadwell shown)
total loops in shared programs: 5346 -> 5298 (-0.90%)
loops in affected programs: 66 -> 18 (-72.73%)
helped: 39 / HURT: 0

fossil-db results:

Skylake, Ice Lake, and Tiger Lake had similar results. (Tiger Lake shown)
Instructions in all programs: 157374679 -> 157397421 (+0.0%)
Instructions hurt: 28

SENDs in all programs: 7463800 -> 7467639 (+0.1%)
SENDs hurt: 28

Loops in all programs: 38980 -> 38950 (-0.1%)
Loops helped: 28

Cycles in all programs: 7559486451 -> 7557455384 (-0.0%)
Cycles helped: 28

Spills in all programs: 11405 -> 11403 (-0.0%)
Spills helped: 1

Fills in all programs: 19578 -> 19588 (+0.1%)
Fills hurt: 1

Lost: 1

Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3445>
2023-04-06 23:50:27 +00:00
..
amd amd/registers: use gfx9 packet definitions for gfx940 2023-04-06 15:00:54 +00:00
android_stub util/log: improve logger_android 2023-02-22 17:55:40 +00:00
asahi agx: Enable nir_lower_frexp. 2023-04-06 02:32:01 +00:00
broadcom Revert "broadcom/ci: run gl jobs on arm64, just like vk" 2023-04-06 14:34:06 +00:00
c11 c11: Remove _MTX_INITIALIZER_NP for windows 2022-11-09 04:38:28 +00:00
compiler nir/loop_analyze: Determine iteration counts for more kinds of loops 2023-04-06 23:50:27 +00:00
drm-shim drm-shim: Use hide_drm_device_path() to hide other drm devices 2022-12-30 15:51:11 -08:00
egl dri2/android: Bypass throttling 2023-03-30 18:46:04 +00:00
etnaviv ci/etnaviv: Polish the gc2000 xfails a bit. 2023-03-29 07:52:45 +00:00
freedreno ci/zink: Disable a630 portal-2-v2 due to kernel OOMs. 2023-04-06 02:32:01 +00:00
gallium radeonsi/vcn: set bitstream buffer size to encoded bitstream size 2023-04-06 22:55:59 +00:00
gbm gbm: drop unnecessary vulkan dependency 2023-02-23 18:31:22 +00:00
getopt
glx glx: Fix error handling yet again in CreateContextAttribs 2023-04-06 21:29:54 +00:00
gtest
imagination pvr: Mark all normalized formats as supporting with_packed_usc_channel 2023-03-29 13:00:37 +00:00
imgui
intel intel/fs: White space fixes 2023-04-06 19:07:50 +00:00
loader loader: Use libdrm shim 2023-03-05 16:31:51 +00:00
mapi mapi: add InternalInvalidateFramebufferAncillaryMESA 2023-03-30 05:06:47 +00:00
mesa mesa: add _mesa_is_api_gles2() helper 2023-04-06 08:07:35 +00:00
microsoft dzn: Fix bindless descriptor sets with multiple dynamic buffers that need custom descriptors 2023-04-06 22:08:28 +00:00
nouveau nouveau: Enable frexp lowering in the backend. 2023-04-06 02:32:01 +00:00
panfrost panfrost/midgard: Enable nir_lower_frexp. 2023-04-06 02:32:01 +00:00
tool pps: Fix build errors. 2023-03-13 01:22:46 +00:00
util dzn: Add a driconf option for enabling subgroup ops in VS/GS 2023-04-06 22:08:28 +00:00
virtio venus/ci: Only run one crosvm instance 2023-03-31 12:39:49 +00:00
vulkan vulkan/wsi/display: set pDisplay to NULL on error 2023-04-05 06:21:26 +00:00
meson.build hgl: remove 2023-02-18 00:44:43 +00:00