mesa/src
Ian Romanick ecc9ffa778 nir/algebraic: Replace a-fract(a) with floor(a)
I noticed this while looking at a shader that was affected by Tim's
"more loop unrolling" series.

In review, Tim Arceri asked:
> Why the hurt on Gen6+ is this something that should be in the late
> optimisations pass?

As far as I can tell, it's just because our scheduler is terrible.  In
all the fragment shaders that I looked at (some hurt shaders were from
other stages), only one of the SIMD8 or SIMD16 version would be hurt.
In many of those case, the other SIMD width is improved (e.g.,
shaders/closed/steam/brutal-legend/3990.shader_test).

Often it looks like the scheduler decides to differently schedule a SEND
the occurs somewhere early in the shader.  Once that happens, everything
is different.

I looked at one vertex shader that was hurt (from Goat Simulator).  In
that case, both the floor and fract are used.  The optimization
eliminates the add, and it should allow better scheduling.  In the area
of the FRC and RNDD instructions, the scheduler does the right thing.
However, later in the shader a MAD and and ADD get scheduled
differently, and that makes it slightly worse.

In light of this, I tried adding some "is_used_once" mark-up, and that
did not fix all the cycles regressions.  It also did a lot more harm
than good on SKL (helped 82 vs. hurt 241).

All Gen6+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 15437001 -> 15435259 (-0.01%)
instructions in affected programs: 213651 -> 211909 (-0.82%)
helped: 988
HURT: 0
helped stats (abs) min: 1 max: 27 x̄: 1.76 x̃: 1
helped stats (rel) min: 0.15% max: 11.54% x̄: 1.14% x̃: 0.59%
95% mean confidence interval for instructions value: -1.89 -1.63
95% mean confidence interval for instructions %-change: -1.23% -1.05%
Instructions are helped.

total cycles in shared programs: 383007378 -> 382997063 (<.01%)
cycles in affected programs: 1650825 -> 1640510 (-0.62%)
helped: 679
HURT: 302
helped stats (abs) min: 1 max: 348 x̄: 23.39 x̃: 14
helped stats (rel) min: 0.04% max: 28.77% x̄: 1.61% x̃: 0.98%
HURT stats (abs)   min: 1 max: 250 x̄: 18.43 x̃: 7
HURT stats (rel)   min: 0.04% max: 25.86% x̄: 1.41% x̃: 0.53%
95% mean confidence interval for cycles value: -13.05 -7.98
95% mean confidence interval for cycles %-change: -0.86% -0.50%
Cycles are helped.

Iron Lake and GM45 had similar results. (GM45 shown)
total instructions in shared programs: 5043616 -> 5043010 (-0.01%)
instructions in affected programs: 119691 -> 119085 (-0.51%)
helped: 432
HURT: 0
helped stats (abs) min: 1 max: 27 x̄: 1.40 x̃: 1
helped stats (rel) min: 0.10% max: 8.11% x̄: 0.66% x̃: 0.39%
95% mean confidence interval for instructions value: -1.58 -1.23
95% mean confidence interval for instructions %-change: -0.72% -0.59%
Instructions are helped.

total cycles in shared programs: 128139812 -> 128135762 (<.01%)
cycles in affected programs: 3829724 -> 3825674 (-0.11%)
helped: 602
HURT: 0
helped stats (abs) min: 2 max: 486 x̄: 6.73 x̃: 6
helped stats (rel) min: 0.02% max: 4.85% x̄: 0.19% x̃: 0.10%
95% mean confidence interval for cycles value: -8.40 -5.05
95% mean confidence interval for cycles %-change: -0.22% -0.16%
Cycles are helped.

Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2019-03-01 12:43:25 -08:00
..
amd radv: Interpolate less aggressively. 2019-02-26 18:51:35 +00:00
broadcom v3d: Rematerialize MOVs of uniforms instead of spilling them. 2019-02-25 21:33:47 -08:00
compiler nir/algebraic: Replace a-fract(a) with floor(a) 2019-03-01 12:43:25 -08:00
egl egl/sl: use kms_swrast with vgem instead of a random GPU 2019-02-28 12:05:03 +00:00
freedreno freedreno: Fix a couple of warnings 2019-02-28 10:43:53 -08:00
gallium Revert "swr/rast: Archrast codegen updates" 2019-03-01 16:46:32 +00:00
gbm gbm: drop duplicate #defines 2019-02-14 11:20:00 +00:00
getopt
glx glx: fix shared memory leak in X11 2019-02-28 14:23:02 +10:00
gtest meson: hide warnings from external project gtest 2018-10-31 18:20:25 +00:00
hgl meson: Add Haiku platform support v4 2018-02-16 16:56:34 -06:00
imgui imgui: update memory editor 2019-02-26 12:49:07 +00:00
intel intel/fs: Generate if instructions with inverted conditions 2019-03-01 12:42:14 -08:00
loader loader: use loader_open_device() to handle O_CLOEXEC 2019-02-26 11:07:23 +00:00
mapi mapi: work around GCC LTO dropping assembly-defined functions 2019-02-13 14:20:51 +00:00
mesa st/nir: count num_uniforms for FS bultin shader 2019-02-27 22:18:24 -08:00
util driconf: add DTD to allow the drirc xml (00-mesa-defaults.conf) to be validated 2019-02-28 17:30:44 +00:00
vulkan vulkan: use VkBase{In,Out}Structure instead of a custom struct 2019-02-28 16:25:59 +00:00
Makefile.am build: move imgui out of src/intel/tools to be reused 2019-02-21 18:06:05 +00:00
meson.build iris: Initial commit of a new 'iris' driver for Intel Gen8+ GPUs. 2019-02-21 10:26:04 -08:00
SConscript scons: Remove gles option. 2018-10-19 16:50:26 +01:00