mesa/src
Alyssa Rosenzweig 54d7907c27 nir: Propagate *2*16 conversions into vectors
If we have code like:

   ('f2f16', ('vec2', ('f2f32', 'a@16'), '#b@32'))

We would like to eliminate the conversions, but the existing rules can't
see into the the (heterogenous) vector. So instead of trying to
eliminate in one pass, we add opts to propagate the f2f16 into the
vector. Even if nothing further happens, this is often a win since then
the created vector is smaller (half2 instead of float2). Hence the above
gets transformed to

   ('vec2', ('f2f16', ('f2f32', 'a@16')), ('f2f16', '#b@32'))

Then the existing f2f16(f2f32) rule will kick in for the first component
and constant folding will for the second and we'll be left with

   ('vec2', 'a@16', '#b@16')

...eliminating all conversions.

v2: Predicate on !options->vectorize_vec2_16bit. As discussed, this
optimization helps greatly on true vector architectures (like Midgard)
but wreaks havoc on more modern SIMD-within-a-register architectures
(like Bifrost and modern AMD). So let's predicate on that.

v3: Extend for integers as well and add a comment explaining the
transforms.

Results on Midgard (unfortunately a true SIMD architecture):

total instructions in shared programs: 51359 -> 50963 (-0.77%)
instructions in affected programs: 4523 -> 4127 (-8.76%)
helped: 53
HURT: 0
helped stats (abs) min: 1 max: 86 x̄: 7.47 x̃: 6
helped stats (rel) min: 1.71% max: 28.00% x̄: 9.66% x̃: 7.34%
95% mean confidence interval for instructions value: -10.58 -4.36
95% mean confidence interval for instructions %-change: -11.45% -7.88%
Instructions are helped.

total bundles in shared programs: 25825 -> 25670 (-0.60%)
bundles in affected programs: 2057 -> 1902 (-7.54%)
helped: 53
HURT: 0
helped stats (abs) min: 1 max: 26 x̄: 2.92 x̃: 2
helped stats (rel) min: 2.86% max: 30.00% x̄: 8.64% x̃: 8.33%
95% mean confidence interval for bundles value: -3.93 -1.92
95% mean confidence interval for bundles %-change: -10.69% -6.59%
Bundles are helped.

total quadwords in shared programs: 41359 -> 41055 (-0.74%)
quadwords in affected programs: 3801 -> 3497 (-8.00%)
helped: 57
HURT: 0
helped stats (abs) min: 1 max: 57 x̄: 5.33 x̃: 4
helped stats (rel) min: 1.92% max: 21.05% x̄: 8.22% x̃: 6.67%
95% mean confidence interval for quadwords value: -7.35 -3.32
95% mean confidence interval for quadwords %-change: -9.54% -6.90%
Quadwords are helped.

total registers in shared programs: 3849 -> 3807 (-1.09%)
registers in affected programs: 167 -> 125 (-25.15%)
helped: 32
HURT: 1
helped stats (abs) min: 1 max: 3 x̄: 1.34 x̃: 1
helped stats (rel) min: 20.00% max: 50.00% x̄: 26.35% x̃: 20.00%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 16.67% max: 16.67% x̄: 16.67% x̃: 16.67%
95% mean confidence interval for registers value: -1.54 -1.00
95% mean confidence interval for registers %-change: -29.41% -20.69%
Registers are helped.

total threads in shared programs: 2471 -> 2520 (1.98%)
threads in affected programs: 49 -> 98 (100.00%)
helped: 25
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.96 x̃: 2
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for threads value: 1.88 2.04
95% mean confidence interval for threads %-change: 100.00% 100.00%
Threads are [helped].

total spills in shared programs: 168 -> 168 (0.00%)
spills in affected programs: 0 -> 0
helped: 0
HURT: 0

total fills in shared programs: 186 -> 186 (0.00%)
fills in affected programs: 0 -> 0
helped: 0
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4999>
2020-06-30 16:21:33 +00:00
..
amd ac/gpu_info: fix num_physical_sgprs_per_simd for gfx10 2020-06-30 10:56:41 +00:00
broadcom v3d: Enable PIPE_CAP_TGSI_TEXCOORD. 2020-06-29 09:07:21 -07:00
compiler nir: Propagate *2*16 conversions into vectors 2020-06-30 16:21:33 +00:00
drm-shim meson: use gnu_symbol_visibility argument 2020-06-01 18:59:18 +00:00
egl util: rename xmlpool.h to driconf.h 2020-06-22 21:50:12 +00:00
etnaviv etnaviv: replace all dup() with os_dupfd_cloexec() 2020-06-18 02:09:56 +00:00
freedreno turnip: enable depthBiasClamp 2020-06-29 13:08:51 +00:00
gallium panfrost: Do fine-grained flushing for occlusion query results 2020-06-30 15:14:05 +00:00
gbm gbm: document that gbm_bo_map exposes a linear view 2020-06-03 10:09:52 +00:00
getopt
glx util: rename xmlpool.h to driconf.h 2020-06-22 21:50:12 +00:00
gtest gtest: Update to 1.10.0 2020-04-20 11:57:11 +00:00
hgl scons: Prune out unnecessary targets. 2020-03-30 13:38:01 +00:00
imgui meson: drop intel_ prefix on imgui_core 2019-12-10 15:16:02 +00:00
intel anv: Align "used" attribute to 64 bits. 2020-06-25 22:11:36 -07:00
loader loader/dri3: Check for window destruction in dri3_wait_for_event_locked 2020-06-29 17:05:52 +00:00
mapi mapi: x86: Fix dynamic entries in x86 tsd stubs. 2020-06-26 18:28:01 +00:00
mesa st/glsl_to_nir: disable st_nir_lower_builtin() when packing supported 2020-06-30 01:29:43 +00:00
panfrost panfrost: Add PAN_MESA_DEBUG=gl3 flag 2020-06-26 10:30:03 +00:00
util driconf: add workarounds for SPECviewperf13 2020-06-23 09:25:24 +00:00
vulkan vulkan/overlay: fix crash on destroying NULL swapchain 2020-06-25 10:31:50 +00:00
meson.build meson: use gnu_symbol_visibility argument 2020-06-01 18:59:18 +00:00
SConscript driconf: drop now unused translation facility 2020-06-22 21:50:12 +00:00