mesa/src
Iago Toral Quiroga 58767f0fec i965/vec4: add a SIMD lowering pass
Generally, instructions in Align16 mode only ever write to a single
register and don't need any form of SIMD splitting, that's why we
have never had a SIMD splitting pass in the vec4 backend. However,
double-precision instructions typically write 2 registers and in
some cases they run into certain hardware bugs and limitations
that we need to work around by splitting the instructions so we only
write to 1 register at a time. This patch implements a SIMD splitting
pass similar to the one in the scalar backend.

Because we only use double-precision instructions in Align16 mode
in gen7 (gen8+ is fully scalar and gens < 7 do not implement fp64)
the pass should be a no-op on any other generation.

For now the pass only handles the gen7 restriction where any
instruction that writes 2 registers also needs to read 2 registers.
This affects double-precision instructions reading uniforms, for
example. Later patches will extend the lowering pass adding a few
more cases.

v2:
 - Move the simd lowering pass after the main optimization loop and
   run copy-propagation and dce if it reports progress (Curro)
 - Compute number of registers written instead of fixing it to 1 (Iago)
 - Use group from backend_instruction (Iago)
 - Drop assertion that checked that we only split 8-wide instructions
   into 4-wide. (Curro)
 - Don't assume that instructions can only be 8-wide, we might want
   to use 16-wide instructions in the future too (Curro)
 - Wrap gen7 workarounds in a conditional to ease adding workarounds
   for other gens in the future (Curro)
 - Handle dst/src overlap hazard (Curro)
 - Use the horiz_offset() helper to simplify the implementation (Curro)
 - Drop the assertion that checks that each split instruction writes
   exactly one register (Curro)
 - Use the copy constructor to generate split instructions with all
   the relevant fields initialized to the values in the original
   instruction instead of copying only a handful of them manually (Curro)

v3 (Iago):
 - When copying to a temporary, allocate the number of registers required
   for the copy based on the size written of the lowered instruction
   instead of assuming that all lowered instructions produce single-register
   writes
 - Adapt to changes in offset()

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-03 11:26:51 +01:00
..
amd anv,radv: disable StorageImageWriteWithoutFormat for now 2016-12-31 16:38:00 -05:00
compiler glsl: Update ES 3.2 shader output restrictions. 2017-01-02 14:10:50 -08:00
egl egl: Emit correct error when robust context creation fails 2016-12-27 10:21:29 -08:00
gallium gallium/hud: fix the windows build by disabling file dumping 2017-01-02 23:18:28 +01:00
gbm gbm: request correct version of the DRI2_FENCE extension 2016-11-22 15:56:44 +00:00
getopt Introduce .editorconfig 2016-08-31 17:06:54 -07:00
glx dri: make use of loader_get_extensions_name(..) helper 2016-11-15 18:15:16 +00:00
gtest Introduce .editorconfig 2016-08-31 17:06:54 -07:00
hgl glapi/hgl: remove the final user of _glapi_check_table() 2016-10-06 15:03:46 +01:00
intel anv,radv: disable StorageImageWriteWithoutFormat for now 2016-12-31 16:38:00 -05:00
loader loader: automake: whitespace cleanup 2016-11-21 14:46:40 +00:00
mapi glapi: add missing INTEL_conservative_rasterization 2016-12-13 16:27:56 +00:00
mesa i965/vec4: add a SIMD lowering pass 2017-01-03 11:26:51 +01:00
util util: import CRC32 implementation from gallium 2016-11-22 18:05:51 +01:00
vulkan/wsi vulkan/wsi/x11: don't crash on null wsi x11 connection 2016-12-22 14:09:46 -08:00
Makefile.am amd: flatten amd/common makefile structure 2016-11-15 20:04:37 +00:00
SConscript scons: put the generated git_sha1.h file in top-level src/ directory 2016-06-17 10:33:00 -06:00