Commit graph

4709 commits

Author SHA1 Message Date
Ilia Mirkin
2d4787d77e mesa: add NV_viewport_array2 enable, attach to glsl
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4529>
2020-04-15 20:12:00 -04:00
Ilia Mirkin
cc6661bfc8 glsl: add NV_viewport_array2 support
This enables gl_Layer/gl_ViewportIndex when the ext is enabled, as well
as adding the new gl_ViewportMask[] array and viewport_relative layout
qualifier for gl_Layer.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4529>
2020-04-15 20:12:00 -04:00
Ilia Mirkin
54424a3d13 compiler: add VARYING_SLOT_VIEWPORT_MASK
See GL_NV_viewport_array2::gl_ViewportMask for how this is supposed
to work.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4529>
2020-04-15 20:12:00 -04:00
Connor Abbott
abcfb64370 ir3: Fix LDC offset units
I had missed that LDC actually uses vec4 units for its offset. This
means that we have to create a new instruction, and lower it in
ir3_nir_lower_io_offsets, similar to the existing SSBO instructions.
Unfortunately we can't assume that loads are always vec4-aligned, so we
have to use the alignment information that NIR gives us. Unfortunately,
it's currently woefully inadequate, and will have to be fixed to give us
good codegen in the future.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4568>
2020-04-15 22:38:20 +00:00
Danylo Piliaiev
600c91fed8 glsl/list: Fix undefined behaviour of foreach_* macros
These macros produced a lot of errors with ubsan preventing us from
expanding the ubsan coverage on CIs.

C++ spec has such clause:

 "If the prvalue of type "pointer to cv1 B" points to a B that is
  actually a subobject of an object of type D, the resulting pointer
  points to the enclosing object of type D. Otherwise, the result
  of the cast is undefined."

Ubsan error example:

 ../src/compiler/glsl/builtin_functions.cpp:4945:4: runtime error: downcast of address 0x559b926abb50 which does not point to an object of type 'ir_instruction'
 0x559b926abb50: note: object has invalid vptr
  9b 55 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  58 ba 6a 92 9b 55 00 00  01 00 00 00
               ^~~~~~~~~~~~~~~~~~~~~~~
               invalid vptr
     #0 0x559b914dbe1a in call ../src/compiler/glsl/builtin_functions.cpp:4945

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4129>
2020-04-14 19:29:38 +00:00
Tapani Pälli
53e4159eaa glsl: stop processing function parameters if error happened
Fixes: d1fa69ed61 ("glsl: do not attempt assignment if operand type not parsed correctly")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2696
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4341>
2020-04-13 15:53:15 +03:00
Tapani Pälli
e2457bedd3 glsl: remove redudant assignment
CID: 1461087
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4500>
2020-04-12 16:48:53 +03:00
Plamena Manolova
c77dc51203 intel/compiler: Add support for variable workgroup size
Add new builtin parameters that are used to keep track of the group
size.  This will be used to implement ARB_compute_variable_group_size.

The compiler will use the maximum group size supported to pick a
suitable SIMD variant.  A later improvement will be to keep all SIMD
variants (like FS) so the driver can select the best one at dispatch
time.

When variable workgroup size is used, the small workgroup optimization
is disabled as it we can't prove at compile time that the barriers
won't be needed.

Extracted from original i965 patch with additional changes by
Caio Marcelo de Oliveira Filho.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4504>
2020-04-09 19:23:12 -07:00
Connor Abbott
274f3815a5 ir3: Plumb through bindless support
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
2020-04-09 15:56:55 +00:00
Timothy Arceri
52c8bc0130 nir: make opt_if_loop_terminator() less strict
nir_cf_{extract,reinsert}() can't stitch a block together
if the block we are extracting ends in a jump but other jumps
nested in further ifs should be fine to move.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4477>
2020-04-08 01:35:45 +00:00
Caio Marcelo de Oliveira Filho
5dc85abc4f nir: Add per_view attribute to nir_variable
If a nir_variable is tagged with per_view, it must be an array with
size corresponding to the number of views.  For slot-tracking, it is
considered to take just the slot for a single element -- drivers will
take care of expanding this appropriately.

This will be used to implement the ability of having per-view position
in a vertex shader in Intel platforms.

Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>
2020-04-07 17:16:09 +00:00
Rob Clark
57557783f6 nir/lower_amul: fix slot calculation
Fixes incorrect indexing in
dEQP-GLES31.functional.ssbo.layout.instance_array_basic_type.packed.mat2x3

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4455>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4455>
2020-04-06 18:00:17 +00:00
Rob Clark
4638a16a93 nir: add some swizzle helpers
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4455>
2020-04-06 18:00:17 +00:00
Jason Ekstrand
e78a7a1825 nir: Assert memory loads are aligned
We've had alignment parameters on these operations for a while but a
bunch of places weren't setting them.  That should be resolved now so we
can start validating that they're always set.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4441>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4441>
2020-04-06 15:57:30 +00:00
Hyunjun Ko
9f174eb2df nir: fix wrong assignment to buffer in xfb_varyings_info
Tested with dEQP-VK.transform_feedback.fuzz.various_buffers.buffers100_instance_array_vertex

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4459>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4459>
2020-04-06 08:55:05 +00:00
Jason Ekstrand
a0a4df7e4f Revert "spirv: Rewrite CFG construction"
This reverts commit fa5a36dbd4.
2020-04-04 09:47:00 -05:00
Rob Clark
0b06adb750 glsl: don't limit fp16 lowering to frag
This restriction doesn't belong in core code.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4423>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4423>
2020-04-04 00:07:10 +00:00
Rob Clark
bf64648864 nir: fix definition of imadsh_mix16 for vectors
Fixes: c27b3758fa ("nir/opcodes: Add new 'umul_low' and 'imadsh_mix16' opcodes")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4423>
2020-04-04 00:07:10 +00:00
Daniel Schürmann
dc69738b0f nir: fix unpack_64_4x16 in lower_alu_to_scalar()
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4002>
2020-04-03 23:13:15 +01:00
Jason Ekstrand
fa5a36dbd4 spirv: Rewrite CFG construction
This commit completely rewrites the way we extract a structured CFG from
SPIR-V.  The new approach is different in a few ways:

 1. It does a breadth-first search instead of depth-first.  This means
    that we've visited the merge node for a construct before we visit
    any of the nodes inside the construct.  This makes it easier to
    validate things like loop and switch nesting.

 2. We record more information in the CFG.  Earlier commits added a
    parent pointer to vtn_cf_node but we now record all of the merge and
    other special blocks for each CFG node.  This lets us validate
    things more precisely.

 3. It makes heavy use of merge blocks for walking the CFG.  Previously,
    we sort of used them as hints for trying to guess the CFG structure
    but things got dicey whenever a merge was missing.  We had some
    heuristics for how to handle short-circuiting if statements but it
    was a bunch of special cases.

    Now, we make them a fundamental part of walking the CFG.  When we
    encounter a control-flow construct, we add the body components of
    the construct to the BFS work list and then jump to the merge block
    if one exists to continue scanning the current CFG nesting level.
    If no merge block exists, we assume that means that control-flow
    never re-converges in a normal way and that the only way to get back
    to normality is with a direct jump such as a loop break or continue.
    This should make things far more robust when trying to deal with the
    more creative placement (or lack thereof) of merge instructions.

Reviewed-by: Alan Baker <alanbaker@google.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3820>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3820>
2020-04-03 20:54:00 +00:00
Jason Ekstrand
2de5a41595 spirv: Add a parent field to vtn_cf_node
This makes it easier to crawl up the CF tree when trying to validate the
incoming SPIR-V control-flow.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3820>
2020-04-03 20:54:00 +00:00
Jason Ekstrand
d94e464a9f spirv: Make vtn_function a vtn_cf_node
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3820>
2020-04-03 20:54:00 +00:00
Jason Ekstrand
255aacbec1 spirv: Make vtn_case a vtn_cf_node
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3820>
2020-04-03 20:54:00 +00:00
Jason Ekstrand
9d7fcf1de0 spirv: Add cast and loop helpers for vtn_cf_node
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3820>
2020-04-03 20:54:00 +00:00
Jason Ekstrand
8c5c65d0d6 spirv: Add a vtn_block() helper
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3820>
2020-04-03 20:54:00 +00:00
Jason Ekstrand
36a32af008 nir/load_store_vectorize: Add support for nir_var_mem_global
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4367>
2020-04-03 20:26:54 +00:00
Jason Ekstrand
b6273291b5 nir/load_store_vectorize: Use nir_iadd_imm for offsets
This makes it capable of handling 64-bit offsets

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4367>
2020-04-03 20:26:54 +00:00
Jason Ekstrand
04d08ea149 nir/load_store_vectorize: Fix shared atomic info
These were clearly copied and pasted from SSBOs.  The shared atomics
don't have an SSBO index so their offset is src0 and data is src1.

Fixes: ce9205c03b "nir: add a load/store vectorization pass"
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4367>
2020-04-03 20:26:54 +00:00
Neil Roberts
63b4fcba33 glsl/lower_precision: Use vector.back() instead of vector.end()[-1]
The use of vector.end()[-1] seems to generate warnings in Coverity about
not allowing a negative argument to a parameter. The intention with the
code snippet is just to access the last element of the vector. The
vector.back() call acheives the same thing, is clearer and will
hopefully fix the Coverity warning.

I’m not exactly sure why Coverity thinks the array index can’t be
negative. cplusplus.com says that vector::end() returns a random access
iterator and that the type of the array index operator argument to that
should be the difference type for the container. It then also says that
difference_type for a vector is "a signed integral type".

Reviewed-by: Eric Anholt <eric@anholt.net>
2020-04-03 09:10:17 +02:00
Jason Ekstrand
c71c1f44b0 nir/from_ssa: Only chain movs when a src is also a dest
The algorithm we use for resolving parallel copy instructions plays this
little shell game with the values.  The reason for this is that it lets
us handle cases where, for instance we have a -> b and b -> a and we
need to use a temporary to do a swap.  One result of this algorithm is
that it tends to emit a lot of mov chains which are typcially really bad
for GPUs where a mov is far from free.  For instance, it's likely to
turn this:

    r16 = ssa_0; r17 = ssa_0; r18 = ssa_0; r15 = ssa_0

into this:

    r15 = mov ssa_0
    r18 = mov r15
    r17 = mov r18
    r16 = mov r17

which, if it's the only thing in a block (this is common for phis) is
impossible for a scheduler to fix because of the dependencies and you
end up with significant stalling.  If, on the other hand, we only do the
chaining in the actual case where we need to free up a so that it can be
used as a destination, we can emit this:

    r15 = mov ssa_0
    r18 = mov ssa_0
    r17 = mov ssa_0
    r16 = mov ssa_0

which is far nicer to the scheduler.  On Intel, our copy propagation
pass will undo the chain for us so this has no shader-db impact.
However, for less intelligent back-ends, it's probably a lot better.

Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4412>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4412>
2020-04-02 19:06:46 +00:00
Timothy Arceri
d259768e62 glsl_to_nir: remove dead code
This code was made unused by the changes described in be2990d8fb.

NIR based Gallium drivers switched to the NIR based lowering in
efa4fc0ebd.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4415>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4415>
2020-04-02 04:49:10 +00:00
Mark Janes
c07bbdbe82 nir: place aligned members after bitfields in shader_info.tess
The placement of new shader_info.tess members unnecessarily wastes
space by interspersing 64bit members between bitfields.

Fixes: f1dd81ae10 ("nir: Collect if shader uses cross-invocation or indirect I/O.")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4408>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4408>
2020-04-01 20:25:55 +00:00
Mark Janes
90a8b458ac nir: check shader type before writing to shaderinfo.tess union
If the shader is not a tesselation shader, then writing to the tess
member of the shaderinfo union will overwrite other members and crash.

Closes: #2722
Fixes: f1dd81ae10 ("nir: Collect if shader uses cross-invocation or indirect I/O.")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4408>
2020-04-01 20:25:55 +00:00
Jason Ekstrand
68f325b256 Revert "spirv: Implement OpCopyObject and OpCopyLogical as blind copies"
This reverts commit 7a53e67816.
2020-04-01 12:40:34 -05:00
Ian Romanick
b097e326b8 nir/algebraic: Remove a redundant fabs pattern
Made redundant by 5544b2cbbd ("nir/algebraic: Use value range analysis
to eliminate useless unary ops").

No shader-db changes on any Intel platform.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>
2020-04-01 00:28:38 +00:00
Ian Romanick
af1bc7e0c7 nir/algebraic: Use value range analysis to convert fmax to fsat
This is conceptually similar to the 1-fsat(a) <=> fsat(1-a) rearragement
done in:

3b74790941 ("nir/algebraic: Recognize open-coded flrp(a, b, fsat(c))")

2d259713b7 ("nir/algebraic: Commute 1-fsat(a) to fsat(1-a) for all
non-fmul instructions").

Note: this helps the Aztex Ruins shader that was hurt for spills and
fills on Braodwell in the previous commit, but it does not fix the
spills or fills. :(

All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 14528985 -> 14526116 (-0.02%)
instructions in affected programs: 477300 -> 474431 (-0.60%)
helped: 2332
HURT: 0
helped stats (abs) min: 1 max: 18 x̄: 1.23 x̃: 1
helped stats (rel) min: 0.07% max: 8.89% x̄: 0.88% x̃: 0.64%
95% mean confidence interval for instructions value: -1.27 -1.19
95% mean confidence interval for instructions %-change: -0.92% -0.85%
Instructions are helped.

total cycles in shared programs: 203723684 -> 203692984 (-0.02%)
cycles in affected programs: 4878847 -> 4848147 (-0.63%)
helped: 1764
HURT: 324
helped stats (abs) min: 1 max: 706 x̄: 22.94 x̃: 17
helped stats (rel) min: <.01% max: 17.75% x̄: 1.94% x̃: 1.66%
HURT stats (abs)   min: 1 max: 400 x̄: 30.15 x̃: 10
HURT stats (rel)   min: <.01% max: 17.76% x̄: 1.91% x̃: 0.69%
95% mean confidence interval for cycles value: -16.55 -12.86
95% mean confidence interval for cycles %-change: -1.44% -1.24%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>
2020-04-01 00:28:38 +00:00
Ian Romanick
62795475e8 nir/algebraic: Distribute source modifiers into instructions
There are three main classes of cases that are helped by this change:

1. When the negation is applied to a value being type converted (e.g.,
   float(-x)).  This could possibly also be handled with more clever
   code generation.

2. When the negation is applied to a phi node source (e.g., x = -(...);
   at the end of a basic block).  This was the original case that caught
   my attention while looking at shader-db dumps.

3. When the negation is applied to the source of an instruction that
   cannot have source modifiers.  This includes texture instructions and
   math box instructions on pre-Gen7 platforms (see more details below).

In many these cases the negation can be propagated into the instructions
that generate the value (e.g., -(a*b) = (-a)*b).

In addition to the operations implemtned in this patch, I also tried:

 - frcp - Helped 6 or fewer shaders on Gen7+, and hurt just as many on
   pre-Gen7.  On Gen6 and earlier, frcp is a math box instruction, and
   math box instructions cannot have source modifiers.

   I suspect this is why so many more shaders are helped on Gen6 than on
   Gen5 or Gen7.  Gen6 supports OpenGL 3.3, so a lot more shaders
   compile on it.  A lot of these shaders may have things like cos(-x)
   or rcp(-x) that could result in an explicit negation instruction.

 - bcsel - Hurt a few shaders with none helped.  bcsel operates on
   integer sources, so the fabs or fneg cannot be a source modifier in
   the bcsel itself.

 - Integer instructions - No changes on any Intel platform.

Some notes about the shader-db results below.

 - On Tiger Lake, a single Deus Ex fragment shader is hurt for both
   spills and fills.

 - On Haswell, a different Deus Ex fragment shader is hurt for both
   spills and fills.

 - On GM45, the "LOST: 1" and "GAINED: 1" is a single Left4Dead 2
   (very high graphics settings, lol) fragment shader that upgrades
   from SIMD8 to SIMD16.

v2: Add support for fsign.  Add some patterns that remove redundant
negations and redundant absolute value rather than trying to push them
down the tree.

Tiger Lake
total instructions in shared programs: 17611333 -> 17586465 (-0.14%)
instructions in affected programs: 3033734 -> 3008866 (-0.82%)
helped: 10310
HURT: 632
helped stats (abs) min: 1 max: 35 x̄: 2.61 x̃: 1
helped stats (rel) min: 0.04% max: 16.67% x̄: 1.43% x̃: 1.01%
HURT stats (abs)   min: 1 max: 47 x̄: 3.21 x̃: 2
HURT stats (rel)   min: 0.04% max: 5.08% x̄: 0.88% x̃: 0.63%
95% mean confidence interval for instructions value: -2.33 -2.21
95% mean confidence interval for instructions %-change: -1.32% -1.27%
Instructions are helped.

total cycles in shared programs: 338365223 -> 338262252 (-0.03%)
cycles in affected programs: 125291811 -> 125188840 (-0.08%)
helped: 5224
HURT: 2031
helped stats (abs) min: 1 max: 5670 x̄: 46.73 x̃: 12
helped stats (rel) min: <.01% max: 34.78% x̄: 1.91% x̃: 0.97%
HURT stats (abs)   min: 1 max: 2882 x̄: 69.50 x̃: 14
HURT stats (rel)   min: <.01% max: 44.93% x̄: 2.35% x̃: 0.74%
95% mean confidence interval for cycles value: -18.71 -9.68
95% mean confidence interval for cycles %-change: -0.80% -0.63%
Cycles are helped.

total spills in shared programs: 8942 -> 8946 (0.04%)
spills in affected programs: 8 -> 12 (50.00%)
helped: 0
HURT: 1

total fills in shared programs: 9399 -> 9401 (0.02%)
fills in affected programs: 21 -> 23 (9.52%)
helped: 0
HURT: 1

Ice Lake
total instructions in shared programs: 16124348 -> 16102258 (-0.14%)
instructions in affected programs: 2830928 -> 2808838 (-0.78%)
helped: 11294
HURT: 2
helped stats (abs) min: 1 max: 12 x̄: 1.96 x̃: 1
helped stats (rel) min: 0.07% max: 17.65% x̄: 1.32% x̃: 0.93%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 3.45% max: 4.00% x̄: 3.72% x̃: 3.72%
95% mean confidence interval for instructions value: -1.99 -1.93
95% mean confidence interval for instructions %-change: -1.34% -1.29%
Instructions are helped.

total cycles in shared programs: 335393932 -> 335325794 (-0.02%)
cycles in affected programs: 123834609 -> 123766471 (-0.06%)
helped: 5034
HURT: 2128
helped stats (abs) min: 1 max: 3256 x̄: 43.39 x̃: 11
helped stats (rel) min: <.01% max: 35.79% x̄: 1.98% x̃: 1.00%
HURT stats (abs)   min: 1 max: 2634 x̄: 70.63 x̃: 16
HURT stats (rel)   min: <.01% max: 49.49% x̄: 2.73% x̃: 0.62%
95% mean confidence interval for cycles value: -13.66 -5.37
95% mean confidence interval for cycles %-change: -0.69% -0.48%
Cycles are helped.

LOST:   0
GAINED: 2

Skylake
total instructions in shared programs: 14949240 -> 14927930 (-0.14%)
instructions in affected programs: 2594756 -> 2573446 (-0.82%)
helped: 11000
HURT: 2
helped stats (abs) min: 1 max: 12 x̄: 1.94 x̃: 1
helped stats (rel) min: 0.07% max: 18.75% x̄: 1.39% x̃: 0.94%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 4.76% max: 4.76% x̄: 4.76% x̃: 4.76%
95% mean confidence interval for instructions value: -1.97 -1.91
95% mean confidence interval for instructions %-change: -1.42% -1.37%
Instructions are helped.

total cycles in shared programs: 324829346 -> 324821596 (<.01%)
cycles in affected programs: 121566087 -> 121558337 (<.01%)
helped: 4611
HURT: 2147
helped stats (abs) min: 1 max: 3715 x̄: 33.29 x̃: 10
helped stats (rel) min: <.01% max: 36.08% x̄: 1.94% x̃: 1.00%
HURT stats (abs)   min: 1 max: 2551 x̄: 67.88 x̃: 16
HURT stats (rel)   min: <.01% max: 53.79% x̄: 3.69% x̃: 0.89%
95% mean confidence interval for cycles value: -4.25 1.96
95% mean confidence interval for cycles %-change: -0.28% -0.02%
Inconclusive result (value mean confidence interval includes 0).

Broadwell
total instructions in shared programs: 14971203 -> 14949957 (-0.14%)
instructions in affected programs: 2635699 -> 2614453 (-0.81%)
helped: 10982
HURT: 2
helped stats (abs) min: 1 max: 12 x̄: 1.93 x̃: 1
helped stats (rel) min: 0.07% max: 18.75% x̄: 1.39% x̃: 0.94%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 4.76% max: 4.76% x̄: 4.76% x̃: 4.76%
95% mean confidence interval for instructions value: -1.97 -1.90
95% mean confidence interval for instructions %-change: -1.42% -1.37%
Instructions are helped.

total cycles in shared programs: 336215033 -> 336086458 (-0.04%)
cycles in affected programs: 127383198 -> 127254623 (-0.10%)
helped: 4884
HURT: 1963
helped stats (abs) min: 1 max: 25696 x̄: 51.78 x̃: 12
helped stats (rel) min: <.01% max: 58.28% x̄: 2.00% x̃: 1.05%
HURT stats (abs)   min: 1 max: 3401 x̄: 63.33 x̃: 16
HURT stats (rel)   min: <.01% max: 39.95% x̄: 2.20% x̃: 0.70%
95% mean confidence interval for cycles value: -29.99 -7.57
95% mean confidence interval for cycles %-change: -0.89% -0.71%
Cycles are helped.

total fills in shared programs: 24905 -> 24901 (-0.02%)
fills in affected programs: 117 -> 113 (-3.42%)
helped: 4
HURT: 0

LOST:   0
GAINED: 16

Haswell
total instructions in shared programs: 13148927 -> 13131528 (-0.13%)
instructions in affected programs: 2220941 -> 2203542 (-0.78%)
helped: 8017
HURT: 4
helped stats (abs) min: 1 max: 12 x̄: 2.17 x̃: 1
helped stats (rel) min: 0.07% max: 15.25% x̄: 1.40% x̃: 0.93%
HURT stats (abs)   min: 1 max: 7 x̄: 2.50 x̃: 1
HURT stats (rel)   min: 0.33% max: 4.76% x̄: 2.73% x̃: 2.91%
95% mean confidence interval for instructions value: -2.21 -2.13
95% mean confidence interval for instructions %-change: -1.43% -1.37%
Instructions are helped.

total cycles in shared programs: 321221791 -> 321079870 (-0.04%)
cycles in affected programs: 126886055 -> 126744134 (-0.11%)
helped: 4674
HURT: 1729
helped stats (abs) min: 1 max: 23654 x̄: 56.47 x̃: 16
helped stats (rel) min: <.01% max: 53.22% x̄: 2.13% x̃: 1.05%
HURT stats (abs)   min: 1 max: 3694 x̄: 70.58 x̃: 18
HURT stats (rel)   min: <.01% max: 63.06% x̄: 2.48% x̃: 0.90%
95% mean confidence interval for cycles value: -33.31 -11.02
95% mean confidence interval for cycles %-change: -0.99% -0.78%
Cycles are helped.

total spills in shared programs: 19872 -> 19874 (0.01%)
spills in affected programs: 21 -> 23 (9.52%)
helped: 0
HURT: 1

total fills in shared programs: 20941 -> 20941 (0.00%)
fills in affected programs: 62 -> 62 (0.00%)
helped: 1
HURT: 1

LOST:   0
GAINED: 8

Ivy Bridge
total instructions in shared programs: 11875553 -> 11853839 (-0.18%)
instructions in affected programs: 1553112 -> 1531398 (-1.40%)
helped: 7304
HURT: 3
helped stats (abs) min: 1 max: 16 x̄: 2.97 x̃: 2
helped stats (rel) min: 0.07% max: 15.25% x̄: 1.62% x̃: 1.15%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.05% max: 3.33% x̄: 2.44% x̃: 2.94%
95% mean confidence interval for instructions value: -3.04 -2.90
95% mean confidence interval for instructions %-change: -1.65% -1.59%
Instructions are helped.

total cycles in shared programs: 178246425 -> 178184484 (-0.03%)
cycles in affected programs: 13702146 -> 13640205 (-0.45%)
helped: 4409
HURT: 1566
helped stats (abs) min: 1 max: 531 x̄: 24.52 x̃: 13
helped stats (rel) min: <.01% max: 38.67% x̄: 2.14% x̃: 1.02%
HURT stats (abs)   min: 1 max: 356 x̄: 29.48 x̃: 10
HURT stats (rel)   min: <.01% max: 64.73% x̄: 1.87% x̃: 0.70%
95% mean confidence interval for cycles value: -11.60 -9.14
95% mean confidence interval for cycles %-change: -1.19% -0.99%
Cycles are helped.

LOST:   0
GAINED: 10

Sandy Bridge
total instructions in shared programs: 10695740 -> 10667483 (-0.26%)
instructions in affected programs: 2337607 -> 2309350 (-1.21%)
helped: 10720
HURT: 1
helped stats (abs) min: 1 max: 49 x̄: 2.64 x̃: 2
helped stats (rel) min: 0.07% max: 20.00% x̄: 1.54% x̃: 1.13%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.04% max: 1.04% x̄: 1.04% x̃: 1.04%
95% mean confidence interval for instructions value: -2.69 -2.58
95% mean confidence interval for instructions %-change: -1.57% -1.51%
Instructions are helped.

total cycles in shared programs: 153478839 -> 153416223 (-0.04%)
cycles in affected programs: 22050900 -> 21988284 (-0.28%)
helped: 5342
HURT: 2200
helped stats (abs) min: 1 max: 1020 x̄: 20.34 x̃: 16
helped stats (rel) min: <.01% max: 24.05% x̄: 1.51% x̃: 0.86%
HURT stats (abs)   min: 1 max: 335 x̄: 20.93 x̃: 6
HURT stats (rel)   min: <.01% max: 20.18% x̄: 1.03% x̃: 0.30%
95% mean confidence interval for cycles value: -9.18 -7.42
95% mean confidence interval for cycles %-change: -0.82% -0.71%
Cycles are helped.

Iron Lake
total instructions in shared programs: 8114882 -> 8105574 (-0.11%)
instructions in affected programs: 1232504 -> 1223196 (-0.76%)
helped: 4109
HURT: 2
helped stats (abs) min: 1 max: 6 x̄: 2.27 x̃: 1
helped stats (rel) min: 0.05% max: 8.33% x̄: 0.99% x̃: 0.66%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.94% max: 4.35% x̄: 2.65% x̃: 2.65%
95% mean confidence interval for instructions value: -2.31 -2.21
95% mean confidence interval for instructions %-change: -1.01% -0.96%
Instructions are helped.

total cycles in shared programs: 188504036 -> 188466296 (-0.02%)
cycles in affected programs: 31203798 -> 31166058 (-0.12%)
helped: 3447
HURT: 36
helped stats (abs) min: 2 max: 92 x̄: 11.03 x̃: 8
helped stats (rel) min: <.01% max: 5.41% x̄: 0.21% x̃: 0.13%
HURT stats (abs)   min: 2 max: 30 x̄: 7.33 x̃: 6
HURT stats (rel)   min: 0.01% max: 1.65% x̄: 0.18% x̃: 0.10%
95% mean confidence interval for cycles value: -11.16 -10.51
95% mean confidence interval for cycles %-change: -0.22% -0.20%
Cycles are helped.

LOST:   0
GAINED: 1

GM45
total instructions in shared programs: 4989697 -> 4984531 (-0.10%)
instructions in affected programs: 703952 -> 698786 (-0.73%)
helped: 2493
HURT: 2
helped stats (abs) min: 1 max: 6 x̄: 2.07 x̃: 1
helped stats (rel) min: 0.05% max: 8.33% x̄: 1.03% x̃: 0.66%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.95% max: 4.35% x̄: 2.65% x̃: 2.65%
95% mean confidence interval for instructions value: -2.13 -2.01
95% mean confidence interval for instructions %-change: -1.07% -0.99%
Instructions are helped.

total cycles in shared programs: 128929136 -> 128903886 (-0.02%)
cycles in affected programs: 21583096 -> 21557846 (-0.12%)
helped: 2214
HURT: 17
helped stats (abs) min: 2 max: 92 x̄: 11.44 x̃: 8
helped stats (rel) min: <.01% max: 5.41% x̄: 0.24% x̃: 0.13%
HURT stats (abs)   min: 2 max: 8 x̄: 4.24 x̃: 4
HURT stats (rel)   min: 0.01% max: 1.65% x̄: 0.20% x̃: 0.09%
95% mean confidence interval for cycles value: -11.75 -10.88
95% mean confidence interval for cycles %-change: -0.25% -0.22%
Cycles are helped.

LOST:   1
GAINED: 1

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>
2020-04-01 00:28:38 +00:00
Ian Romanick
c0bdf37c91 nir/algebraic: Change the default cursor location when replacing a unary op
If the expression tree that is being replaced has a unary operation at
its root, set the cursor (location where new instructions are inserted)
at the source instruction instead.

This doesn't do much now because there are very few patterns that have a
unary operation as the root.  Almost all of the patterns that do have a
unary operation as the root have inot.  All of the shaders that are
affected by this commit have expression trees with an inot at the root.

This change prevents some significant, spurious caused by the next
commit.  There is further explanation in the large comment added in
the code.

I also considered a couple other options that may still be worth exploring.

1. Add some mark-up to the search pattern to denote where new
   instructions should be added.  I considered using "@" to denote the
   cursor location.  For example,

    (('fneg', ('fadd@', a, b)), ...)

2. To prevent other kinds of unintended code motion, add the ability to
   name expressions in the search pattern so that they can be reused in
   the replacement.  For example,

   (('bcsel', ('ige', ('find_lsb=b', a), 0), ('find_lsb', a), -1), b),

   An alternative would be to add some kind of CSE at the time of
   inserting the replacements.  Create a new instruction, then check to
   see if it already exists.  That option might be better overall.

Over the years I know Matt has heard me complain, "I added a pattern
that just deleted an instruction, but it added a bunch of spills!"  This
was always in large, complex shaders that are very hard to analyze.  I
always blamed these cases on the scheduler being dumb.  I am now very
suspicious that unintended code motion was the real problem.

All Gen4+ Intel platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 17611405 -> 17611333 (<.01%)
instructions in affected programs: 18613 -> 18541 (-0.39%)
helped: 41
HURT: 13
helped stats (abs) min: 1 max: 18 x̄: 4.46 x̃: 4
helped stats (rel) min: 0.27% max: 5.68% x̄: 1.29% x̃: 1.34%
HURT stats (abs)   min: 1 max: 20 x̄: 8.54 x̃: 7
HURT stats (rel)   min: 0.30% max: 4.20% x̄: 2.15% x̃: 2.38%
95% mean confidence interval for instructions value: -3.29 0.63
95% mean confidence interval for instructions %-change: -0.95% 0.02%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 338366118 -> 338365223 (<.01%)
cycles in affected programs: 257889 -> 256994 (-0.35%)
helped: 42
HURT: 15
helped stats (abs) min: 2 max: 120 x̄: 39.38 x̃: 34
helped stats (rel) min: 0.04% max: 2.55% x̄: 0.86% x̃: 0.76%
HURT stats (abs)   min: 6 max: 204 x̄: 50.60 x̃: 34
HURT stats (rel)   min: 0.11% max: 4.75% x̄: 1.12% x̃: 0.56%
95% mean confidence interval for cycles value: -30.39 -1.02
95% mean confidence interval for cycles %-change: -0.66% -0.02%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>
2020-04-01 00:28:38 +00:00
Timothy Arceri
0f4a81430e nir: fix crash in varying packing on interface mismatch
For example when the outputs are scalars but the inputs are struct
members.

Fixes: 26aa460940 ("nir: rewrite varying component packing")

Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4351>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4351>
2020-03-31 23:43:31 +00:00
Jason Ekstrand
7a53e67816 spirv: Implement OpCopyObject and OpCopyLogical as blind copies
Because the types etc. are required to logically match, we can just
copy-propagate the guts of the vtn_value.  This was causing issues with
some new CTS tests that are doing an OpCopyObject of a sampler which is
a special-cased type in spirv_to_nir.  Of course, this is only a partial
solution.  Ideally, we've got a bit of work to do to make all the
composite stuff able to handle all types including images, sampler, and
combined image/samplers but this gets some CTS tests passing.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4375>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4375>
2020-03-31 17:55:30 +00:00
Jason Ekstrand
9468f0729b nir: Handle vec8/16 in nir_shrink_array_vars
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4365>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4365>
2020-03-31 00:18:05 +00:00
Jason Ekstrand
c26bf848ba nir: Handle vec8/16 in opt_undef_vecN
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4365>
2020-03-31 00:18:05 +00:00
Jason Ekstrand
99540edfde nir: Treat vec8/16 as select in opt_peephole_select
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4365>
2020-03-31 00:18:05 +00:00
Jason Ekstrand
e3554a293b nir: Handle vec8/16 in opt_split_alu_of_phi
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4365>
2020-03-31 00:18:05 +00:00
Jason Ekstrand
2aab7999e4 nir: Handle vec8/16 in lower_regs_to_ssa
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4365>
2020-03-31 00:18:05 +00:00
Jason Ekstrand
1033255952 nir: Handle vec8/16 in lower_phis_to_scalar
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4365>
2020-03-31 00:18:05 +00:00
Jason Ekstrand
ac7a940eba nir: Handle vec8/16 in gather_ssa_types
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4365>
2020-03-31 00:18:05 +00:00
Jason Ekstrand
a18c4ee7b0 nir: Handle vec8/16 in bool_to_bitsize
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4365>
2020-03-31 00:18:05 +00:00
Jason Ekstrand
f5bbdf7621 nir: Copy propagate through vec8s and vec16s
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4365>
2020-03-31 00:18:05 +00:00
Jason Ekstrand
842338e2f0 nir: Add a nir_op_is_vec helper
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4365>
2020-03-31 00:18:05 +00:00