Commit graph

377 commits

Author SHA1 Message Date
Connor Abbott
eaf799ddff nir: add nir_foreach_instr_safe_reverse()
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
2015-07-17 09:49:53 -07:00
Connor Abbott
8eea091747 nir: add nir_instr_is_first() and nir_instr_is_last() helpers
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
2015-07-17 09:47:22 -07:00
Iago Toral Quiroga
6b09598d63 nir: add nir_var_shader_storage
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-07-14 07:04:03 +02:00
Kenneth Graunke
efb36271a9 nir: Fix comment above nir_convert_from_ssa() prototype.
Connor renamed the parameter, inverting the sense.
Update the comment accordingly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-07-08 11:28:08 -07:00
Rob Clark
959b47262b nir/lower_phis_to_scalar: undef is trivially scalarizable
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-07-03 08:56:09 -04:00
Jason Ekstrand
89bd5ee64c nir: Don't allow copying SSA destinations
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-07-02 15:42:33 -07:00
Connor Abbott
aa7d4cecec nir: remove parent_instr from nir_register
It's no longer used.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-06-30 11:18:27 -07:00
Connor Abbott
f49e51ef44 nir: remove nir_src_get_parent_instr()
It's now unused.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-06-30 11:18:27 -07:00
Connor Abbott
2b1a1d8b12 nir/from_ssa: add a flag to not convert everything from SSA
We already don't convert constants out of SSA, and in our backend we'd
like to have only one way of saying something is still in SSA.

The one tricky part about this is that we may now leave some undef
instructions around if they aren't part of a phi-web, so we have to be
more careful about deleting them.

v2: rename and flip meaning of flag (Jason)

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-06-30 11:18:27 -07:00
Rob Clark
dc7e6463d3 nir: cleanup open-coded instruction casts
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2015-06-30 12:13:44 -04:00
Kenneth Graunke
6026f7e8fb nir: Recognize max(min(a, 1.0), 0.0) as fsat(a).
We already recognize min(max(a, 0.0), 1.0) as a saturate, but neglected
this variant (which is also handled by the GLSL IR pass).

shader-db results on Broadwell:
total instructions in shared programs: 7363046 -> 7362788 (-0.00%)
instructions in affected programs:     11928 -> 11670 (-2.16%)
helped:                                64
HURT:                                  0

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-06-25 02:12:32 -07:00
Kenneth Graunke
147cdb53ec nir: Use a switch statement for detecting move-like operations.
Suggested by Jason Ekstrand.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-06-24 10:35:04 -07:00
Kenneth Graunke
1762568fd3 nir: Allow vec2/vec3/vec4 instructions in the select peephole pass.
These are basically just moves, so they should be safe as well.

When disabling i965's GLSL IR level scalarizer (channel expressions)
pass, I started seeing NIR code like this:

        if ssa_21 {
                block block_1:
                /* preds: block_0 */
                vec4 ssa_120 = vec4 ssa_82, ssa_83, ssa_84, ssa_30
                /* succs: block_3 */
        } else {
                block block_2:
                /* preds: block_0 */
                /* succs: block_3 */
        }
        block block_3:
        /* preds: block_1 block_2 */
        vec4 ssa_33 = phi block_1: ssa_120, block_2: ssa_2

Previously, the GLSL IR scalarizer pass would break the vec4 into a
series of fmovs, which were allowed by the peephole pass.  But with
the vec4 operation, they were not.  We want to keep getting selects.

Normal i965 on Broadwell:
instructions in affected programs:     200 -> 176 (-12.00%)
helped:                                4

With brw_fs_channel_expressions() disabled:
instructions in affected programs:     1832 -> 1646 (-10.15%)
helped:                                30

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-06-22 14:08:36 -07:00
Jordan Justen
2867f2e8cd nir: Add barrier intrinsic function
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-06-12 15:12:40 -07:00
Chris Forbes
e7f628c2fc glsl: Add ir node for barrier
v2:
 * Changes suggested by mattst88

[jordan.l.justen@intel.com: Add nir support]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-06-12 15:12:39 -07:00
Timothy Arceri
86a74e9b6b nir: use src for ssa helper
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-06-03 06:50:39 +10:00
Timothy Arceri
5f7b8fa481 nir: remove extra semicolon
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-06-03 06:50:33 +10:00
Eduardo Lima Mitev
5b226a1242 nir: prevent use-after-free condition in should_lower_phi()
lower_phis_to_scalar() pass recurses the instruction dependence graph to
determine if all the sources of a given instruction are scalarizable.
To prevent cycles, it temporary marks the phi instruction before recursing in,
then updates the entry with the resulting value. However, it does not consider
that the entry value may have changed after a recursion pass, hence causing
a use-after-free situation and a crash.

This patch fixes this by reloading the entry corresponding to the 'phi'
after recursing and before updating its value.

The crash can be reproduced ~20% of times with the dEQP test:

dEQP-GLES3.functional.shaders.loops.while_constant_iterations.nested_sequence_fragment

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-06-02 20:21:49 +02:00
Iago Toral Quiroga
2231cf0ba3 nir: Fix output swizzle in get_mul_for_src
When we compute the output swizzle we want to consider the number of
components in the add operation. So far we were using the writemask
of the multiplication for this instead, which is not correct.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-28 18:25:37 +02:00
Matt Turner
5614bcc416 nir: Remove sRGB colorspace conversion round-trip.
Some shaders in Civilization V and Beyond Earth do

   pow(pow(x, 2.2), 0.454545)

which is converting to and from sRGB colorspace.

A more general rule that replaces pow(pow(a, b), c) with pow(a, b * c)
actually regresses two shaders in Sun Temple in which the result of the
inner pow is used twice, once by another pow and once by another
instruction. Also, since 2.2 * 0.454545 isn't exactly one, the more
general pattern would have still left us with a pow, and I'm 2.2 *
0.454545 percent sure that's not what they want.

instructions in affected programs:     934 -> 886 (-5.14%)
helped:                                16
2015-05-22 11:26:36 -07:00
Jason Ekstrand
2126c68e5c nir: Get rid of the array elements parameter on load/store intrinsics
Previously, we used intrinsic->const_index[1] to represent "the number of
array elements to load" for load/store intrinsics.  However, this set to 1
by every pass that ever creates a load/store intrinsic.  Also, while it
might make some sense for registers, it makes no sense whatsoever in SSA.
On top of that, the i965 backend was the only backend to ever support it;
freedreno and vc4 just assert that it's always 1.  Let's just delete it.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2015-05-20 09:28:06 -07:00
Francisco Jerez
d91d6b3f03 nir: Translate memory barrier intrinsics from GLSL IR.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 15:47:57 +03:00
Francisco Jerez
f8f8b31847 nir: Translate image load, store and atomic intrinsics from GLSL IR.
v2: Undefine coordinate components not applicable to the target.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 15:47:57 +03:00
Francisco Jerez
6de78e6b0c nir: Fix indexing of atomic counter arrays with a constant value.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 15:47:57 +03:00
Francisco Jerez
f1269a3e01 nir: Add memory barrier intrinsic.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 15:47:57 +03:00
Francisco Jerez
d9e930997f nir: Define image load, store and atomic intrinsics.
v2: Undefine coordinate components not applicable to the target.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-05-12 15:47:57 +03:00
Tapani Pälli
95774ca258 nir: fix sampler lowering pass for arrays
This fixes bugs with special cases where we have arrays of
structures containing samplers or arrays of samplers.

I've verified that patch results in calculating same index value as
returned by _mesa_get_sampler_uniform_value for IR. Patch makes
following ES3 conformance test pass:

	ES3-CTS.shaders.struct.uniform.sampler_array_fragment

v2: remove unnecessary comment (Topi)
    simplify changes and the overall code (Jason)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90114
2015-05-12 14:28:16 +03:00
Kenneth Graunke
d6fb155f30 nir: Fix aggressive typos in nir_from_ssa.c.
s/agressive/aggressive/g

Trivial.
2015-05-08 19:38:14 -07:00
Jason Ekstrand
fb5f411248 nir/search: Save/restore the variables_seen bitmask when matching
Shader-db results on Broadwell:

   total instructions in shared programs: 7152330 -> 7137006 (-0.21%)
   instructions in affected programs:     1330548 -> 1315224 (-1.15%)
   helped:                                5797
   HURT:                                  76
   GAINED:                                0
   LOST:                                  8

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:29:15 -07:00
Jason Ekstrand
e0cfe59c37 nir/search: Assert that variable id's are in range
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:29:15 -07:00
Jason Ekstrand
13facfbd5b nir/search: handle explicitly sized sources in match_value
Previously, this case was being handled in match_expression prior to
calling match_value.  However, there is really no good reason for this
given that match_value has all of the information it needs.  Also, they
weren't being handled properly in the commutative case and putting it in
match_value gives us that for free.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:29:14 -07:00
Jason Ekstrand
f752effa08 nir/nir: Use a linked list instead of a hash set for use/def sets
This commit switches us from the current setup of using hash sets for
use/def sets to using linked lists.  Doing so should save us quite a bit of
memory because we aren't carrying around 3 hash sets per register and 2 per
SSA value.  It should also save us CPU time because adding/removing things
from use/def sets is 4 pointer manipulations instead of a hash lookup.

Running shader-db 50 times with USE_NIR=0, NIR, and NIR + use/def lists:

   GLSL IR Only:        586.4 +/- 1.653833
   NIR with hash sets:  675.4 +/- 2.502108
   NIR + use/def lists: 641.2 +/- 1.557043

I also ran a memory usage experiment with Ken's patch to delete GLSL IR and
keep NIR.  This patch cuts an aditional 42.9 MiB of ralloc'd memory over
and above what we gained by deleting the GLSL IR on the same dota trace.

On the code complexity side of things, some things are now much easier and
others are a bit harder.  One of the operations we perform constantly in
optimization passes is to replace one source with another.  Due to the fact
that an instruction can use the same SSA value multiple times, we had to
iterate through the sources of the instruction and determine if the use we
were replacing was the only one before removing it from the set of uses.
With this patch, uses are per-source not per-instruction so we can just
remove it safely.  On the other hand, trying to iterate over all of the
instructions that use a given value is more difficult.  Fortunately, the
two places we do that are the ffma peephole where it doesn't matter and GCM
where we already gracefully handle duplicates visits to an instruction.

Another aspect here is that using linked lists in this way can be tricky to
get right.  With sets, things were quite forgiving and the worst that
happened if you didn't properly remove a use was that it would get caught
in the validator.  With linked lists, it can lead to linked list corruption
which can be harder to track.  However, we do just as much validation of
the linked lists as we did of the sets so the validator should still catch
these problems.  While working on this series, the vast majority of the
bugs I had to fix were caught by assertions.  I don't think the lists are
going to be that much worse than the sets.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
ecc2cfc8b6 nir: Use nir_instr_rewrite_src in copy propagation
We were rolling our own rewrite_src variant in copy-propagation.  Let's
stop doing that and use the ones in core NIR.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
f72a8d1cf0 nir: Add a function for rewriting the condition of an if statement
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
300d729436 nir: Add and use initializer #defines for nir_src and nir_dest
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
6702ebce57 nir: Modernize the out-of-SSA pass
The out-of-SSA pass was one of the first passes written when getting SSA
up-and-going (for obvious reasons).  As such, it came before a lot of the
nifty SSA-based helpers were introduced.  This commit modernizes it so that
we're no longer doing nearly as much manual banging on use/def sets.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
7ee0216e2d nir/validate: Validate SSA def parent instructions
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:16:13 -07:00
Ian Romanick
3bdbc1e436 nir: Delete all traces of nir_op_flog
Nothing produces it, and nothing can consume it.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-08 12:12:54 -07:00
Ian Romanick
ad51f9b421 nir: Don't produce nir_op_flog from GLSL IR
All paths that produce GLSL IR for NIR lower ir_unop_log.  All paths
that consume NIR will explode if they geta nir_op_flog.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-08 12:12:54 -07:00
Ian Romanick
e0a17f6e31 nir: Delete all traces of nir_op_fexp
Nothing produces it, and nothing can consume it.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-08 12:12:54 -07:00
Ian Romanick
a45d55f17c nir: Don't produce nir_op_fexp from GLSL IR
All paths that produce GLSL IR for NIR lower ir_unop_exp.  All paths
that consume NIR will explode if they geta nir_op_fexp.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-08 12:12:54 -07:00
Matt Turner
8e029105c2 nir: Allow feq/fne/ieq/ine to be optimized with inot.
instructions in affected programs:     380 -> 376 (-1.05%)
helped:                                2

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-05-07 10:51:05 -07:00
Matt Turner
f5cf74d8ba nir: Recognize (a < c || b < c) as min(a, b) < c.
... and (a >= c) || (b >= c) as max(a, b) >= c.

Similar to commit 97e6c1b9.

total instructions in shared programs: 6182276 -> 6182180 (-0.00%)
instructions in affected programs:     6400 -> 6304 (-1.50%)
helped:                                68
HURT:                                  4

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-05-07 10:51:05 -07:00
Matt Turner
ceb8b739ce nir: Recognize trivial min/max.
No changes, but does prevent some regressions in the next commit.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-05-07 10:51:05 -07:00
Matt Turner
8ae559971a nir: Recognize i2b(b2i(x)) as x.
Helps the same set of programs as the previous commit.

instructions in affected programs:     4490 -> 4346 (-3.21%)
helped:                                8

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-05-07 10:51:05 -07:00
Matt Turner
74697e2844 nir: Recognize imul(b2i(a), b2i(b)) as a logical AND.
Four shaders in Unreal 4's Sun Temple are helped, and gain SIMD16
because we avoid an integer multiplication.

instructions in affected programs:     2353 -> 2245 (-4.59%)
helped:                                4
GAINED:                                4

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-05-07 10:51:05 -07:00
Zoë Blade
05e7f7f438 Fix a few typos
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-04-27 17:28:29 +03:00
Matt Turner
f251ea393b nir: Transform pow(x, 4) into (x*x)*(x*x). 2015-04-24 11:39:01 -07:00
Jason Ekstrand
125574d1ef nir/lower_source_mods: Don't propagate register sources
The nir_lower_source_mods pass does a weak form of copy propagation to
clean up all of the mov-with-negate's that get generated.  However, we
weren't properly checking that the sources were SSA and so we could end up
moving a register read which is not, in general, valid.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:41 -07:00
Jason Ekstrand
296131f467 nir: Rewrite instr_rewrite_src
The old code wasn't correctly handling the case where the new value of the
source contains an indirect.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:41 -07:00