Commit graph

406 commits

Author SHA1 Message Date
Connor Abbott
13482111d0 nir/cf: add remove_phi_src() helper
Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:41 -07:00
Connor Abbott
f41e108d8b nir: add nir_foreach_phi_src_safe()
Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:41 -07:00
Connor Abbott
762ae436ea nir/cf: add insert_phi_undef() helper
Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:41 -07:00
Connor Abbott
b49371b8ed nir: move control flow modification to its own file
We want to start reworking and expanding this code, but it'll be a lot
easier to do once we disentangle it from the rest of the stuff in nir.c.
Unfortunately, there are a few unavoidable dependencies in nir.c on
methods we'd rather not expose publicly, since if not used in very
specific situations they can cause Bad Things (tm) to happen. Namely, we
need to do some magical control flow munging when adding/removing jumps.
In the future, we may disallow adding/removing jumps in
nir_instr_insert_*() and nir_instr_remove(), and use separate functions
that are part of the control flow modification code, but for now we
expose them and put them in a separate, private header.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:41 -07:00
Connor Abbott
1c53f89696 nir: make cleanup_cf_node() not use remove_defs_uses()
cleanup_cf_node() is part of the control flow modification code, which
we're going to split into its own file, but remove_defs_uses() is an
internal function used by nir_instr_remove(). Break the dependency by
making cleanup_cf_node() use nir_instr_remove() instead, which simply
calls remove_defs_uses() and then removes the instruction from the list.
nir_instr_remove() does do extra things for jumps, though, so we avoid
calling it on jumps which matches the previous behavior (this will be
fixed later in the series).

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:41 -07:00
Connor Abbott
9d5944053c nir: inline block_add_pred() a few places
It was being used to initialize function impls and loops, even though
it's really a control flow modification helper. It's pretty trivial, so
just inline it to avoid the dependency.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:41 -07:00
Connor Abbott
c7df141c71 nir/validate: check successors/predecessors more carefully
We should be checking almost everything now.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:41 -07:00
Kenneth Graunke
8e0d4ef341 nir: Delete the nir_function_impl::start_block field.
It's simply the first nir_cf_node in the nir_function_impl::body list,
which is easy enough to access - we don't to store a pointer to it
explicitly.  Removing it means we don't need to maintain the pointer
when, say, splitting the start block when modifying control flow.

Thanks to Connor Abbott for suggesting this.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-08-24 13:31:41 -07:00
Martin Peres
80b1707e26 nir: convert the glsl intrinsic image_size to nir_intrinsic_image_size
v2, review from Francisco Jerez:
 - make the destination variable as large as what the nir instrinsic
   defines (4) instead of the size of the return variable of glsl. This
   is still safe for the already existing code because all the intrinsics
   affected returned the same amount of components as expected by glsl IR.
   In the case of image_size, it is not possible to do so because the
   returned number of component depends on the image type and this case
   is not well handled by nir.

v3:
- Style fix

Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-08-20 14:07:46 +03:00
Kenneth Graunke
ab83be590d nir: Use nir_builder in nir_lower_io's get_io_offset().
Much more readable.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-08-19 19:29:39 -07:00
Kenneth Graunke
ed2afec3fc nir: Pull nir_lower_io's load_op selection into a helper function.
Makes the function a bit smaller.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-08-19 19:29:22 -07:00
Thomas Helland
49d0a36bd6 nir: Simplify feq(fneg(a), a)) -> feq(a, 0.0)
The positive and negative value of a float can only
be equal to each other if it is -0.0f and 0.0f.
This is safe for Nan and Inf, as -Nan != Nan, and -Inf != Inf
This gives no changes in my shader-db

Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-08-18 11:34:44 -07:00
Thomas Helland
a39167d594 nir: Simplify fne(fneg(a), a) -> fne(a, 0.0)
-NaN != NaN, and -Inf != Inf, so this should be safe.
Found while working on my VRP pass.

Shader-db results on my IVB:
total instructions in shared programs: 1698267 -> 1698067 (-0.01%)
instructions in affected programs:     15785 -> 15585 (-1.27%)
helped:                                36
HURT:                                  0
GAINED:                                0
LOST:                                  0

Some shaders was found to have the following pattern in NIR:
vec1 ssa_26 = fneg ssa_21
vec1 ssa_27 = fne ssa_21, ssa_26

Make that:
vec1 ssa_27 = fne ssa_21, 0.0f

This is found in Dota2 and Brutal Legend.
One shader is cut by 8%, from 323 -> 296 instructons in SIMD8

Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-08-18 11:34:44 -07:00
Kenneth Graunke
afccbd7256 nir: Add a glsl_uint_type() wrapper.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
2015-08-16 21:44:19 -07:00
Eric Anholt
a6e75e3cd7 nir: Add support for CSE on textures.
NIR instruction count results on i965:
total instructions in shared programs: 1261954 -> 1261937 (-0.00%)
instructions in affected programs:     455 -> 438 (-3.74%)

One in yofrankie, two in tropics.  Apparently i965 had also optimized all
of these out anyway.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-08-14 11:39:18 -07:00
Eric Anholt
fb2425a641 nir: Zero out texture instructions when creating them.
There are so many flags in textures, that the CSE pass would have a hard
time referencing the correct set when figuring out if two texture ops are
the same.  By zeroing, we can avoid that fragility.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-08-14 11:39:18 -07:00
Eric Anholt
d50c182671 nir: Don't try to scalarize unpack ops.
Avoids regressions in vc4 when trying to do our blending in NIR.

v2: Add the other unpack ops I meant to when writing the original commit
    message.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-08-14 11:39:18 -07:00
Eric Anholt
9e6dc5b64d nir: Add a nir_opt_undef() to handle csels with undef.
We may find a cause to do more undef optimization in the future, but for
now this fixes up things after if flattening.  vc4 was handling this
internally most of the time, but a GLB2.7 shader that did a conditional
discard and assign gl_FragColor in the else was still emitting some extra
code.

total instructions in shared programs: 100809 -> 100795 (-0.01%)
instructions in affected programs:     37 -> 23 (-37.84%)

v2: Use nir_instr_rewrite_src() to update def/use on src[0] (by Thomas
    Helland).
v3: Make sure to flag metadata dirties, and copy the swizzle and abs/neg
    over to src[0], too (by anholt).

Reviewed-by: Thomas Helland <thomashelland90@gmail.com> (v2)
Tested-by: Thomas Helland <thomashelland90@gmail.com> (v2)
2015-08-14 11:39:18 -07:00
Timothy Arceri
2c61d583f8 nir: add missing type to type_size_vec4()
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-08-05 21:16:45 +10:00
Eric Anholt
6c28ee2041 nir: Add a nir_lower_load_const_to_scalar() pass.
This is useful to increase the CSE opportunities for a scalar backend.  It
avoids regressions when dropping vc4's custom CSE implementation.

v2: Cleanups by Matt (decl in the for loop, and unreachable()).
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-08-04 20:03:10 -07:00
Eric Anholt
a70f63ab20 nir: Add algebraic opt for no-op iand.
I lazily generated some of these in VC4 NIR lowering.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-08-04 17:19:25 -07:00
Eric Anholt
eae9c3286e Revert "nir: Use a single bit for the dual-source blend index"
This reverts commit ab5b7a0fe6.  We use more
than one bit of value in tgsi_to_nir.
2015-08-04 17:19:01 -07:00
Samuel Iglesias Gonsalvez
418c004f80 nir: Fix output swizzle in get_mul_for_src
Avoid copying an overwritten swizzle, use the original values.

Example:

   Former swizzle[] = xyzw
   src->swizzle[] = zyxx

The expected output swizzle = zyxx but if we reuse swizzle in the loop,
then output swizzle would be zyzz.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-08-03 09:40:50 -07:00
Iago Toral Quiroga
01f6235020 nir/nir_lower_io: Add vec4 support
The current implementation operates in scalar mode only, so add a vec4
mode where types are padded to vec4 sizes.

This will be useful in the i965 driver for its vec4 nir backend
(and possbly other drivers that have vec4-based shaders).

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-08-03 09:40:47 -07:00
Timothy Arceri
ab5b7a0fe6 nir: Use a single bit for the dual-source blend index
The only values allowed are 0 and 1, and the value is checked before
assigning.

This is a copy of 8eeca7a56c that seems to have been made to the glsl
ir type after it was copied for use in nir but before nir landed.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-08-03 21:36:50 +10:00
Matt Turner
4251ccb47b nir: Avoid double promotion.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-07-29 09:34:51 -07:00
Matt Turner
5c7fd67045 glsl: Remove MSVC implementations of copysign and isnormal.
Non-Gallium parts of Mesa require MSVC 2013 which provides these.
2015-07-29 09:34:51 -07:00
Dave Airlie
80511d176a i965: add support for ARB_shader_subroutine
This just adds some missing pieces to nir/i965,
it is lightly tested on my Haswell.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-07-24 10:25:08 +10:00
Dave Airlie
57f24299b7 glsl/types: add new subroutine type (v3.2)
This type will be used to store the name of subroutine types

as in subroutine void myfunc(void);
will store myfunc into a subroutine type.

This is required to the parser can identify a subroutine
type in a uniform decleration as a valid type, and also for
looking up the type later.

Also add contains_subroutine method.

v2: handle subroutine to int comparisons, needed
for lowering pass.
v3: do subroutine to int with it's own IR
operation to avoid hacking on asserts (Kayden)
v3.1: fix warnings in this patch, fix nir,
fix tgsi
v3.2: fixup tests

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Dave Airlie <airlied@redhat.com>

tests: fix warnings
2015-07-23 17:25:25 +10:00
Connor Abbott
eaf799ddff nir: add nir_foreach_instr_safe_reverse()
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
2015-07-17 09:49:53 -07:00
Connor Abbott
8eea091747 nir: add nir_instr_is_first() and nir_instr_is_last() helpers
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
2015-07-17 09:47:22 -07:00
Iago Toral Quiroga
6b09598d63 nir: add nir_var_shader_storage
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-07-14 07:04:03 +02:00
Kenneth Graunke
efb36271a9 nir: Fix comment above nir_convert_from_ssa() prototype.
Connor renamed the parameter, inverting the sense.
Update the comment accordingly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-07-08 11:28:08 -07:00
Rob Clark
959b47262b nir/lower_phis_to_scalar: undef is trivially scalarizable
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-07-03 08:56:09 -04:00
Jason Ekstrand
89bd5ee64c nir: Don't allow copying SSA destinations
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-07-02 15:42:33 -07:00
Connor Abbott
aa7d4cecec nir: remove parent_instr from nir_register
It's no longer used.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-06-30 11:18:27 -07:00
Connor Abbott
f49e51ef44 nir: remove nir_src_get_parent_instr()
It's now unused.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-06-30 11:18:27 -07:00
Connor Abbott
2b1a1d8b12 nir/from_ssa: add a flag to not convert everything from SSA
We already don't convert constants out of SSA, and in our backend we'd
like to have only one way of saying something is still in SSA.

The one tricky part about this is that we may now leave some undef
instructions around if they aren't part of a phi-web, so we have to be
more careful about deleting them.

v2: rename and flip meaning of flag (Jason)

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-06-30 11:18:27 -07:00
Rob Clark
dc7e6463d3 nir: cleanup open-coded instruction casts
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2015-06-30 12:13:44 -04:00
Kenneth Graunke
6026f7e8fb nir: Recognize max(min(a, 1.0), 0.0) as fsat(a).
We already recognize min(max(a, 0.0), 1.0) as a saturate, but neglected
this variant (which is also handled by the GLSL IR pass).

shader-db results on Broadwell:
total instructions in shared programs: 7363046 -> 7362788 (-0.00%)
instructions in affected programs:     11928 -> 11670 (-2.16%)
helped:                                64
HURT:                                  0

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-06-25 02:12:32 -07:00
Kenneth Graunke
147cdb53ec nir: Use a switch statement for detecting move-like operations.
Suggested by Jason Ekstrand.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-06-24 10:35:04 -07:00
Kenneth Graunke
1762568fd3 nir: Allow vec2/vec3/vec4 instructions in the select peephole pass.
These are basically just moves, so they should be safe as well.

When disabling i965's GLSL IR level scalarizer (channel expressions)
pass, I started seeing NIR code like this:

        if ssa_21 {
                block block_1:
                /* preds: block_0 */
                vec4 ssa_120 = vec4 ssa_82, ssa_83, ssa_84, ssa_30
                /* succs: block_3 */
        } else {
                block block_2:
                /* preds: block_0 */
                /* succs: block_3 */
        }
        block block_3:
        /* preds: block_1 block_2 */
        vec4 ssa_33 = phi block_1: ssa_120, block_2: ssa_2

Previously, the GLSL IR scalarizer pass would break the vec4 into a
series of fmovs, which were allowed by the peephole pass.  But with
the vec4 operation, they were not.  We want to keep getting selects.

Normal i965 on Broadwell:
instructions in affected programs:     200 -> 176 (-12.00%)
helped:                                4

With brw_fs_channel_expressions() disabled:
instructions in affected programs:     1832 -> 1646 (-10.15%)
helped:                                30

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-06-22 14:08:36 -07:00
Jordan Justen
2867f2e8cd nir: Add barrier intrinsic function
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-06-12 15:12:40 -07:00
Chris Forbes
e7f628c2fc glsl: Add ir node for barrier
v2:
 * Changes suggested by mattst88

[jordan.l.justen@intel.com: Add nir support]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-06-12 15:12:39 -07:00
Timothy Arceri
86a74e9b6b nir: use src for ssa helper
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-06-03 06:50:39 +10:00
Timothy Arceri
5f7b8fa481 nir: remove extra semicolon
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-06-03 06:50:33 +10:00
Eduardo Lima Mitev
5b226a1242 nir: prevent use-after-free condition in should_lower_phi()
lower_phis_to_scalar() pass recurses the instruction dependence graph to
determine if all the sources of a given instruction are scalarizable.
To prevent cycles, it temporary marks the phi instruction before recursing in,
then updates the entry with the resulting value. However, it does not consider
that the entry value may have changed after a recursion pass, hence causing
a use-after-free situation and a crash.

This patch fixes this by reloading the entry corresponding to the 'phi'
after recursing and before updating its value.

The crash can be reproduced ~20% of times with the dEQP test:

dEQP-GLES3.functional.shaders.loops.while_constant_iterations.nested_sequence_fragment

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-06-02 20:21:49 +02:00
Iago Toral Quiroga
2231cf0ba3 nir: Fix output swizzle in get_mul_for_src
When we compute the output swizzle we want to consider the number of
components in the add operation. So far we were using the writemask
of the multiplication for this instead, which is not correct.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-28 18:25:37 +02:00
Matt Turner
5614bcc416 nir: Remove sRGB colorspace conversion round-trip.
Some shaders in Civilization V and Beyond Earth do

   pow(pow(x, 2.2), 0.454545)

which is converting to and from sRGB colorspace.

A more general rule that replaces pow(pow(a, b), c) with pow(a, b * c)
actually regresses two shaders in Sun Temple in which the result of the
inner pow is used twice, once by another pow and once by another
instruction. Also, since 2.2 * 0.454545 isn't exactly one, the more
general pattern would have still left us with a pow, and I'm 2.2 *
0.454545 percent sure that's not what they want.

instructions in affected programs:     934 -> 886 (-5.14%)
helped:                                16
2015-05-22 11:26:36 -07:00
Jason Ekstrand
2126c68e5c nir: Get rid of the array elements parameter on load/store intrinsics
Previously, we used intrinsic->const_index[1] to represent "the number of
array elements to load" for load/store intrinsics.  However, this set to 1
by every pass that ever creates a load/store intrinsic.  Also, while it
might make some sense for registers, it makes no sense whatsoever in SSA.
On top of that, the i965 backend was the only backend to ever support it;
freedreno and vc4 just assert that it's always 1.  Let's just delete it.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2015-05-20 09:28:06 -07:00