Commit graph

231 commits

Author SHA1 Message Date
Jose Fonseca
40a4797384 nir: Use helper macros for dealing with VLAs.
v2:
- Single statement, by using memset return value as suggested by Ian
Romanick.
- No internal declaration, as suggested by Jason Ekstrand.
- Move macros to a header.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-04 10:52:02 +00:00
Jose Fonseca
f320ecf218 nir: Use alloca instead of variable length arrays.
This is to enable the code to build with -Werror=vla in the short term,
and enable the code to build with MSVC2013 soon after.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-27 14:30:36 +00:00
Kenneth Graunke
8e62bd52f8 nir: Introduce nir_intrinsic_discard_if.
This is a conditional discard, which takes a boolean source.

Note that we don't generate ir_discard::condition today, so this
shouldn't break drivers (since none implement this intrinsic yet).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-24 15:24:52 -08:00
Jason Ekstrand
c750ecaa12 nir/register: Add a parent_instr field
This adds a parent_instr field similar to the one for ssa_def.  The
difference here is that the parent_instr field on a nir_register can be
NULL if the register does not have a unique definition or if that
definition does not dominate all its uses.  We set this field in the
out-of-SSA pass so that backends can get SSA-like information even after
they have gone out of SSA.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-02-24 14:08:04 -08:00
Jason Ekstrand
9b9ef2aeee nir/gcm: Add some missing break statements
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-23 13:20:13 -08:00
Jason Ekstrand
cb4b2ad44a nir: Copy-propagate vecN operations that are actually moves
We were already do this for ALU operations but we haven't for non-ALU
operations.  This changes that.

total NIR instructions in shared programs: 2039883 -> 2022338 (-0.86%)
NIR instructions in affected programs:     1768850 -> 1751305 (-0.99%)
helped:                                    14244
HURT:                                      124

total FS instructions in shared programs: 4083960 -> 4084036 (0.00%)
FS instructions in affected programs:     7302 -> 7378 (1.04%)
helped:                                   12
HURT:                                     51

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-23 13:19:05 -08:00
Eric Anholt
4359954d84 nir: Generalize the optimization of subs of subs from 0.
I initially wrote this based on the "(('fneg', ('fneg', a)), a)" above,
but we can generalize it and make it more potentially useful.  In the
specific original case of a 0 for our new 'a' argument, it'll get further
algebraic optimization once the 0 is an argument to the new add.

No shader-db effects.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-21 14:57:14 -08:00
Eric Anholt
345c2b288a nir: Collapse repeated bcsels on the same argument.
vc4 results:
total instructions in shared programs: 39881 -> 39794 (-0.22%)
instructions in affected programs:     6302 -> 6215 (-1.38%)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-21 14:57:14 -08:00
Eric Anholt
a38038ca5e nir: When faced with a csel on !condition, just flip the arguments.
total NIR instructions in shared programs: 39426 -> 39411 (-0.04%)
NIR instructions in affected programs:     3748 -> 3733 (-0.40%)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-21 14:57:14 -08:00
Eric Anholt
8e1152cb33 nir: Allow nir_opt_algebraic to see booleanness through &&, ||, ^, !.
We have some useful optimizations to drop things like 'ine a, 0' on a
boolean argument, but if 'a' came from logical operations on bools, it
couldn't tell.  These kinds of constructs appear as a result of TGSI->NIR
quite frequently (at least with if flattening), so being a little more
aggressive in detecting booleans can pay off.

v2: Add ixor as a booleanness-preserving op (Suggestion by Connor).

vc4 results:
total instructions in shared programs: 40207 -> 39881 (-0.81%)
instructions in affected programs:     6677 -> 6351 (-4.88%)

Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-21 14:57:14 -08:00
Eric Anholt
dc982f4a85 nir: Add a couple of simplifications of csel operations.
vc4 was already cleaning these up, but it does shave 4 NIR instructions in
shader-db.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-21 14:57:14 -08:00
Kenneth Graunke
b6393d7040 nir: Fix the Mesa build without -DDEBUG.
With -DDEBUG -UNDEBUG, this assert uses reg_state::stack_size, which
doesn't exist, breaking the build:

assert(state->states[index].index < state->states[index].stack_size);

Switch it to ifndef NDEBUG, so the field will exist if the assertion
actually generates code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-20 13:43:44 -08:00
Eric Anholt
bef38f62e0 nir: Drop dependency on mtypes.h for core NIR.
One less new directory necessary for gallium code that wants to interact
with NIR.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-20 11:36:34 -08:00
Eric Anholt
b53d035825 util: Move Mesa's bitset.h to util/.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-02-20 11:36:34 -08:00
Jason Ekstrand
c7002fad90 nir/GCM: Pull unpinned instructions out of blocks while pinning
This lets us be slightly more efficient by not walking the CFG extra times.
Also, it may make it easier to ensure that GVN happens on only unpinned
instructions.

Reviewed-by: Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
8dfe6f672f nir/GCM: Use pass_flags instead of bitsets for tracking visited/pinned
Reviewed-by: Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
190073c737 nir: Add a global code motion (GCM) pass
v2 Jason Ekstrand <jason.ekstrand@intel.com>:
 - Use nir_dominance_lca for computing least common anscestors
 - Use the block index for comparing dominance tree depths
 - Pin things that do partial derivatives

Reviewed-by: Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
a52a4b5223 nir/instr: Change "live" to a more generic "pass_flags" field
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
3d25afc51c nir: Make nir_[cf_node/instr]_[prev/next] return null if at the end
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
902b0ccc9a nir/from_ssa: Don't try to read an invalid instruction
Right now, the nir_instr_prev function function blindly looks up the
previous element in the exec list and casts it to an instruction even if
it's the tail sentinel.  The next commit will change this to return null if
it's the first instruction.  Making this change first avoids getting a
segfault between commits.  The only reason we never noticed is that, thanks
to the way things are laid out in nir_block, the casted instruction's type
was never parallal_copy.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
0281fd0786 nir/validate: Validate SSA defs the same way we do for registers
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
34952b5671 nir/validate: Validate if_uses on registers
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
98ecb25f89 nir: Properly clean up CF nodes when we remove them
Previously, if you remved a CF node that still had instructions in it, none
of the use/def information from those instructions would get cleaned up.
Also, we weren't removing if statements from the if_uses of the
corresponding register or SSA def.  This commit fixes both of these
problems

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
e025943134 nir: use nir_foreach_ssa_def for indexing ssa defs
This is both simpler and more correct.  The old code didn't properly index
load_const instructions.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
0167c38cac nir/from_ssa: Use the nir_block_dominance function instead of our own
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
f481a9425c nir/dominance: Add a constant-time mechanism for comparing blocks
This is mostly thanks to Connor.  The idea is to do a depth-first search
that computes pre and post indices for all the blocks.  We can then figure
out if one block dominates another in constant time by two simple
comparison operations.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:17 -08:00
Jason Ekstrand
b4c5489c8a nir/dominance: Expose the dominance intersection function
Being able to find the least common anscestor in the dominance tree is a
useful thing that we may want to do in other passes.  In particular, we
need it for GCM.

v2: Handle NULL inputs by returning the other block

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-19 17:06:16 -08:00
Brian Paul
2f5597787c nir: add missing GLSL_TYPE_DOUBLE case in type_size()
To silence compiler warning about unhandled switch case.
v2: move GLSL_TYPE_DOUBLE to the "not reached" section, per Ilia.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-02-19 15:36:59 -07:00
Eric Anholt
2a135c470e nir: Add an ALU op builder kind of like ir_builder.h
v2: Rebase on the nir_opcodes.h python code generation support.
v3: Use SSA values, and set an appropriate writemask on dot products.
v4: Make the arguments be SSA references as well.  This lets you stack up
    expressions in the arguments of other expressions, at the cost of
    having to insert a fmov/imov if you want to swizzle.  Also, add
    the generated file to NIR_GENERATED_FILES.
v5: Use more pythonish style for iterating the list.
v6: Infer the size of the dest from the size of the srcs, and auto-swizzle
    a single small src out to the appropriate size.
v7: Add little helpers for initializing the struct, add a typedef for the
    struct like other nir types have.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v6)
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v7)
2015-02-18 22:28:42 -08:00
Eric Anholt
6eadde51bb nir: Recognize and reduce duplicated fsats.
No effect on vc4 shader-db.

v2: Rebase to master (no TGSI->NIR present)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2015-02-18 14:47:51 -08:00
Eric Anholt
1907a3a7ee nir: Add a flag for lowering fsat.
vc4 cse/algebraic-disabled stats:
total instructions in shared programs: 44356 -> 44354 (-0.00%)
instructions in affected programs:     55 -> 53 (-3.64%)

v2: Rebase to master (no TGSI->NIR present)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2015-02-18 14:47:51 -08:00
Eric Anholt
e5ecf8e427 nir: Add a flag for lowering ffma.
vc4 cse/algebraic-disabled stats:
total uniforms in shared programs: 13966 -> 13791 (-1.25%)
uniforms in affected programs:     435 -> 260 (-40.23%)
total instructions in shared programs: 44732 -> 44356 (-0.84%)
instructions in affected programs:     9599 -> 9223 (-3.92%)

v2: Rebase to master (no TGSI->NIR present)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2015-02-18 14:47:51 -08:00
Eric Anholt
42a8ace66e nir: Add a flag for lowering fneg/ineg.
vc4 cse/algebraic-disabled stats:
total instructions in shared programs: 44911 -> 44732 (-0.40%)
instructions in affected programs:     11371 -> 11192 (-1.57%)

v2: Fix broken iabs(isub(0, a)) transformation.
v3: Rebase to master (no TGSI->NIR present)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2015-02-18 14:47:51 -08:00
Eric Anholt
cb95a228e8 nir: Add a flag for lowering fsqrt(x) to frcp(frsqrt(x)).
vc4 cse/algebraic-disabled stats:
total uniforms in shared programs: 13972 -> 13966 (-0.04%)
uniforms in affected programs:     408 -> 402 (-1.47%)
total instructions in shared programs: 44973 -> 44911 (-0.14%)
instructions in affected programs:     1551 -> 1489 (-4.00%)

v2: Rebase to master (no TGSI->NIR present)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2015-02-18 14:47:50 -08:00
Eric Anholt
ccf14bca4b nir: Add lowering of POW instructions if the lower flag is set.
This could be done in a separate pass like we do in GLSL IR, but it seems
to me like having the definitions of the transformations in the two
directions next to each other makes a lot of sense.

v2: Reorder the comment about the transformation.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-18 14:47:50 -08:00
Eric Anholt
8e9dbfff17 nir: Conditionalize the POW reconstruction on shader compiler options.
Mesa has a shader compiler struct flagging whether GLSL IR's opt_algebraic
and other passes should try and generate certain types of opcodes or
patterns.  Extend that to NIR by defining our own struct, which is
automatically generated from the Mesa struct in glsl_to_nir and provided
directly by the driver in TGSI-to-NIR.

v2: Split out the previous two prep patches.
v3: Rebase to master (no TGSI->NIR present)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v2)
2015-02-18 14:47:50 -08:00
Eric Anholt
955a6bb57d nir: Add an optional expression controlling nir_algebraic xforms.
This will be used so that we can customize the transforms for the target
GPU, so we don't un-lower expressions that had already been lowered (or
introduce new lowering transformations that not all GPUs want)

v2: Drop the complication of having the condition->index dictionary, since
    we don't actually expect there to be many different conditions (change
    by Kenneth).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-18 14:47:50 -08:00
Eric Anholt
f90bb54734 nir: Add a nir_shader_compiler_options struct pointed to by the shaders.
This will be used to give the optimization passes a chance to customize
behavior for the particular target device.

v2: Rebase to master (no TGSI->NIR present)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2015-02-18 14:47:50 -08:00
Jason Ekstrand
dd110cdfd8 nir: Make gl_FrontFacing a system_value
GLSL IR labels gl_FrontFacing as an input variable and not a system value.
This commit makes NIR silently translate gl_FrontFacing to a system value
so that it properly gets translated into a load_system_value intrinsic.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-14 13:47:16 -08:00
Jason Ekstrand
929f43851e nir/lower_phis_to_scalar: Fix some logic in is_phi_scalarizable
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-14 13:46:59 -08:00
Matt Turner
4c42e1116b nir: Recognize open-coded fmin/fmax.
And unfortunately other shaders do the same thing but with >=/<= which
we can't apply this optimization to because of NaNs.

instructions in affected programs:     23309 -> 22938 (-1.59%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-11 13:50:19 -08:00
Eric Anholt
56e21647e2 nir: Add algebraic opt for int comparisons with identical operands.
No change on shader-db on i965.

v2: Reword the comment due to feedback from Erik Faye-Lund

Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v1)
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> (v1)
2015-02-11 11:52:38 -08:00
Eric Anholt
2919bdf466 nir: Fix load_const comparisons for CSE.
We want the size of a float per component, not the size of a whole vec4.

NIR instructions on i965:
total instructions in shared programs: 1261937 -> 1261929 (-0.00%)
instructions in affected programs:     114 -> 106 (-7.02%)

Looking at one of these examples (tesseract), it's from vec4 load_consts
for a MRT solid fill, which do get CSEed now that we don't memcmp off the
end of the const value and into the SSA def.  For the 1-component loads
that are common in i965, we were only memcmping off into the rest of the
usually zero-filled const_value.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-11 11:52:38 -08:00
Matt Turner
a9065cef48 nir: Remove casts from void*.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-10 17:48:42 -08:00
Matt Turner
bb1e007157 nir: Replace assert(0) with unreachable().
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-10 17:48:31 -08:00
Matt Turner
942b56ad05 nir: Remove unused has_indirect variable.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-10 17:48:16 -08:00
Kenneth Graunke
480ee1f0b4 nir: Mark nir_print_instr's instr pointer as const.
Printing instructions doesn't modify them, so we can mark the parameter
const.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-10 03:37:55 -08:00
Eric Anholt
bff4cbdafa nir: Fix broken fsat recognizer.
We've probably never seen this ridiculous pattern in the wild, so it
didn't matter.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-06 15:57:55 -08:00
Eric Anholt
6706537dd4 nir: Slightly simplify algebraic code generation by reusing a struct.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-06 15:57:55 -08:00
Connor Abbott
a135f34080 nir: add an optimization to remove useless phi nodes
This removes phi nodes whose sources all point to the same thing.

Shader-db results:

total NIR instructions in shared programs: 2045293 -> 2041209 (-0.20%)
NIR instructions in affected programs:     126564 -> 122480 (-3.23%)
helped:                                615
HURT:                                  0

total FS instructions in shared programs: 4321840 -> 4320392 (-0.03%)
FS instructions in affected programs:     24622 -> 23174 (-5.88%)
helped:                                138
HURT:                                  0

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Tested-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-03 16:00:13 -05:00