Commit graph

346 commits

Author SHA1 Message Date
Jason Ekstrand
f752effa08 nir/nir: Use a linked list instead of a hash set for use/def sets
This commit switches us from the current setup of using hash sets for
use/def sets to using linked lists.  Doing so should save us quite a bit of
memory because we aren't carrying around 3 hash sets per register and 2 per
SSA value.  It should also save us CPU time because adding/removing things
from use/def sets is 4 pointer manipulations instead of a hash lookup.

Running shader-db 50 times with USE_NIR=0, NIR, and NIR + use/def lists:

   GLSL IR Only:        586.4 +/- 1.653833
   NIR with hash sets:  675.4 +/- 2.502108
   NIR + use/def lists: 641.2 +/- 1.557043

I also ran a memory usage experiment with Ken's patch to delete GLSL IR and
keep NIR.  This patch cuts an aditional 42.9 MiB of ralloc'd memory over
and above what we gained by deleting the GLSL IR on the same dota trace.

On the code complexity side of things, some things are now much easier and
others are a bit harder.  One of the operations we perform constantly in
optimization passes is to replace one source with another.  Due to the fact
that an instruction can use the same SSA value multiple times, we had to
iterate through the sources of the instruction and determine if the use we
were replacing was the only one before removing it from the set of uses.
With this patch, uses are per-source not per-instruction so we can just
remove it safely.  On the other hand, trying to iterate over all of the
instructions that use a given value is more difficult.  Fortunately, the
two places we do that are the ffma peephole where it doesn't matter and GCM
where we already gracefully handle duplicates visits to an instruction.

Another aspect here is that using linked lists in this way can be tricky to
get right.  With sets, things were quite forgiving and the worst that
happened if you didn't properly remove a use was that it would get caught
in the validator.  With linked lists, it can lead to linked list corruption
which can be harder to track.  However, we do just as much validation of
the linked lists as we did of the sets so the validator should still catch
these problems.  While working on this series, the vast majority of the
bugs I had to fix were caught by assertions.  I don't think the lists are
going to be that much worse than the sets.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
ecc2cfc8b6 nir: Use nir_instr_rewrite_src in copy propagation
We were rolling our own rewrite_src variant in copy-propagation.  Let's
stop doing that and use the ones in core NIR.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
f72a8d1cf0 nir: Add a function for rewriting the condition of an if statement
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
300d729436 nir: Add and use initializer #defines for nir_src and nir_dest
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
6702ebce57 nir: Modernize the out-of-SSA pass
The out-of-SSA pass was one of the first passes written when getting SSA
up-and-going (for obvious reasons).  As such, it came before a lot of the
nifty SSA-based helpers were introduced.  This commit modernizes it so that
we're no longer doing nearly as much manual banging on use/def sets.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:16:13 -07:00
Jason Ekstrand
7ee0216e2d nir/validate: Validate SSA def parent instructions
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-05-08 17:16:13 -07:00
Ian Romanick
3bdbc1e436 nir: Delete all traces of nir_op_flog
Nothing produces it, and nothing can consume it.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-08 12:12:54 -07:00
Ian Romanick
ad51f9b421 nir: Don't produce nir_op_flog from GLSL IR
All paths that produce GLSL IR for NIR lower ir_unop_log.  All paths
that consume NIR will explode if they geta nir_op_flog.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-08 12:12:54 -07:00
Ian Romanick
e0a17f6e31 nir: Delete all traces of nir_op_fexp
Nothing produces it, and nothing can consume it.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-08 12:12:54 -07:00
Ian Romanick
a45d55f17c nir: Don't produce nir_op_fexp from GLSL IR
All paths that produce GLSL IR for NIR lower ir_unop_exp.  All paths
that consume NIR will explode if they geta nir_op_fexp.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-05-08 12:12:54 -07:00
Matt Turner
8e029105c2 nir: Allow feq/fne/ieq/ine to be optimized with inot.
instructions in affected programs:     380 -> 376 (-1.05%)
helped:                                2

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-05-07 10:51:05 -07:00
Matt Turner
f5cf74d8ba nir: Recognize (a < c || b < c) as min(a, b) < c.
... and (a >= c) || (b >= c) as max(a, b) >= c.

Similar to commit 97e6c1b9.

total instructions in shared programs: 6182276 -> 6182180 (-0.00%)
instructions in affected programs:     6400 -> 6304 (-1.50%)
helped:                                68
HURT:                                  4

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-05-07 10:51:05 -07:00
Matt Turner
ceb8b739ce nir: Recognize trivial min/max.
No changes, but does prevent some regressions in the next commit.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-05-07 10:51:05 -07:00
Matt Turner
8ae559971a nir: Recognize i2b(b2i(x)) as x.
Helps the same set of programs as the previous commit.

instructions in affected programs:     4490 -> 4346 (-3.21%)
helped:                                8

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-05-07 10:51:05 -07:00
Matt Turner
74697e2844 nir: Recognize imul(b2i(a), b2i(b)) as a logical AND.
Four shaders in Unreal 4's Sun Temple are helped, and gain SIMD16
because we avoid an integer multiplication.

instructions in affected programs:     2353 -> 2245 (-4.59%)
helped:                                4
GAINED:                                4

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-05-07 10:51:05 -07:00
Zoë Blade
05e7f7f438 Fix a few typos
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-04-27 17:28:29 +03:00
Matt Turner
f251ea393b nir: Transform pow(x, 4) into (x*x)*(x*x). 2015-04-24 11:39:01 -07:00
Jason Ekstrand
125574d1ef nir/lower_source_mods: Don't propagate register sources
The nir_lower_source_mods pass does a weak form of copy propagation to
clean up all of the mov-with-negate's that get generated.  However, we
weren't properly checking that the sources were SSA and so we could end up
moving a register read which is not, in general, valid.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:41 -07:00
Jason Ekstrand
296131f467 nir: Rewrite instr_rewrite_src
The old code wasn't correctly handling the case where the new value of the
source contains an indirect.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:41 -07:00
Jason Ekstrand
d61bd972d8 nir/locals_to_regs: Hanadle indirect accesses of length-1 arrays
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:41 -07:00
Jason Ekstrand
06f3c98b9d nir/locals_to_regs: Initialize registers with constant initializers
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:41 -07:00
Jason Ekstrand
4e9b376594 nir/locals_to_regs: Pass around the nir_shader rather than a void * mem_ctx
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:41 -07:00
Jason Ekstrand
f50f59d3d9 nir: Add a simple growing array data structure
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:41 -07:00
Jason Ekstrand
8b900e7405 nir/types: Make glsl_get_length smarter
Previously, this function returned the number of elements for structures
and arrays and 0 for everything else.  In NIR, this is almost never what
you want because we also treat matricies as arrays so you have to
special-case constantly.  This commit  glsl_get_length treat matrices
as an array of columns by returning the number of columns instead of 0

This also fixes a bug in locals_to_regs caused by not checking for the
matrix case in one place.

v2: Only special-case for matrices and return a length of 0 for vectors as
    we did before.  This was needed to not break the TGSI-based drivers and
    doesn't really affect NIR at the moment.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Tested-by: Rob Clark <robclark@freedesktop.org>
2015-04-22 18:10:40 -07:00
Jason Ekstrand
7e1d21edbf nir: Move get_const_initializer_load from vars_to_ssa to NIR core
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:40 -07:00
Jason Ekstrand
ba88760202 nir/lower_vars_to_ssa: Pass around the nir_shader instead of a void mem_ctx
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:40 -07:00
Jason Ekstrand
e79120afdc nir/print: Print the closing paren on load_const instructions
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:40 -07:00
Jason Ekstrand
02f03fc0f1 nir/tex: Use the correct return size for query_levels and lod
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:40 -07:00
Jason Ekstrand
94669cb534 nir: Refactor tex_instr_dest_size to use a switch statement
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:40 -07:00
Jason Ekstrand
73cc76362d nir/lower_vars_to_ssa: Actually look for indirects when determining aliasing
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-22 18:10:39 -07:00
Matt Turner
4dacb212fd nir: Allow abs/neg in select peephole pass.
total instructions in shared programs: 4314531 -> 4308949 (-0.13%)
instructions in affected programs:     429085 -> 423503 (-1.30%)
helped:                                1680
HURT:                                  0
GAINED:                                0
LOST:                                  111

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-17 11:01:34 -07:00
Rob Clark
e14af4c067 nir/builder: add nir_builder_insert_after_instr()
For lowering if/else, I need a way to insert at the end of the previous
block.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-17 10:34:15 -04:00
Ian Romanick
94aab6cde6 nir: Convert the if-test for num_inputs == 2 to an assertion
Suggested by Jason on a different patch after some comments /
questions by Ilia.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Connor Abbott <cwabott0@gmail.com>
2015-04-16 09:56:49 -07:00
Ian Romanick
4cf5ca5ca5 nir: Try commutative sources in CSE
Shader-db results:

GM45 NIR:
total instructions in shared programs: 4082044 -> 4081919 (-0.00%)
instructions in affected programs:     27609 -> 27484 (-0.45%)
helped:                                44

Iron Lake NIR:
total instructions in shared programs: 5678776 -> 5678646 (-0.00%)
instructions in affected programs:     27406 -> 27276 (-0.47%)
helped:                                45

Sandy Bridge NIR:
total instructions in shared programs: 7329995 -> 7329096 (-0.01%)
instructions in affected programs:     142035 -> 141136 (-0.63%)
helped:                                406
HURT:                                  19

Ivy Bridge NIR:
total instructions in shared programs: 6769314 -> 6768359 (-0.01%)
instructions in affected programs:     140820 -> 139865 (-0.68%)
helped:                                423
HURT:                                  2

Haswell NIR:
total instructions in shared programs: 6183693 -> 6183298 (-0.01%)
instructions in affected programs:     96538 -> 96143 (-0.41%)
helped:                                303
HURT:                                  4

Broadwell NIR:
total instructions in shared programs: 7501711 -> 7498170 (-0.05%)
instructions in affected programs:     266403 -> 262862 (-1.33%)
helped:                                705
HURT:                                  5
GAINED:                                4

v2: Rebase on top of Connor's fix.

v3: Convert the if-test for num_inputs == 2 to an assertion.  Suggested
by Jason after some comments / questions by Ilia.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> [v1]
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Connor Abbott <cwabbott0@gmail.com>
2015-04-15 18:15:59 -07:00
Ian Romanick
bc672e261c nir: Fix typo in "ushr by 0" algebraic replacement
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Cc: "10.5" <mesa-stable@lists.freedestkop.org>
2015-04-14 16:41:04 -07:00
Ian Romanick
67a8610caf nir: Silence unused parameter warnings
nir/nir.h: In function 'nir_validate_shader':
nir/nir.h:1567:56: warning: unused parameter 'shader' [-Wunused-parameter]
 static inline void nir_validate_shader(nir_shader *shader) { }
                                                        ^
nir/nir_opt_cse.c: In function 'src_is_ssa':
nir/nir_opt_cse.c:165:32: warning: unused parameter 'data' [-Wunused-parameter]
 src_is_ssa(nir_src *src, void *data)
                                ^
nir/nir_opt_cse.c: In function 'dest_is_ssa':
nir/nir_opt_cse.c:171:35: warning: unused parameter 'data' [-Wunused-parameter]
 dest_is_ssa(nir_dest *dest, void *data)
                                   ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-14 16:41:04 -07:00
Connor Abbott
47a1b4841d nir/cse: fix bug with comparing non-per-component sources
We weren't comparing the right number of components when checking
swizzles. Use nir_ssa_alu_instr_num_src_components() to do the right
thing.

No piglit regressions, and no fixes either.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
2015-04-14 19:07:44 -04:00
Kenneth Graunke
b3e286c457 nir: Store num_direct_uniforms in the nir_shader.
Storing this here is pretty sketchy - I don't know if any driver other
than i965 will want to use it.  But this will make it a lot easier to
generate NIR code at link time.  We'll probably rework it anyway.

(Ian suggested making nir_assign_var_locations_scalar_direct_first
 simply modify the nir_shader's fields, rather than passing pointers
 to them.  If this stays long term, we should do that.  But Jason and
 I suspect we'll be reworking this area again in the near future.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-11 11:39:48 -07:00
Rob Clark
f596135616 nir: fix bit of cargo-culting in lower_idiv
I guess I was looking too much at how lower_system_values worked when
writing lower_idiv.

Since ttn wasn't emitting load_var for sysvals and the only drivers
using lower_idiv were using ttn, I think nothing was broken as a result.
But might as well fix this before it becomes a problem.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-11 10:43:16 -04:00
Rob Clark
58add76791 nir: split out lower_sub from lower_negate
Originally you had to have one or the other.  But actually I don't want
either.  (Or rather I want whatever is the minimum # of instructions.)

TODO: not sure where the best place to insert a check that driver hasn't
set *both* lower_negate and lower_sub?

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-11 10:43:16 -04:00
Kenneth Graunke
500da98e0b nir: Constify nir_lower_sampler's gl_shader_program pointer.
Now that we're not generating linker errors, we don't actually modify
this.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-04-10 02:16:33 -07:00
Kenneth Graunke
709b88ccd8 nir: Remove linker_error calls from nir_lower_samplers().
These should never happen.  Plus, NIR passes really shouldn't be
reporting linker errors - this is past link time.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-04-10 02:16:31 -07:00
Kenneth Graunke
99264b7f37 nir: Make nir_lower_samplers take a gl_shader_stage, not a gl_program *.
We don't actually need a gl_program struct.  We only used it to
translate prog->Target (i.e. GL_VERTEX_PROGRAM) to the gl_shader_stage
(i.e. MESA_SHADER_VERTEX).  We may as well just pass that.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-04-10 02:16:29 -07:00
Kenneth Graunke
4b27391cad nir: Move gl_shader_stage enum from mtypes.h to shader_enums.h.
I want to use this in some code that doesn't currently include mtypes.h.
It seems like a better place for it anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-04-10 02:16:27 -07:00
Jason Ekstrand
11694737fc nir: Make nir_*_instr_create take a nir_shader instead of a void * context
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-04-07 14:34:21 -07:00
Kenneth Graunke
a10d493715 nir: Implement a nir_sweep() pass.
This pass performs a mark and sweep pass over a nir_shader's associated
memory - anything still connected to the program will be kept, and any
dead memory we dropped on the floor will be freed.

The expectation is that this will be called when finished building and
optimizing the shader.  However, it's also fine to call it earlier, and
many times, to free up memory earlier.

v2: (feedback from Jason Ekstrand)
- Skip sweeping impl->start_block, as it's already in the CF list.
- Don't sweep SSA defs (they're owned by their defining instruction)
- Don't steal phi sources (they're owned by nir_phi_instr).
- Don't steal tex->src (it's owned by the tex_inst itself)
- Don't sweep dereference chains (top-level dereferences are owned by
  the instruction; sub-dereferences are owned by the parent deref).
- Don't sweep sources and destinations (SSA defs are handled as part of
  the defining instruction, and registers are handled as part of
  function implementations).
- Just steal instructions; don't walk them (no longer required).

v3: (feedback from Jason Ekstrand)
- Steal indirect sources from nir_src/nir_dest.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-07 14:34:14 -07:00
Kenneth Graunke
de2014cf1e nir: Allocate dereferences out of their parent instruction or deref.
Jason pointed out that variable dereferences in NIR are really part of
their parent instruction, and should have the same lifetime.

Unlike in GLSL IR, they're not used very often - just for intrinsic
variables, call parameters & return, and indirect samplers for
texturing.  Also, nir_deref_var is the top-level concept, and
nir_deref_array/nir_deref_record are child nodes.

This patch attempts to allocate nir_deref_vars out of their parent
instruction, and any sub-dereferences out of their parent deref.
It enforces these restrictions in the validator as well.

This means that freeing an instruction should free its associated
dereference chain as well.  The memory sweeper pass can also happily
ignore them.

v2: Rename make_deref to evaluate_deref and make it take a nir_instr *
    instead of void *.  This involves adding &instr->instr everywhere.
    (Requested by Jason Ekstrand.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-07 14:34:14 -07:00
Kenneth Graunke
4f4b04b7c7 nir: Allocate nir_ssa_def::uses/if_uses out of the instruction.
We can't allocate them out of the nir_ssa_def itself, because it may not
be ralloc'd (for example, nir_dest embeds a nir_ssa_def).

However, allocating them out of the instruction should work.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-07 14:34:13 -07:00
Kenneth Graunke
900498bd11 nir: Allocate nir_phi_src values out of the nir_phi_instr.
Phi sources are part of the phi instruction and should have the same
lifetime.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-07 14:34:13 -07:00
Kenneth Graunke
b05d53404c nir: Allocate nir_call_instr::params out of the nir_call itself.
The lifetime of the params array needs to be match the nir_call_instr
itself.  So, allocate it using the instruction itself as the context.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-04-07 14:34:13 -07:00