Commit graph

1760 commits

Author SHA1 Message Date
Eric Engestrom
abc226cf41 tree-wide: replace MAYBE_UNUSED with ASSERTED
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-31 09:41:05 +01:00
Eric Engestrom
5febd4d575 compiler: replace MAYBE_UNUSED with UNUSED
MAYBE_UNUSED is going away, so let's replace legitimate uses of it with
UNUSED, which the former aliased to so far anyway.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-31 09:41:05 +01:00
Connor Abbott
a094928abc nir/find_array_copies: Use correct parent array length
instr->type is the type of the array element, not the type of the array
being dereferenced. Rather than fishing out the parent type, just use
parent->num_children which should be the length plus 1. While we're here
add another assert for the issue fixed by the previous commit.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111251
Fixes: 156306e5e6 ("nir/find_array_copies: Handle wildcards and overlapping copies")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-30 17:14:33 +02:00
Connor Abbott
7788992bc6 nir: Fix comparison for nir_deref_instr_is_known_out_of_bounds()
There was an off-by-one error.

Fixes: 156306e5e6 ("nir/find_array_copies: Handle wildcards and overlapping copies")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-30 17:14:28 +02:00
Jason Ekstrand
4bb6e6817e intel: Use a system value for gl_FragCoord
It's kind-of an anomaly that the Intel drivers are still treating
gl_FragCoord as an input.  It also makes zero sense because we have to
special-case it in the back-end.

Because ANV is the only user of nir_lower_wpos_center, we go ahead and
just update it to look for nir_intrinsic_load_frag_coord as part of this
patch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-29 23:30:26 +00:00
Eric Anholt
3c46778b75 nir: Fix helgrind complaints about data race in trivial_swizzle init.
Even if the data race wasn't real (I'm not great at reasoning about
this), helgrind is a nice enough tool that keeping noise out of it is
probably worthwhile.  Besides, typing out the numbers keeps the data
in the read-only data section instead of emitting code to initialize
it every time.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-07-29 12:50:49 -07:00
Connor Abbott
156306e5e6 nir/find_array_copies: Handle wildcards and overlapping copies
This commit rewrites opt_find_array_copies to be able to handle
an array copy sequence with other intervening operations in between. In
particular, this handles the case where we OpLoad an array of structs
and then OpStore it, which generates code like:

foo[0].a = bar[0].a
foo[0].b = bar[0].b
foo[1].a = bar[1].a
foo[1].b = bar[1].b
...

that wasn't recognized by the previous pass.

In order to correctly handle copying arrays of arrays, and in particular
to correctly handle copies involving wildcards, we need to use a tree
structure similar to lower_vars_to_ssa so that we can walk all the
partial array copies invalidated by a particular write, including
ones where one of the common indices is a wildcard. I actually think
that when factoring in the needed hashing/comparing code, a hash table
based approach wouldn't be a lot smaller anyways.

All of the changes come from tessellation control shaders in Strange
Brigade, where we're able to remove the DXVK-inserted copy at the
beginning of the shader. These are the result for radv:

Totals from affected shaders:
SGPRS: 4576 -> 4576 (0.00 %)
VGPRS: 13784 -> 5560 (-59.66 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 8696 -> 6876 (-20.93 %) dwords per thread
Code Size: 329940 -> 263268 (-20.21 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 330 -> 898 (172.12 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-29 11:36:25 +02:00
Connor Abbott
c6543efe7a nir: Print array deref indices as decimal
We print the size as decimal too, and using hex without a leading "0x"
was very confusing.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-29 11:36:19 +02:00
Sagar Ghuge
d5992ab134 nir: Optimize umod lowering
We don't have calculate final quotient in order to calculate unsigned
modulo result.  Once we are done with error correction we have partial
result which can be used to find out modulo operation result

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-26 11:19:23 -07:00
Lionel Landwerlin
8c330728f3 nir: add access to image_deref intrinsics
SPIRV added the ability to access variables and have expressions non
dynamically uniform and because spirv_to_nir generates deref
instructions, we'll need to have that access there.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-26 14:09:55 +00:00
Jonathan Marek
97c8314c5f nir/algebraic: add scmp algebraic optimizations
When 'x' is the result of a scmp op:

x != 0.0 or x == 1.0: passthrough
x == 0.0 or x != 1.0: invert

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-24 17:36:21 -04:00
Jonathan Marek
9be902097c nir/algebraic: add option to lower fall_equalN/fany_nequalN
Add generic lowerings for fall_equalN/fany_nequalN. These should be optimal
for vec4 backends that doesn't have any special instructions for it, as
long as they support saturate.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-24 17:36:21 -04:00
Jonathan Marek
397375d3f3 nir/algebraic: add fdot2 optimizations
Add simple fdot2 optimizations that are missing.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-24 17:36:21 -04:00
Jonathan Marek
1e089d0575 nir/algebraic: add option to lower fdph
For backends that don't have a 'fdph' instructions

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-24 17:36:21 -04:00
Jonathan Marek
bc3b6168ba nir: replace lower_sincos with algebraic opt
This version has less ops for the same precision.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2019-07-24 17:36:21 -04:00
Jonathan Marek
5a4e71c082 nir/algebraic: allow swizzle in nir_algebraic replace expression
This is to allow optimizations in nir_opt_algebraic not otherwise possible

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2019-07-24 17:36:21 -04:00
Daniel Schürmann
e272fdd508 nir,intel: lower if (cond) demote() to new intrinsic demote_if(cond)
This will effectively enable the optimization in anv.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-24 13:02:18 -05:00
Jason Ekstrand
799f0f7b28 nir/lower_subgroups: Properly lower masks when subgroup_size == 0
Instead of building a constant mask (which depends on knowing the
subgroup size), we build an expression.  Because the pass uses the
nir_shader_lower_instructions helper, subgroup lowering will be run on
any newly emitted instructions as well as the previously existing
instructions.  In particular, if the subgroup size is known, the newly
emitted subgroup_size intrinsic will get turned into a constant and a
later constant folding pass will clean it up.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-24 12:55:40 -05:00
Sagar Ghuge
87cef718e1 nir: Add lowering for nir_op_irem and nir_op_imod
Tested on Gen > 9.

v2: 1) Fix lowering
    2) Keep a consistent i/u order (Matt Turner)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-24 10:33:09 -07:00
Jason Ekstrand
9700e45463 nir/lower_io: Return SSA defs from helpers
I can't find a single place where nir_lower_io is called after going out
of SSA which is the only real reason why you wouldn't do this. Returning
SSA defs is more idiomatic and is required for the next commit.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-23 17:48:49 -05:00
Jason Ekstrand
ae392d73c9 nir/gather_info: Look for uses of helper invocations
The one obvious omission here is gl_HelperInvocation itself.  However,
the spec doesn't require that we generate then when gl_HelperInvocation
is used, it merely mandates that we report them if they are there.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-23 13:40:41 -05:00
Jason Ekstrand
41ab92a327 nir/gather_info: Move setting uses_64bit out of the switch
Otherwise, as we add things to the switch, we're going to forget and add
some 64-bit op at some point in the future and it'll stop getting
flagged.  There's no reason why we can't do the check for derivatives.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-23 13:40:41 -05:00
Jason Ekstrand
0e6cb481fa nir: Add a nir_tex_instr_has_implicit_derivatives helper
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-23 13:40:41 -05:00
Jason Ekstrand
7a98c7804c nir: Move nir_alu_instr_is_comparison to the ALU section
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-23 13:40:41 -05:00
Andrii Simiklit
79ab2c3e57 nir: use | instead of || operator
warning: use of logical '||' with constant operand
note: use '|' for a bitwise operation

Fixes: 758fdce9fe ("nir: Add some generic helpers for writing lowering passes")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
2019-07-23 18:08:58 +03:00
Eric Engestrom
3acc4278ad nir: don't return void
Fixes: 14531d676b ("nir: make nir_const_value scalar")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-07-23 16:02:37 +01:00
Jason Ekstrand
5c5f11d1dd nir: Remove a bunch of large stack arrays
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-22 16:17:18 -05:00
Jason Ekstrand
6301f80b84 nir: Only rematerialize comparisons with all SSA sources
Otherwise, you may end up moving a register read and that could result
in an incorrect shader.  This commit fixes a rendering issue in Elite:
Dangerous.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111152
Fixes: 3ee2e84c60 "nir: Rematerialize compare instructions"
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-07-19 19:45:36 +00:00
Caio Marcelo de Oliveira Filho
4061a3f6c9 nir: use a switch when printing intrinsic indices
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-07-19 10:04:52 -07:00
Rhys Perry
e8644122ed nir/algebraic: mark a few comparison simplifications as precise
No vkpipeline-db changes found.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reveiewed-by: Alyssa Rosenzweig alyssa.rosenzweig@collabora.com
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-07-19 16:33:01 +00:00
Rhys Perry
79801b9d7d nir/algebraic: optimize contradictory iand operands
Some of these were found in a few GTAV, Rise of the Tomb Raider and
Shadow of the Tomb Raider shaders.

Results from vkpipeline-db run with ACO:
Totals from affected shaders:
SGPRS: 376 -> 376 (0.00 %)
VGPRS: 220 -> 220 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 13492 -> 11560 (-14.32 %) bytes
LDS: 6 -> 6 (0.00 %) blocks
Max Waves: 69 -> 69 (0.00 %)
Wait states: 0 -> 0 (0.00 %)

v2: use False instead of 0

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reveiewed-by: Alyssa Rosenzweig alyssa.rosenzweig@collabora.com
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-07-19 16:33:01 +00:00
Timothy Arceri
30038dd5ec nir/lower_clip: add support for geometry shaders
This will be used to enabled compat profile support for geometry
shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-19 09:25:47 +10:00
Timothy Arceri
4b08bb4770 nir/lower_clip: add lower_clip_outputs() helper
This will be reused in the following patch to add support for clip
vertex lowering in geometry shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-19 09:25:47 +10:00
Timothy Arceri
a59926b3ca nir/lower_clip: add create_clipdist_vars() helper
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-19 09:25:47 +10:00
Timothy Arceri
e38b930876 nir/lower_clip: add a find_clipvertex_and_position_outputs() helper
This will allow code sharing in a following patch that adds support
for lowering in geometry shaders. It also allows us to exit early
if there is no lowering to do which allows a small code tidy up.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-19 09:25:47 +10:00
Caio Marcelo de Oliveira Filho
b6d4753568 nir/large_constants: De-duplicate constants
If a function has a constant and is called more than once, after
inlining we may end up with different variables representing the same
constant.  This commit look into the data and de-duplicate them.

The first pass now will collect the constant data in a per variable
buffer, then de-duplication happens (by sorting then linear walk), and
the second pass will use the data in var->data.location.

One side-effect of the current implementation is that constants will
be reordered.  If this turns out to be a problem is something that can
be fixed.

An alternative strategy considered was to perform this in a
per-function basis and then merge the results, the problem is that we
would have to fix up the offsets during the merge.  Given the data we
have, the current patch is good enough.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-18 12:24:24 -07:00
Caio Marcelo de Oliveira Filho
d9b67ad079 nir/large_constants: Use ralloc for var_infos
This will be used later on to allocate constant data for each
variable (and then deduplicate).  Also drop initializing found_read,
as it is already implicitly false in the literal.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-18 12:24:24 -07:00
Eric Anholt
251c64a53d nir: Allow internal changes to the instr in nir_shader_lower_instructions().
v3d's NIR txf_ms lowering wants to swizzle around the input coordinates in
NIR, but doesn't generate a new txf_ms instructions as replacement.  It's
pretty easy to allow that in nir_shader_lower_instructions, and it may be
common in lowering passes.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-18 11:28:56 -07:00
Andreas Baierl
f5804f1768 nir: Add gl_PointCoord system value
gl_PointCoord handling needs some special bits set in lima/ppir code
generation. Treating gl_PointCoord as a system value makes it easier
to distinguish from a regular varying.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-18 13:20:39 +00:00
Connor Abbott
4423552ff0 nir/lower_viewport: Check variable mode first
The location is unused for shader_temp and function_temp variables, and
due to the way we nir_lower_io_to_temproraries demotes shader_out
variables to shader_temp variables, it happened to equal
VARYING_SLOT_POS for the gl_Position temporary, which made this pass
fail with the offline compiler due to this coming before vars_to_ssa.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-07-18 14:21:41 +02:00
Iago Toral Quiroga
50016d7718 nir: add a V3D-specific intrinsic for per-sample color writes
For per-sample color writes we need the output intrinsic to pack the
sample index, which is not provided with regular store_output intrinsics
unless we figured out a way to encode it into the base or the offset.

v2:
 - Drop the writemask (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-18 08:59:35 +02:00
Caio Marcelo de Oliveira Filho
891a232214 nir/large_constants: Use dominance information to find more constants
Relax the restriction that all the writes need to be in the first
block: now accept variables that have all the writes in the same
block, and all the reads are dominated by that block.

This let the pass identify large constants that are local to a helper
function.  The writes will be at the place that the function is
inlined, possibly not in the first block (but still all in the same
block).

Results for vkpipeline-db in SKL:

total instructions in shared programs: 3624891 -> 3623145 (-0.05%)
instructions in affected programs: 79416 -> 77670 (-2.20%)
helped: 16
HURT: 0

total cycles in shared programs: 1458149667 -> 1458147273 (<.01%)
cycles in affected programs: 30154164 -> 30151770 (<.01%)
helped: 14
HURT: 2

total loops in shared programs: 2437 -> 2437 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total spills in shared programs: 8813 -> 8745 (-0.77%)
spills in affected programs: 2894 -> 2826 (-2.35%)
helped: 8
HURT: 0

total fills in shared programs: 23470 -> 23392 (-0.33%)
fills in affected programs: 12248 -> 12170 (-0.64%)
helped: 6
HURT: 2

LOST:   0
GAINED: 0

Results for shader-db in SKL with Iris:

total instructions in shared programs: 15379442 -> 15379392 (<.01%)
instructions in affected programs: 837 -> 787 (-5.97%)
helped: 2
HURT: 2
helped stats (abs) min: 27 max: 27 x̄: 27.00 x̃: 27
helped stats (rel) min: 10.47% max: 10.67% x̄: 10.57% x̃: 10.57%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 1.23% max: 1.23% x̄: 1.23% x̃: 1.23%
95% mean confidence interval for instructions value: -39.14 14.14
95% mean confidence interval for instructions %-change: -15.51% 6.17%
Inconclusive result (value mean confidence interval includes 0).

total loops in shared programs: 4880 -> 4880 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total cycles in shared programs: 370677237 -> 370676567 (<.01%)
cycles in affected programs: 17852 -> 17182 (-3.75%)
helped: 2
HURT: 1
helped stats (abs) min: 338 max: 356 x̄: 347.00 x̃: 347
helped stats (rel) min: 13.98% max: 14.64% x̄: 14.31% x̃: 14.31%
HURT stats (abs)   min: 24 max: 24 x̄: 24.00 x̃: 24
HURT stats (rel)   min: 0.18% max: 0.18% x̄: 0.18% x̃: 0.18%

total spills in shared programs: 11772 -> 11772 (0.00%)
spills in affected programs: 0 -> 0
helped: 0
HURT: 0

total fills in shared programs: 24948 -> 24948 (0.00%)
fills in affected programs: 0 -> 0
helped: 0
HURT: 0

LOST:   0
GAINED: 0

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-17 12:50:32 -07:00
Jason Ekstrand
812b341578 nir/algebraic: Optimize comparisons and up-casts
These seem like obvious enough optimizations in the world of multiple
integer bit sizes.  The only known thing which hits these at the moment
is some Vulkan CTS tests for 16-bit SSBO values which like to up-cast
and check for equality.  However, it's something that's bound to come up
as we start seeing more integers in shaders.

The optimizations of comparisons of casted values with constants are
something which we would ideally do with range analysis.  However,
lacking that, we can do it in opt_algebraic as long as one side is a
constant.

In dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13, this commit, along
with the previous commit, reduce the number of instructions emitted on
Skylake from 55328 to 44546, a reduction of 20%.

Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-07-17 18:44:35 +00:00
Jason Ekstrand
e8505e982a nir/algebraic: Optimize comparing unpacked values
We could, in theory, add the same optimization for 64-bit unpack
operations but that's likely to fight with 64-bit integer lowering on
platforms which require it so it will require more infrastructure before
that will be a good idea.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-17 18:44:35 +00:00
Jason Ekstrand
9fed031e4e nir/algebraic: Print out the list of transforms in the C file
This helps greatly when debugging algebraic transform generators because
you can now actually see the output and verify that your transforms are
getting generated.

Acked-by: Matt Turner <mattst88@gmail.com>
2019-07-17 18:44:35 +00:00
Eric Anholt
28a808a11b nir: Fix nir_lower_alu_to_scalar's instr filtering.
It was checking if the dest or src[0] SSA values were vectors, rather than
whether the ALU op was using the source as a vector resulting in a
nir_fdot4 making it through to vc4 and v3d:

vec1 32 ssa_6 = fdot4 ssa_4.xxxx, ssa_5

Fixes: c1cffa4249 ("nir/alu_to_scalar: Use the new NIR lowering framework")
v2: Use Jason's recommendation to look at input_sizes.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-17 10:30:43 -07:00
Jason Ekstrand
6fb685fe4b nir/regs_to_ssa: Handle regs in phi sources properly
Sources of phi instructions act as if they occur at the very end of the
predecessor block not the block in which the phi lives.  In order to
handle them correctly, we have to skip phi sources on the normal
instruction walk and handle them as a separate walk over the successor
phis.  While registers in phi instructions is a bit of an oddity it can
happen when we temporarily go out-of-SSA for control-flow manipulations.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111075
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-16 23:28:03 +00:00
Jason Ekstrand
548da20b22 nir/lower_doubles: Handle fdiv and fsub directly
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-16 16:05:16 +00:00
Jason Ekstrand
d7d35a9522 nir/lower_doubles: Use the new NIR lowering framework
One advantage of this is that we no longer need to run in a loop because
the new framework handles lowering instructions added by lowering.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-16 16:05:16 +00:00
Jason Ekstrand
197a08dc69 nir/lower_doubles: Use "alu" for the nir_alu_instr
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-16 16:05:16 +00:00