This is a squash of a bunch of individual changes:
nir/builder: Generate 32-bit bool opcodes transparently
nir/algebraic: Remap Boolean opcodes to the 32-bit variant
Use 32-bit opcodes in the NIR producers and optimizations
Generated with a little hand-editing and the following sed commands:
sed -i 's/nir_op_ball_fequal/nir_op_b32all_fequal/g' **/*.c
sed -i 's/nir_op_bany_fnequal/nir_op_b32any_fnequal/g' **/*.c
sed -i 's/nir_op_ball_iequal/nir_op_b32all_iequal/g' **/*.c
sed -i 's/nir_op_bany_inequal/nir_op_b32any_inequal/g' **/*.c
sed -i 's/nir_op_\([fiu]lt\)/nir_op_\132/g' **/*.c
sed -i 's/nir_op_\([fiu]ge\)/nir_op_\132/g' **/*.c
sed -i 's/nir_op_\([fiu]ne\)/nir_op_\132/g' **/*.c
sed -i 's/nir_op_\([fiu]eq\)/nir_op_\132/g' **/*.c
sed -i 's/nir_op_\([fi]\)ne32g/nir_op_\1neg/g' **/*.c
sed -i 's/nir_op_bcsel/nir_op_b32csel/g' **/*.c
Use 32-bit opcodes in the NIR back-ends
Generated with a little hand-editing and the following sed commands:
sed -i 's/nir_op_ball_fequal/nir_op_b32all_fequal/g' **/*.c
sed -i 's/nir_op_bany_fnequal/nir_op_b32any_fnequal/g' **/*.c
sed -i 's/nir_op_ball_iequal/nir_op_b32all_iequal/g' **/*.c
sed -i 's/nir_op_bany_inequal/nir_op_b32any_inequal/g' **/*.c
sed -i 's/nir_op_\([fiu]lt\)/nir_op_\132/g' **/*.c
sed -i 's/nir_op_\([fiu]ge\)/nir_op_\132/g' **/*.c
sed -i 's/nir_op_\([fiu]ne\)/nir_op_\132/g' **/*.c
sed -i 's/nir_op_\([fiu]eq\)/nir_op_\132/g' **/*.c
sed -i 's/nir_op_\([fi]\)ne32g/nir_op_\1neg/g' **/*.c
sed -i 's/nir_op_bcsel/nir_op_b32csel/g' **/*.c
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Here we rework force_unroll_array_access() so that we can reuse
the induction variable detection in a following patch.
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Following commits will introduce additional fields such as
guessed_trip_count. Renaming these will help avoid confusion
as our unrolling feature set grows.
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
In order to be sure loop_terminator_list is an accurate
representation of all the jumps in the loop we need to be sure we
didn't encounter any other complex behaviour such as continues,
nested breaks, etc during analysis.
This will be used in the following patch.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This will help later patches with unrolling loops that end with a
break i.e. loops the always exit on their first interation.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
We need to add loop terminators to the list in the order we come
across them otherwise if two or more have the same exit condition
we will select that last one rather than the first one even though
its unreachable.
This fix is for simple unrolls where we only have a single exit
point. When unrolling these type of loops the unreachable
terminators and their unreachable branch are removed prior to
unrolling. Because of the logic change we also switch some
list access in the complex unrolling logic to avoid breakage.
Fixes: 6772a17acc ("nir: Add a loop analysis pass")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Note that this patch needs to come late in the series since this pass
can be run after any pass that damages nir_metadata_loop_analysis.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This will be removed at the end of the transition, but add some tracking
plus asserts to help ensure that lowering passes are called at the
correct point (pre or post deref instruction lowering) as passes are
converted and the point where lower_deref_instrs() is called is moved.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
nir_const_value is not needed in get_iteration
Signed-off-by: Elie Tournier <tournier.elie@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Fixes performance regression in SynMark PSPom caused by loops with float
counters not always unrolling.
For example:
for (float i = 0.02; i < 0.9; i += 0.11)
...
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This pass detects induction variables and calculates the
trip count of loops to be used for loop unrolling.
V2: Rebase, adapt to removal of function overloads
V3: (Timothy Arceri)
- don't try to find trip count if loop terminator conditional is a phi
- fix trip count for do-while loops
- replace conditional type != alu assert with return
- disable unrolling of loops with continues
- multiple fixes to memory allocation, stop leaking and don't destroy
structs we want to use for unrolling.
- fix iteration count bugs when induction var not on RHS of condition
- add FIXME for && conditions
- calculate trip count for unsigned induction/limit vars
V4: (Timothy Arceri)
- count instructions in a loop
- set the limiting_terminator even if we can't find the trip count for
all terminators. This is needed for complex unrolling where we handle
2 terminators and the trip count is unknown for one of them.
- restruct structs so we don't keep information not required after
analysis and remove dead fields.
- force unrolling in some cases as per the rules in the GLSL IR pass
V5: (Timothy Arceri)
- fix metadata mask value 0x10 vs 0x16
V6: (Timothy Arceri)
- merge loop_variable and nir_loop_variable structs and lists suggested by Jason
- remove induction var hash table and store pointer to induction information in
the loop_variable suggested by Jason.
- use lowercase list_addtail() suggested by Jason.
- tidy up init_loop_block() as per Jasons suggestions.
- replace switch with nir_op_infos[alu->op].num_inputs == 2 in
is_var_basic_induction_var() as suggested by Jason.
- use nir_block_last_instr() in and rename foreach_cf_node_ex_loop() as suggested
by Jason.
- fix else check for is_trivial_loop_terminator() as per Connors suggetions.
- simplify offset for induction valiables incremented before the exit conditions is
checked.
- replace nir_op_isub check with assert() as it should have been lowered away.
V7: (Timothy Arceri)
- use rzalloc() on nir_loop struct creation. Worked previously because ralloc()
was broken and always zeroed the struct.
- fix cf_node_find_loop_jumps() to find jumps when loops contain
nested if statements. Code is tidier as a result.
V8: (Timothy Arceri)
- move is_trivial_loop_terminator() to nir.h so we can use it to assert is
the loop unroll pass
- fix analysis to not bail when looking for terminator when the break is in the else
rather then the if
- added new loop terminator fields: break_block, continue_from_block and
continue_from_then so we don't have to gather these when doing unrolling.
- get correct array length when forcing unrolling of variables
indexed arrays that are the same size as the iteration count
- add support for induction variables of type float
- update trival loop terminator check to allow an if containing
instructions as long as both branches contain only a single
block.
V9: (Timothy)
- bunch of tidy ups and simplifications suggested by Jason.
- rewrote trivial terminator detection, now the only restriction is there
must be no nested jumps, anything else goes.
- rewrote the iteration test to use nir_eval_const_opcode().
- count instruction properly even when forcing an unroll.
- bunch of other tidy ups and simplifications.
V10: (Timothy)
- some trivial tidy ups suggested by Jason.
- conditional fix for break inside continue branch by Jason.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>