Piglit's spec/glsl-1.20/compiler/structure-and-array-operations/
array-selection.vert test contains the following code:
gl_Position = (pick_from_a_or_b ? a : b)[i];
where "a" and "b" are uniform vec4[2] variables.
ast_to_hir creates a temporary vec4[2] variable, conditional_tmp, and
generates an if-block to copy one or the other:
(declare (temporary) (array vec4 2) conditional_tmp)
(if (var_ref pick_from_a_or_b)
((assign () (var_ref conditional_tmp) (var_ref a)))
((assign () (var_ref conditional_tmp) (var_ref b))))
However, we failed to update max_array_access for "a" and "b", so it
remained 0 - here, the whole array is being accessed. At link time,
update_array_sizes() used this bogus information to change the types
of "a" and "b" to vec4[1]. We then had assignments from a vec4[1] to
a vec4[2], which is highly illegal.
This tripped assertions in nir_split_var_copies with scalar VS.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Use common intrastage array validation for interface blocks.
This change also allows us to support interface blocks
that are arrays of arrays.
V2: Reinsert unsized array asserts in interstage_match()
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
v2:
- Single statement, by using memset return value as suggested by Ian
Romanick.
- No internal declaration, as suggested by Jason Ekstrand.
- Move macros to a header.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
The main objective of this change is to enable Linux developers to use
more of C99 throughout Mesa, with confidence that the portions that need
to be built with MSVC -- and only those portions --, stay portable.
This is achieved by using the appropriate -Werror= options only on the
places they need to be used.
Unfortunately we still need MSVC 2008 on a few portions of the code
(namely llvmpipe and its dependencies). I hope to eventually eliminate
this so that we can use C99 everywhere, but there are technical/logistic
challenges (specifically, newer Windows SDKs no longer bundle MSVC,
instead require a full installation of Visual Studio, and that has
hindered adoption of newer MSVC versions on our build processes.)
Thankfully we have more directy control over our OpenGL driver, which is
why we're now able to migrate to MSVC 2013 for most of the tree.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This reverts commit 79daa510c7.
I apparently hadn't done a clean build when testing this; it broke the
build for Tom, Ben, and myself. We like the idea; let's try a v2.
The main objective of this change is to enable Linux developers to use
more of C99 throughout Mesa, with confidence that the portions that need
to be built with MSVC -- and only those portions --, stay portable.
This is achieved by using the appropriate -Werror= options only on the
places they need to be used.
Unfortunately we still need MSVC 2008 on a few portions of the code
(namely llvmpipe and its dependencies). I hope to eventually eliminate
this so that we can use C99 everywhere, but there are technical/logistic
challenges (specifically, newer Windows SDKs no longer bundle MSVC,
instead require a full installation of Visual Studio, and that has
hindered adoption of newer MSVC versions on our build processes.)
Thankfully we have more directy control over our OpenGL driver, which is
why we're now able to migrate to MSVC 2013 for most of the tree.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This is to enable the code to build with -Werror=vla in the short term,
and enable the code to build with MSVC2013 soon after.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
There were some bugs, and the code was really difficult to follow. We
would optimize
min(max(x, b), 1.0) into max(sat(x), b)
but not pay attention to the order of min/max and also do
max(min(x, b), 1.0) into max(sat(x), b)
Corrects four shaders from Champions of Regnum that do
min(max(x, 1), 10)
and corrects rendering of Mass Effect under VMware Workstation.
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89180
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Doubles are always packed, but a single double will never cross a slot
boundary -- single slots can still be wasted in some situations.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
st_glsl_to_tgsi and ir_to_mesa have handled conditional discards for a
long time; the previous patch added that capability to i965.
i965 (Haswell) shader-db stats:
Without NIR:
total instructions in shared programs: 5792133 -> 5776360 (-0.27%)
instructions in affected programs: 737585 -> 721812 (-2.14%)
helped: 6300
HURT: 68
GAINED: 2
With NIR:
total instructions in shared programs: 5787538 -> 5769569 (-0.31%)
instructions in affected programs: 767843 -> 749874 (-2.34%)
helped: 6522
HURT: 35
GAINED: 6
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This is a conditional discard, which takes a boolean source.
Note that we don't generate ir_discard::condition today, so this
shouldn't break drivers (since none implement this intrinsic yet).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
opt_constant_folding() already detects conditional assignments where the
condition is constant, and either deletes the assignment or the
condition.
Make it handle discards in the same fashion.
Spotted happening in the wild in Tropico 5 shaders.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This pass wasn't prepared to handle conditional discards.
Instead of initializing the "discarded" temporary to "true", set it to
the condition. Then, refer to the variable for the condition, to avoid
duplicating the expression tree.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This was forgotten.
I omitted the NULL check since we don't check ir_assignment::condition
either.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Copy and pasted from the ir_if::condition handling, plus a NULL check.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This adds a parent_instr field similar to the one for ssa_def. The
difference here is that the parent_instr field on a nir_register can be
NULL if the register does not have a unique definition or if that
definition does not dominate all its uses. We set this field in the
out-of-SSA pass so that backends can get SSA-like information even after
they have gone out of SSA.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
We were already do this for ALU operations but we haven't for non-ALU
operations. This changes that.
total NIR instructions in shared programs: 2039883 -> 2022338 (-0.86%)
NIR instructions in affected programs: 1768850 -> 1751305 (-0.99%)
helped: 14244
HURT: 124
total FS instructions in shared programs: 4083960 -> 4084036 (0.00%)
FS instructions in affected programs: 7302 -> 7378 (1.04%)
helped: 12
HURT: 51
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
I initially wrote this based on the "(('fneg', ('fneg', a)), a)" above,
but we can generalize it and make it more potentially useful. In the
specific original case of a 0 for our new 'a' argument, it'll get further
algebraic optimization once the 0 is an argument to the new add.
No shader-db effects.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
We have some useful optimizations to drop things like 'ine a, 0' on a
boolean argument, but if 'a' came from logical operations on bools, it
couldn't tell. These kinds of constructs appear as a result of TGSI->NIR
quite frequently (at least with if flattening), so being a little more
aggressive in detecting booleans can pay off.
v2: Add ixor as a booleanness-preserving op (Suggestion by Connor).
vc4 results:
total instructions in shared programs: 40207 -> 39881 (-0.81%)
instructions in affected programs: 6677 -> 6351 (-4.88%)
Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
vc4 was already cleaning these up, but it does shave 4 NIR instructions in
shader-db.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
With -DDEBUG -UNDEBUG, this assert uses reg_state::stack_size, which
doesn't exist, breaking the build:
assert(state->states[index].index < state->states[index].stack_size);
Switch it to ifndef NDEBUG, so the field will exist if the assertion
actually generates code.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
One less new directory necessary for gallium code that wants to interact
with NIR.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
GLSL 1.50 and GLSL 4.40 specs, they both say the same in
"Interface Blocks" section:
"If optional qualifiers are used, they can include interpolation qualifiers,
auxiliary storage qualifiers, and storage qualifiers and they must declare
an input, output, or uniform member consistent with the interface qualifier
of the block"
From GLSL ES 3.0, chapter 4.3.7 "Interface Blocks", page 38:
"GLSL ES 3.0 does not support interface blocks for shader inputs or outputs."
and from GLSL ES 3.0, chapter 4.6.1 "The invariant qualifier", page 52.
"Only variables output from a shader can be candidates for invariance."
This patch fixes the following dEQP tests:
dEQP-GLES3.functional.shaders.declarations.invalid_declarations.invariant_uniform_block_2_vertex
dEQP-GLES3.functional.shaders.declarations.invalid_declarations.invariant_uniform_block_2_fragment
No piglit regressions.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
v2:
- Enable this check for GLSL.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This lets us be slightly more efficient by not walking the CFG extra times.
Also, it may make it easier to ensure that GVN happens on only unpinned
instructions.
Reviewed-by: Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
v2 Jason Ekstrand <jason.ekstrand@intel.com>:
- Use nir_dominance_lca for computing least common anscestors
- Use the block index for comparing dominance tree depths
- Pin things that do partial derivatives
Reviewed-by: Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Right now, the nir_instr_prev function function blindly looks up the
previous element in the exec list and casts it to an instruction even if
it's the tail sentinel. The next commit will change this to return null if
it's the first instruction. Making this change first avoids getting a
segfault between commits. The only reason we never noticed is that, thanks
to the way things are laid out in nir_block, the casted instruction's type
was never parallal_copy.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Previously, if you remved a CF node that still had instructions in it, none
of the use/def information from those instructions would get cleaned up.
Also, we weren't removing if statements from the if_uses of the
corresponding register or SSA def. This commit fixes both of these
problems
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>