Previously, lower_jumps.cpp would break out of its loop after lowering
a jump instruction in just the then- or else-branch of a conditional,
and it would fail to lower a jump instruction occurring in the other
branch.
Without this patch, lower_jumps.cpp may require multiple passes in
order to lower all jumps. This results in sub-optimal output because
lower_jumps.cpp produces a brand new set of temporary variables each
time it is run, and the redundant temporary variables are not
guaranteed to be eliminated by later optimization passes.
Fixes unit test test_lower_returns_4.
The visitor class in lower_jumps.cpp never removes or replaces the
instruction being visited, but it frequently alters or removes the
instructions that follow it. Therefore, to make sure the altered IR
is visited, it needs to iterate through exec_lists using foreach_list
rather than visit_exec_list().
Without this patch, lower_jumps.cpp may require multiple passes in
order to lower all jumps. This results in sub-optimal output because
lower_jumps.cpp produces a brand new set of temporary variables each
time it is run, and the redundant temporary variables are not
guaranteed to be eliminated by later optimization passes.
Also, certain invariants assumed by lower_jumps.cpp may fail to hold,
causing assertion failures.
Fixes unit tests test_lower_pulled_out_jump,
test_lower_unified_returns, test_lower_guarded_conditional_break,
test_lower_return_non_void_at_end_of_loop, and test_lower_returns_3.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Previously, lower_jumps.cpp would only lower return and continue
statements that appeared inside conditionals. This patch makes it
lower unconditional returns and continue statements that occur inside
a loop.
Such unconditional flow control statements would be unlikely to be
explicitly coded by a reasonable user, however they might arise as a
result of other optimizations.
Without this patch, lower_jumps.cpp might not lower certain return and
continue statements, causing some backends to fail.
Fixes unit tests test_lower_return_void_at_end_of_loop and
test_remove_continue_at_end_of_loop.
Previously, lower_jumps.cpp only lowered return statements that
appeared inside of an if statement.
Without this patch, lower_jumps.cpp might not lower certain return
statements, causing some back-ends to fail (as in bug #36669).
Fixes unit test test_lower_returns_1.
Previously, do_lower_jumps.cpp determined whether to lower return
statements in ir_lower_jumps_visitor::should_lower_jumps(). Moved
this logic to ir_lower_jumps_visitor::visit(ir_function_signature *),
so that it can be used in determining whether to lower a return
statement at the end of a function.
Previously ir_reader was only able to handle return of non-void.
This patch is necessary in order to allow optimization passes to be
tested in isolation.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes an assertion failure in the piglib out-01.frag
ARB_explicit_attrib_location test. The locations set via the layout
qualifier in fragment shader were not being applied to the shader
outputs. As a result all of these variables still had a location of
-1 set.
This may need some more work for pre-3.0 contexts. The problem is
dealing with generic outputs that lack a layout qualifier. There is
no way for the application to specify a location
(glBindFragDataLocation is not supported) or query the location
assigned by the linker (glGetFragDataLocation is not supported).
NOTE: This is a candidate for the 7.10 and 7.11 branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38624
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Vinson Lee <vlee@vmware.com>
The set of values initially available (before any kills) must be
tracked with each constant in the set. Otherwise the wrong component
can be selected after earlier components have been killed.
NOTE: This is a candidate for the 7.10 and 7.11 branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=37383
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Matthias Bentrup <matthias.bentrup@googlemail.com>
MOD_TO_FRACT was designed to lower the GLSL 1.20 mod() function, which
operates on floating point values. However, we also use ir_binop_mod
for GLSL 1.30's % operator, which operates on integers.
For now, make MOD_TO_FRACT only apply to floating-point mod operations.
In the future, we may want to add a lowering pass for integer-based mod.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
f2i results in an int/ivec; we need i2u to get a uint/uvec.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Previously, it would simply say "type error" in three different cases:
- The LHS is not an integer
- The RHS is not an integer
- The LHS and RHS have different base types (int vs. uint)
Now the error messages state the specific problem.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Previously, ir_function::matching_signature had a fatal bug: if a
function had more than one non-exact match, it would simply return NULL.
This occured, for example, when looking for max(uvec3, uvec3):
- max(vec3, vec3) -> score 1 (found first)
- max(ivec3, ivec3) -> score 1 (found second...used to return NULL here)
- max(uvec3, uvec3) -> score 0 (exact match...the right answer)
This did not occur for max(ivec3, ivec3) since the second match found
was an exact match.
The new behavior is to return a match with the lowest score. If there
is an exact match, that will be returned. Otherwise, a match with the
least number of implicit conversions is chosen.
Fixes piglit tests max-uvec3.vert and glsl-inexact-overloads.shader_test.
NOTE: This is a candidate for the 7.10 and 7.11 branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reverts commit f41e1db327
"fix conversions from uint to bool and from float/bool to uint"
f2i, b2i, and b2i should not accept uint types. Use i2u and u2i.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
These are necessary to handle int/uint constructor conversions. For
example, the following code currently results in a type mismatch:
int x = 7;
uint y = uint(x);
In particular, uint(x) still has type int.
This commit simply adds the new operations; it does not generate them,
nor does it add backend support for them.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
We almost never want to specify a condition, and when we do we're
already thinking about it (because we're writing a lowering pass
generating the condition), so a default argument should make the code
more pleasant to read.
NOTE: This is a candidate for the 7.11 branch (we want to be able to
cherry-pick future code).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Our copy propagation tends to be bad at handling the later array
accesses of the matrix argument we moved to a temporary. Generally we
don't need to move it to a temporary, though, so this avoids needing
more copy propagation complexity.
Reduces instruction count of some Unigine Tropics and Sanctuary
fragment shaders that do operations on uniform matrix arrays by 5.9%
on gen6.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We were constrained to using temporaries because we were assuming
variables all over. This simplifies things a bit.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This awkward typing was to avoid shadowing the function argument (the
matrix) with the temporary deref (the column) before the
get_column()/get_element()s were moved into the expression/assignment
constructors. They're about to become not-variables, so the current
names had to go. This change is almost mechanical (other than
column_expr), so it should make the next diff clearer.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
I think this makes the code more obvious by moving the declarations to
their single usage (now that we aren't using them to get at the ->type
field for expression constructors).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Instead of using a chain of manually maintained if/else blocks to
handle "#extension" directives, we now consult a table that specifies,
for each extension, the circumstances under which it is available, and
what flags in _mesa_glsl_parse_state need to be set in order to
activate it.
This makes it easier to add new GLSL extensions in the future, and
fixes the following bugs:
- Previously, _mesa_glsl_process_extension would sometimes set the
"_enable" and "_warn" flags for an extension before checking whether
the extension was supported by the driver; as a result, specifying
"enable" behavior for an unsupported extension would sometimes cause
front-end support for that extension to be switched on in spite of
the fact that back-end support was not available, leading to strange
failures, such as those in
https://bugs.freedesktop.org/show_bug.cgi?id=38015.
- "#extension all: warn" and "#extension all: disable" had no effect.
Notes:
- All extensions are currently marked as unavailable in geometry
shaders. This should not have any adverse effects since geometry
shaders aren't supported yet. When we return to working on geometry
shader support, we'll need to update the table for those extensions
that are available in geometry shaders.
- Previous to this commit, if a shader mentioned
ARB_shader_texture_lod, extension ARB_texture_rectangle would be
automatically turned on in order to ensure that the types
sampler2DRect and sampler2DRectShadow would be defined. This was
unnecessary, because (a) ARB_shader_texture_lod works perfectly well
without those types provided that the builtin functions that
reference them are not called, and (b) ARB_texture_rectangle is
enabled by default in non-ES contexts anyway. I eliminated this
unnecessary behavior in order to make the behavior of all extensions
consistent.
NOTE: This is a candidate for the 7.10 and 7.11 branches.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
These were previously 1-bit-wide bitfields. Changing them to bools
has a negligible performance impact, and allows them to be accessed by
offset as well as by direct structure access.
NOTE: This is a candidate for the 7.10 and 7.11 branches.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
From the OpenGL docs for GL_ARB_explicit_attrib_location:
This extension provides a method to pre-assign attribute locations to
named vertex shader inputs and color numbers to named fragment shader
outputs.
This was accidentally implemented for fragment shader inputs. This
patch fixes it to apply to fragment shader outputs.
Fixes piglit tests
spec/ARB_explicit_attrib_location/1.{10,20}/compiler/layout-{01,03,06,07,08,09,10}.frag
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
NOTE: This is a candidate for the 7.10 and 7.11 branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38624
Previously, if max_depth were 1, the following code would see the
first if-statement (correctly) not get flattened, but the second
if-statement would (incorrectly) get flattened:
void main()
{
if (a)
gl_Position = vec4(0);
if (b)
gl_Position = vec4(1);
}
This is because the visit_leave(ir_if*) method would not decrement the
depth before returning on the first if-statement.
NOTE: This is a candidate for the 7.10 and 7.11 branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Previously, the builtins in OES_texture_3D.{frag,vert} were only
compiling properly as a consequence of bug 38015, which allows
unsupported extensions to be enabled. This fix eliminates the builtin
compiler's reliance on bug 38015, so that bug 38015 can be fixed.
Previously it was up to the driver or later code generator to reject
these shaders. It turns out that nobody did this.
This will need changes to support geometry shaders.
NOTE: This is a candidate for the stable branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=37743
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for stable release branches (and don't forget
to re-run "make builtins" after cherry-picking.)
Commit 56ef62d988
"glsl: Generate readable unique names at print time."
changed ir_print_visitor to not generate @0x1234567 suffixes except
where necessary. So there's no need to manually remove them.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
The function was named "find_unconditional_discard", but didn't
actually check that the discard statement found was unconditional.
Fixes piglit glsl-fs-discard-04.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
ir_print_visitor::visit(ir_constant *) was failing to index properly
into ir->type->fields.structure, so the first field name was being
reprinted for every field in the structure.
Signed-off-by: Brian Paul <brianp@vmware.com>
ast_expression::print() had an incorrect index into the subexpressions
array, so (a ? b : c) was being incorrectly rendered as (a ? b : b).
Signed-off-by: Brian Paul <brianp@vmware.com>
It's just an alias of the ARB variant with some GLSL compiler changes.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Lots of code (deleted by this patch) tried to make type == result->type,
but not all cases did. Don't pretend; just use result->type.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
The expression
x = y, 5, 3;
will generate
0:7(9): warning: left-hand operand of comma expression has no effect
The warning is only emitted for the left-hand operands, becuase the
right-most operand is the result of the expression. This could be
used in an assignment, etc.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The 095-recursive-define test case was triggering infinite recursion
with the following test case:
#define A(a, b) B(a, b)
#define C A(0, C)
C
Here's what was happening:
1. "C" was pushed onto the active list to expand the C node
2. While expanding the "0" argument, the active list would be
emptied by the code at the end of _glcpp_parser_expand_token_list
3. When expanding the "C" argument, the active list was now empty,
so lather, rinse, repeat.
We fix this by adjusting the final popping at the end of
_glcpp_parser_expand_token_list to never pop more nodes then this
particular invocation had pushed itself. This is as simple as saving
the original state of the active list, and then interrupting the
popping when we reach this same state.
With this fix, all of the glcpp-test tests now pass.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32835
Signed-off-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>