Setting this flag prevents declarations of uniforms from being removed
from the IR. Since the IR is directly used by several API functions
that query uniforms in shaders, uniform declarations cannot be removed
after the locations have been set. However, it should still be safe
to reorder the declarations (this is not tested).
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41980
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Bryan Cain <bryancain3@gmail.com>
Cc: Vinson Lee <vlee@vmware.com>
Cc: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Very simple shaders don't actually use GLSL built-ins. For example:
- gl_Position = gl_ModelViewProjectionMatrix * gl_Vertex;
- gl_FragColor = vec4(0.0);
Both of the shaders used by _mesa_meta_glsl_Clear() also qualify.
By waiting to initialize the built-ins until the first time we need to
look for a signature, we can avoid the overhead entirely in these cases.
Makes piglit run roughly 18% faster (255 vs. 312 seconds).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
We were splitting on each side of an unlinked program, and the two
sides lost track of which variables they referenced, resulting in
assertion failure during validation. Fixes piglit
link-struct-uniform-usage.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
It's the same as GL_AMD_conservative_depth. The specs have slight
differences in wording, but don't differ in content or behavior.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Instead of using a chain of manually maintained if/else blocks to
handle "#extension" directives, we now consult a table that specifies,
for each extension, the circumstances under which it is available, and
what flags in _mesa_glsl_parse_state need to be set in order to
activate it.
This makes it easier to add new GLSL extensions in the future, and
fixes the following bugs:
- Previously, _mesa_glsl_process_extension would sometimes set the
"_enable" and "_warn" flags for an extension before checking whether
the extension was supported by the driver; as a result, specifying
"enable" behavior for an unsupported extension would sometimes cause
front-end support for that extension to be switched on in spite of
the fact that back-end support was not available, leading to strange
failures, such as those in
https://bugs.freedesktop.org/show_bug.cgi?id=38015.
- "#extension all: warn" and "#extension all: disable" had no effect.
Notes:
- All extensions are currently marked as unavailable in geometry
shaders. This should not have any adverse effects since geometry
shaders aren't supported yet. When we return to working on geometry
shader support, we'll need to update the table for those extensions
that are available in geometry shaders.
- Previous to this commit, if a shader mentioned
ARB_shader_texture_lod, extension ARB_texture_rectangle would be
automatically turned on in order to ensure that the types
sampler2DRect and sampler2DRectShadow would be defined. This was
unnecessary, because (a) ARB_shader_texture_lod works perfectly well
without those types provided that the builtin functions that
reference them are not called, and (b) ARB_texture_rectangle is
enabled by default in non-ES contexts anyway. I eliminated this
unnecessary behavior in order to make the behavior of all extensions
consistent.
NOTE: This is a candidate for the 7.10 and 7.11 branches.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
ast_expression::print() had an incorrect index into the subexpressions
array, so (a ? b : c) was being incorrectly rendered as (a ? b : b).
Signed-off-by: Brian Paul <brianp@vmware.com>
It's just an alias of the ARB variant with some GLSL compiler changes.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Also make the GL_ARB_draw_instanced block follow the same pattern as
the other blocks.
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This patch cleans up many of the extra copies in GLSL IR introduced by
i965's scalarizing passes. It doesn't result in a statistically
significant performance difference on nexuiz high settings (n=3) or my
demo (n=10), due to brw_fs.cpp's register coalescing covering most of
those extra moves anyway. However, it does make the debug of wine's
GLSL shaders much more tractable, and reduces instruction count of
glsl-fs-convolution-2 from 376 to 288.
Passing ralloc_vasprintf_append a 0-byte allocation doesn't work. If
passed a non-NULL argument, ralloc calls strlen to find the end of the
string. Since there's no terminating '\0', it runs off the end.
Fixes a crash introduced in 14880a510a.
Previously we'd happily compile GLSL 1.30 shaders on any driver. We'd
also happily compile GLSL 1.10 and 1.20 shaders in an ES2 context.
This has been a long standing FINISHME in the compiler.
NOTE: This is a candidate for the 7.9 and 7.10 branches
This should save on the overhead of tree-walking and provide a
convenient place to add more instruction lowering in the future.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
This adds proper support for the GL_ARB_shader_stencil_export extension
to the GLSL compiler. Thanks to Ian for pointing out where I need to add things.
Changes in v2:
- Base class renamed to ir_control_flow_visitor
- Tried to comply with coding style
This is a new pass that supersedes ir_if_return and "lowers" jumps
to if/else structures.
Currently it causes no regressions on softpipe and nv40, but I'm not sure
whether the piglit glsl tests are thorough enough, so consider this
experimental.
It can be asked to:
1. Pull jumps out of ifs where possible
2. Remove all "continue"s, replacing them with an "execute flag"
3. Replace all "break" with a single conditional one at the end of the loop
4. Replace all "return"s with a single return at the end of the function,
for the main function and/or other functions
This gives several great benefits:
1. All functions can be inlined after this pass
2. nv40 and other pre-DX10 chips without "continue" can be supported
3. nv30 and other pre-DX10 chips with no control flow at all are better supported
Note that for full effect we should also teach the unroller to unroll
loops with a fixed maximum number of iterations but with the canonical
conditional "break" that this pass will insert if asked to.
Continues are lowered by adding a per-loop "execute flag", initialized to
TRUE, that when cleared inhibits all execution until the end of the loop.
Breaks are lowered to continues, plus setting a "break flag" that is checked
at the end of the loop, and trigger the unique "break".
Returns are lowered to breaks/continues, plus adding a "return flag" that
causes loops to break again out of their enclosing loops until all the
loops are exited: then the "execute flag" logic will ignore everything
until the end of the function.
Note that "continue" and "return" can also be implemented by adding
a dummy loop and using break.
However, this is bad for hardware with limited nesting depth, and
prevents further optimization, and thus is not currently performed.
This increases the chance that GLSL programs will actually work.
Note that continues and returns are not yet lowered, so linking
will just fail if not supported.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
This should make it easier to change the default version based on the
API (say, version 1.00 for OpenGL ES).
Also, synchronize the symbol table's version with the parse state's
version just before doing AST-to-HIR. This way, it will be set when
it matters, but the main initialization code doesn't have to care about
the symbol table.
With the glsl2-965 branch, the optimization of glsl-algebraic-rcp-rcp
regressed due to noop swizzles hiding information from ir_algebraic.
This cleans up those noop swizzles for us.
This prevents using uninitialized data in _msea_glsl_error in some
cases, (including at least 6 piglit tests). Thanks to valgrind for
pointing out the problem!