This code had no relation to ir_to_mesa.cpp, since it was also used by
intel and state_tracker, and most of it was duplicated with the standalone
compiler (which has periodically drifted from the Mesa copy).
v2: Split from the ir_to_mesa to shaderapi.c changes.
Acked-by: Paul Berry <stereotype441@gmail.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We were duplicating this code all over the place, and they all would need
updating for the next set of shader targets.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Previously we only supported "#version 150". This patch recognizes
"compatibility" to give the user a more descriptive error message.
Fixes Piglit's version-150-core-profile test.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
This GLSL extension requires that AMD_vertex_shader_layer be
enabled by the driver.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This will eventually replace do_vec_index_to_cond_assign. This lowering
pass is called in all the places where do_vec_index_to_cond_assign or
do_vec_index_to_swizzle is called.
v2: Use WRITEMASK_* instead of integer literals. Use a more concise
method of generating broadcast_index. Both suggested by Eric.
v3: Use a series of scalar compares instead of a single vector compare.
Suggested by Eric and Ken. It still uses 'if (cond) v.x = y;' instead
of conditional assignments because ir_builder doesn't do conditional
assignments, and I'd rather keep the code simple.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This pass flips (matrix * vector) operations to (vector *
matrixTranspose) for certain built-in matrices (currently
gl_ModelViewProjectionMatrix and gl_TextureMatrix).
This is equivalent, but results in dot products rather than multiplies
and adds. On some hardware, this is more efficient.
This pass is conditionalized on ctx->mvp_with_dp4, the flag drivers set
to indicate they prefer dot products.
Improves performance in Lightsmark by 1.01131% +/- 0.162069% (n = 10)
on a Haswell GT2 system. Passes Piglit on Ivybridge.
v2: Use struct gl_shader_compiler_options instead of plumbing through
another boolean flag for this purpose.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
do_common_optimization may need to make choices about whether to emit
certain kinds of instructions. gl_context::ShaderCompilerOptions
contains exactly that information, so it makes sense to pass it in.
Rather than passing the whole array, pass the structure for the stage
that's currently being worked on.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Const.MaxTextureImageUnits -> Const.FragmentProgram.MaxTextureImageUnits
Const.MaxVertexTextureImageUnits -> Const.VertexProgram.MaxTextureImageUnits
etc.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
GLBenchmark 2.7's shaders contain conditional blocks like:
if (x) {
if (y) {
...
}
}
where the outer conditional's then clause contains exactly one statement
(the nested if) and there are no else clauses. This can easily be
optimized into:
if (x && y) {
...
}
This saves a few instructions in GLBenchmark 2.7:
total instructions in shared programs: 11833 -> 11649 (-1.55%)
instructions in affected programs: 8234 -> 8050 (-2.23%)
It also helps CS:GO slightly (-0.05%/-0.22%). More importantly,
however, it simplifies the control flow graph, which could enable other
optimizations.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
v2 [mattst88]:
- Rebase.
- #define GL_ARB_texture_query_lod to 1.
- Remove comma after ir_lod in ir.h for MSVC.
- Handled ir_lod in ir_hv_accept.cpp, ir_rvalue_visitor.cpp,
opt_tree_grafting.cpp.
- Rename textureQueryLOD to textureQueryLod, see
https://www.khronos.org/bugzilla/show_bug.cgi?id=821
- Fix ir_reader of (lod ...).
v3 [mattst88]:
- Rename textureQueryLod to textureQueryLOD, pending resolution of
Khronos 821.
- Add ir_lod case to ir_to_mesa.cpp.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This ends up reusing the dynamic ID support, so a silly enum gets to go
away. We don't assign good IDs to different messages yet, but at least
that's tractable now.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
V2: - emit `sample` parameter properly for multisample texelFetch()
- fix spurious whitespace change
- introduce a new opcode ir_txf_ms rather than overloading the
existing ir_txf further. This makes doing the right thing in
the driver somewhat simpler.
V3: - fix weird whitespace
V4: - don't forget to include the new opcode in tex_opcode_strs[]
(thanks Kenneth for spotting this)
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
[V2] Reviewed-by: Eric Anholt <eric@anholt.net>
[V2] Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
When the user specifies an unsupported GLSL version,
_mesa_glsl_parse_state::process_version_directive() nicely gives them
an error message telling them which GLSL versions are supported.
Previous to this patch, the logic for determining whether a given
language version was supported was independent from the logic to
generate this error message string; as a result, we had a bug where
GLSL 3.00 would never be listed in the error message as an available
language version, even if it was really available.
To make matters worse, the code for generating the error message
string assumed that desktop GL versions were always separated by 0.10,
an assumption that will be wrong as soon as we support GLSL 3.30.
This patch fixes both problems by adding a table of supported GLSL
versions to _mesa_glsl_parse_state; this table is used both to
generate the error message and to check whether a given version is
supported.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Note that while 'packed' is a reserved word in GLSL ES, row_major is not.
This means that we have to use the string-based matching for that.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Carl Worth <cworth@cworth.org>
These constants need to be made available to shaders in GLSL 3.00 ES.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Carl Worth <cworth@cworth.org>
Note that GLSL 1.00 is selected using "#version 100", so "#version 100
es" is prohibited.
v2: Check for GLES3 before allowing '#version 300 es'
v3: Make sure a correct language_version is set in
_mesa_glsl_parse_state::process_version_directive.
Signed-off-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Carl Worth <cworth@cworth.org>
Version directive handling is going to have to be used within two
parser rules, one for desktop-style version directives (e.g. "#version
130") and one for the new ES-style version directive (e.g. "#version
300 es"), so this patch moves it to a function that can be called from
both rules.
No functional change.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Carl Worth <cworth@cworth.org>
With the advent of GLSL 3.00 ES, the version checks we perform in the
GLSL compiler (to determine which language features are present) will
become more complicated. To reduce the complexity, this patch adds
functions check_version() and is_version() to _mesa_glsl_parse_state.
These functions take two version numbers: a desktop GLSL version and a
GLSL ES version, and return a boolean indicating whether the GLSL
version being compiled is at least the required version. So, for
example, is_version(130, 300) returns true if the GLSL version being
compiled is at least desktop GLSL 1.30 or GLSL 3.00.
The check_version() function additionally produces an error message if
the version check fails, informing the user of which GLSL version(s)
support the given feature.
[v2, idr]: Add PRINTFLIKE annotation to the new method. The numbering of th
parameters is correct because GCC is silly.
[v3, idr]: Fix copy-and-paste error in the comment before
_mesa_glsl_parse_state::is_version. Noticed by Ken.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Carl Worth <cworth@cworth.org>
This will be useful in generating more helpful error messages,
especially with the addition of GLSL 3.00 ES support.
[v2, idr]: Rename ctx parameter to mem_ctx
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Carl Worth <cworth@cworth.org>
This adds all the new builtins + the new sampler types,
and hooks them up if the extension is supported.
v2: fix missing signatures for grad/lod
fix missing textureSize clarifications
fix compare vs starts with usage
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Previously, we advertised the extension but the builtin functions
were enabled only for GLSL and not for ES.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52003
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
I ended up having to add rallocing of the ast_type_qualifier in order
to avoid pulling in ast.h for glsl_parser_extras.h, because I wanted
to track an ast_type_qualifier in the state.
Fixes piglit ARB_uniform_buffer_object/row-major.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
The previous implementation required a flag in _mesa_glsl_parse_state
and line of code to initialize it for every version of the shading
language we intend to support. As we look to add 150, 330, 400, 410,
420, and beyond, this gets rather unwieldy.
This patch retains the switch statement (to reject, say, #version 111),
but removes all the bits. Code to check for ctx->API == API_OPENGL_CORE
could easily be added to the 110 and 120 cases to reject those.
v2: Use _mesa_is_desktop_gl to preserve the existing behavior in the
presence of the new API_OPENGL_CORE enumeration.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v1]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
I've been trying to derive from this for UBO support, and the slightly
obfuscated types were putting me over the edge.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
That adds support for activating the extension. It doesn't actually
*do* anything yet, of course.
Signed-off-by: Olivier Galibert <galibert@pobox.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
I've had this code laying around almost done for a long time. The
idea is like opt_structure_splitting, that we've got a bunch of
transforms at the GLSL IR level that only understand scalars and
vectors, which just skip complicated dereferences. While driver
backends may manage some optimization after they split matrices up
themselves, it would be better to bring all of our optimization to
bear on the problem.
While I wasn't expecting changes quite yet, a few programs end up
winning: a gstreamer convolution shader, and the Humus dynamic
branching demo:
Total instructions: 269430 -> 269342
3/2148 programs affected (0.1%)
1498 -> 1410 instructions in affected programs (5.9% reduction)
Nothing actually relied on them being mutable, and there was at least
one cast which discarded const qualifiers. The next patch would have
introduced many more.
Casting away const qualifiers should be avoided if at all possible.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This stuffs them all in a struct for sanity. Fixes piglit
glsl-1.30/execution/switch/fs-uniform-nested.
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Up until now modifying the GLSL compiler has been pretty straightforward.
This is where things get interesting. But still pretty straightforward.
Switch statements can be thought of a series of if/then/else statements.
Case labels are compared with the value of a test expression and the case
statements are executed if the comparison is true.
There are a couple of aspects of switch statements that complicate this simple
view of the world. The primary one is that cases can fall through sequentially
to subsequent case, unless a break statement is encountered, in which case,
the switch statement exits completely.
But break handling is further complicated by the fact that a break statement
can impact the exit of a loop. Thus, we need to coordinate break processing
between switch statements and loop statements.
The code generated by a switch statement maintains three temporary state
variables:
int test_value;
bool is_fallthru;
bool is_break;
test_value is initialized to the value of the test expression at the head of
the switch statement. This is the value that case labels are compared against.
is_fallthru is used to sequentially fall through to subsequent cases and is
initialized to false. When a case label matches the test expression, this
state variable is set to true. It will also be forced to false if a break
statement has been encountered. This forcing to false on break MUST be
after every case test. In practice, we defer that forcing to immediately after
the last case comparison prior to executing a case statement, but that is
an optimization.
is_break is used to indicate that a break statement has been executed and is
initialized to false. When a break statement is encountered, it is set to true.
This state variable is then used to conditionally force is_fallthru to to false
to prevent subsequent case statements from executing.
Code generation for break statements depends on whether the break statement is
inside a switch statement or inside a loop statement. If it inside a loop
statement is inside a break statement, the same code as before gets generated.
But if a switch statement is inside a loop statement, code is emitted to set
the is_break state to true.
Just as ASTs for loop statements are managed in a stack-like
manner to handle nesting, we also add a bool to capture the innermost switch
or loop condition. Note that we still need to maintain a loop AST stack to
properly handle for-loop code generation on a continue statement. Technically,
we don't (yet) need a switch AST stack, but I am using one for orthogonality
with loop statements, in anticipation of future use. Note that a simple
boolean stack would have sufficed.
We will illustrate a switch statement with its analogous conditional code that
a switch statement corresponds to by examining an example.
Consider the following switch statement:
switch (42) {
case 0:
case 1:
gl_FragColor = vec4(1.0, 2.0, 3.0, 4.0);
case 2:
case 3:
gl_FragColor = vec4(4.0, 3.0, 2.0, 1.0);
break;
case 4:
default:
gl_FragColor = vec4(0.0, 0.0, 0.0, 0.0);
}
Note that case 0 and case 1 fall through to cases 2 and 3 if they occur.
Note that case 4 and the default case must be reached explicitly, since cases
2 and 3 break at the end of their case.
Finally, note that case 4 and the default case don't break but simply fall
through to the end of the switch.
For this code, the equivalent code can be expressed as:
int test_val = 42; // capture value of test expression
bool is_fallthru = false; // prevent initial fall through
bool is_break = false; // capture the execution of a break stmt
is_fallthru |= (test_val == 0); // enable fallthru on case 0
is_fallthru |= (test_val == 1); // enable fallthru on case 1
is_fallthru &= !is_break; // inhibit fallthru on previous break
if (is_fallthru) {
gl_FragColor = vec4(1.0, 2.0, 3.0, 4.0);
}
is_fallthru |= (test_val == 2); // enable fallthru on case 2
is_fallthru |= (test_val == 3); // enable fallthru on case 3
is_fallthru &= !is_break; // inhibit fallthru on previous break
if (is_fallthru) {
gl_FragColor = vec4(4.0, 3.0, 2.0, 1.0);
is_break = true; // inhibit all subsequent fallthru for break
}
is_fallthru |= (test_val == 4); // enable fallthru on case 4
is_fallthru = true; // enable fallthru for default case
is_fallthru &= !is_break; // inhibit fallthru on previous break
if (is_fallthru) {
gl_FragColor = vec4(0.0, 0.0, 0.0, 0.0);
}
The code generate for |= and &= uses the conditional assignment capabilities
of the IR.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We now tie the grammar to the ctors of the ASTs they reference.
This requires that we actually have definitions of the ctors.
In addition, we also need to define "print" and "hir" methods for the AST
classes. The Print methods are pretty simple to flesh out. However, at this
stage of the development, we simply stub out the "hir" methods and flesh
them out later.
Also, since actual class instances get returned by the productions in the
grammar, we also need to designate the type of the productions that
reference those instances.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This extension introduces a new sampler type: samplerExternalOES.
texture2D (and texture2DProj) can be used to do a texture look up in an
external texture.
Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Jakob Bornecrantz <jakob@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>