This gives the compiler the chance to inline and not export class symbols
even in the absence of LTO. Saves about 60kb on disk.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
This information will be useful in the i965 back end, since we can
save some compilation effort if we know from the outset that the
shader never calls EndPrimitive().
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
During compilation, we'll use this to determine built-in availability.
The plan is to have a single shader containing every built-in in every
version of the language, but filter out the ones that aren't actually
available to the shader being compiled.
At link time, we don't actually need this filtering capability: we've
already imported prototypes for every built-in that the shader actually
calls, and they're flagged as is_builtin(). The linker doesn't import
any additional prototypes, so it won't pull in any unavailable
built-ins. When resolving prototypes to function definitions, the
linker ensures the values of is_builtin() match, which means that a
shader can't trick the linker into importing the body of an unavailable
built-in by defining a suspiciously similar prototype.
In other words, during linking, we can just pass in NULL. It will work
out fine.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
A signature is a built-in if and only if builtin_info != NULL, so we
don't actually need a separate flag bit. Making a boolean-valued
method allows existing code to ask the same question while not worrying
about the internal representation.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
This patch extracts the following logic from
validate_vertex_shader_executable():
(a) Generate an error if the shader writes to both gl_ClipDistance and
gl_ClipVertex.
(b) Record whether the shader writes to gl_ClipDistance in
gl_shader_program for use by the back-end.
(c) Record the size of gl_ClipDistance in gl_shader_program for use by
transform feedback logic.
And moves it into a function that is shared between vertex and
geometry shaders.
Strictly speaking we only need to have shared logic for (b) and (c)
right now (since (a) only matters in compatibility contexts, and we're
only implementing geometry shaders in core contexts right now). But
the three are closely related enough that it seems sensible to keep
them together.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested by examining generated TGSI shaders from piglit/glsl-routing.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Henri Verbeet <hverbeet@gmail.com>
Tested-by: Henri Verbeet <hverbeet@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Commit 7cfefe6965 introduced a check for whether linked->Type equals
GL_GEOMETRY_SHADER. However, linked may be NULL due to an earlier error
condition.
Since the entire function after the error path is (or should be) guarded
by linked != NULL checks, we may as well just return early and remove
the checks.
Fixes crashes in 9 Piglit tests.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Section 4.3.8.1 (Input Layout Qualifiers) of the GLSL 1.50 spec
contains some tricky rules for how the sizes of geometry shader input
arrays are related to the input layout specification. In essence,
those rules boil down to the following:
- If an input array declaration does not specify a size, and it
follows an input layout declaration, it is sized according to the
input layout.
- If an input layout declaration follows an input array declaration
that didn't specify a size, the input array declaration is given a
size at the time the input layout declaration appears.
- All input layout declarations and input array sizes must ultimately
match. Inconsistencies are reported as soon as they are detected,
at compile time if the inconsistency is within one compilation unit,
otherwise at link time.
- At least one compilation unit must contain an input layout
declaration.
(Note: the geom_array_resize_visitor class was contributed by Bryan
Cain <bryancain3@gmail.com>.)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This gets piglit's geometry-basic test running.
TODO: Still need to validate that the GS layout qualifiers don't get used
in places they shouldn't (like an interface block, or a particular shader
input or output)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
From section 2.15 (Geometry Shaders) the OpenGL 3.2 spec:
A program object that includes a geometry shader must also include
a vertex shader; otherwise a link error will occur.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Since geometry shader inputs are arrays (where the array index
indicates which vertex is being examined), varying packing needs to
treat them differently.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This commit adds all of the parsing and semantics for GLSL 150 style
geometry shaders.
v2 (Paul Berry <stereotype441@gmail.com>): Add a few missing calls to
get_pipeline_stage(). Fix some signed/unsigned comparison warnings.
Fix handling of NULL consumer in assign_varying_locations().
v3 (Bryan Cain <bryancain3@gmail.com>): fix indexing order of 2D
arrays. Also, allow interpolation qualifiers in geometry shaders.
v4 (Paul Berry <stereotype441@gmail.com>): Eliminate
get_pipeline_stage()--it is no longer needed thanks to 030ca23 (mesa:
renumber shader indices according to their placement in pipeline).
Remove 2D stuff. Move vertices_per_prim() to ir.h, so that it will be
accessible from outside the linker. Remove
inject_num_vertices_visitor. Rework for GLSL 1.50.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
v5 (Paul Berry <stereotype441@gmail.com>): Split out
do_set_program_inouts() argument refactoring to a separate patch.
Move geom_array_resizing_visitor to later in the series.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
There's no reason to be clever about this. By making separate
allocations for vertex and fragment shaders, we'll allow geometry
shaders to be added without introducing any complication.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Commit 586b4b5 (glsl: Also update implicit sizes of varyings at link
time) extended update_array_sizes() to apply to both uniforms and
shader ins/outs. However, doing creates problems for geometry
shaders, because update_array_sizes() assumes that variables with
matching names in different parts of the pipeline should have the same
sizes. With the addition of geometry shaders, this is no longer true
(e.g. both vertex and geometry shaders have a gl_ClipDistance output
variable, but there's no reason these variables should have the same
sizes).
The original reason for commit 586b4b5 (avoid problems with
gl_TexCoord being 0 length) has since been addressed by commit 6f53921
(linker: Ensure that unsized arrays have a size after linking). So go
ahead and switch update_array_sizes() back to only acting on uniforms.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Our previous justification for leaving this function out of glsl_type
was that it implemented counting rules that were specific to GLSL
1.50. However, these counting rules also describe the number of
varying slots that Mesa will assign to a varying in the absence of
varying packing. That's useful to be able to compute from outside of
the linker code (a future patch will use it from
ir_set_program_inouts.cpp). So go ahead and move it to glsl_type.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This patch changes link_shaders() so that it sets prog->LinkStatus to
true when it starts, and then relies on linker_error() to set it to
false if a link failure occurs.
Previously, link_shaders() would set prog->LinkStatus to true halfway
through its execution; as a result, linker functions that executed
during the first half of link_shaders() would have to do their own
success/failure tracking; if they didn't, then calling linker_error()
would add an error message to the log, but not cause the link to fail.
Since it wasn't always obvious from looking at a linker function
whether it was called before or after link_shaders() set
prog->LinkStatus to true, this carried a high risk of bugs.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Previously we failed to link (which is correct), but we did not output
an error message, which could have been confusing for users.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
A comment in link_intrastage_shaders(), and an if-test that followed
it, seemed to indicate that link_uniform_blocks() would return a
negative value in the event of an error. But this is not the
case--all error checking has already been performed by
validate_intrastage_interface_blocks(), and link_uniform_blocks() can
only return unsigned values.
So get rid of the if-test and change the return type of
link_intrastage_shaders() to clarify that it can only return unsigned
values.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
All compilation units need to agree on the binding point, if they
specify one at all.
v2: Use binding, not constant_value.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
This eliminates built-in varyings such as gl_Color, gl_SecondaryColor,
gl_TexCoord, and gl_FogFragCoord if they are unused by the next stage or
not written at all (e.g. gl_TexCoord elements). The gl_TexCoord array is
broken down into separate vec4s if needed.
v2: - use a switch statement in varying_info_visitor::visit(ir_variable*)
- use snprintf
- disable the optimization for GLES2
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This ensures that inter-shader outputs and inputs are properly eliminated
across 3 or more shader stages. The behavior is unchanged with 2 or less
shader stages.
For example, elimination of unused FS inputs causes elimination of matching
GS outputs, which causes elimination of the GS inputs that were needed for
evaluation of the eliminated GS outputs, which causes elimination of
matching VS outputs. An unused FS input is all that's needed to trigger
this chain reaction.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
See my explanation in mtypes.h.
v2: don't do this in gallium
v3: also updated the comment at the gl_shader_type definition
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
We were duplicating this code all over the place, and they all would need
updating for the next set of shader targets.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
We were counting uniforms located in UBOs against the default uniform
block limit, while not doing any counting against the specific combined
limit.
Note that I couldn't quite find justification for the way I did this, but
I think it's the only sensible thing: The spec talks about components, so
each "float" in a std140 block would count as 1 component and a "vec4"
would count as 4, though they occupy the same amount of space. Since GPU
limits on uniform buffer loads are surely going to be about the size of
the blocks, I just counted them that way.
Fixes link failures in piglit
arb_uniform_buffer_object/maxuniformblocksize when ported to geometry
shaders on Paul's GS branch, since in that case the max block size is
bigger than the default uniform block component limit.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Verify that interface blocks match when linking separate shader
stages into a program.
Fixes piglit glsl-1.50 tests:
* linker/interface-blocks-vs-fs-member-count-mismatch.shader_test
* linker/interface-blocks-vs-fs-member-order-mismatch.shader_test
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Verify that interface blocks match when combining compilation
units at the same stage. (For example, when merging all vertex
shaders.)
Fixes piglit glsl-1.50 test:
* linker/interface-blocks-multiple-vs-member-count-mismatch.shader_test
v5 (Ken): Rename to link_interface_blocks.cpp and drop the separate .h
file for consistency with other linker code. Remove "ok" variable.
Fold cross_validate_interface_blocks into its caller.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Convert interface blocks with instance names into flat
interface blocks without an instance name.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
do_common_optimization may need to make choices about whether to emit
certain kinds of instructions. gl_context::ShaderCompilerOptions
contains exactly that information, so it makes sense to pass it in.
Rather than passing the whole array, pass the structure for the stage
that's currently being worked on.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Const.MaxTextureImageUnits -> Const.FragmentProgram.MaxTextureImageUnits
Const.MaxVertexTextureImageUnits -> Const.VertexProgram.MaxTextureImageUnits
etc.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Since half of ir_validate uses asserts() (the other using printf() then
abort()), there's not much use to calling it in a release build. Cuts
6.3% of the startup time of TF2.
NOTE: This is a candidate for the stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This patch makes the following search-and-replace changes:
gl_frag_attrib -> gl_varying_slot
FRAG_ATTRIB_* -> VARYING_SLOT_*
FRAG_BIT_* -> VARYING_BIT_*
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
This patch makes the following search-and-replace changes:
gl_vert_result -> gl_varying_slot
VERT_RESULT_* -> VARYING_SLOT_*
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
The parsing logic is moved to a new function in the GLSL module,
parse_program_resource_name(). This name was chosen because it should
eventually be useful for handling everything that OpenGL 4.3 calls
"program resources" (e.g. uniforms, vertex inputs, fragment outputs,
and transform feedback varyings).
Future patches will make use of this function for linking transform
feedback varyings.
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Use the function added in the previous commit.
This temporarily causes gles3conform
uniform_buffer_object_index_of_not_active_block,
uniform_buffer_object_inherit_and_override_layouts, and
uniform_buffer_object_repeat_global_scope_layouts to assertion fail.
This is fixed in the next commit.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
The way a variable is tested for this property is about to change, and
this makes the code easier to modify.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This patch replaces the three ir_variable_mode enums:
- ir_var_in
- ir_var_out
- ir_var_inout
with the following five:
- ir_var_shader_in
- ir_var_shader_out
- ir_var_function_in
- ir_var_function_out
- ir_var_function_inout
This eliminates a frustrating ambiguity: it used to be impossible to
tell whether an ir_var_{in,out} variable was a shader in/out or a
function in/out without seeing where the variable was declared in the
IR. This complicated some optimization and lowering passes, and would
have become a problem for implementing varying structs.
In the lisp-style serialization of GLSL IR to strings performed by
ir_print_visitor.cpp and ir_reader.cpp, I've retained the names "in",
"out", and "inout" for function parameters, to avoid introducing code
churn to the src/glsl/builtins/ir/ directory.
Note: a couple of comments in the code seemed to indicate that we were
planning for a possible future in which geometry shaders could have
shader-scope inout variables. Our GLSL grammar rejects shader-scope
inout variables, and I've been unable to find any evidence in the GLSL
standards documents (or extensions) that this will ever be allowed, so
I've eliminated these comments.
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This looks like a copy-and-paste left over.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
linker.cpp is getting pretty big, and we're about to add even more
varying packing code, so split out the linker code that concerns
varyings to its own file.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Previously this macro existed in 3 separate places, some inside the
intel driver and some outside of it. It makes more sense to have it
in main/macros.h
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Not sure what was going on here, but running piglit with debug builds
might be a good plan :-)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This patch implements varying packing between varyings.
Previously, each varying occupied components 0 through N-1 of its
assigned varying slot, so there was no way to pack two varyings into
the same slot. For example, if the varyings were a float, a vec2, a
vec3, and another vec2, they would be stored as follows:
<----slot1----> <----slot2----> <----slot3----> <----slot4----> slots
* * * * * * * * * * * * * * * *
flt x x x <vec2-> x x <--vec3---> x <vec2-> x x varyings
(Each * represents a varying component, and the "x"s represent wasted
space).
This change packs the varyings together to eliminate wasted space
between varyings, like so:
<----slot1----> <----slot2----> <----slot3----> <----slot4----> slots
* * * * * * * * * * * * * * * *
<vec2-> <vec2-> flt <--vec3---> x x x x x x x x varyings
Note that we take advantage of the sort order introduced in previous
patches (vec4's first, then vec2's, then scalars, then vec3's) to
minimize how often a varying is "double parked" (split across varying
slots).
Reviewed-by: Eric Anholt <eric@anholt.net>
v2: Skip varying packing if ctx->Const.DisableVaryingPacking is true.
This patch implements varying packing within varyings that are
composed of multiple vectors of size less than 4 (e.g. arrays of
vec2's, or matrices with height less than 4).
Previously, such varyings used up a full 4-wide varying slot for each
constituent vector, meaning that some of the components of each
varying slot went unused. For example, a mat4x3 would be stored as
follows:
<----slot1----> <----slot2----> <----slot3----> <----slot4----> slots
* * * * * * * * * * * * * * * *
<-column1-> x <-column2-> x <-column3-> x <-column4-> x matrix
(Each * represents a varying component, and the "x"s represent wasted
space). In addition to wasting precious varying components, this
layout complicated transform feedback, since the constituents of the
varying are expected to be output to the transform feedback buffer
contiguously (e.g. without gaps between the columns, in the case of a
matrix).
This change packs the constituents of each varying together so that
all wasted space is at the end. For the mat4x3 example, this looks
like so:
<----slot1----> <----slot2----> <----slot3----> <----slot4----> slots
* * * * * * * * * * * * * * * *
<-column1-> <-column2-> <-column3-> <-column4-> x x x x matrix
Note that matrix columns 2 and 3 now cross a boundary between varying
slots (a characteristic I call "double parking" of a varying).
We don't bother trying to eliminate the wasted space at the end of the
varying, since the patch that follows will take care of that.
Since compiler back-ends don't (yet) support this packed layout, the
lower_packed_varyings function is used to rewrite the shader into a
form where each varying occupies a full varying slot. Later, if we
add native back-end support for varying packing, we can make this
lowering pass optional.
Reviewed-by: Eric Anholt <eric@anholt.net>
v2: Skip varying packing if ctx->Const.DisableVaryingPacking is true.
This patch paves the way for varying packing by adding a sorting step
before varying assignment, which sorts the varyings into an order that
increases the likelihood of being able to find an efficient packing.
First, varyings are sorted into "packing classes" by considering
attributes that can't be mixed during varying packing--at the moment
this includes base type (float/int/uint/bool) and interpolation mode
(smooth/noperspective/flat/centroid), though later we will hopefully
be able to relax some of these restrictions. The number of packing
classes places an upper limit on the amount of space that must be
wasted by varying packing, since in theory a shader might nave 4n+1
components worth of varyings in each of m packing classes, resulting
in 3m components worth of wasted space.
Then, within each packing class, varyings are sorted by vector size,
with vec4's coming first, then vec2's, then scalars, and then finally
vec3's. The motivation for this order is that it ensures that the
only vectors that might be "double parked" (with part of the vector in
one varying slot and the remainder in another) are vec3's.
Note that the varyings aren't actually packed yet, merely placed in an
order that will facilitate packing.
Reviewed-by: Eric Anholt <eric@anholt.net>
This patch further subdivides the loop that assigns varying locations
into two phases: one phase to match up the varyings between shader
stages, and one phase to assign them varying locations.
In between the two phases the matched varyings are stored in a new
data structure called varying_matches. This will free us to be able
to assign varying locations in any order, which will pave the way for
packing varyings.
Note that the new varying_matches::assign_locations() function returns
the number of varying slots that were used; this return value will be
used in a future patch.
Reviewed-by: Eric Anholt <eric@anholt.net>
This patch subdivides the loop that assigns varying locations into two
phases: one phase to match up varyings between shader stages (and
assign them varying locations), and a second phase to record the
varying assignments for use by transform feedback.
This paves the way for varying packing, which will require us to
further subdivide the first phase.
In addition, it lets us avoid a clumsy O(n^2) algorithm, since we can
now record the locations of all transform feedback varyings in a
single pass through the tfeedback_decls array, rather than have to
iterate through the array after assigning each varying.
Reviewed-by: Eric Anholt <eric@anholt.net>
Currently, the location of each varying is recorded in ir_variable as
a multiple of the size of a vec4. In order to pack varyings, we need
to be able to record, e.g. that a vec2 is stored in the second half of
a varying slot rather than the first half.
This patch introduces a field ir_variable::location_frac, which
represents the offset within a vec4 where a varying's value is stored.
Varyings that are not subject to packing will always have a
location_frac value of zero.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Previously, the linker used a value of -1 in ir_variable::location to
denote a generic input or output of the shader that had not yet been
matched up to a variable in another pipeline stage.
This patch introduces a new ir_variable field,
is_unmatched_generic_inout, for that purpose.
In future patches, this will allow us to separate the process of
matching varyings between shader stages from the processes of
assigning locations to those varying. That will in turn pave the way
for packing varyings.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Previously, link_invalidate_variable_locations() was only called
during assign_attribute_or_color_locations() and
assign_varying_locations(). This meant that in the corner case when
there was only a vertex shader, and varyings were being captured by
transform feedback, link_invalidate_variable_locations() wasn't being
called for the varyings.
This patch migrates the calls to link_invalidate_variable_locations()
to link_shaders(), so that they will be called in all circumstances.
In addition, it modifies the call semantics so that
link_invalidate_variable_locations() need only be called once per
shader stage (rather than once for inputs and once for outputs).
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>