This change allows used defined inputs/outputs with explicit locations
to be removed if they are detected to not be used between shaders
at link time.
To enable this we change the is_unmatched_generic_inout field to be
flagged when we have a user defined varying. Previously
explicit_location was assumed to be set only in builtins however SSO
allows the user to set an explicit location.
We then add a function to match explicit locations between shaders.
V2: call match_explicit_outputs_to_inputs() after
is_unmatched_generic_inout has been initialised.
Cc: Gregory Hainaut <gregory.hainaut@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
This will help avoid eliminating inputs/outputs needed by SSOs.
Cc: Gregory Hainaut <gregory.hainaut@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
ARB_explicit_uniform_location allows the index for subroutine functions
to be explicitly set in the shader.
This patch reduces the restriction on the index qualifier in
validate_layout_qualifiers() to allow it to be applied to subroutines
and adds the new subroutine qualifier validation to ast_function::hir().
ast_fully_specified_type::has_qualifiers() is updated to allow the
index qualifier on subroutine functions when explicit uniform locations
is available.
A new check is added to ast_type_qualifier::merge_qualifier() to stop
multiple function qualifiers from being defied, before this patch this
would cause a segfault.
Finally a new variable is added to ir_function_signature to store the
index. This value is validated and the non explicit values assigned in
link_assign_subroutine_types().
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
We've assumed that we could lower per-component vector access from
vec[i] = scalar
to
vec = ir_triop_vector_insert(vec, scalar, i)
but with SSBOs (and compute shader SLM and tesselation outputs) this is
no longer valid. If a vector is "externally visible", multiple threads
can write independent components simultaneously. With lowering to
ir_triop_vector_insert, each thread read the entire vector, changes one
component, then writes out the entire vector. This is racy.
Instead of generating a ir_binop_vector_extract when we see v[i], we
generate ir_dereference_array. We then add a lowering pass to lower the
ir_dereference_array to ir_binop_vector_extract for rvalues and for to
vector_insert for lvalues in a separate lowering pass.
The resulting IR is the same as before, but we now have a window between
ast->ir conversion and the lowering pass where v[i] appears in the IR as
an array deref. This lets us run lowering passes that lower the vector
access to I/O (eg for SSBO load/store) before we lower the per-component
access to full vector writes.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
All GLSL IR consumers run this lowering pass so we can move it to the
linker. This moves the pass up quite a bit, but that's the point: it
needs to run before we throw away information about per-component vector
access.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
SSBO support now exists as of commits f24e5e and f408a13dd3.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
These helpers are ran for same case the same loop. Here joined
their operation so the loop is ran just once. Also fixed
out-of-memory condition here.
v2: Make the loop simpler to read as per Tapani's suggestion
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
This makes sure that user is still able to query properties about
variables that have gotten removed by opt_dead_builtin_varyings pass.
Fixes following OpenGL ES 3.1 test:
ES31-CTS.program_interface_query.output-layout
No Piglit regressions.
v2: cleanup, drop extra parenthesis (Topi)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
When a TCS is present, the TES input gl_PatchVerticesIn is actually a
constant - it's simply the # of output vertices specified by the TCS
layout qualifiers. So, we can replace the system value with a constant,
which may allow further optimization, and will likely be more efficient.
If the TCS is absent, we can't do this optimization.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Explicit locations are only used with uniform variables.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
The split of Uniform blocks and shader storage block only loops
up to MESA_SHADER_FRAGMENT and igonres compute shaders.
This cause segfault when running the OpenGL ES 3.1 CTS tests
with GL_ARB_compute_shader enabled.
V2: Changed to use MESA_SHADER_STAGES instead of
MESA_SHADER_COMPUTE
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com>
Patch moves existing calculation code from shader_query.cpp to happen
during program resource list creation.
No Piglit or CTS regressions were observed during testing.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Currently, these arrays in gl_shader and gl_shader_program hold both
UBOs and SSBOs, so this looks like a better name. We were already
using NumBufferInterfaceBlocks in gl_shader_program, so this makes
things more consistent as well.
In a later patch we will add {Num}UniformBlocks and
{Num}ShaderStorageBlocks which will contain only references to
UBOs and SSBOs respectively that will provide backends with
a separate index space for both types of objects.
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Varyings can be considered inputs or outputs of a program only when
SSO is in use. With multi-stage programs, inputs contain only inputs
for first stage and outputs contains outputs of the final shader stage.
I've tested that fix works for Assault Android Cactus (demo version)
and does not cause Piglit or CTS regressions in glGetProgramiv tests.
Following ES 3.1 CTS separate shader tests that do query properties
of varyings in SSO shader programs pass:
ES31-CTS.program_interface_query.separate-programs-vertex
ES31-CTS.program_interface_query.separate-programs-fragment
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92122
The uniform will only be of a single type so store the data for
opaque types in a single array.
Cc: Francisco Jerez <currojerez@riseup.net>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
From ARB_program_interface_query:
"For an active shader storage block member declared as an array, an
entry will be generated only for the first array element, regardless
of its type. For arrays of aggregate types, the enumeration rules are
applied recursively for the single enumerated array element."
v2:
- Simplify 'if' conditions and return true if it is not a buffer
variable, because these rules only apply to buffer variables (Timothy).
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Because it counts shader storage blocks too.
v2:
- Use NumBufferInterfaceBlocks instead (Jordan).
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Including TOP_LEVEL_ARRAY_SIZE and TOP_LEVEL_ARRAY_STRIDE queries.
v2:
- Use std430_array_stride() to get top level array stride following std430's rules.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Notice that we should differentiate between shader storage blocks and
uniform blocks, since they have different limits.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Otherwise, generate a link time error as per the
ARB_shader_storage_buffer_object spec.
v2:
- Fix error message (Jordan)
v3:
- Move std140_size() changes to its own patch (Kristian)
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
They only can be defined in the last position of the shader
storage blocks.
When an unsized array is used in different shaders, it might be
converted in different sized arrays, avoid get a linker error
in that case.
v2:
- Rework error condition and error messages (Timothy Arceri)
v3:
- Move OpenGL ES check to its own patch.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
This makes sure that user is still able to query properties about
variables that have gotten packed by lower_packed_varyings pass.
Fixes following OpenGL ES 3.1 test:
ES31-CTS.program_interface_query.separate-programs-vertex
v2: fix 'name included in packed list' check (Ilia Mirkin)
v3: iterate over instances of name using strtok_r (Ilia Mirkin)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
Also rename to _mesa_get_main_function_signature.
We will call it near the end of compilation to insert some code into
main for initializing some compute shader global variables.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
This applies to OpenGL Core >= 4.5 and OpenGL ES >= 3.1.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
In various versions of OpenGL and GLSL, it's possible to declare
multiple VS input variables with aliasing attribute locations.
So, when computing the storage requirements for vertex attributes,
we can't simply add up the sizes. Instead, we need to look at the
enabled slots.
This patch begins tracking which attributes are double types that
are larger than 128-bits (i.e. take up two vec4 slots). We then
count normal attributes once, and count the double-size attributes
a second time.
Fixes deQP functional.attribute_location.bind_aliasing.max_cond_* tests
on i965, which regressed with commit ad208d975a.
No Piglit changes on llvmpipe (which actually supports dvecs).
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
The name of both the GLSL built-in variable and the glGetInteger param
with the same value changed in GLSL ES 3.1 and GL 4.5. Its semantics
also changed slightly, since the limit now also takes into account the
number of SSBs in use. Switch our internal data structures to the
up-to-date name.
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Stage ref cannot be queried for transform feedback.
Also simplify the build_stageref function by passing the
correct mode for uniforms.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Currently stage reference mask is built using the variable name
only. However it can happen that input of one stage has same name
as output from another stage. Adding check of variable mode makes
sure we do not pick wrong variable.
Fixes some subcases from
ES31-CTS.program_interface_query.no-locations
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
This fixes the remaining failing tests in:
ES31-CTS.program_interface_query.uniform-types
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Same check is made for glBindFragDataLocationIndexed but it was missing
when using layout qualifiers.
Fixes following Piglit test:
arb_blend_func_extended-output-location
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Change function to get all gl_constants for inspection, this is used
by follow-up patch.
v2: rebase, update function documentation
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
This adds linker support for subroutine uniforms, they
have some subtle differences from real uniforms, we also hide
them and they are given internal uniform names.
This also adds the subroutine locations and subroutine uniforms
to the program resource tracking for later use.
v1.1: drop is_subroutine_def
v2: handle explicit location properly, ARB_explicit_location
has a lot of language for subroutine shaders.
Calculate a link time the number of compatible subroutines
for a uniform, to make program resource easier later.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
With the exception of always-taken switch cases (which are
indistinguishable from straight line code in our IR), this
disallows use of the builtin barrier() function in all the
places it may not appear.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>