Unlike uniforms, the limit on shared memory size is not called out
explicitly in the list of things that cause linker errors, but presumably
that's just an oversight in the spec.
Fixes dEQP-GLES31.functional.debug.negative_coverage.{callbacks,get_error,log}.compute.exceed_shared_memory_size_limit
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Prevent an overflow caused by too many output variables. To limit the
scope of the issue, write to the assigned array only for the non-ES
fragment shader path, which is the only place where it's needed.
Since the function will bail with an error when output variables with
overlapping components are found, (max # of FS outputs) * 4 is an upper
limit to the space we need.
Found by address sanitizer.
Fixes dEQP-GLES3.functional.attribute_location.bind_aliasing.*
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
The functionality is used by glsl and mesa. With the latter already
depending on the former.
With this in place the src/util/ static library libmesautil.la no longer
has a C++ dependency. Thus objects which use it (like libEGL) don't need
the C++ link.
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Fixes: 02cc359372 ("egl/wayland: Use linux-dmabuf interface for buffers")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101851
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Tested-by: James Harvey <lothmordor@gmail.com>
Here we also make use of the UseSTD430AsDefaultPacking constant
and call the new get_internal_ifc_packing() helper.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The main motivation for this is that threaded compilation can fall
over if we were to allocate IR inside constant_expression_value()
when calling it on a builtin. This is because builtins are shared
across the whole OpenGL context.
f81ede4699 worked around the problem by cloning the entire
builtin before constant_expression_value() could be called on
it. However cloning the whole function each time we referenced
it lead to a significant reduction in the GLSL IR compiler
performance. This change along with the following patch
helps fix that performance regression.
Other advantages are that we reduce the number of calls to
ralloc_parent(), and for loop unrolling we free constants after
they are used rather than leaving them hanging around.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We are currently copying the name for each member dereference
but we can just share a single instance of the string provided
by the type.
This change also stops us recalculating the field index
repeatedly.
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Otherwise, the padding bits remain undefined, which leads to valgrind
errors when storing the gl_shader_variable in the disk cache.
v2: use rzalloc instead of an explicit padding member variable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Save some passes over the IR.
v2: redesign to make the users of find_assignments more readable
v3:
- fix missing !
- add some comments and make the num_found check more explicit (Timothy)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
We always have stage == first and stage == last when first == last, so
drop the special case. Also rephrase the comment to make the logic
clearer.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
xfb only applies to the latest stage before the fragment shader, so
there is no need to invoke it in the fragment shader.
Fixes:
KHR-GL45.enhanced_layouts.xfb_stride_of_empty_list
KHR-GL45.enhanced_layouts.xfb_stride_of_empty_list_and_api
v2: do reset only if shaders provide an explicit stride
v3: do not call link_xfb_stride_layout_qualifiers() for fragment shaders
(Timothy)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Unnamed struct types are now equal across stages based on the fields they
contain, so overriding the type to make sure names match has become
unnecessary.
The check was originally introduced in commit 955c93dc08 ("glsl: Match
unnamed record types across stages.")
v2: clarify the commit message
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
From section 4.4.6 of the ARB_bindless_texture spec:
"If both bindless_sampler and bound_sampler, or bindless_image
and bound_image, are declared at global scope in any
compilation unit, a link- time error will be generated."
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Due to a max limit of 65,536 entries on the index table that we use to
decide if we can skip compiling individual shaders, it is very likely
we will have collisions.
To avoid doing too much work when the linked program may be in the
cache this patch delays calling the optimisations until link time.
Improves cold cache start-up times on Deus Ex by ~20 seconds.
When deleting the cache index to simulate a worst case scenario
of collisions in the index, warm cache start-up time improves by
~45 seconds.
V2: fix indentation, make sure to call optimisations on cache
fallback, make sure optimisations get called for XFB.
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
From page 45 (page 52 of the PDF) of the GLSL ES 3.00 v.6 spec:
" When instance names are present on matched block names, it is
allowed for the instance names to differ; they need not match for
the blocks to match.
From page 51 (page 57 of the PDF) of the GLSL 4.30 v.8 spec:
" When instance names are present on matched block names, it is
allowed for the instance names to differ; they need not match for
the blocks to match."
Therefore, no cross linking validation is needed for the instance name
of an Interface Block.
This patch will make that no link error will be reported on a program
like this:
"# VS
layout(binding = 1) Block1 {
vec4 color;
} uni_block;
...
# FS
layout(binding = 2) Block2 {
vec4 color;
} uni_block;
..."
Fixes GL45-CTS.enhanced_layouts.ssb_layout_qualifier_conflict
Signed-off-by: Andres Gomez <agomez@igalia.com>
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
While it's legal to have an active blocks count > 0 on link failure.
Unless we actually assign memory for the blocks array we can end up
segfaulting in calls such as glUniformBlockBinding().
To avoid having to NULL check these api calls we simply reset the
block count to 0 if the array was not created.
Signed-off-by: Andres Gomez <agomez@igalia.com>
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
OpenGL allows the TCS to be missing and supplies an implicit passthrough
shader, but OpenGL ES does not (see section 7.3 of the ES 3.2 spec,
cited above in the code).
One open question is how to handle this for ARB_ES3_2_compatibility.
This patch raises the link error for all ES shading language programs,
but it might make sense to base it on the API. The approach taken in
this patch is more restrictive, but should still allow any valid ES
programs to work in GL.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Andres Gomez <agomez@igalia.com>
From ARB_post_depth_coverage:
"This extension allows the fragment shader to control whether values in
gl_SampleMaskIn[] reflect the coverage after application of the early
depth and stencil tests. This feature can be enabled with the following
layout qualifier in the fragment shader:
layout(post_depth_coverage) in;
Use of this feature implicitly enables early fragment tests."
And a bit later it also adds:
"early_fragment_tests" requests that fragment tests be performed before
fragment shader execution, as described in section 15.2.4 "Early Fragment
Tests" of the OpenGL Specification. If neither this nor post_depth_coverage
are declared, per-fragment tests will be performed after fragment shader
execution."
Fixes:
GL45-CTS.post_depth_coverage_tests.PostDepthSampleMask
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
For now this disables the shader cache when transform feedback is
enabled via the GL API as we don't currently allow for it when
generating the sha for the shader.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
The hash key for glsl metadata is a hash of the hashes of each GLSL
source string.
This commit uses the put_key/get_key support in the cache put the SHA-1
hash of the source string for each successfully compiled shader into the
cache. This allows for early, optimistic returns from glCompileShader
(if the identical source string had been successfully compiled in the past),
in the hope that the final, linked shader will be found in the cache.
This is based on the intial patch by Carl.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
TCS and TES inputs without an array size are implicitly sized to
gl_MaxPatchVertices. But TCS outputs are apparently not:
"If no size is specified, it will be taken from the output patch size
(gl_VerticesOut) declared in the shader."
Fixes dEQP-GLES31.functional.program_interface_query.program_output.
array_size.separable_tess_ctrl.var.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
OpenGL ES actually has spec text to prohibit this. It's just OpenGL
that's confusing.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
For the on-disk shader cache we want to be able to differentiate
between a program that was linked and one that was loaded from cache.
V2:
- don't return the new enum directly to the application when queried,
instead return GL_TRUE or GL_FALSE as required. Fixes google-chrome
corruptions when using cache.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Previously the constant array would not get copy propagated until the backend
did its GLSL IR opt loop. I plan on removing that from i965 shortly which
caused huge regressions in Deus-ex and Tomb Raider which have large
constant arrays. Moving lowering before the opt loop in the GLSL linker
fixes this and unexpectedly improves some compute shaders also.
shader-db results BDW:
instructions helped: shaders/closed/steam/deus-ex-mankind-divided/374.shader_test CS SIMD16: 204 -> 194 (-4.90%)
instructions helped: shaders/closed/steam/deus-ex-mankind-divided/318.shader_test CS SIMD8: 1010 -> 741 (-26.63%)
instructions helped: shaders/closed/steam/deus-ex-mankind-divided/144.shader_test CS SIMD8: 542 -> 385 (-28.97%)
cycles helped: shaders/closed/steam/deus-ex-mankind-divided/318.shader_test CS SIMD8: 1831382 -> 1818492 (-0.70%)
cycles helped: shaders/closed/steam/deus-ex-mankind-divided/144.shader_test CS SIMD8: 216238 -> 206180 (-4.65%)
cycles helped: shaders/closed/steam/deus-ex-mankind-divided/374.shader_test CS SIMD16: 18484 -> 16644 (-9.95%)
total instructions in shared programs: 13060313 -> 13059877 (-0.00%)
instructions in affected programs: 1756 -> 1320 (-24.83%)
helped: 3
HURT: 0
total cycles in shared programs: 256586698 -> 256561910 (-0.01%)
cycles in affected programs: 2066104 -> 2041316 (-1.20%)
helped: 3
HURT: 0
V3: only call the opt loop if lowering progressed (Suggested by Eric)
V2: call opts before and after lowering (Suggested by Ken)
Reviewed-by: Eric Anholt <eric@anholt.net>
There are some line wrapping violations here but those lines will get
deleted in the following patch.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Now that the i965 backend doesn't depend on this field we can
make it more generic and short circuit a bunch of code paths.
The new field will be used in a following patch for another
clean-up.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Here we remove the single use of this field in gl_linked_shader
which allows us to move the field out of gl_shader_info
While we are at it we rewrite link_xfb_stride_layout_qualifiers()
to be more clear.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
There is no reason for this to be in the shared gl_shader_info or
to copy it to gl_program at the end of linking (its already there).
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This is only used by gl_linked_shader as a temp during linking
so use a temp there instead.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This is only used by gl_linked_shader as a temp during linking
so use a temp there instead.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This is only used by gl_linked_shader as a temp during linking
so use a temp there instead.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This is never used in gl_linked_shader other than as a temp
during linking so just use a temp instead.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
We also move EarlyFragmentTests out of the gl_shader_info struct
as it is now only used by gl_shader.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>