From ARB_post_depth_coverage:
"This extension allows the fragment shader to control whether values in
gl_SampleMaskIn[] reflect the coverage after application of the early
depth and stencil tests. This feature can be enabled with the following
layout qualifier in the fragment shader:
layout(post_depth_coverage) in;
Use of this feature implicitly enables early fragment tests."
And a bit later it also adds:
"early_fragment_tests" requests that fragment tests be performed before
fragment shader execution, as described in section 15.2.4 "Early Fragment
Tests" of the OpenGL Specification. If neither this nor post_depth_coverage
are declared, per-fragment tests will be performed after fragment shader
execution."
Fixes:
GL45-CTS.post_depth_coverage_tests.PostDepthSampleMask
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
It will return the current variable ('var') or the earlier declaration ('earlier') in
case of redeclaration of that variable.
In order to distinguish between both, 'is_redeclaration' boolean will indicate in which
case we are.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
The get_variable_being_redeclared() function can free 'var' because
a re-declaration of an unsized array variable can establish the size, so
we set the array type to the 'earlier' declaration and free 'var' as it is
not needed anymore.
However, the same 'var' is referenced later in ast_declarator_list::hir().
This patch fixes it by picking the ir_variable_mode from the proper
ir_variable.
This error was detected by Address Sanitizer.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Suggested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99677
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Do not wrap header inclusion in extern C since it can cause issues.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
If an unsized declared array is not the last in an SSBO
and an implicit size can not be defined on linking time,
the linker should raise an error instead of reaching
an assertion on GL.
This reverts part of commit 3da08e1664
getting back to the behavior of commit 5b2675093e
The original patch was correct for GLES that should produce
a compile-time error but the linker error is still necessary
in desktop GL.
Fixes the following piglit tests:
tests/spec/arb_shader_storage_buffer_object/non_integral_size_array_member.shader_test
tests/spec/arb_shader_storage_buffer_object/unsized_array_member.shader_test
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
NIR is a typeless IR and the two opcodes, when considered bitwise, do
exactly the same thing. There's no reason to have two versions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
In order to avoid costly fallback recompiles when cache items are
created with an old version of Mesa or for a different gpu on the
same system we want to create directories that look like this:
./{TIMESTAMP}_{LLVM_TIMESTAMP}/{GPU_ID}
Note: The disk cache util will take a single timestamp string, it is
up to the backend to concatenate the llvm string with the mesa string
if applicable.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Here we skip the recreation of uniform storage if we are relinking
after a cache miss. This is improtant because uniform values may
have already been set by the application and we don't want to reset
them.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
This will allow us to skip certain things when falling back to
a full recompile on a cache miss such as avoiding reinitialising
uniforms.
In this change we use it to avoid reading the program metadata
from the cache and skipping linking during a fallback.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
These may be lowered constant arrays or uniform values that we set before linking
so we need to cache the actual uniform values.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
For now this disables the shader cache when transform feedback is
enabled via the GL API as we don't currently allow for it when
generating the sha for the shader.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
V2: don't store pointers use an enum instead to flag what should be
restored. Also do the work in a helper that we will later use for
the subroutine remap table.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
The three additional tables are AttributeBindings, FragDataBindings,
and FragDataIndexBindings.
The first table (AttributeBindings) was identified as missing by
trying to test the shader cache with a program that called
glGetAttribLocation.
Many thanks to Tapani Pälli <tapani.palli@intel.com>, as it was review
of related work that he had done previously that pointed me to the
necessity to also save and restore FragDataBindings and
FragDataIndexBindings.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
The scenario is:
glShaderSource
glCompileShader <-- deferred due to cache hit of shader
glShaderSource <-- with new source code
glAttachShader
glLinkProgram <-- no cache hit for program
At this point we need to compile the original source when we
fallback.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
The hash key for glsl metadata is a hash of the hashes of each GLSL
source string.
This commit uses the put_key/get_key support in the cache put the SHA-1
hash of the source string for each successfully compiled shader into the
cache. This allows for early, optimistic returns from glCompileShader
(if the identical source string had been successfully compiled in the past),
in the hope that the final, linked shader will be found in the cache.
This is based on the intial patch by Carl.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
This uses disk_cache.c to write out a serialization of various
state that's required in order to successfully load and use a
binary written out by a drivers backend, this state is referred to as
"metadata" throughout the implementation.
This initial version is intended to work with all stages beside
compute.
This patch is based on the initial work done by Carl.
V2: extend the file's doxygen comment to cover some of the
design decisions.
V3:
- skip cache for fixed function shaders
- add int64 support
- fix glsl IR program parameter caching/restore and cache the
parameter values which are used by gallium backends.
- use new link status enum
V4:
- add compute program support
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
For GS input arrays, we may turn a packed_type of ivec4 into an
array of ivec4s. We still want flat qualification.
Found by inspection. Not known to help anything.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Passes the newly added piglit test for this extension on i965.
V2: Fix comments by Ilia.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
TCS and TES inputs without an array size are implicitly sized to
gl_MaxPatchVertices. But TCS outputs are apparently not:
"If no size is specified, it will be taken from the output patch size
(gl_VerticesOut) declared in the shader."
Fixes dEQP-GLES31.functional.program_interface_query.program_output.
array_size.separable_tess_ctrl.var.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
OpenGL ES actually has spec text to prohibit this. It's just OpenGL
that's confusing.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
From GLSL ES 3.10 spec, section 4.1.9 "Arrays":
"If an array is declared as the last member of a shader storage block
and the size is not specified at compile-time, it is sized at run-time.
In all other cases, arrays are sized only at compile-time."
In desktop GLSL it is allowed to have unsized-arrays that are
not last, as long as we can determine that they are implicitly
sized, which is detected at link-time.
With this patch Mesa reports a compilation error as glslang does with
the following shader:
buffer SSBO { vec4 data[]; vec4 moreData;};
void main (void)
{
}
Fixes:
dEQP-GLES31.functional.debug.negative_coverage.log.shader.compile_compute_shader
dEQP-GLES31.functional.debug.negative_coverage.callbacks.shader.compile_compute_shader
dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.compile_compute_shader
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Previously if you used MESA_GL_VERSION_OVERRIDE=3.3COMPAT, Mesa exposed
an OpenGL 3.3 compatibility profile context (with various unimplemented
features and bugs), but still refused to compile shaders with
#version 330 compatibility
This patch simply adds a small bit of plumbing to let that through.
Of course the same caveats apply: compatibility profile is still not
supported (and will not be supported), so there are no guarantees that
anything will work.
Tested-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
For the on-disk shader cache we want to be able to differentiate
between a program that was linked and one that was loaded from cache.
V2:
- don't return the new enum directly to the application when queried,
instead return GL_TRUE or GL_FALSE as required. Fixes google-chrome
corruptions when using cache.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
As per the spec -
"The functions memoryBarrierShared() and groupMemoryBarrier() are
available only in compute shaders; the other functions are available
in all shader types."
Conform to this by adding another delegate to check for compute
shader support instead of only whether the current stage is compute
This allows some fragment shaders in Dirt Rally to compile
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This addresses several issues of the current atan2 implementation:
- Negative zero (and negative denorms which end up getting flushed to
zero) isn't handled correctly by the current implementation. The
reason is that it does 'y >= 0' and 'x < 0' comparisons to decide
on which side of the branch cut the argument is, which causes us to
return incorrect results (off by up to 2π) for very small negative
values.
- There is a serious precision problem for x values of large enough
magnitude introduced by the floating point division operation being
implemented as a mul+rcp sequence. This can lead to the quotient
getting flushed to zero in some cases introducing an error of over
8e6 ULP in the result -- Or in the most catastrophic case will
cause us to return NaN instead of the correct value ±π/2 for y=±∞
and x very large. We can fix this easily by scaling down both
arguments when the absolute value of the denominator goes above
certain threshold. The error of this atan2 implementation remains
below 25 ULP in most of its domain except for a neighborhood of y=0
where it reaches a maximum error of about 180 ULP.
- It emits a bunch of instructions including no less than three
if-else branches per scalar component that don't seem to get
optimized out later on. This implementation uses about 13% less
instructions on Intel SKL hardware and doesn't emit any control
flow instructions.
v2: Fix up argument scaling to take into account the range and
precision of exotic FP24 hardware. Flip coordinate system for
arguments along the vertical line as if they were on the left
half-plane in order to avoid division by zero which may give
unspecified results on non-GLSL 4.1-capable hardware. Sprinkle in
some more comments.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Will avoid a regression in a future commit that introduces some
additional rcp operations. According to the GLSL 4.10 specification:
"Dividing by 0 results in the appropriately signed IEEE Inf."
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
The `end+1` skips the ']', whereas the `strlen+1` includes the final
'\0' in the move to terminate the string.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>