Several layout-qualifier validations are duplicated in the
merge_qualifier and validate_in_qualifier methods.
We would rather have them refactored into single calls.
Suggested by Timothy.
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
The point mode value in an ast_type_qualifier can only be true if the
flag is already set since this layout-id-qualifier can only be or not
be present in a shader.
Hence, it is useless to check for its value if the flag is already
set. Just replaced with an assert.
V2: assert instead of checking for coherence and raising a compilation
error. Suggested by Timothy.
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
The validation of the default in layout qualifier already assures that
we won't have 2 ast_gs_input_layout objects with different primitive
type values. In fact, the validation already assures that we won't
have 2 ast_gs_input_layout objects in the AST tree at all.
The check for an error in the shader has been replaced by an assert.
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
The merge into the default in layout qualifier duplicates a lot of
code that can be reused from the generic merge method.
Now, we use the generic merge method inside the specific merge for the
default in layout qualifier. The generic merge method has been
completed with some bits that were only present in the merge for the
default in layout qualifier and the specific validation bits have been
moved to the validation method for the default in layout qualifier.
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Currently, the default in layout qualifier merge performs specific
validation and merge.
We want to split out the validation from the merge so they can be done
independently.
Additionally, for simplification, the direction of the validation and
merge is changed so the ast_type_qualifier calling the method is the
one validated and merged against the default in qualifier.
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Currently, the default out layout qualifier merge performs specific
validation and merge.
We want to split out the validation from the merge so they can be done
independently.
Additionally, for simplification, the direction of the validation and
merge is changed so the ast_type_qualifier calling the method is the
one validated and merged against the default out qualifier.
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Consider this example:
" #version 150 core
#extension GL_ARB_shading_language_420pack: require
#extension GL_ARB_explicit_attrib_location: require
layout(location=0) out vec4 o;
layout(binding=2) layout(binding=3, std140) uniform U {
vec4 a;
} u[2];"
As there is 2 layout-qualifiers for the uniform U and the binding
layout-qualifier-id is duplicated, the rules set by the
ARB_shading_language_420pack spec state that the rightmost should
prevail.
Our ast_type_qualifier merges with others in a way that if the value
for a layout-qualifier-id is set in both, the object being merged
overwrites the value of the object invoking the merge. Hence, the
merge has to happen from the left layout towards the right one and
this was not happening for interface blocks because we were merging
into the default layout qualifier.
Now, the merge is done from left to right and, as a last step, we
merge into the default layout qualifier if needed, so the values of
the explicit layouts prevail over it.
V2: added a default_layout variable instead of a layout_helper and
make the merge directly over the layout one. Suggested by Timothy.
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
When a layout contains a duplicated layout-qualifier-name in a single
declaration, only the last occurrence should be taken into account.
From page 59 (page 65 of the PDF) of the GLSL 4.40 spec:
" More than one layout qualifier may appear in a single
declaration. Additionally, the same layout-qualifier-name can
occur multiple times within a layout qualifier or across multiple
layout qualifiers in the same declaration. When the same
layout-qualifier-name occurs multiple times, in a single
declaration, the last occurrence overrides the former
occurrence(s)."
Consider this example:
" #version 150
#extension GL_ARB_enhanced_layouts: enable
layout(max_vertices=2, max_vertices=3) out;
layout(max_vertices=3) out;"
Although different values for "max_vertices" results in a compilation
error. The above code is valid because max_vertices=2 is ignored.
When merging qualifiers in an ast_type_qualifier, we now simply ignore
new appearances of a same layout-qualifier-name if the
"is_single_layout_merge" parameter is true, this works because the GLSL
parser processes qualifiers from right to left.
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Fixes multiple Vulkan CTS tests that combine anisotropy and VK_FILTER_NEAREST
in dEQP-VK.texture.filtering_anisotropy.*
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The commit 8e430ff8b0 broke support for
LLVM 3.9 and older versions in Clover. This patch restores it and
refactors the support using Clover compatibility layer for LLVM.
v2: merged #ifdef blocks
v3: added support for LLVM 3.6-3.8
v4: add missing #ifdef around <memory>
v5: simplify using templates and lambda
Signed-off-by: Vedran Miletić <vedran@miletic.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98740
Tested-by[v4]: Pierre Moreau <pierre.morrow@free.fr>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Not needed anymore.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
The code didn't limit the offsets to the number supplied, so
if we expected 3 but only got 2 we were accessing undefined memory.
This fixes random failures in:
dEQP-VK.glsl.texture_functions.texelfetchoffset.sampler2darray_*
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
There is an specific list of texture targets that can be used with
glGetTextureImage. From OpenGL 4.5 spec, section '8.11 Texture Queries',
page 234 of the PDF:
"An INVALID_ENUM error is generated if the effective target is
not one of TEXTURE_1D , TEXTURE_2D , TEXTURE_3D , TEXTURE_1D_-
ARRAY , TEXTURE_2D_ARRAY , TEXTURE_CUBE_MAP_ARRAY , TEXTURE_-
RECTANGLE , one of the targets from table 8.19 (for GetTexImage
and GetnTexImage only), or TEXTURE_CUBE_MAP (for GetTextureImage
only)."
We are currently not validating the target for glGetTextureImage. As
an example, calling this function on a texture with target
GL_TEXTURE_2D_MULTISAMPLE should return INVALID_ENUM, but instead it
hits an assertion down the road in the i965 driver.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
_mesa_lookup_texture_err() is not currently checking that the
texture-id can be zero, but _mesa_HashLookup() doesn't expect the key
to be zero, and will fail an assertion.
Considering that _mesa_lookup_texture_err() is called from
_mesa_GetTextureImage and _mesa_GetTextureSubImage with user provided
arguments, we must validate the texture-id before looking it up in the
hash-table.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
This fixes rendering in Dolphin on Vulkan since we enabled clip
distances. (Dolphin on GL has a similar bug because the linker
fails to eliminate unused clip distance built-in arrays, but it
isn't using SSO...so that needs more fixing.)
Also fixes a Piglit test:
spec/glsl-1.50/execution/geometry.clip-distance-vs-gs-out-sso
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Apparently the hw wedges otherwise, as mentioned in i965 comments.
Reported-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Currently clears only operate on the 0th array index (ignoring surface
layout parameters). Instead normalize to take a RTAI like all the
load/store tile logic does, and use ComputeSurfaceAddress to properly
take the surface state's lod/array index into account.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
The non-fast-clear path was never updated after clear colors were passed
in as floats. Remove the now-harmful conversion from unorm8.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
When switching render target array indexes (as might happen in a GS, or
in a future change, with layered clears), if the previous state is
HOTTILE_CLEAR, we should actually clear the tile before saving it off.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
Commit 6d416bcd84 (i965: Use arrays in
Gen7+ URB code.) introduced a regression which caused us to fail to
allocate all of our URB space.
- total_wants -= ds_wants;
+ total_wants -= additional;
The new line should have been total_wants -= wants[i].
Fixes a large performance regression in TessMark.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98815
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Gen6-7.5 specify the user clip distance enable bitmask in 3DSTATE_CLIP.
Gen8+ normally uses the new internal signalling mechanism to select the
one specified in the last enabled shader stage (3DSTATE_VS, DS, or GS).
This is a pretty good fit for Vulkan, or even newer GL, where the
bitmask comes entirely from the shader. But with glClipPlane(),
this is dynamic state, and we have to listen to _NEW_TRASNFORM.
Clip plane enables are the only reason the VS/DS/GS atoms need to
listen to _NEW_TRANSFORM. 3DSTATE_CLIP already has to listen to it
in order to support ARB_clip_control settings.
Setting the "Use the 3DSTATE_CLIP bitmask" force enable bit allows
us to drop _NEW_TRANSFORM from all the shader stage atoms, so we can
re-emit them less often.
Improves performance of OglBatch7 (version 6) by 2.70773% +/- 0.491257%
(n = 38) at 1024x768 on Cherryview.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
This fixes:
dEQP-VK.api.copy_and_blit.blit_image.simple_tests.mirror*
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is just a cleanup before I rework this code to fix mirrored
blits.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
radv_tex_wrap() already supports VK_SAMPLER_ADDRESS_MODE_MIRROR_CLAMP_TO_EDGE,
so all that's needed is to advertise support for the extension.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Ported from radeonsi.
Note that si_make_texture_descriptor() already sets img7 to the mask
value referred to in the comment.
Reviewed-by: Dave Airlie <airlied@redhat.com>
We can't render to 8x MSAA if the width is greater than 64 bits. (see
brw_render_target_supported)
Fixes ES31-CTS.sample_variables.mask.rgba32f.samples_8.mask_*
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Driver should enumerate only up-to min2(num_available, num_requested)
properties and return VK_INCOMPLETE if the # of requested props is
smaller than the ones available.
Presently we assert out in such cases.
Inspired by a similar fix for RADV.
v2: Use MIN2 + typed_memcpy (Jason).
Should fix: dEQP-VK.api.info.device.extensions
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
v2 (Jason):
- Use PRM citation for SKL now that it is available
- Also return false for gen < 8 mipmapped/arrayed
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
instead of in intel_miptree_init_mcs(). For lossless compression
the status is immediately overwritten in
intel_miptree_alloc_non_msrt_mcs() while the status for
non-compressed non-msaa miptrees is explicitly set in
do_blorp_clear().
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This path is not yet taken for fast cleared or compressed buffers
but later patches will enable it.
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This is useful when checking if any slice is in unresolved state.
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
And fix a mangled comment while at it.
v2 (Ben): Return the converted color.
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This was detected when examining CCS_E failures with piglit test:
"fbo-generatemipmap-formats". Test creates a 2D texture with
dimensions 293x277. It manually loops over all levels and calls
glTexImage2D(). Level one triggers creation of full miptree:
intel_alloc_texture_image_buffer() realizes that there is only one
level in the miptree and calls intel_miptree_create_for_teximage()
to re-allocate the miptree with all 9 levels. However, the end result
is a miptree with level zero dimensions of 292x276.
Related, and possibly calling for treatment of its own is mip-map
generation:
After calling glTexImage2D() against every level test continues by
replacing content for levels one to eight with data derived from level
zero by calling glGenerateMipmapEXT(). This results into the miptree
being allocated anew for every level:
Mip-map generation goes thru meta which ends up validating the texture
(brw_validate_textures()->intel_finalize_mipmap_tree()->
intel_miptree_match_image()) where one finds texture with base level
size 292:276. This results into new miptree being created for the npot
size 293:277. Only here intel_finalize_mipmap_tree() is asked for only
one level, and therefore such is created. Generation for level one in
turn finds right base level size but only one level when two is needed.
And the same goes on for all eight levels.
This patch prevents the shrink maintaining the NPOT size of 293x277.
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
In get_tex_memcpy, when copying texture data directly from source
to destination (when row strides match for both src and dst), the
copy size is currently calculated using the full texture height
instead of the sub-region height parameter that was passed.
This can cause a read past the end of the mapped buffer when y-offset
is greater than zero, leading to a segfault.
Fixes CTS test (from crash to pass):
* GL45-CTS/get_texture_sub_image/functional_test
v2: (Jason) Use the passed 'height' instead of copying til the
end of the buffer (tex-height - yoffset).
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Besides the logical operation involved, these also require that we test if the
operands are ordered / unordered.
For ordered operations, both operands must be ordered (and they must pass the
conditional test) while for unordered operations it is sufficient if only one
of the operands is unordered (or they pass the logical test).
Fixes the following Vulkan CTS tests:
dEQP-VK.spirv_assembly.instruction.compute.opfunord.equal
dEQP-VK.spirv_assembly.instruction.compute.opfunord.greater
dEQP-VK.spirv_assembly.instruction.compute.opfunord.greaterequal
dEQP-VK.spirv_assembly.instruction.compute.opfunord.less
dEQP-VK.spirv_assembly.instruction.compute.opfunord.lessequal
v2: Fixed typo: s/nir_eq/nir_feq
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Since bind image memory started memsetting surfaces, the
device node can't be NULL, since we lookup device->info.has_llc.
Not sure why it ever was NULL before.
Fixes some things on my Ivybridge.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This fixes gl-coord-replace-doesnt-eliminate-frag-tex-coords
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>