Now that information about which array-of-arrays elements are accessed
is tracked, use that information to only mark an instance array element
as used-by-stage if, in fact, it is.
Fixes GL45-CTS.program_interface_query.uniform-block-types.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
If there's a better way to provide access to ir_array_refcount_entry
private members to the test functions, I am very interested to know
about it.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: Francisco Jerez <currojerez@riseup.net>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Very soon this visitor will get more complicated. The users of the
existing ir_variable_refcount visitor won't need the coming
functionality, and this use doesn't need much of the functionality of
ir_variable_refcount.
v2: ir_array_refcount_visitor::get_variable_entry cannot return NULL, so
don't check it. Suggested by Timothy.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
One for the array parts and one for the leaf members. This will
simplify later changes.
The indentation is wonkey after this patch. This was done to make it
more obvious that the function is just getting split. The next patch
will fix the indentation.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This fixes 555 dEQP tests (using the nougat-cts-dev branch), Piglit's
arb_program_interface_query/arb_program_interface_query-resource-query,
and GL45-CTS.program_interface_query.separate-programs-{tess-control,
tess-eval,geometry}. Only one dEQP program interface failure remains.
I would have liked to split this up into several distinct changes, but
I wasn't sure how to do that given thet tangled nature of these issues.
So, the issues:
* We need to treat interface blocks declared as an array of instances
as a single block - removing the outer array. The resource list
entry's name should not include the array length. Properties such
as GL_ARRAY_SIZE should refer to the variable inside the block, not
the interface block's array properties.
* We need to do this prefixing even for structure variables.
* We need to do this for built-ins (such as gl_PerVertex.gl_Position).
* After interface array unwrapping, any variable which is an array
should have [0] appended. It doesn't matter if it's a TCS/TES/GS
input or TCS output - that looked like an attempt to unwrap for
per-vertex variables, but that didn't consider per-patch variables,
and as far as I can tell there's nothing to justify this.
Several Mesa developers have suggested that Issue 16 contradicts the
main specification, but I believe that it doesn't - the main spec just
isn't terribly clear. The main ARB_program_interface query spec says:
"* For an active interface block not declared as an array of block
instances, a single entry will be generated, using the block name from
the shader source.
* For an active interface block declared as an array of instances,
separate entries will be generated for each active instance. The name
of the instance is formed by concatenating the block name, the "["
character, an integer identifying the instance number, and the "]"
character."
Issue 16 says that built-ins should be named "gl_PerVertex.gl_Position",
but several people suggested the second bullet above means that it
should be named "gl_PerVertex[array length].gl_Position".
There are two important things to note. Those bullet points say
"an active interface block", while the others say "variable" or "active
shader storage block member". They also don't mention applying the
rules recursively (unlike the other bullets). Both suggest that
these rules apply to blocks themselves, not members of blocks.
In fact, for GL_UNIFORM_BLOCK queries, we do have "block[0]",
"block[1]", ... resource list entries - so those rules are real,
and actually used. So if they don't apply to block members, then how
should members be named? Unfortunately, I don't see any rules outside
of issue 16 - where the rationale is very unclear. I hope to clarify
the spec in the future.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
stage_mask is a bitmask of shader stages, so the proper comparison would
be (1 << MESA_SHADER_VERTEX), not MESA_SHADER_VERTEX itself.
But we only care for structure types, and VS inputs cannot be structs.
So we can just drop this entirely.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
The strategy is to do the same thing that the GLSL lower_offset_arrays
pass does - create 4 separate texture gather ops, one per offset, and
read in the results from each gather's w component to recreate the
desired result.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
We do nothing but initialize it, add to it, and delete it.
This is a fallout from removing constant initializer support.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reverting the previous attempt at this a5502a721f resulted in
the following Vulkan test failing.
dEQP-VK.glsl.return.return_in_dynamic_loop_dynamic_vertex
This time we use the num_components from the alu dest rather than
num_inputs to the op to determine the size of the undef.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99100
We should make the dest in the textureLod() operation have the same number
of components as the destination in the original textureGrad()
Fixes regression in ES3-CTS.gtf.GL3Tests.shadow
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99072
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This reverts commit 6aa730000f.
This was changing the size of the undef to always be 1 (the number of inputs
to imov and fmov) which is wrong, we could be moving a vec4 for example.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Even if lower_txd_cube_map isn't. Suggested by Ken to make the flag more
consistent with its name.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This is ported from the Intel lowering pass that we use with GLSL IR.
This takes care of lowering texture gradients on shadow samplers other
than cube maps. Intel hardware requires this for gen < 8.
v2 (Ken):
- Use the helper function to retrieve ddx/ddy
- Swizzle away size components we are not interested in
v3:
- Get rid of the ddx/ddy helper and use nir_tex_instr_src_index
instead (Ken, Eric)
v4:
- Add a 'continue' statement if the lowering makes progress because it
replaces the original texture instruction
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v3)
This is ported from the Intel lowering pass that we use with GLSL IR.
The NIR pass only handles cube maps, not shadow samplers, which are
also lowered for gen < 8 on Intel hardware. We will add support for
that in a later patch, at which point we should be able to remove
the GLSL IR lowering pass.
v2:
- added a helper to retrieve ddx/ddy parameters (Ken)
- No need to make size.z=1.0, we are only using component x anyway (Iago)
v3:
- Get rid of the ddx/ddy helper and use nir_tex_instr_src_index
instead (Ken, Eric)
v4:
- When emitting the textureLod operation, copy all texture parameters
from the original textureGrad() (except for ddx/ddy) using a loop
- Add a 'continue' statement if the lowering makes progress because it
replaces the original texture instruction
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v3)
This was written specifically for RECT samplers. Make it more generic so
we can call this from the gradient lowerings too.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
git grep -l comparitor | xargs sed -i 's/comparitor/comparator/g'
Just happened to notice this in a patch that was sent and included one
of the tokens in question.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
In 19a541f (nir: Get rid of nir_constant_data) a number of places that
operated on nir_constant::values were mechanically converted to operate
on the whole array without regard for the base type. Only
GLSL_TYPE_FLOAT and GLSL_TYPE_DOUBLE can be matrices, so only those
types can have data in the non-0 array element.
See also b870394.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
The new implementation is more correct because it clamps the incoming value
to 10 to avoid floating-point overflow. It also uses a much reduced
version of the formula which only requires 1 exp() rather than 2. This
fixes all of the dEQP-VK.glsl.builtin.precision.tanh.* tests.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0" <mesa-dev@lists.freedesktop.org>
The formula we have used in the past is a trivial reduction from the
definition by simply multiplying both the numerator and denominator of the
formula by 2. However, multiplying by e^x, you can further reduce it.
This allows us to get rid of one side of the clamp and two of exponential
functions which should make it faster. The new formula still passes the
dEQP precision tests for tanh so it should be fine.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Clamp input scalar value to range [-10, +10] to avoid precision problems
when the absolute value of input is too large.
Fixes dEQP-GLES3.functional.shaders.builtin_functions.precision.tanh.* test
failures.
v2: added more explanation in the comment.
v3: fixed a typo in the comment.
Signed-off-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0" <mesa-dev@lists.freedesktop.org>
This extension allows the fragment shader to control whether values in
gl_SampleMaskIn[] reflect the coverage after application of the early
depth and stencil tests.
Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
Noticed while adding support for 64-bit integer types.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Trivial, this just resurrects the code which was there once upon a time
(the code can't lower instructions generated in the lowering pass there,
and even if it could it would probably be suboptimal).
This fixes piglit mesa_shader_integer_functions fs-ldexp.shader_test and
vs-ldexp.shader_test with llvmpipe.
Reviewed-by: Matt Turner <mattst88@gmail.com>
All of these are happily set from glsl_to_nir or spirv_to_nir but their
values are never used for anything.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Constant initializers have been a constant (ha!) pain for quite some time.
While they're useful from a language perspective, people writing passes or
backends really don't want deal with them most of the time. This commit
removes most of the constant initializer support from NIR. It is expected
that you call nir_lower_constant_initializers VERY EARLY to ensure that
they're gone before you do anything interesting.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
It's only ever called on single-function shaders. At this point, there are
a lot of helpers that can make it all much simpler.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
This has bothered me for about as long as NIR has been around. Why do we
have two different unions for constants? No good reason other than one of
them is a direct port from GLSL IR.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
This is mostly just used during linking however the st uses it
when updating textures.
In order to store gl_program in the CurrentProgram array
rather than gl_shader_program we need to move this field to
the shared gl_shader_program_data struct.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
This should be faster than looping over every stage and null checking, but
will also make the code a bit cleaner when we switch to getting more fields
from gl_program rather than from gl_linked_shader as we can just copy the
pointer and not need to worry about null checking then copying.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This will be used to enable us to store the current gl_program
rather than gl_shader_program in the gl_pipline_object allowing
us to simplify handing of validation.
Also we should not be depending on _LinkedShader for this information
as it may contain shaders from a failed linking attempt rather than
the current program still in use.
We could also use this mask to iterate over the stages during linking
with _mesa_bit_scan() rather then the current method of NULL checking
each stage.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This only useful for spir-v shaders, but I keep finding myself
having to add it.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
07fe2d565b introduced a big hack in order to return
NumSubroutineUniforms when querying ACTIVE_RESOURCES for
<shader>_SUBROUTINE_UNIFORM interfaces. However this is the
wrong fix we are meant to be returning the number of active
resources i.e. the count of subroutine uniforms in the
resource list which is what the code was previously doing,
anything else will cause trouble when trying to retrieve
the resource properties based on the ACTIVE_RESOURCES count.
The real problem is that NumSubroutineUniforms was counting
array elements as separate uniforms but the innermost array
is always considered a single uniform so we fix that count
instead which was counted incorrectly in 7fa0250f9.
Idealy we could probably completely remove
NumSubroutineUniforms and just compute its value when needed
from the resource list but this works for now.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
This reverts commit aaa69c79cd.
The commit was erroneous because the ast_layout_expression class is
meant to hold a list used for an after check that all the declared
values for a layout-qualifier-name are consistent.
Therefore, the check for the possibility of duplicated values was
previously fixed to happen much sooner, in the GLSL parser and the
merge of layout qualifiers, and the process_qualifier_constant method
only needs to check that the values are consistent.
By now, those layout-qualifier-name represented as a
ast_layout_expression are "max_vertices", "invocations", "vertices",
"local_size_[x|y|z]" and "xfb_stride".
From page 40 (page 46 of the PDF) of the GLSL 1.50 spec:
" All geometry shader output layout declarations in a program must
declare the same layout and same value for max_vertices."
From page 44 (page 50 of the PDF) of the GLSL 4.00 spec:
" If an invocation count is declared, all such declarations must
specify the same count."
From page 47 (page 53 of the PDF) of the GLSL 4.00 spec:
" All tessellation control shader layout declarations in a program
must specify the same output patch vertex count."
From page 60 (page 66 of the PDF) of the GLSL 4.30 spec:
" Also, if such a layout qualifier is declared more than once in the
same shader, all those declarations must set the same set of local
work-group sizes and set them to the same values; otherwise a
compile-time error results. If multiple compute shaders attached
to a single program object declare local work-group size, the
declarations must be identical; otherwise a link-time error
results."
From page 73 (page 79 of the PDF) of the GLSL 4.40 spec:
" While xfb_stride can be declared multiple times for the same
buffer, it is a compile-time or link-time error to have different
values specified for the stride for the same buffer."
Fixes GL44-CTS.enhanced_layouts.xfb_duplicated_stride
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>