The ES31-CTS.compute_shader.pipeline-compute-chain test case generates
an unsigned index by using gl_LocalInvocationID.x and
gl_LocalInvocationID.y as array indices.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
The ES31-CTS.compute_shader.pipeline-compute-chain test case generates
an unsigned index by using gl_LocalInvocationID.x and
gl_LocalInvocationID.y as array indices.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
The split of Uniform blocks and shader storage block only loops
up to MESA_SHADER_FRAGMENT and igonres compute shaders.
This cause segfault when running the OpenGL ES 3.1 CTS tests
with GL_ARB_compute_shader enabled.
V2: Changed to use MESA_SHADER_STAGES instead of
MESA_SHADER_COMPUTE
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com>
Patch moves existing calculation code from shader_query.cpp to happen
during program resource list creation.
No Piglit or CTS regressions were observed during testing.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Patch adds 2 new fields to gl_uniform_storage so that we don't need to
calculate these values during runtime shader queries. This is required by
upcoming changes to free GLSL IR after linking.
Patch moves 3 booleans inside structure so that structure size stays the
same after this change.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Currently, these arrays in gl_shader and gl_shader_program hold both
UBOs and SSBOs, so this looks like a better name. We were already
using NumBufferInterfaceBlocks in gl_shader_program, so this makes
things more consistent as well.
In a later patch we will add {Num}UniformBlocks and
{Num}ShaderStorageBlocks which will contain only references to
UBOs and SSBOs respectively that will provide backends with
a separate index space for both types of objects.
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
We get these when we operate on vector variables with array accessors
(i.e. things like a[0] where 'a' is a vec4). When we call variable_referenced()
on these expressions we want to return a reference to 'a' instead of NULL.
This fixes a problem where we pass a[0] as the first argument to an atomic
SSBO function that expects a buffer variable. In order to check this, we use
variable_referenced(), but that is currently returning NULL in this case, since
the underlying rvalue is a vector_extract expression.
Tested-by: Markus Wick <markus@selfnet.de>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
NIR is typeless so this is the only way to keep track of the
type to select the proper atomic to use.
v2:
- Use imin,imax,umin,umax for the intrinsic names (Connor Abbott)
- Change message for unreachable paths (Michael Schellenberger)
Tested-by: Markus Wick <markus@selfnet.de>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This is the result of applying several rules:
From OpenGL 4.3 spec, section 7.6.2.2 "Standard Uniform Block Layout":
"2. If the member is a two- or four-component vector with components
consuming N basic machine units, the base alignment is 2N or 4N,
respectively."
[...]
"4. If the member is an array of scalars or vectors, the base alignment
and array stride are set to match the base alignment of a single array
element, according to rules (1), (2), and (3), and rounded up to the
base alignment of a vec4."
[...]
"7. If the member is a row-major matrix with C columns and R rows, the
matrix is stored identically to an array of R row vectors with C
components each, according to rule (4)."
[...]
"When using the std430 storage layout, shader storage blocks will be
laid out in buffer storage identically to uniform and shader storage
blocks using the std140 layout, except that the base alignment and
stride of arrays of scalars and vectors in rule 4 and of structures in
rule 9 are not rounded up a multiple of the base alignment of a vec4."
In summary: vec2 has a base alignment of 2*N, a row-major mat2xY is
stored like an array of Y row vectors with 2 components each. Because
of std430 storage layout, the base alignment of the array of vectors
is not rounded up to vec4, so it is still 2*N.
Fixes 15 dEQP tests:
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_lowp_mat2
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_mediump_mat2
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_highp_mat2
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_lowp_mat2x3
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_mediump_mat2x3
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_highp_mat2x3
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_lowp_mat2x4
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_mediump_mat2x4
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_highp_mat2x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.row_major_mat2
dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.row_major_mat2x3
dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.row_major_mat2x4
dEQP-GLES31.functional.ssbo.layout.instance_array_basic_type.std430.row_major_mat2
dEQP-GLES31.functional.ssbo.layout.instance_array_basic_type.std430.row_major_mat2x3
dEQP-GLES31.functional.ssbo.layout.instance_array_basic_type.std430.row_major_mat2x4
v2:
- Add spec quote in both commit log and code (Timothy)
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
Fixes:
ES3-CTS.shaders.negative.constant_sequence
spec/glsl-es-3.00/compiler/global-initializer/from-sequence.vert
spec/glsl-es-3.00/compiler/global-initializer/from-sequence.frag
v2: Fix a couple copy-and-paste mistake in the spec quotations.
Suggested by Matt.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
This will be used in the next patch to enforce some language sematics.
v2: Fix inverted logic in
ast_function_expression::has_sequence_subexpression. The method
originally had a different name and a different meaning. I fixed the
logic in ast_to_hir.cpp, but I only changed the names in
ast_function.cpp.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
v2: Combine this check with the existing const and uniform checks. This
change depends on the previous patch (glsl: Only set
ir_variable::constant_value for const-decorated variables).
Fixes:
ES2-CTS.shaders.negative.initialize
ES3-CTS.shaders.negative.initialize
spec/glsl-es-1.00/compiler/global-initializer/from-attribute.vert
spec/glsl-es-1.00/compiler/global-initializer/from-uniform.vert
spec/glsl-es-1.00/compiler/global-initializer/from-uniform.frag
spec/glsl-es-1.00/compiler/global-initializer/from-global.vert
spec/glsl-es-1.00/compiler/global-initializer/from-global.frag
spec/glsl-es-1.00/compiler/global-initializer/from-varying.frag
spec/glsl-es-3.00/compiler/global-initializer/from-uniform.vert
spec/glsl-es-3.00/compiler/global-initializer/from-uniform.frag
spec/glsl-es-3.00/compiler/global-initializer/from-in.vert
spec/glsl-es-3.00/compiler/global-initializer/from-in.frag
spec/glsl-es-3.00/compiler/global-initializer/from-global.vert
spec/glsl-es-3.00/compiler/global-initializer/from-global.frag
Note: spec/glsl-es-3.00/compiler/global-initializer/from-sequence.*
still fail because the result of a sequence operator is still considered
to be a constant expression.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92304
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> [v1]
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Right now we're also setting for uniforms, and that doesn't seem to hurt
things. The next patch will make general global variables in GLSL ES,
and those definitely should not have constant_value set!
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
This even matches the comment "uniform initializers are precious, and
could get used by another stage."
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
In d4a24745 (August 2012), Paul made functions calls not be constant
expressions in GLSL ES 1.00. Since this feature was added in desktop
GLSL 1.20, we believed that it was added in GLSL ES 3.00. That turns
out to be completely wrong. Built-in functions have always been allowed
as constant expressions in GLSL ES, and the patch adds the (many) spec
quotations to prove it.
While we never previously encountered this, a later patch enforces a GLSL
ES 1.00 rule that global variable initializers must be constant
expressions. Without this fix, several dEQP tests fail.
Fixes:
tests/spec/glsl-es-1.00/compiler/const-initializer/from-function.frag
tests/spec/glsl-es-1.00/compiler/const-initializer/from-function.vert
tests/spec/glsl-es-1.00/compiler/const-initializer/from-sequence-in-function.frag
tests/spec/glsl-es-1.00/compiler/const-initializer/from-sequence-in-function.vert
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.0 10.1 10.2 10.3 10.4 10.5 10.6 11.0" <mesa-stable@lists.freedesktop.org>
Yes, I know we don't maintain stable branches that far back, but that
*is* how far back this bug goes!
Also fix style / wrong indentation along the way and make the messages
more uniform.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
GLSL Spec 4.20.8, 4.3 Storage Qualifiers:
"Initializers in global declarations may only be used in declarations of
global variables with no storage qualifier, with a const qualifier or
with a uniform qualifier."
We do this for input variables, but not for output variables. AMD and NVIDIA
proprietary drivers don't allow this either.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
First step towards inverting the dependency between glsl and nir (so nir
can be used without glsl). Also solves this issue with 'make distclean'
Making distclean in mesa
make[2]: Entering directory '/mnt/sdb1/Src64/Mesa-git/mesa/src/mesa'
Makefile:2486: ../glsl/.deps/shader_enums.Plo: No such file or directory
make[2]: *** No rule to make target '../glsl/.deps/shader_enums.Plo'. Stop.
make[2]: Leaving directory '/mnt/sdb1/Src64/Mesa-git/mesa/src/mesa'
Makefile:684: recipe for target 'distclean-recursive' failed
make[1]: *** [distclean-recursive] Error 1
make[1]: Leaving directory '/mnt/sdb1/Src64/Mesa-git/mesa/src'
Makefile:615: recipe for target 'distclean-recursive' failed
make: *** [distclean-recursive] Error 1
Reported-by: Andy Furniss <adf.lists@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
This was originally added to nir_instrs_equal() instead of
nir_instr_can_cse() incorrectly, but this was fixed when moving to the
instruction set API (as it had to be, otherwise hashing wouldn't work).
Now, this is dead code since instr_can_rewrite() will only return true
for texture instructions that use an index, so we can turn the check into
an assert. This also means that now nir_instrs_equal(instr, instr) will
always return true unless it assert-fails.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
This was previously tied to CSE, since it would only work for
instructions where nir_can_cse() (now instr_can_rewrite()) returned
true. Now that CSE uses the instruction set abstraction which only uses
this internally, we can make it local to nir_instr_set.c.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
This replaces an O(n^2) algorithm with an O(n) one, while allowing us to
import most of the infrastructure required for GVN. The idea is to walk
the dominance tree depth-first, similar when converting to SSA, and
remove the instructions from the set when we're done visiting the
sub-tree of the dominance tree so that the only instructions in the set
are the instructions that dominate the current block.
No piglit regressions. No shader-db changes.
Compilation time for full shader-db:
Difference at 95.0% confidence
-35.826 +/- 2.16018
-6.2852% +/- 0.378975%
(Student's t, pooled s = 3.37504)
v2:
- rebase on start_block removal
- remove useless state struct
- change commit message
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
This will replace direct usage of nir_instrs_equal() in the CSE pass,
which reduces an O(n^2) algorithm with an effectively O(n) one. It'll
also be useful for implementing GVN on top of GCM.
v2:
- Add texture support.
- Add more comments.
- Rename instr_can_hash() to instr_can_rewrite() since it's really more
about whether its uses can be rewritten, and it's implicitly used by
nir_instrs_equal() as well.
- Rename nir_instr_set_add() to nir_instr_set_add_or_rewrite() (Jason).
- Make the HASH() macro less magical (Topi).
- Rewrite the commit message.
v3:
- For sorting phi sources, use a VLA, store pointers to the sources, and
compare the predecessor pointer directly (Jason).
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Right now nir_instrs_equal() is tied pretty tightly to CSE, but we're
going to introduce the idea of an instruction set and tie it to that
instead. In anticipation of that, move this into its own file where
we'll add the rest of the instruction set implementation later.
v2: Rebase on texture support.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Adds nir_src_is_dynamically_uniform which returns true if the source
is known to be dynamically uniform. This will be used in a later patch
to add a workaround for cases that only work with dynamically uniform
sources. Note that the function is not definitive, it can return false
negatives (but not false positives). Currently it only detects
constants and uniform accesses. It could easily be extended to include
more cases.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Previously the name of the nir shader was being freed prematurely during
nir_sweep. Since 756613ed35 the name was later being used to generate
filenames for the optimiser debug output and these would end up with
garbage from the dangling pointer.
Co-authored-by: Neil Roberts <neil@linux.intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Varyings can be considered inputs or outputs of a program only when
SSO is in use. With multi-stage programs, inputs contain only inputs
for first stage and outputs contains outputs of the final shader stage.
I've tested that fix works for Assault Android Cactus (demo version)
and does not cause Piglit or CTS regressions in glGetProgramiv tests.
Following ES 3.1 CTS separate shader tests that do query properties
of varyings in SSO shader programs pass:
ES31-CTS.program_interface_query.separate-programs-vertex
ES31-CTS.program_interface_query.separate-programs-fragment
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92122
With NIR, it actually hurts things.
total instructions in shared programs: 6529329 -> 6528888 (-0.01%)
instructions in affected programs: 14833 -> 14392 (-2.97%)
helped: 299
HURT: 1
In all affected programs I inspected (including the single hurt one) the
pass CSE'd some multiplies and caused some reassociation (e.g., caused
(A * B) * C to be A * (B * C)) when the original intermediate result was
reused elsewhere.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes following Piglit test:
global-scope-binding-qualifier.frag
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
The uniform will only be of a single type so store the data for
opaque types in a single array.
Cc: Francisco Jerez <currojerez@riseup.net>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Geometry and tessellation shaders process multiple vertices; their
inputs are arrays indexed by the vertex number. While GLSL makes
this look like a normal array, it can be very different behind the
scenes.
On Intel hardware, all inputs for a particular vertex are stored
together - as if they were grouped into a single struct. This means
that consecutive elements of these top-level arrays are not contiguous.
In fact, they may sometimes be in completely disjoint memory segments.
NIR's existing load_input intrinsics are awkward for this case, as they
distill everything down to a single offset. We'd much rather keep the
vertex ID separate, but build up an offset as normal beyond that.
This patch introduces new nir_intrinsic_load_per_vertex_input
intrinsics to handle this case. They work like ordinary load_input
intrinsics, but have an extra source (src[0]) which represents the
outermost array index.
v2: Rebase on earlier refactors.
v3: Use ssa defs instead of nir_srcs, rebase on earlier refactors.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
get_io_offset() already walks the dereference chain and discovers
whether or not we have an indirect; we can just return that rather than
computing it a second time via deref_has_indirect(). This means moving
the call a bit earlier.
By returning a nir_ssa_def *, we can pass back both an existence flag
(via NULL checking the pointer) and the value in one parameter. It
also simplifies the code somewhat. nir_lower_samplers works in a
similar fashion.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
This is a common enough operation that it's nice to not have to think about
the arguments to foreach_list_typed every time.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
That way, if we do the usual thing of multiplying vector_elements by
matrix_columns we get the actual number of components in the type as per
component_slots().
While we're at it, we also switch to using the actual C++ field
initializers for vector_elements and matrix_columns.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Commit 0a1adaf11d (nir: Report progress
from nir_lower_system_values().) introduced a bug caught by Valgrind:
==823== Conditional jump or move depends on uninitialised value(s)
==823== at 0xB09020C: convert_block (nir_lower_system_values.c:68)
==823== by 0xB079FB8: foreach_cf_node (nir.c:1310)
==823== by 0xB07A0AF: nir_foreach_block (nir.c:1336)
==823== by 0xB09026B: convert_impl (nir_lower_system_values.c:79)
...
==823== Uninitialised value was created by a stack allocation
==823== at 0xB090249: convert_impl (nir_lower_system_values.c:76)
which is trivially fixed by initializing progress.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>