generate_array_index fails to check whether the target of a subroutine
call exists in the AST, potentially passing around null ir_rvalue
pointers eventuating in abort/segfault.
Fixes: fd01840c0b ("glsl: add AoA support to subroutines")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100438
(cherry picked from commit f09c2cefdd)
Commit 259fc50545 added linker error for
mismatching uniform precision, as required by GLES 3.0 specification and
conformance test-suite.
Several Android applications, including Forge of Empires, have shaders
which violate this rule, on a dead varying that will be eliminated.
The problem affects a big number of applications using Cocos2D engine
and other GLES implementations accept this, this poses a serious
application compatibility issue.
Starting from GLSL ES 3.0, declarations with conflicting precision
qualifiers are explicitly prohibited. However GLSL ES 1.00 does not
clearly specify the behavior, except that
"Uniforms are defined to behave as if they are using the same storage in
the vertex and fragment processors and may be implemented this way.
If uniforms are used in both the vertex and fragment shaders, developers
should be warned if the precisions are different. Conversion of
precision should never be implicit."
The word "used" is not clear in this context and might refer to
1) declared (same as GLES 3.x)
2) referred after post-processing, or
3) linked after all optimizations are done.
Looking at existing applications, 2) or 3) seems to be widely adopted.
To avoid compatibility issues, turn the error into a warning if GLSL ES
version is lower than 3.0 and the data is dead in at least one of the
shaders.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97532
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 0886be093f)
Fixes: 94d669b0d2 ("glsl: enforce fragment shader input restrictions in
GLSL ES 3.10")
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit a6932faae1)
The GL spec will soon be revised to clarify that a buffer binding for
a transform feedback buffer is only required if a variable is actually
defined to use the buffer binding point. Previously a declaration for
the default transform buffer would make it require a binding even if
nothing was declared to use the default buffer.
Affects:
KHR-GL44/45.enhanced_layouts.xfb_stride_of_empty_list
KHR-GL44/45.enhanced_layouts.xfb_stride_of_empty_list_and_api
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 4dc8458cd1)
This patch is mostly a patch done by Ilia Mirkin.
It fixes KHR-GL45.enhanced_layouts.varying_structure_locations.
v2: fix locations for TCS/TES/GS inputs and outputs (Ilia)
CC: Ilia Mirkin <imirkin@alum.mit.edu>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103098
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit d5a641106b)
This turned out to be a dead end, it is much easier and less error
prone to just cache the IR used by the drivers backend e.g. TGSI or
NIR.
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit cf05bb506a)
There are two issues with the current implementation. First, it relies
on the layout(local_size_*) happening in the same shader as the main
function, and secondly it doesn't work for variable group sizes.
In both cases, the simplest fix is to move the setup of these derived
values to a later time, similar to how the gl_VertexID workarounds are
done. There already exist system values defined for both of the derived
values, so we use them unconditionally, and lower them after linking is
performed.
While we're at it, we move to using gl_LocalGroupSizeARB instead of
gl_WorkGroupSize for variable group sizes.
Also the dead code elimination avoidance can be removed, since there
can be situations where gl_LocalGroupSizeARB is needed but has not been
inserted for the shader with main function. As a result, the lowering
code has to insert its own copies of the system values if needed.
Reported-by: Stephane Chevigny <stephane.chevigny@polymtl.ca>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103393
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 4d24a7cb97)
We only need to add a check to validate output locations here. For
inputs with invalid locations we will fail to link when we can't
find a matching output in the same (invalid) location.
v2: compute location slots properly depending on shader stage and
variable type / direction
Fixes:
KHR-GL45.enhanced_layouts.varying_location_limit
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
We won't split varyings marked as always active because there
is no point in doing so. This means we need to mark both
sides of the interface as always active otherwise we will have
a mismatch and start removing things we shouldn't.
Reviewed-by: Eric Anholt <eric@anholt.net>
Since blob.h moved up to src/compiler the test should include that
instead of src/compiler/glsl
fixes: 0e3bd56c6e ("compiler: Move blob up a level")
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This looks like a copy+paste error. They don't actually write into that
variable as would be implied by putting the return there.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Despite the name, it could only be used if you immediately wrote to the
pointer. Noboby was using it outside of one test, so clearly this
behavior wasn't that useful. Instead, make it return an offset into the
data buffer so that the result isn't invalidated if you later write to
the blob. In conjunction with blob_overwrite_bytes(), this will be
useful for leaving a placeholder and then filling it in later, which
we'll need to do for handling phi nodes when serializing NIR.
v2 (Jason Ekstrand):
- Detect overflow in the offset + to_write computation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
There's no reason why that tiny bit of memory needs to be on the heap.
We always put blob_reader on the stack, so why not do the same with the
writable blob.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
We're going to want to use the blob for Vulkan pipeline caching so it
makes sense to have it in libcompiler not libglsl.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Otherwise we could have a failure followed by a smaller write that
succeeds and get a corrupted blob. If we ever OOM, we should stop.
v2 (Jason Ekstrand):
- Initialize the new boolean member in create_blob
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Otherwise, if you have a large read fail and then try to do a small
read, the small read may succeed even though it's at the wrong offset.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Unlike uniforms, the limit on shared memory size is not called out
explicitly in the list of things that cause linker errors, but presumably
that's just an oversight in the spec.
Fixes dEQP-GLES31.functional.debug.negative_coverage.{callbacks,get_error,log}.compute.exceed_shared_memory_size_limit
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
c7affbf687 enabled GLSLOptimizeConservatively on some
drivers. The idea was to speed up compile times by running
the GLSL IR passes only once each time do_common_optimization()
is called. However loop unrolling can create a big mess and
with large loops can actually case compile times to increase
significantly due to a bunch of redundant if statements being
propagated to other IRs.
Here we make sure to clean things up before moving on.
There was no measureable difference in shader-db compile times,
but it makes compile times of some piglit tests go from a couple
of seconds to basically instant.
The shader-db results seemed positive also:
Totals:
SGPRS: 2829456 -> 2828376 (-0.04 %)
VGPRS: 1720793 -> 1721457 (0.04 %)
Spilled SGPRs: 7707 -> 7707 (0.00 %)
Spilled VGPRs: 33 -> 33 (0.00 %)
Private memory VGPRs: 3140 -> 2060 (-34.39 %)
Scratch size: 3308 -> 2180 (-34.10 %) dwords per thread
Code Size: 79441464 -> 79214616 (-0.29 %) bytes
LDS: 436 -> 436 (0.00 %) blocks
Max Waves: 558670 -> 558571 (-0.02 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
The old code assumed that loop terminators will always be at
the start of the loop, resulting in otherwise unrollable
loops not being unrolled at all. For example the current
code would unroll:
int j = 0;
do {
if (j > 5)
break;
... do stuff ...
j++;
} while (j < 4);
But would fail to unroll the following as no iteration limit was
calculated because it failed to find the terminator:
int j = 0;
do {
... do stuff ...
j++;
} while (j < 4);
Also we would fail to unroll the following as we ended up
calculating the iteration limit as 6 rather than 4. The unroll
code then assumed we had 3 terminators rather the 2 as it
wasn't able to determine that "if (j > 5)" was redundant.
int j = 0;
do {
if (j > 5)
break;
... do stuff ...
if (bool(i))
break;
j++;
} while (j < 4);
This patch changes this pass to be more like the NIR unrolling pass.
With this change we handle loop terminators correctly and also
handle cases where the terminators have instructions in their
branches other than a break.
V2:
- fixed regression where loops with a break in else were never
unrolled in v1.
- fixed confusing/wrong naming of bools in complex unrolling.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
do-while loops can increment the starting value before the
condition is checked. e.g.
do {
ndx++;
} while (ndx < 3);
This commit changes the code to detect this and reduces the
iteration count by 1 if found.
V2: fix terminator spelling
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
These instructions will be executed on every iteration of the loop
we cannot drop them.
V2:
- move removal of unreachable terminators from the terminator list
to the same place they are removed from the IR as suggested by
Nicolai.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
This gets pretty much the entire classic tree building, as well as
i965, including the various glapis. There are some workarounds for bugs
that are fixed in meson 0.43.0, which is due out on October 8th.
I have tested this with piglit using glx.
v2: - fix typo "vaule" -> "value"
- use gtest dep instead of linking to libgtest (rebase error)
- use gtest dep instead of linking against libgtest (rebase error)
- copy the megadriver, then create hard links from that, then delete
the megadriver. This matches the behavior of the autotools build.
(Eric A)
- Use host_machine instead of target_machine (Eric A)
- Put a comment in the right place (Eric A)
- Don't have two variables for the same information (Eric A)
- Put pre_args at top of file in this patch (Eric A)
- Fix glx generators in this patch instead of next (Eric A)
- Remove -DMESON hack (Eric A)
- add sha1_h to mesa in this patch (Eric A)
- Put generators in loops when possible to reduce code in
mapi/glapi/gen (Eric A)
v3: - put HAVE_X11_PLATFORM in this patch
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
It's inside an if-statement that already checks that the variables are
not NULL.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
The optimization as done in opt_copy_propagation would have to be
removed in the next patch. If we just eliminate that optimization
altogether, shader-db results, even on platforms that use NIR, are hurt
quite substantially. I have not investigated why NIR isn't picking up
the slack here.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Instead of generating a sequence like:
run_default = true;
if (i == 3) // some label that appears after default
run_default = false;
if (i == 4) // some label that appears after default
run_default = false;
...
if (run_default) {
...
}
generate something like:
run_default = !((i == 3) || (i == 4) || ...);
if (run_default) {
...
}
This eliminates one use of conditional assignment, and it enables the
elimination of another.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Previously the instruction stream was walked looking for comparisons
with case-label values. This should generate nearly identical code.
For at least fs-default-notlast-fallthrough.shader_test, the code is
identical.
This change will make later changes possible.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
The values being compared are scalars, so these are the same. While
I'm here, simplify the run_default condition to just deref the flag
(instead of comparing a scalar bool with true).
There is a bit of extra change in this patch. When constructing an
ir_binop_equal ir_expression, there is an assertion that the types are
the same. There is no such assertion for ir_binop_all_equal, so
passing glsl_type::uint_type with glsl_type::int_type was previously
fine. A bunch of the code motion is to deal with that.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
This happens to work now because ir_binop_all_equal is used. This
causes vector typed init-expressions to produce scalar Boolean values
after comparison.
The next commit changes ir_binop_all_equal to ir_binop_equal. Vector
typed init-expressions will then produce vector Boolean values, and, in
debug builds, the ir_assignment constructor will fail an assertion.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Mostly tabs-before-spaces, but there was some other trivium too.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
This is basically a wash now, but it simplifies later patches that
convert to using ir_builder.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Mostly tabs-before-spaces, but there was some other trivium too.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
It leads to surprising states with integer inputs and outputs on
vertex processing stages (e.g. geometry stages). Instead, rely on the
driver to choose smooth interpolation by default.
We still allow varyings to match when one stage declares it as smooth
and the other declares it without interpolation qualifiers.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
GLSL ES requires both, and while GLSL explicitly doesn't require correct
overflow handling, it does appear to require handling input inf/denorms
correctly.
Fixes dEQP-GLES31.functional.shaders.builtin_functions.precision.ldexp.*
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
This allows building and installing the Intel "anv" Vulkan driver using
meson and ninja, the driver has been tested against the CTS and has
seems to pass the same series of tests (they both segfault when the CTS
tries to run wayland wsi tests).
There are still a mess of TODO, XXX, and FIXME comments in here. Those
are mostly for meson bugs I'm trying to fix, or for additional things to
implement for other drivers/features.
I have configured all intermediate libraries and optional tools to not
build by default, meaning they will only be built if they're pulled in
as a dependency of a target that will actually be installed) this allows
us to avoid massive if chains, while ensuring that only the bits that
need to be built are.
v2: - enable anv, x11, and wayland by default
- add configure option to disable valgrind
v3: - fix typo in meson_options (Nicholas)
v4: - Remove dead code (Eric)
- Remove change to generator that was from v0 (Eric)
- replace if chain with loop (Eric)
- Fix typos (Eric)
- define HAVE_DLOPEN for both libdl and builtin dl cases (Eric)
v5: - rebase on util string buffer implementation
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net> (v4)
Length of the token was already calculated by flex and stored in yyleng,
no need to implicitly call strlen() via linear_strdup().
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle at amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ian Romanick <ian.d.romanick at intel.com>
V2: Also convert this pattern in glsl_lexer.ll
V3: Remove a misplaced comment
V4: Use a temporary char to avoid type change
Remove bogus +1 on length check of identifier
Migrate removal of line continuations to string_buffer. Before this
it used ralloc_strncat() to append strings, which internally
each time calculates strlen() of its argument. Its argument is
entire shader, so it multiple time scans the whole shader text.
Signed-off-by: Vladislav Egorov <vegorov180@gmail.com>
Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle at amd.com>
V2: Adapt to different API of string buffer (Thomas Helland)
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle at amd.com>
V2: Pointed out by Timothy
- Fix pp.c reralloc size issue and comment
V3 - Use vprintf instead of printf where we should
- Fixes failing make-check tests
V4 - Use buffer_append_char in a couple places
- Use append_char in even more places
This will be used by the nir linking pass so that we don't remove
otherwise unused varyings.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Will be used in nir link pass to decided if we can remove a varying
or not.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>