At this point there is no reason not to be using the linked shaders,
using the linked shaders should be faster and will make things simpler
for upcoming shader cache work.
The previous variable name suggests the linked shaders were intended
to be used here anyway.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Likewise, rename the enum type to glsl_interp_mode.
Beyond the GLSL front-end, talking about "interpolation modes" seems
more natural than "interpolation qualifiers" - in the IR, we're removed
from how exactly the source language specifies how to interpolate an
input. Also, SPIR-V calls these "decorations" rather than "qualifiers".
Generated by:
$ find . -regextype egrep -regex '.*\.(c|cpp|h)' -type f -exec sed -i \
-e 's/INTERP_QUALIFIER_/INTERP_MODE_/g' \
-e 's/glsl_interp_qualifier/glsl_interp_mode/g' {} \;
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Dave Airlie <airlied@redhat.com>
I recently refactored this to share code between load and atomic
lowering. loads used intrin->num_components, while atomics used
intrin->dest.ssa.num_components. They should be equivalent, but
Jason wanted me to use the latter. I missed applying his review.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
This is more readable and also offers assertions that protect against
setting const_index fields on the wrong kind of intrinsic.
Suggested by Jason.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The original function was becoming a bit hard to read, with the details
of creating and filling out load/store/atomic atomics all in one
function.
This patch makes helpers for creating each type of intrinsic, and also
combines them with the *_op() helpers, as they're closely coupled and
not too large.
v2: Minor style nits from Jason.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This can't happen, the caller asserts that mode is shader_out or shared.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Both loads and atomics had identical code to rewrite destinations,
and all cases had the same two lines to replace instructions.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The load/store/atomic cases all duplicated the get_io_offset code, with
a few tiny differences: stores didn't bother checking for per-vertex
inputs, because they can't be stored to, and atomics didn't check at
all, since shared variables aren't per-vertex.
However, it's harmless to check, and allows us to share more code.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Less typing and word wrapping issues than intrin->variables[0]->var.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This is way better than the stupid string approach especially since you
could overflow the string. Again, I thought I had something better at one
point but it obviously got lost.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
It was returning true if the function types have different lengths rather
than false. This was new with the SPIR-V to NIR pass and I thought I'd
fixed it a while ago but it may have gotten lost in rebasing somewhere.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
I have no idea why we were multiplying by 4 before. The offsets we get
from SPIR-V are in bytes and so is nir->num_uniforms so there's no need to
do any adjustment whatsoever.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
This was appearing in vc4 VS/CS in mupen64, due to vertex attrib lowering
producing some constants that were getting compared.
total instructions in shared programs: 112276 -> 112198 (-0.07%)
instructions in affected programs: 2239 -> 2161 (-3.48%)
total estimated cycles in shared programs: 283102 -> 283038 (-0.02%)
estimated cycles in affected programs: 2365 -> 2301 (-2.71%)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
If a shader has an output array, it will get treated as though it were
gl_FragData and rewritten into gl_out_FragData instances. We only want
this to happen on the actual gl_FragData and not everything else.
This is a small part of the problem pointed out by the below bug.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96765
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
This just stops counting and assigning a storage location for
these uniforms, the count is only used to create the uniform storage.
These uniform types don't use this storage.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Some games are sloppy.. perhaps because it is defined behavior for DX or
perhaps because nv blob driver defaults things to zero.
So add driconf param to force uninitialized variables to default to zero.
This issue was observed with rust, from steam store. But has surfaced
elsewhere in the past.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The linker deals with atomic counters in terms of uniforms but the
data structure are called after the atomic counters.
Renamed the data structures used in the linker for disambiguation.
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Currently the linker uses the uniform count for the total number of
atomic counters. However uniforms don't include the innermost array
dimension in their count, but atomic counters are expected to include
them.
Although the spec doesn't directly state this, it's clear how offsets
will be assigned for arrays.
From OpenGL 4.2 (Core Profile), page 98:
" * Arrays of type atomic_uint are stored in memory by element
order, with array element member zero at the lowest offset. The
difference in offsets between each pair of elements in the
array in basic machine units is referred to as the array
stride, and is constant across the entire array. The stride can
be queried by calling GetIntegerv with a pname of
ATOMIC_COUNTER_- ARRAY_STRIDE after a program is linked."
From that it is clear how arrays of atomic counters will interact with
GL_MAX_ATOMIC_COUNTER_BUFFER_SIZE.
For other kinds of uniforms it's also clear that each entry in an
array counts against the relevant limits.
Hence, although inferred, this is the expected behavior.
Fixes GL44-CTS.arrays_of_arrays_gl.AtomicDeclaration
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
There are two distinctly different uses of this struct. The first
is to store GL shader objects. The second is to store information
about a shader stage thats been linked.
The two uses actually share few fields and there is clearly confusion
about their use. For example the linked shaders map one to one with
a program so can simply be destroyed along with the program. However
previously we were calling reference counting on the linked shaders.
We were also creating linked shaders with a name even though it
is always 0 and called the driver version of the _mesa_new_shader()
function unnecessarily for GL shader objects.
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
This will allow us to split gl_shader into two different structs, one for
shader objects and one for linked shaders.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Rather than passing in gl_shader we now pass in the IR. This will
allow us to later split gl_shader into two structs. One for use
as a linked per stage shader struct and one for use as a GL shader
object.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
The only part of an ir_texture which can be an array is the
offsets array in textureGatherOffsets() calls. We don't want
to lower those, because they're required to remain constants.
Fixes textureGatherOffsets with Gallium drivers such as llvmpipe,
which commit ef78df8d3b regressed.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
The intent was to continue down the indirect chain, not to call ourselves
with unchanged input arguments. Found by code inspection, and comparison
to copy_prop_alu_src().
We haven't hit this because callers of NIR's copy prop are doing so in
SSA, before indirect variable dereferences have been lowered to registers.
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
It defaults to true so default behavior doesn't change but it allows you to
do NIR_VALIDATE=false if you don't want validation. Disabling validation
can substantially speed up shader compiles so you frequently want to turn
it off if compiler invariants aren't in question.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Clean up misrepetitions ('if if', 'the the' etc) found throughout the
comments. This has been done manually, after grepping
case-insensitively for duplicate if, is, the, then, do, for, an,
plus a few other typos corrected in fly-by
v2:
* proper commit message and non-joke title;
* replace two 'as is' followed by 'is' to 'as-is'.
v3:
* 'a integer' => 'an integer' and similar (originally spotted by
Jason Ekstrand, I fixed a few other similar ones while at it)
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
Constant propagation on arrays doesn't make a lot of sense. If the
array is only accessed with constant indexes, then opt_array_splitting
would split it up. Otherwise, we have variable indexing. If there's
multiple accesses, then constant propagation would end up replicating
the data.
The lower_const_arrays_to_uniforms pass creates uniforms for each
ir_constant with array type that it encounters. This means that it
creates redundant uniforms for each copy of the constant, which means
uploading too much data. It can even mean exceeding the maximum number
of uniform components, causing link failures.
We could try and teach the pass to de-duplicate the data by hashing
constants, but it makes more sense to avoid duplicating it in the first
place. We should promote constant arrays to uniforms, then propagate
the uniform access.
Fixes the TressFX shaders from Tomb Raider, which exceeded the maximum
number of uniform components by a huge margin and failed to link.
On Broadwell:
total instructions in shared programs: 9067702 -> 9068202 (0.01%)
instructions in affected programs: 10335 -> 10835 (4.84%)
helped: 10 (Hoard, Shadow of Mordor, Amnesia: The Dark Descent)
HURT: 20 (Natural Selection 2)
loops in affected programs: 4 -> 0
The hurt programs appear to no longer have a constarray uniform, as
all constants were successfully propagated. Apparently before this
patch, we successfully unrolled a loop containing array access, but
only after promoting constant arrays to uniforms. With this patch,
we unroll it first, so all array access is direct, and the array
is split up, and individual constants are propagated. This seems
better.
Cc: mesa-stable@lists.freedesktop.org
Reported-by: Karol Herbst <nouveau@karolherbst.de>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
There's really no point in looking at ir_dereference_array of a
constant. It also misses cases like:
(assign () (var_ref tmp) (constant (array ...) ...))
No changes in shader-db, but keeps it working after the next commit.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
The new uniform may need precise as well.
Fixes copy propagation of constant array uniforms in Tomb Raider shaders.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Previously, we failed to split constant arrays. Code such as
int[2] numbers = int[](1, 2);
would generates a whole-array assignment:
(assign () (var_ref numbers)
(constant (array int 4) (constant int 1) (constant int 2)))
opt_array_splitting generally tried to visit ir_dereference_array nodes,
and avoid recursing into the inner ir_dereference_variable. So if it
ever saw a ir_dereference_variable, it assumed this was a whole-array
read and bailed. However, in the above case, there's no array deref,
and we can totally handle it - we just have to "unroll" the assignment,
creating assignments for each element.
This was mitigated by the fact that we constant propagate whole arrays,
so a dereference of a single component would usually get the desired
single value anyway. However, I plan to stop doing that shortly;
early experiments with disabling constant propagation of arrays
revealed this shortcoming.
This patch causes some arrays in Gl32GSCloth's geometry shaders to be
split, which allows other optimizations to eliminate unused GS inputs.
The VS then doesn't have to write them, which eliminates the entire VS
(5 -> 2 instructions). It still renders correctly.
No other change in shader-db.
v2: Drop !AOA check and improve a comment (feedback from Tim Arceri).
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
opt_constant_propagation.cpp contains constant folding code which can
actually do constant propagation in some cases. It was happily
propagating constants into the left-hand-side of assignments.
For example,
(assign () (var_ref temp) (constant ...))
would brilliantly be turned into:
(assign () (constant ...) (constant ....))
This is a bigger hammer than necessary - it prevents propagation
into the left-hand-side altogether. We could certainly do better
someday. Notably, the constant propagation pass itself already
takes this approach - it's just the constant propagation pass's
built-in constant folding code (which actually propagates, too)
that was broken.
No change in shader-db, but prevents regressions after future commits.
It seems plausible that this could be hit today, but I haven't seen it
happen.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
We already store these in gl_shader and gl_program here we
remove it from gl_shader_program and just use the values
from gl_shader.
This will allow us to keep the shader cache restore code as
simple as it can be while making it somewhat clearer where these
values originate from.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>