Generally you'd see the gl_Color reference first and get some cursor set.
However, in piglit draw-pixel-with-texture we're now seeing the TexCoord
dereferenced first.
Reviewed-by: Rob Clark <robdclark@gmail.com>
In the GLSL-to-NIR conversion of VC4, I had a bit of trouble with what I
was calling the "state uniforms" that I was putting into the NIR fighting
with its other lowering passes. Instead of using magic uniform base
numbers in the backend, follow the lead of load_user_clip_plane and just
define system values for them.
v2: Fix unintended change to channel_num, drop unspecified const_index
value on blend_const_color_r_float.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Implementation previously used value itself as the key, however after
hash implementation change by ee02a5e we cannot use 0 as key.
v2: use constant pointer as the key and implement comparison
for contents (Eric Anholt)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97309
We're going to handle output qualifiers here too, and calling it "inout"
seems to be the going convention.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
vc4 wants to have per-scalar IO load/stores so that dead code elimination
can happen on a more granular basis, which it has been doing in the
backend using a multiplication by 4 of the intrinsic's driver_location.
We can represent it properly in the NIR using the first_component field,
though.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The previous nir_load_system_value(b, nir_intrinsic_load_whatever), 0) was
rather verbose, when system values should be easy to generate.
The index is left out because only one system value had an index included
in it.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
GLSL-to-NIR generates system value usage, and vc4/freedreno would both
like the system value instead of the varying, so switch this pass over to
it.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
According to Connor, it's safe to assume that the first operand of
bcsel, as well as the operand of b2f and b2i, must be well formed
booleans.
https://lists.freedesktop.org/archives/mesa-dev/2016-August/125658.html
With the previous improvements to a@bool handling, this now has no
change in shader-db instruction counts on Broadwell.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
I don't want src_is_bool() and src_is_type(x, nir_type_bool) to behave
differently. Having the logic spread out over three functions makes it
harder to decide where to put new logic, as well.
So, combine them all. It's a bit simpler because there's now only one
recursive function rather than a pair of mutually recursive functions.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Currently, 'a@type' can only match if 'a' is produced by an ALU
instruction. This is rather limited - there are other cases we
can easily detect which we should handle.
Extending the code in-place would be fairly messy, so we introduce
a new src_is_type() helper.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The first simply picks the bany_inequal[234] opcodes based on the SSA
def's number of components. The latter implicitly compares with zero
to achieve the same semantics of GLSL's any().
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
No change except to the copyright symbol. The next patch will generate
this file with Python, and Unicode + Python = pure rage.
v2: Massive rebase... I guess a lot can change in a year.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
This won't affect the output, but it was, technically, wrong.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
This won't affect the output, but it was, technically, wrong.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
ir_unop_fract already forbade integer types in ir_validate. ir_unop_rcp,
ir_unop_rsq, and ir_unop_sqrt should also forbid them in ir_validate.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Some shaders include code that looks like:
uniform int i;
uniform vec4 bones[...];
foo(bones[i * 3], bones[i * 3 + 1], bones[i * 3 + 2]);
CSE would do some work on this:
x = i * 3
foo(bones[x], bones[x + 1], bones[x + 2]);
The compiler may then add '<< 4 + base' to the index calculations.
This results in expressions like
x = i * 3
foo(bones[x << 4], bones[(x + 1) << 4], bones[(x + 2) << 4]);
Just rearranging the math to produce (i * 48) + 16 saves an
instruction, and it allows CSE to do more work.
x = i * 48;
foo(bones[x], bones[x + 16], bones[x + 32]);
So, ~6 instructions becomes ~3.
Some individual shader-db results look pretty bad. However, I have a
really, really hard time believing the change in estimated cycles in,
for example, 3dmmes-taiji/51.shader_test after looking that change in
the generated code.
G45
total instructions in shared programs: 4020840 -> 4010070 (-0.27%)
instructions in affected programs: 177460 -> 166690 (-6.07%)
helped: 894
HURT: 0
total cycles in shared programs: 98829000 -> 98784990 (-0.04%)
cycles in affected programs: 3936648 -> 3892638 (-1.12%)
helped: 894
HURT: 0
Ironlake
total instructions in shared programs: 6418887 -> 6408117 (-0.17%)
instructions in affected programs: 177460 -> 166690 (-6.07%)
helped: 894
HURT: 0
total cycles in shared programs: 143504542 -> 143460532 (-0.03%)
cycles in affected programs: 3936648 -> 3892638 (-1.12%)
helped: 894
HURT: 0
Sandy Bridge
total instructions in shared programs: 8357887 -> 8339251 (-0.22%)
instructions in affected programs: 432715 -> 414079 (-4.31%)
helped: 2795
HURT: 0
total cycles in shared programs: 118284184 -> 118207412 (-0.06%)
cycles in affected programs: 6114626 -> 6037854 (-1.26%)
helped: 2478
HURT: 317
Ivy Bridge
total instructions in shared programs: 7669390 -> 7653822 (-0.20%)
instructions in affected programs: 388234 -> 372666 (-4.01%)
helped: 2795
HURT: 0
total cycles in shared programs: 68381982 -> 68263684 (-0.17%)
cycles in affected programs: 1972658 -> 1854360 (-6.00%)
helped: 2458
HURT: 307
Haswell
total instructions in shared programs: 7082636 -> 7067068 (-0.22%)
instructions in affected programs: 388234 -> 372666 (-4.01%)
helped: 2795
HURT: 0
total cycles in shared programs: 68282020 -> 68164158 (-0.17%)
cycles in affected programs: 1891820 -> 1773958 (-6.23%)
helped: 2459
HURT: 261
Broadwell
total instructions in shared programs: 9002466 -> 8985875 (-0.18%)
instructions in affected programs: 658784 -> 642193 (-2.52%)
helped: 2795
HURT: 5
total cycles in shared programs: 78503092 -> 78450404 (-0.07%)
cycles in affected programs: 2873304 -> 2820616 (-1.83%)
helped: 2275
HURT: 415
Skylake
total instructions in shared programs: 9156978 -> 9140387 (-0.18%)
instructions in affected programs: 682625 -> 666034 (-2.43%)
helped: 2795
HURT: 5
total cycles in shared programs: 75591392 -> 75550574 (-0.05%)
cycles in affected programs: 3192120 -> 3151302 (-1.28%)
helped: 2271
HURT: 425
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Ian recently changed the preprocessor to allow this in most GLSL
versions, but not GLSL ES 3.00+. This patch converts the existing
test that expects a failure to a #version 300 es shader, and adds
a #version 110 shader to make sure that it's allowed.
Fixes 'make check'.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97307
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
For regular ast_add, we can implicitly change either a or b's type.
However in an assignment situation, the type of the lvalue is fixed. So
if the implicit conversion logic decides to change it, it means that the
rhs's type could not be converted to the lhs type.
Emit a specific error for this rather than the rather mysterious "is not
an lvalue" error that results from having a i2f or other operation as
the lvalue.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96729
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
The GL spec is very unclear on this point. Apparently this is discussed
without resolution in the closed Khronos bugtracker at
https://cvs.khronos.org/bugzilla/show_bug.cgi?id=7829 . The
recommendation is to allow dropping the [0] for looking up the bindings.
The approach taken in this patch is to instead tack on [0]'s for each
arrayness level of the output's type, and doing the lookup again. That
way, for
out vec4 foo[2][2][2]
we will end up looking for bindings for foo, foo[0], foo[0][0], and
foo[0][0][0], in that order of preference.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96765
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Eric Anholt <eric@anholt.net>
Old languages (GLSL <= 4.20 and GLSL ES 1.00) require "invariant"
to be specified on both inputs and outputs, and match when linking.
New languages only allow outputs to be qualified as "invariant"
and remove the "invariant must match" restriction when linking
varyings (because no input can have that qualifier).
Commit 426a50e208 introduced the new
behavior for ES 3.00. It also removed the "must match" restriction
for ES 1.00 shaders, which I believe is incorrect. This patch adds
that back, as well as making 4.30+ follow the new rules.
Thanks to Qiankun Miao for noticing this discrepancy.
Fixes a WebGL 2.0 conformance test when run in Chromium:
https://www.khronos.org/registry/webgl/sdk/tests/deqp/data/gles3/shaders/qualification_order.html?webglVersion=2
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96971
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
The previous commit fixed xfb_buffer handling, which was largely copy
and pasted from the stream handling. The difference is that stream
was set in input_layout_mask, so it worked.
However, that's totally rubbish: stream is only valid on geometry shader
outputs. Presumably this was to hack around inout. Instead, apply the
solution I used in the previous fix.
Really, we just need to separate shader interface and parameter
qualifier handling so this isn't a mess, but this patch at least
tidies it slightly.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
inout variables have q.in and q.out set. We were trying to set
xfb_buffer = 1 for shader output variables (and inadvertantly setting
it on inout parameters, too). But input_layout_mask doesn't have
xfb_buffer set, so it was seen as in invalid input qualifier.
This meant that all 'inout' parameters were broken.
Caught by running a WebGL conformance test in Chromium:
https://www.khronos.org/registry/webgl/sdk/tests/deqp/data/gles3/shaders/qualification_order.html?webglVersion=2
Fixes Piglit's tests/spec/glsl-4.40/compiler/inout-parameter-qualifier.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Section 3.4 (Preprocessor) of the GLSL ES 3.00 spec says:
It is an error to undefine or to redefine a built-in (pre-defined)
macro name.
The GLSL ES 1.00 spec does not contain this text.
Section 3.3 (Preprocessor) of the GLSL 1.30 spec says:
#define and #undef functionality are defined as is standard for C++
preprocessors for macro definitions both with and without macro
parameters.
At least as far as I can tell GCC allow '#undef __FILE__'. Furthermore,
there are desktop OpenGL conformance tests that expect '#undef
__VERSION__' and '#undef GL_core_profile' to work.
Fixes:
GL45-CTS.shaders.preprocessor.definitions.undefine_version_vertex
GL45-CTS.shaders.preprocessor.definitions.undefine_version_fragment
GL45-CTS.shaders.preprocessor.definitions.undefine_core_profile_vertex
GL45-CTS.shaders.preprocessor.definitions.undefine_core_profile_fragment
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
Builtins already have locations assigned so this shouldn't
change anything. We want to call it earlier so we can tranform
GLSL IR to NIR earlier.
Reviewed-by: Eric Anholt <eric@anholt.net>
Here a new function link_varyings_and_uniforms() is created this
should help make it easier to follow the code in link_shader()
which was getting very large.
Note the end of the new function contains a for loop with some
lowering calls that currently don't seem related to varyings or
uniforms but they are a dependancy for converting to NIR ealier
so we move things here now to keep things easy to follow.
Reviewed-by: Eric Anholt <eric@anholt.net>
If a shader has an output array, it will get treated as though it were
gl_FragData and rewritten into gl_out_FragData instances. We only want
this to happen on the actual gl_FragData and not everything else.
This is a small part of the problem pointed out by the below bug.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96765
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
As requested with the initial creation of util/bitscan.h
now move other bitscan related functions into util.
v2: Split into two patches.
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Previously we would not print a swizzle on ssa_52 when only its .x
component is used (as seen in the definition of ssa_53):
vec3 ssa_52 = fadd ssa_51, ssa_51
vec1 ssa_53 = flog2 ssa_52
vec1 ssa_54 = flog2 ssa_52.y
vec1 ssa_55 = flog2 ssa_52.z
But this makes the interpretation of the RHS of the definition difficult
to understand and dependent on the size of the LHS. Just print swizzles
when they are not the identity swizzle, so the previous example is now
printed as:
vec3 ssa_52 = fadd ssa_51.xyz, ssa_51.xyz
vec1 ssa_53 = flog2 ssa_52.x
vec1 ssa_54 = flog2 ssa_52.y
vec1 ssa_55 = flog2 ssa_52.z
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This reverts commit a37e46323c.
It broke the game Overlord such that it hung a GCN GNU. While I don't know
how the hang happened because of its randomness and gfx corruption precedes
it, many of the shaders contain this:
out vec4 FragData[gl_MaxDrawBuffers];
These are largely identical, except that the GS version has a few
extra error conditions. We can just pass in the stage and skip these.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
We need to subtract VARYING_SLOT_PATCH0, not VARYING_SLOT_VAR0.
Since "patch" only applies to inputs and outputs, we can just handle
this once outside the switch statement, rather than replicating the
check twice and complicating the earlier conditions.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
These are lowered to gl_TessLevel{Outer,Inner}MESA. We need them to
appear in the program resource list with their original names and types.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
This assertion is bogus. Varying structs, and arrays of structs, are
allowed by GLSL, and we can see them here. While we currently don't
have any partial-variable support for those, simply returning false
and marking the entire thing as used is certainly legitimate.
I believe this is often swept under the rug by varying packing,
but that's disabled in certain tessellation situations.
Hit by 20 dEQP-GLES31.functional.tessellation.user_defined_io.* tests.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
This better matches the grammar in section 4.3.9 of the GLSL 4.5 spec,
and also removes some redundant code.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>