Commit graph

880 commits

Author SHA1 Message Date
Eric Anholt
3f607f9e4f nir: Use the system-value front face for twoside lowering.
GLSL-to-NIR generates system value usage, and vc4/freedreno would both
like the system value instead of the varying, so switch this pass over to
it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-19 13:11:36 -07:00
Kenneth Graunke
7d0554f341 nir: Rely on the fact that bcsel takes a well formed boolean.
According to Connor, it's safe to assume that the first operand of
bcsel, as well as the operand of b2f and b2i, must be well formed
booleans.

https://lists.freedesktop.org/archives/mesa-dev/2016-August/125658.html

With the previous improvements to a@bool handling, this now has no
change in shader-db instruction counts on Broadwell.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-08-19 02:05:23 -07:00
Kenneth Graunke
3a9e6102b4 nir/search: Extend 'a@bool' to handle a couple of system values.
load_front_face and load_helper_invocation produce booleans.

On Broadwell:

total instructions in shared programs: 11638956 -> 11638011 (-0.01%)
instructions in affected programs: 115093 -> 114148 (-0.82%)
helped: 628
HURT: 14

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-08-18 01:27:27 -07:00
Kenneth Graunke
e8543feba7 nir/search: Fold src_is_bool()/alu_instr_is_bool() into src_is_type().
I don't want src_is_bool() and src_is_type(x, nir_type_bool) to behave
differently.  Having the logic spread out over three functions makes it
harder to decide where to put new logic, as well.

So, combine them all.  It's a bit simpler because there's now only one
recursive function rather than a pair of mutually recursive functions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-08-18 01:27:15 -07:00
Kenneth Graunke
241870fe5b nir/search: Introduce a src_is_type() helper for 'a@type' handling.
Currently, 'a@type' can only match if 'a' is produced by an ALU
instruction.  This is rather limited - there are other cases we
can easily detect which we should handle.

Extending the code in-place would be fairly messy, so we introduce
a new src_is_type() helper.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-08-18 01:26:47 -07:00
Kenneth Graunke
d8971128ac nir/builder: Add bany_inequal and bany helpers.
The first simply picks the bany_inequal[234] opcodes based on the SSA
def's number of components.  The latter implicitly compares with zero
to achieve the same semantics of GLSL's any().

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-08-18 00:46:04 -07:00
Ian Romanick
607ab6d3bf glsl: Pull enum ir_expression_operation out to its own file
No change except to the copyright symbol.  The next patch will generate
this file with Python, and Unicode + Python = pure rage.

v2: Massive rebase... I guess a lot can change in a year.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-08-17 13:48:25 +01:00
Ian Romanick
de71bc9eb6 glsl: Make the generated sources build rules more like NIR
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-08-17 13:48:25 +01:00
Ian Romanick
2ec3a3e151 glsl: Add missing ir_quadop_vector constant evaluation for Boolean types
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-08-17 10:52:39 +01:00
Ian Romanick
cf58e3f522 glsl: Fix typo in ir_unop_f2u implementation
This won't affect the output, but it was, technically, wrong.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-08-17 10:52:39 +01:00
Ian Romanick
8b123b08cb glsl: Fix typo in ir_unop_b2i implementation
This won't affect the output, but it was, technically, wrong.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-08-17 10:52:39 +01:00
Ian Romanick
cd8764737e glsl: Don't support integer types for operations that can't handle them
ir_unop_fract already forbade integer types in ir_validate.  ir_unop_rcp,
ir_unop_rsq, and ir_unop_sqrt should also forbid them in ir_validate.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-08-17 10:52:39 +01:00
Ian Romanick
437e612bd7 glsl: Don't support ir_unop_abs or ir_unop_sign for unsigned integers
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-08-17 10:52:39 +01:00
Ian Romanick
cceb50e14e nir/algebraic: Optimize common array indexing sequence
Some shaders include code that looks like:

   uniform int i;
   uniform vec4 bones[...];

   foo(bones[i * 3], bones[i * 3 + 1], bones[i * 3 + 2]);

CSE would do some work on this:

   x = i * 3
   foo(bones[x], bones[x + 1], bones[x + 2]);

The compiler may then add '<< 4 + base' to the index calculations.
This results in expressions like

   x = i * 3
   foo(bones[x << 4], bones[(x + 1) << 4], bones[(x + 2) << 4]);

Just rearranging the math to produce (i * 48) + 16 saves an
instruction, and it allows CSE to do more work.

   x = i * 48;
   foo(bones[x], bones[x + 16], bones[x + 32]);

So, ~6 instructions becomes ~3.

Some individual shader-db results look pretty bad.  However, I have a
really, really hard time believing the change in estimated cycles in,
for example, 3dmmes-taiji/51.shader_test after looking that change in
the generated code.

G45
total instructions in shared programs: 4020840 -> 4010070 (-0.27%)
instructions in affected programs: 177460 -> 166690 (-6.07%)
helped: 894
HURT: 0

total cycles in shared programs: 98829000 -> 98784990 (-0.04%)
cycles in affected programs: 3936648 -> 3892638 (-1.12%)
helped: 894
HURT: 0

Ironlake
total instructions in shared programs: 6418887 -> 6408117 (-0.17%)
instructions in affected programs: 177460 -> 166690 (-6.07%)
helped: 894
HURT: 0

total cycles in shared programs: 143504542 -> 143460532 (-0.03%)
cycles in affected programs: 3936648 -> 3892638 (-1.12%)
helped: 894
HURT: 0

Sandy Bridge
total instructions in shared programs: 8357887 -> 8339251 (-0.22%)
instructions in affected programs: 432715 -> 414079 (-4.31%)
helped: 2795
HURT: 0

total cycles in shared programs: 118284184 -> 118207412 (-0.06%)
cycles in affected programs: 6114626 -> 6037854 (-1.26%)
helped: 2478
HURT: 317

Ivy Bridge
total instructions in shared programs: 7669390 -> 7653822 (-0.20%)
instructions in affected programs: 388234 -> 372666 (-4.01%)
helped: 2795
HURT: 0

total cycles in shared programs: 68381982 -> 68263684 (-0.17%)
cycles in affected programs: 1972658 -> 1854360 (-6.00%)
helped: 2458
HURT: 307

Haswell
total instructions in shared programs: 7082636 -> 7067068 (-0.22%)
instructions in affected programs: 388234 -> 372666 (-4.01%)
helped: 2795
HURT: 0

total cycles in shared programs: 68282020 -> 68164158 (-0.17%)
cycles in affected programs: 1891820 -> 1773958 (-6.23%)
helped: 2459
HURT: 261

Broadwell
total instructions in shared programs: 9002466 -> 8985875 (-0.18%)
instructions in affected programs: 658784 -> 642193 (-2.52%)
helped: 2795
HURT: 5

total cycles in shared programs: 78503092 -> 78450404 (-0.07%)
cycles in affected programs: 2873304 -> 2820616 (-1.83%)
helped: 2275
HURT: 415

Skylake
total instructions in shared programs: 9156978 -> 9140387 (-0.18%)
instructions in affected programs: 682625 -> 666034 (-2.43%)
helped: 2795
HURT: 5

total cycles in shared programs: 75591392 -> 75550574 (-0.05%)
cycles in affected programs: 3192120 -> 3151302 (-1.28%)
helped: 2271
HURT: 425

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-08-17 10:52:38 +01:00
Kenneth Graunke
1f47f78fc3 glcpp: Update tests for new #undef of built-in macro rules.
Ian recently changed the preprocessor to allow this in most GLSL
versions, but not GLSL ES 3.00+.  This patch converts the existing
test that expects a failure to a #version 300 es shader, and adds
a #version 110 shader to make sure that it's allowed.

Fixes 'make check'.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97307
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2016-08-15 22:55:34 -07:00
Ilia Mirkin
a32c87f74b glsl: emit a specific error when ast_*_assign changes type
For regular ast_add, we can implicitly change either a or b's type.
However in an assignment situation, the type of the lvalue is fixed. So
if the implicit conversion logic decides to change it, it means that the
rhs's type could not be converted to the lhs type.

Emit a specific error for this rather than the rather mysterious "is not
an lvalue" error that results from having a i2f or other operation as
the lvalue.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96729
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-12 22:45:20 -04:00
Ilia Mirkin
1baae00089 glsl: look for frag data bindings with [0] tacked onto the end for arrays
The GL spec is very unclear on this point. Apparently this is discussed
without resolution in the closed Khronos bugtracker at
https://cvs.khronos.org/bugzilla/show_bug.cgi?id=7829 . The
recommendation is to allow dropping the [0] for looking up the bindings.

The approach taken in this patch is to instead tack on [0]'s for each
arrayness level of the output's type, and doing the lookup again. That
way, for

out vec4 foo[2][2][2]

we will end up looking for bindings for foo, foo[0], foo[0][0], and
foo[0][0][0], in that order of preference.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96765
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-08-12 20:21:08 -04:00
Kenneth Graunke
f9f462936a glsl: Fix invariant matching in GLSL 4.30 and GLSL ES 1.00.
Old languages (GLSL <= 4.20 and GLSL ES 1.00) require "invariant"
to be specified on both inputs and outputs, and match when linking.

New languages only allow outputs to be qualified as "invariant"
and remove the "invariant must match" restriction when linking
varyings (because no input can have that qualifier).

Commit 426a50e208 introduced the new
behavior for ES 3.00.  It also removed the "must match" restriction
for ES 1.00 shaders, which I believe is incorrect.  This patch adds
that back, as well as making 4.30+ follow the new rules.

Thanks to Qiankun Miao for noticing this discrepancy.

Fixes a WebGL 2.0 conformance test when run in Chromium:
https://www.khronos.org/registry/webgl/sdk/tests/deqp/data/gles3/shaders/qualification_order.html?webglVersion=2

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96971
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-11 23:56:53 -07:00
Kenneth Graunke
0ed316360f glsl: Tidy stream handling in merge_qualifier().
The previous commit fixed xfb_buffer handling, which was largely copy
and pasted from the stream handling.  The difference is that stream
was set in input_layout_mask, so it worked.

However, that's totally rubbish: stream is only valid on geometry shader
outputs.  Presumably this was to hack around inout.  Instead, apply the
solution I used in the previous fix.

Really, we just need to separate shader interface and parameter
qualifier handling so this isn't a mess, but this patch at least
tidies it slightly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-11 23:56:48 -07:00
Kenneth Graunke
dffa371665 glsl: Fix inout qualifier handling in GLSL 4.40.
inout variables have q.in and q.out set.  We were trying to set
xfb_buffer = 1 for shader output variables (and inadvertantly setting
it on inout parameters, too).  But input_layout_mask doesn't have
xfb_buffer set, so it was seen as in invalid input qualifier.

This meant that all 'inout' parameters were broken.

Caught by running a WebGL conformance test in Chromium:
https://www.khronos.org/registry/webgl/sdk/tests/deqp/data/gles3/shaders/qualification_order.html?webglVersion=2

Fixes Piglit's tests/spec/glsl-4.40/compiler/inout-parameter-qualifier.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-11 23:56:40 -07:00
Timothy Arceri
33b3815773 glsl/tests: fix segfault in uniform initializer test
Caused by 549222f5

Tested-by: Aaron Watry <awatry@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97286
2016-08-11 14:57:18 +10:00
Ian Romanick
50b49d242d glcpp: Only disallow #undef of pre-defined macros on GLSL ES >= 3.00 shaders
Section 3.4 (Preprocessor) of the GLSL ES 3.00 spec says:

   It is an error to undefine or to redefine a built-in (pre-defined)
   macro name.

The GLSL ES 1.00 spec does not contain this text.

Section 3.3 (Preprocessor) of the GLSL 1.30 spec says:

   #define and #undef functionality are defined as is standard for C++
   preprocessors for macro definitions both with and without macro
   parameters.

At least as far as I can tell GCC allow '#undef __FILE__'.  Furthermore,
there are desktop OpenGL conformance tests that expect '#undef
__VERSION__' and '#undef GL_core_profile' to work.

Fixes:

    GL45-CTS.shaders.preprocessor.definitions.undefine_version_vertex
    GL45-CTS.shaders.preprocessor.definitions.undefine_version_fragment
    GL45-CTS.shaders.preprocessor.definitions.undefine_core_profile_vertex
    GL45-CTS.shaders.preprocessor.definitions.undefine_core_profile_fragment

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
2016-08-10 16:42:02 -07:00
Ian Romanick
eda6349346 glcpp: Track the actual version instead of just the version_resolved flag
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
2016-08-10 16:42:02 -07:00
Timothy Arceri
30e5ff7067 glsl: remove remaining tabs in link_uniform_initializers.cpp
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-08-11 08:33:38 +10:00
Timothy Arceri
549222f5f8 glsl: use UniformHash to find storage location
There is no need to be looping over all the uniforms.

Reviewed-by: Eric Anholt <eric@anholt.net>
2016-08-11 08:33:30 +10:00
Timothy Arceri
82e153daff glsl: remove dead builtins before assigning varying locations
Builtins already have locations assigned so this shouldn't
change anything. We want to call it earlier so we can tranform
GLSL IR to NIR earlier.

Reviewed-by: Eric Anholt <eric@anholt.net>
2016-08-11 08:33:21 +10:00
Timothy Arceri
588702cc41 glsl: split out varying and uniform linking code
Here a new function link_varyings_and_uniforms() is created this
should help make it easier to follow the code in link_shader()
which was getting very large.

Note the end of the new function contains a for loop with some
lowering calls that currently don't seem related to varyings or
uniforms but they are a dependancy for converting to NIR ealier
so we move things here now to keep things easy to follow.

Reviewed-by: Eric Anholt <eric@anholt.net>
2016-08-11 08:33:12 +10:00
Eric Anholt
ac6966360f mesa: Use a temporary set to track whether we've added a resource yet.
Saves another .1s on servo.trace.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-10 12:27:22 -07:00
Eric Anholt
60f1b436b9 nir: Drop an unused program/hash_table.h include.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-10 12:27:22 -07:00
Ilia Mirkin
bc5df3b321 Re-apply "glsl: don't try to lower non-gl builtins as if they were gl_FragData"
If a shader has an output array, it will get treated as though it were
gl_FragData and rewritten into gl_out_FragData instances. We only want
this to happen on the actual gl_FragData and not everything else.

This is a small part of the problem pointed out by the below bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96765
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-10 15:43:36 +02:00
Mathias Fröhlich
027cbf00f2 util: Move _mesa_fsl/util_last_bit into util/bitscan.h
As requested with the initial creation of util/bitscan.h
now move other bitscan related functions into util.

v2: Split into two patches.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-09 21:20:46 +02:00
Timothy Arceri
8c4d9afb7e nir: make use of nir_cf_list_extract() helper
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-09 13:21:30 +10:00
Matt Turner
b1d9c742e9 nir: Always print non-identity swizzles.
Previously we would not print a swizzle on ssa_52 when only its .x
component is used (as seen in the definition of ssa_53):

   vec3 ssa_52 = fadd ssa_51, ssa_51
   vec1 ssa_53 = flog2 ssa_52
   vec1 ssa_54 = flog2 ssa_52.y
   vec1 ssa_55 = flog2 ssa_52.z

But this makes the interpretation of the RHS of the definition difficult
to understand and dependent on the size of the LHS. Just print swizzles
when they are not the identity swizzle, so the previous example is now
printed as:

   vec3 ssa_52 = fadd ssa_51.xyz, ssa_51.xyz
   vec1 ssa_53 = flog2 ssa_52.x
   vec1 ssa_54 = flog2 ssa_52.y
   vec1 ssa_55 = flog2 ssa_52.z

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-08-08 17:52:35 -07:00
Marek Olšák
1ebf3c4b67 Revert "glsl: don't try to lower non-gl builtins as if they were gl_FragData"
This reverts commit a37e46323c.

It broke the game Overlord such that it hung a GCN GNU. While I don't know
how the hang happened because of its randomness and gfx corruption precedes
it, many of the shaders contain this:

out vec4 FragData[gl_MaxDrawBuffers];
2016-08-08 23:24:20 +02:00
Kenneth Graunke
522b5d4566 glsl: Implicitly enable OES_shader_io_blocks if geom/tess are enabled.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-08-08 09:59:03 -07:00
Kenneth Graunke
0eaa84e8af glsl: Expose gl_PointSize if OES/EXT_tessellation_point_size is enabled.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-08-08 09:59:03 -07:00
Kenneth Graunke
58709d36d7 glsl: Add extension plumbing for OES/EXT_tessellation_shader.
This adds the #extension directive support, built-in #defines,
lexer keyword support, and updates has_tessellation_shader().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-08-08 09:59:03 -07:00
Kenneth Graunke
bd1bd03268 glsl: Combine GS and TES array resizing visitors.
These are largely identical, except that the GS version has a few
extra error conditions.  We can just pass in the stage and skip these.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-07 23:53:59 -07:00
Kenneth Graunke
398428f406 glsl: Fix location bias for patch variables.
We need to subtract VARYING_SLOT_PATCH0, not VARYING_SLOT_VAR0.

Since "patch" only applies to inputs and outputs, we can just handle
this once outside the switch statement, rather than replicating the
check twice and complicating the earlier conditions.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-07 23:53:42 -07:00
Kenneth Graunke
1556f16e46 glsl: Fix the program resource names of gl_TessLevelOuter/Inner[].
These are lowered to gl_TessLevel{Outer,Inner}MESA.  We need them to
appear in the program resource list with their original names and types.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-07 23:53:28 -07:00
Kenneth Graunke
4a49851da1 glsl: Delete bogus ir_set_program_inouts assert.
This assertion is bogus.  Varying structs, and arrays of structs, are
allowed by GLSL, and we can see them here.  While we currently don't
have any partial-variable support for those, simply returning false
and marking the entire thing as used is certainly legitimate.

I believe this is often swept under the rug by varying packing,
but that's disabled in certain tessellation situations.

Hit by 20 dEQP-GLES31.functional.tessellation.user_defined_io.* tests.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-07 23:51:21 -07:00
Kenneth Graunke
86915b495b glsl: Simplify interface qualifier parsing.
This better matches the grammar in section 4.3.9 of the GLSL 4.5 spec,
and also removes some redundant code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-07 23:48:48 -07:00
Kenneth Graunke
d0642c52fc glsl: Add a has_tessellation_shader() helper.
Similar to has_geometry_shader(), has_compute_shader(), and so on.
This will make it easier to add more conditions here later.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-07 23:47:55 -07:00
Andres Gomez
591869e921 glsl: fix indentation, comments and line lengths in ast_function.cpp
Acked-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
2016-08-05 14:27:11 +03:00
Andres Gomez
8f98a120f3 glsl: apply_implicit_conversion is static again
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
2016-08-05 14:27:11 +03:00
Andres Gomez
1443c10d74 glsl: struct constructors/initializers only allow implicit conversions
When an argument for a structure constructor or initializer doesn't
match the expected type, only Section 4.1.10 “Implicit Conversions”
are allowed to try to match that expected type.

From page 32 (page 38 of the PDF) of the GLSL 1.20 spec:

  " The arguments to the constructor will be used to set the structure's
    fields, in order, using one argument per field. Each argument must
    be the same type as the field it sets, or be a type that can be
    converted to the field's type according to Section 4.1.10 “Implicit
    Conversions.”"

From page 35 (page 41 of the PDF) of the GLSL 4.20 spec:

  " In all cases, the innermost initializer (i.e., not a list of
    initializers enclosed in curly braces) applied to an object must
    have the same type as the object being initialized or be a type that
    can be converted to the object's type according to section 4.1.10
    "Implicit Conversions". In the latter case, an implicit conversion
    will be done on the initializer before the assignment is done."

v2: Remove also the now redundant constant conversion, the
    constant_record_constructor helper and the replacement code
    (Timothy).

Fixes GL44-CTS.shading_language_420pack.initializer_list_negative

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
2016-08-05 14:27:03 +03:00
Andres Gomez
de60d549b9 glsl: Refactor implicit conversion into its own helper
v2: Refactor also the conversion to constant and replacement code
    (Timothy).

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
2016-08-05 14:27:03 +03:00
Andres Gomez
af796d756e glsl/types: disallow implicit conversions before GLSL 1.20
Implicit conversions were added in the GLSL 1.20 spec version.

v2: Join the checks for GLSL 1.10 and ESSL (Timothy).

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
2016-08-05 14:27:03 +03:00
Kenneth Graunke
144cbf8987 nir: Make nir_opt_remove_phis see through moves.
I found a shader in Tales of Maj'Eyal that contains:

        if ssa_21 {
                block block_1:
                /* preds: block_0 */
                ...instructions that prevent the select peephole...
                vec1 32 ssa_23 = imov ssa_4
                vec1 32 ssa_24 = imov ssa_4.y
                vec1 32 ssa_25 = imov ssa_4.z
                /* succs: block_3 */
        } else {
                block block_2:
                /* preds: block_0 */
                vec1 32 ssa_26 = imov ssa_4
                vec1 32 ssa_27 = imov ssa_4.y
                vec1 32 ssa_28 = imov ssa_4.z
                /* succs: block_3 */
        }
        block block_3:
        /* preds: block_1 block_2 */
        vec1 32 ssa_29 = phi block_1: ssa_23, block_2: ssa_26
        vec1 32 ssa_30 = phi block_1: ssa_24, block_2: ssa_27
        vec1 32 ssa_31 = phi block_1: ssa_25, block_2: ssa_28

Here, copy propagation will bail because phis cannot perform swizzles,
and CSE won't do anything because there is no dominance relationship
between the imovs.  By making nir_opt_remove_phis handle identical moves,
we can eliminate the phis and rewrite everything to use ssa_4 directly,
so all the moves become dead and get eliminated.

I don't think we need to check "exact" - just the alu sources.
Presumably phi sources should match in their exactness.

On Broadwell:

total instructions in shared programs: 11639872 -> 11638535 (-0.01%)
instructions in affected programs: 134222 -> 132885 (-1.00%)
helped: 338
HURT: 0

v2: Fix return value to be NULL, not false (caught by Iago).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-08-04 00:42:12 -07:00
Kenneth Graunke
7603b4d3a1 nir: Make nir_alu_srcs_equal non-static.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-08-04 00:41:07 -07:00