Commit graph

896 commits

Author SHA1 Message Date
Francisco Jerez
6b3d23dcc0 glsl/ast: Allow redeclaration of gl_LastFragData with different precision qualifier.
v2: No need to check the GLSL version. (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-24 13:28:31 -07:00
Francisco Jerez
5e1d34394e glsl: Don't attempt to do dead varying elimination on gl_LastFragData arrays.
Apparently this pass can only handle elimination of a single built-in
fragment output array, so the presence of gl_LastFragData (which it
wouldn't split correctly anyway) could prevent it from splitting the
actual gl_FragData array.  Just match gl_FragData by name since it's
the only built-in it can handle.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-24 13:28:31 -07:00
Francisco Jerez
6b33eab959 glsl: Define a gl_LastFragData built-in for older GLSL versions.
The EXT_shader_framebuffer_fetch extension defines alternative
language for GLES2 shaders where user-defined fragment outputs are not
allowed.  Instead of using inout user-defined fragment outputs the
shader is expected to read from the gl_LastFragData built-in array.
In addition this allows using the same language on desktop GLSL
versions prior to 4.2 that support the deprecated gl_FragData built-in
in preparation for the MESA_shader_framebuffer_fetch desktop GL
extension.

Both legacy and user-defined inout outputs have a common
representation at the GLSL IR level, so it shouldn't make any
difference for optimization passes and back-ends whether the
application is using gl_LastFragData or user-defined outputs, all
they'll see is a variable dereference of a fragment output at a
certain interface location with the fb_fetch_output bit set to one.

v2: Don't define the built-in variable on GLSL versions for which
    gl_FragData exists but is deprecated. (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-24 13:28:31 -07:00
Francisco Jerez
19e929a177 glsl: Handle the inout qualifier in fragment shader output declarations.
According to the EXT_shader_framebuffer_fetch extension the inout
qualifier can be used on ESSL 3.0+ shaders to declare a special kind
of fragment output that gets implicitly initialized with the previous
framebuffer contents at the current fragment coordinates.  In addition
we allow using the same language to define FB fetch outputs in GLSL
1.3+ shaders in preparation for the desktop MESA_shader_framebuffer_fetch
extensions.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-24 13:28:30 -07:00
Francisco Jerez
b49d8f20f4 glsl: Add support for representing framebuffer fetch in the GLSL IR.
The GLSL IR representation of framebuffer fetch amounts to a single
bit in the ir_variable object applicable to fragment shader outputs.
The flag indicates that the variable will be implicitly initialized to
the previous contents of the render buffer at the same fragment
coordinates and sample index.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-24 13:28:30 -07:00
Francisco Jerez
d7cd7b9c49 glsl: Add parser state enables for the framebuffer fetch extensions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-24 13:28:30 -07:00
Kenneth Graunke
eb1a0ddfd5 glsl: Mark tessellation qualifier maps static const.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-08-23 21:15:59 -07:00
Timothy Arceri
8ee909ee42 nir: avoid segfault when ssa src not found
Without this the following line will segfault and we don't get to
see the results of the validate_assert() above.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-08-23 09:06:29 +10:00
Eric Anholt
3ef1853f7d nir: Fix crash in nir_lower_drawpixels.
Generally you'd see the gl_Color reference first and get some cursor set.
However, in piglit draw-pixel-with-texture we're now seeing the TexCoord
dereferenced first.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-08-22 11:52:27 -07:00
Eric Anholt
0a8ff1681b nir: Fix a comment typo in nir_lower_drawpixels.
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-08-22 11:52:26 -07:00
Eric Anholt
e8378fee0c nir: Define system values for vc4's blending-lowering arguments.
In the GLSL-to-NIR conversion of VC4, I had a bit of trouble with what I
was calling the "state uniforms" that I was putting into the NIR fighting
with its other lowering passes.  Instead of using magic uniform base
numbers in the backend, follow the lead of load_user_clip_plane and just
define system values for them.

v2: Fix unintended change to channel_num, drop unspecified const_index
    value on blend_const_color_r_float.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-22 11:52:26 -07:00
Tapani Pälli
68233801ae glsl: fix key used for hashing switch statement cases
Implementation previously used value itself as the key, however after
hash implementation change by ee02a5e we cannot use 0 as key.

v2: use constant pointer as the key and implement comparison
    for contents (Eric Anholt)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97309
2016-08-22 07:36:33 +03:00
Kenneth Graunke
7db81d9a87 glsl: Rename link_fs_input_layout_qualifiers to "inout".
We're going to handle output qualifiers here too, and calling it "inout"
seems to be the going convention.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-08-20 13:52:25 -07:00
Eric Anholt
9f1411d1ec nir: Add an IO scalarizing pass using the intrinsic's first_component.
vc4 wants to have per-scalar IO load/stores so that dead code elimination
can happen on a more granular basis, which it has been doing in the
backend using a multiplication by 4 of the intrinsic's driver_location.
We can represent it properly in the NIR using the first_component field,
though.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-19 13:11:36 -07:00
Eric Anholt
c35f979220 nir: Add nir_builder support for individual system value loads.
The previous nir_load_system_value(b, nir_intrinsic_load_whatever), 0) was
rather verbose, when system values should be easy to generate.

The index is left out because only one system value had an index included
in it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-19 13:11:36 -07:00
Eric Anholt
24728637e2 nir: Move the undef of nir_intrinsics.h macros to the .h.
I wanted to include this from nir_builder as well, so it also needed the
undefs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-19 13:11:36 -07:00
Eric Anholt
3f607f9e4f nir: Use the system-value front face for twoside lowering.
GLSL-to-NIR generates system value usage, and vc4/freedreno would both
like the system value instead of the varying, so switch this pass over to
it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-19 13:11:36 -07:00
Kenneth Graunke
7d0554f341 nir: Rely on the fact that bcsel takes a well formed boolean.
According to Connor, it's safe to assume that the first operand of
bcsel, as well as the operand of b2f and b2i, must be well formed
booleans.

https://lists.freedesktop.org/archives/mesa-dev/2016-August/125658.html

With the previous improvements to a@bool handling, this now has no
change in shader-db instruction counts on Broadwell.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-08-19 02:05:23 -07:00
Kenneth Graunke
3a9e6102b4 nir/search: Extend 'a@bool' to handle a couple of system values.
load_front_face and load_helper_invocation produce booleans.

On Broadwell:

total instructions in shared programs: 11638956 -> 11638011 (-0.01%)
instructions in affected programs: 115093 -> 114148 (-0.82%)
helped: 628
HURT: 14

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-08-18 01:27:27 -07:00
Kenneth Graunke
e8543feba7 nir/search: Fold src_is_bool()/alu_instr_is_bool() into src_is_type().
I don't want src_is_bool() and src_is_type(x, nir_type_bool) to behave
differently.  Having the logic spread out over three functions makes it
harder to decide where to put new logic, as well.

So, combine them all.  It's a bit simpler because there's now only one
recursive function rather than a pair of mutually recursive functions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-08-18 01:27:15 -07:00
Kenneth Graunke
241870fe5b nir/search: Introduce a src_is_type() helper for 'a@type' handling.
Currently, 'a@type' can only match if 'a' is produced by an ALU
instruction.  This is rather limited - there are other cases we
can easily detect which we should handle.

Extending the code in-place would be fairly messy, so we introduce
a new src_is_type() helper.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-08-18 01:26:47 -07:00
Kenneth Graunke
d8971128ac nir/builder: Add bany_inequal and bany helpers.
The first simply picks the bany_inequal[234] opcodes based on the SSA
def's number of components.  The latter implicitly compares with zero
to achieve the same semantics of GLSL's any().

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2016-08-18 00:46:04 -07:00
Ian Romanick
607ab6d3bf glsl: Pull enum ir_expression_operation out to its own file
No change except to the copyright symbol.  The next patch will generate
this file with Python, and Unicode + Python = pure rage.

v2: Massive rebase... I guess a lot can change in a year.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-08-17 13:48:25 +01:00
Ian Romanick
de71bc9eb6 glsl: Make the generated sources build rules more like NIR
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-08-17 13:48:25 +01:00
Ian Romanick
2ec3a3e151 glsl: Add missing ir_quadop_vector constant evaluation for Boolean types
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-08-17 10:52:39 +01:00
Ian Romanick
cf58e3f522 glsl: Fix typo in ir_unop_f2u implementation
This won't affect the output, but it was, technically, wrong.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-08-17 10:52:39 +01:00
Ian Romanick
8b123b08cb glsl: Fix typo in ir_unop_b2i implementation
This won't affect the output, but it was, technically, wrong.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-08-17 10:52:39 +01:00
Ian Romanick
cd8764737e glsl: Don't support integer types for operations that can't handle them
ir_unop_fract already forbade integer types in ir_validate.  ir_unop_rcp,
ir_unop_rsq, and ir_unop_sqrt should also forbid them in ir_validate.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-08-17 10:52:39 +01:00
Ian Romanick
437e612bd7 glsl: Don't support ir_unop_abs or ir_unop_sign for unsigned integers
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-08-17 10:52:39 +01:00
Ian Romanick
cceb50e14e nir/algebraic: Optimize common array indexing sequence
Some shaders include code that looks like:

   uniform int i;
   uniform vec4 bones[...];

   foo(bones[i * 3], bones[i * 3 + 1], bones[i * 3 + 2]);

CSE would do some work on this:

   x = i * 3
   foo(bones[x], bones[x + 1], bones[x + 2]);

The compiler may then add '<< 4 + base' to the index calculations.
This results in expressions like

   x = i * 3
   foo(bones[x << 4], bones[(x + 1) << 4], bones[(x + 2) << 4]);

Just rearranging the math to produce (i * 48) + 16 saves an
instruction, and it allows CSE to do more work.

   x = i * 48;
   foo(bones[x], bones[x + 16], bones[x + 32]);

So, ~6 instructions becomes ~3.

Some individual shader-db results look pretty bad.  However, I have a
really, really hard time believing the change in estimated cycles in,
for example, 3dmmes-taiji/51.shader_test after looking that change in
the generated code.

G45
total instructions in shared programs: 4020840 -> 4010070 (-0.27%)
instructions in affected programs: 177460 -> 166690 (-6.07%)
helped: 894
HURT: 0

total cycles in shared programs: 98829000 -> 98784990 (-0.04%)
cycles in affected programs: 3936648 -> 3892638 (-1.12%)
helped: 894
HURT: 0

Ironlake
total instructions in shared programs: 6418887 -> 6408117 (-0.17%)
instructions in affected programs: 177460 -> 166690 (-6.07%)
helped: 894
HURT: 0

total cycles in shared programs: 143504542 -> 143460532 (-0.03%)
cycles in affected programs: 3936648 -> 3892638 (-1.12%)
helped: 894
HURT: 0

Sandy Bridge
total instructions in shared programs: 8357887 -> 8339251 (-0.22%)
instructions in affected programs: 432715 -> 414079 (-4.31%)
helped: 2795
HURT: 0

total cycles in shared programs: 118284184 -> 118207412 (-0.06%)
cycles in affected programs: 6114626 -> 6037854 (-1.26%)
helped: 2478
HURT: 317

Ivy Bridge
total instructions in shared programs: 7669390 -> 7653822 (-0.20%)
instructions in affected programs: 388234 -> 372666 (-4.01%)
helped: 2795
HURT: 0

total cycles in shared programs: 68381982 -> 68263684 (-0.17%)
cycles in affected programs: 1972658 -> 1854360 (-6.00%)
helped: 2458
HURT: 307

Haswell
total instructions in shared programs: 7082636 -> 7067068 (-0.22%)
instructions in affected programs: 388234 -> 372666 (-4.01%)
helped: 2795
HURT: 0

total cycles in shared programs: 68282020 -> 68164158 (-0.17%)
cycles in affected programs: 1891820 -> 1773958 (-6.23%)
helped: 2459
HURT: 261

Broadwell
total instructions in shared programs: 9002466 -> 8985875 (-0.18%)
instructions in affected programs: 658784 -> 642193 (-2.52%)
helped: 2795
HURT: 5

total cycles in shared programs: 78503092 -> 78450404 (-0.07%)
cycles in affected programs: 2873304 -> 2820616 (-1.83%)
helped: 2275
HURT: 415

Skylake
total instructions in shared programs: 9156978 -> 9140387 (-0.18%)
instructions in affected programs: 682625 -> 666034 (-2.43%)
helped: 2795
HURT: 5

total cycles in shared programs: 75591392 -> 75550574 (-0.05%)
cycles in affected programs: 3192120 -> 3151302 (-1.28%)
helped: 2271
HURT: 425

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-08-17 10:52:38 +01:00
Kenneth Graunke
1f47f78fc3 glcpp: Update tests for new #undef of built-in macro rules.
Ian recently changed the preprocessor to allow this in most GLSL
versions, but not GLSL ES 3.00+.  This patch converts the existing
test that expects a failure to a #version 300 es shader, and adds
a #version 110 shader to make sure that it's allowed.

Fixes 'make check'.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97307
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2016-08-15 22:55:34 -07:00
Ilia Mirkin
a32c87f74b glsl: emit a specific error when ast_*_assign changes type
For regular ast_add, we can implicitly change either a or b's type.
However in an assignment situation, the type of the lvalue is fixed. So
if the implicit conversion logic decides to change it, it means that the
rhs's type could not be converted to the lhs type.

Emit a specific error for this rather than the rather mysterious "is not
an lvalue" error that results from having a i2f or other operation as
the lvalue.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96729
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-12 22:45:20 -04:00
Ilia Mirkin
1baae00089 glsl: look for frag data bindings with [0] tacked onto the end for arrays
The GL spec is very unclear on this point. Apparently this is discussed
without resolution in the closed Khronos bugtracker at
https://cvs.khronos.org/bugzilla/show_bug.cgi?id=7829 . The
recommendation is to allow dropping the [0] for looking up the bindings.

The approach taken in this patch is to instead tack on [0]'s for each
arrayness level of the output's type, and doing the lookup again. That
way, for

out vec4 foo[2][2][2]

we will end up looking for bindings for foo, foo[0], foo[0][0], and
foo[0][0][0], in that order of preference.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96765
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-08-12 20:21:08 -04:00
Kenneth Graunke
f9f462936a glsl: Fix invariant matching in GLSL 4.30 and GLSL ES 1.00.
Old languages (GLSL <= 4.20 and GLSL ES 1.00) require "invariant"
to be specified on both inputs and outputs, and match when linking.

New languages only allow outputs to be qualified as "invariant"
and remove the "invariant must match" restriction when linking
varyings (because no input can have that qualifier).

Commit 426a50e208 introduced the new
behavior for ES 3.00.  It also removed the "must match" restriction
for ES 1.00 shaders, which I believe is incorrect.  This patch adds
that back, as well as making 4.30+ follow the new rules.

Thanks to Qiankun Miao for noticing this discrepancy.

Fixes a WebGL 2.0 conformance test when run in Chromium:
https://www.khronos.org/registry/webgl/sdk/tests/deqp/data/gles3/shaders/qualification_order.html?webglVersion=2

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96971
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-11 23:56:53 -07:00
Kenneth Graunke
0ed316360f glsl: Tidy stream handling in merge_qualifier().
The previous commit fixed xfb_buffer handling, which was largely copy
and pasted from the stream handling.  The difference is that stream
was set in input_layout_mask, so it worked.

However, that's totally rubbish: stream is only valid on geometry shader
outputs.  Presumably this was to hack around inout.  Instead, apply the
solution I used in the previous fix.

Really, we just need to separate shader interface and parameter
qualifier handling so this isn't a mess, but this patch at least
tidies it slightly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-11 23:56:48 -07:00
Kenneth Graunke
dffa371665 glsl: Fix inout qualifier handling in GLSL 4.40.
inout variables have q.in and q.out set.  We were trying to set
xfb_buffer = 1 for shader output variables (and inadvertantly setting
it on inout parameters, too).  But input_layout_mask doesn't have
xfb_buffer set, so it was seen as in invalid input qualifier.

This meant that all 'inout' parameters were broken.

Caught by running a WebGL conformance test in Chromium:
https://www.khronos.org/registry/webgl/sdk/tests/deqp/data/gles3/shaders/qualification_order.html?webglVersion=2

Fixes Piglit's tests/spec/glsl-4.40/compiler/inout-parameter-qualifier.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-11 23:56:40 -07:00
Timothy Arceri
33b3815773 glsl/tests: fix segfault in uniform initializer test
Caused by 549222f5

Tested-by: Aaron Watry <awatry@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97286
2016-08-11 14:57:18 +10:00
Ian Romanick
50b49d242d glcpp: Only disallow #undef of pre-defined macros on GLSL ES >= 3.00 shaders
Section 3.4 (Preprocessor) of the GLSL ES 3.00 spec says:

   It is an error to undefine or to redefine a built-in (pre-defined)
   macro name.

The GLSL ES 1.00 spec does not contain this text.

Section 3.3 (Preprocessor) of the GLSL 1.30 spec says:

   #define and #undef functionality are defined as is standard for C++
   preprocessors for macro definitions both with and without macro
   parameters.

At least as far as I can tell GCC allow '#undef __FILE__'.  Furthermore,
there are desktop OpenGL conformance tests that expect '#undef
__VERSION__' and '#undef GL_core_profile' to work.

Fixes:

    GL45-CTS.shaders.preprocessor.definitions.undefine_version_vertex
    GL45-CTS.shaders.preprocessor.definitions.undefine_version_fragment
    GL45-CTS.shaders.preprocessor.definitions.undefine_core_profile_vertex
    GL45-CTS.shaders.preprocessor.definitions.undefine_core_profile_fragment

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
2016-08-10 16:42:02 -07:00
Ian Romanick
eda6349346 glcpp: Track the actual version instead of just the version_resolved flag
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
2016-08-10 16:42:02 -07:00
Timothy Arceri
30e5ff7067 glsl: remove remaining tabs in link_uniform_initializers.cpp
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-08-11 08:33:38 +10:00
Timothy Arceri
549222f5f8 glsl: use UniformHash to find storage location
There is no need to be looping over all the uniforms.

Reviewed-by: Eric Anholt <eric@anholt.net>
2016-08-11 08:33:30 +10:00
Timothy Arceri
82e153daff glsl: remove dead builtins before assigning varying locations
Builtins already have locations assigned so this shouldn't
change anything. We want to call it earlier so we can tranform
GLSL IR to NIR earlier.

Reviewed-by: Eric Anholt <eric@anholt.net>
2016-08-11 08:33:21 +10:00
Timothy Arceri
588702cc41 glsl: split out varying and uniform linking code
Here a new function link_varyings_and_uniforms() is created this
should help make it easier to follow the code in link_shader()
which was getting very large.

Note the end of the new function contains a for loop with some
lowering calls that currently don't seem related to varyings or
uniforms but they are a dependancy for converting to NIR ealier
so we move things here now to keep things easy to follow.

Reviewed-by: Eric Anholt <eric@anholt.net>
2016-08-11 08:33:12 +10:00
Eric Anholt
ac6966360f mesa: Use a temporary set to track whether we've added a resource yet.
Saves another .1s on servo.trace.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-10 12:27:22 -07:00
Eric Anholt
60f1b436b9 nir: Drop an unused program/hash_table.h include.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-10 12:27:22 -07:00
Ilia Mirkin
bc5df3b321 Re-apply "glsl: don't try to lower non-gl builtins as if they were gl_FragData"
If a shader has an output array, it will get treated as though it were
gl_FragData and rewritten into gl_out_FragData instances. We only want
this to happen on the actual gl_FragData and not everything else.

This is a small part of the problem pointed out by the below bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96765
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-10 15:43:36 +02:00
Mathias Fröhlich
027cbf00f2 util: Move _mesa_fsl/util_last_bit into util/bitscan.h
As requested with the initial creation of util/bitscan.h
now move other bitscan related functions into util.

v2: Split into two patches.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-09 21:20:46 +02:00
Timothy Arceri
8c4d9afb7e nir: make use of nir_cf_list_extract() helper
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-09 13:21:30 +10:00
Matt Turner
b1d9c742e9 nir: Always print non-identity swizzles.
Previously we would not print a swizzle on ssa_52 when only its .x
component is used (as seen in the definition of ssa_53):

   vec3 ssa_52 = fadd ssa_51, ssa_51
   vec1 ssa_53 = flog2 ssa_52
   vec1 ssa_54 = flog2 ssa_52.y
   vec1 ssa_55 = flog2 ssa_52.z

But this makes the interpretation of the RHS of the definition difficult
to understand and dependent on the size of the LHS. Just print swizzles
when they are not the identity swizzle, so the previous example is now
printed as:

   vec3 ssa_52 = fadd ssa_51.xyz, ssa_51.xyz
   vec1 ssa_53 = flog2 ssa_52.x
   vec1 ssa_54 = flog2 ssa_52.y
   vec1 ssa_55 = flog2 ssa_52.z

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-08-08 17:52:35 -07:00
Marek Olšák
1ebf3c4b67 Revert "glsl: don't try to lower non-gl builtins as if they were gl_FragData"
This reverts commit a37e46323c.

It broke the game Overlord such that it hung a GCN GNU. While I don't know
how the hang happened because of its randomness and gfx corruption precedes
it, many of the shaders contain this:

out vec4 FragData[gl_MaxDrawBuffers];
2016-08-08 23:24:20 +02:00