Commit graph

3726 commits

Author SHA1 Message Date
Eric Anholt
6ff3341fc7 mesa: Move varying slots and FS output names to shader_enums.h
They're used by glsl_to_nir.cpp, and I want to use them in TGSI-to-NIR as
well (our use of the var->index slot to store slot properties no longer
works since it got truncated).

The *_MAX defines are left in mtypes.h, because they depend on config.h.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-18 17:40:21 -07:00
Thomas Helland
49d0a36bd6 nir: Simplify feq(fneg(a), a)) -> feq(a, 0.0)
The positive and negative value of a float can only
be equal to each other if it is -0.0f and 0.0f.
This is safe for Nan and Inf, as -Nan != Nan, and -Inf != Inf
This gives no changes in my shader-db

Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-08-18 11:34:44 -07:00
Thomas Helland
a39167d594 nir: Simplify fne(fneg(a), a) -> fne(a, 0.0)
-NaN != NaN, and -Inf != Inf, so this should be safe.
Found while working on my VRP pass.

Shader-db results on my IVB:
total instructions in shared programs: 1698267 -> 1698067 (-0.01%)
instructions in affected programs:     15785 -> 15585 (-1.27%)
helped:                                36
HURT:                                  0
GAINED:                                0
LOST:                                  0

Some shaders was found to have the following pattern in NIR:
vec1 ssa_26 = fneg ssa_21
vec1 ssa_27 = fne ssa_21, ssa_26

Make that:
vec1 ssa_27 = fne ssa_21, 0.0f

This is found in Dota2 and Brutal Legend.
One shader is cut by 8%, from 323 -> 296 instructons in SIMD8

Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-08-18 11:34:44 -07:00
Tapani Pälli
a0cea8f642 glsl: add missing MS sampler builtin types for GLSL ES 3.10
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-17 08:25:04 +03:00
Kenneth Graunke
afccbd7256 nir: Add a glsl_uint_type() wrapper.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
2015-08-16 21:44:19 -07:00
Eric Anholt
a6e75e3cd7 nir: Add support for CSE on textures.
NIR instruction count results on i965:
total instructions in shared programs: 1261954 -> 1261937 (-0.00%)
instructions in affected programs:     455 -> 438 (-3.74%)

One in yofrankie, two in tropics.  Apparently i965 had also optimized all
of these out anyway.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-08-14 11:39:18 -07:00
Eric Anholt
fb2425a641 nir: Zero out texture instructions when creating them.
There are so many flags in textures, that the CSE pass would have a hard
time referencing the correct set when figuring out if two texture ops are
the same.  By zeroing, we can avoid that fragility.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-08-14 11:39:18 -07:00
Eric Anholt
d50c182671 nir: Don't try to scalarize unpack ops.
Avoids regressions in vc4 when trying to do our blending in NIR.

v2: Add the other unpack ops I meant to when writing the original commit
    message.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-08-14 11:39:18 -07:00
Eric Anholt
9e6dc5b64d nir: Add a nir_opt_undef() to handle csels with undef.
We may find a cause to do more undef optimization in the future, but for
now this fixes up things after if flattening.  vc4 was handling this
internally most of the time, but a GLB2.7 shader that did a conditional
discard and assign gl_FragColor in the else was still emitting some extra
code.

total instructions in shared programs: 100809 -> 100795 (-0.01%)
instructions in affected programs:     37 -> 23 (-37.84%)

v2: Use nir_instr_rewrite_src() to update def/use on src[0] (by Thomas
    Helland).
v3: Make sure to flag metadata dirties, and copy the swizzle and abs/neg
    over to src[0], too (by anholt).

Reviewed-by: Thomas Helland <thomashelland90@gmail.com> (v2)
Tested-by: Thomas Helland <thomashelland90@gmail.com> (v2)
2015-08-14 11:39:18 -07:00
Timothy Arceri
b8f63b3c10 glsl: make linker error message more informative
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-13 21:57:20 +10:00
Timothy Arceri
fe55ab2d12 glsl: Add missing spec quote about atomic counter in structs
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2015-08-11 21:07:31 +10:00
Timothy Arceri
42d283a0cc glsl: remove stage ref generation for transform feedback
Stage ref cannot be queried for transform feedback.

Also simplify the build_stageref function by passing the
correct mode for uniforms.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-08-07 10:20:08 +10:00
Marek Olšák
7d3939f0de mesa: save which transform feedback buffer is associated with which stream
Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-08-06 20:11:43 +02:00
Michel Dänzer
f7ac4ef4ee glsl: Initialize patch member of glsl_struct_field
There is apparently a subtle difference in C++ between

    F f;

and

    F f();

The former will use the default constructor.  If there is no default
constructor specified, the compiler provides one that simply invokes the
default constructor for each field.  For built-in basic types, the
default constructor does nothing.  The later will, according to
http://stackoverflow.com/questions/2417065/does-the-default-constructor-initialize-built-in-types)
perform value-initialization of the type.  For built-in types this means
initializing to zero.

The per_vertex_accumulator constructor is:

    per_vertex_accumulator::per_vertex_accumulator()
       : fields(),
         num_fields(0)
    {
    }

This is the second form of constructor, so the glsl_struct_field
objects were previously zero initialized.  With the addition of an empty
default constructor in commit 7ac946e5, per_vertex_accumulator::fields
receive no initialization.

Fixes a bunch of random (mostly tessellation related) piglit failures
since commit 7ac946e5 ("glsl: Add constuctors for the common cases of
glsl_struct_field").

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91544
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-08-06 11:53:43 +09:00
Timothy Arceri
2c61d583f8 nir: add missing type to type_size_vec4()
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-08-05 21:16:45 +10:00
Tapani Pälli
18c5cdb943 glsl: add variable mode check to build_stageref
Currently stage reference mask is built using the variable name
only. However it can happen that input of one stage has same name
as output from another stage. Adding check of variable mode makes
sure we do not pick wrong variable.

Fixes some subcases from
   ES31-CTS.program_interface_query.no-locations

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
2015-08-05 07:25:53 +03:00
Eric Anholt
6c28ee2041 nir: Add a nir_lower_load_const_to_scalar() pass.
This is useful to increase the CSE opportunities for a scalar backend.  It
avoids regressions when dropping vc4's custom CSE implementation.

v2: Cleanups by Matt (decl in the for loop, and unreachable()).
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-08-04 20:03:10 -07:00
Eric Anholt
a70f63ab20 nir: Add algebraic opt for no-op iand.
I lazily generated some of these in VC4 NIR lowering.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-08-04 17:19:25 -07:00
Eric Anholt
eae9c3286e Revert "nir: Use a single bit for the dual-source blend index"
This reverts commit ab5b7a0fe6.  We use more
than one bit of value in tgsi_to_nir.
2015-08-04 17:19:01 -07:00
Matt Turner
3c050222b0 mesa: Use _mesa_lroundevenf() in some more places. 2015-08-04 10:32:39 -07:00
Alejandro Seguí
e23cbaadaa glsl: replace old hash table with new and faster one
The util/hash_table was intended to be a fast hash table
replacement for the program/hash_table see 35fd61bd99 and
72e55bb688.

Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
2015-08-04 12:31:05 +10:00
Ian Romanick
7ac946e546 glsl: Add constuctors for the common cases of glsl_struct_field
Fixes a giant pile of GCC warnings:

builtin_types.cpp:60:1: warning: missing initializer for member 'glsl_struct_field::stream' [-Wmissing-field-initializers]

I had to add a default constructor because a non-default constructor
was added.  Otherwise the only constructor would be the one with
parameters, and all the plases like

    glsl_struct_field foo;

would fail to compile.

I wanted to do this in two patches.  All of the initializers of
glsl_struct_field structures had to be converted to use the
constructor because C++ apparently forces you to do one or the other:

builtin_types.cpp:61:1: error: could not convert '{glsl_type::float_type, "near", -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0, -1}' from '<brace-enclosed initializer list>' to 'glsl_struct_field'

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
2015-08-03 11:07:04 -07:00
Samuel Iglesias Gonsalvez
418c004f80 nir: Fix output swizzle in get_mul_for_src
Avoid copying an overwritten swizzle, use the original values.

Example:

   Former swizzle[] = xyzw
   src->swizzle[] = zyxx

The expected output swizzle = zyxx but if we reuse swizzle in the loop,
then output swizzle would be zyzz.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-08-03 09:40:50 -07:00
Iago Toral Quiroga
01f6235020 nir/nir_lower_io: Add vec4 support
The current implementation operates in scalar mode only, so add a vec4
mode where types are padded to vec4 sizes.

This will be useful in the i965 driver for its vec4 nir backend
(and possbly other drivers that have vec4-based shaders).

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-08-03 09:40:47 -07:00
Timothy Arceri
ab5b7a0fe6 nir: Use a single bit for the dual-source blend index
The only values allowed are 0 and 1, and the value is checked before
assigning.

This is a copy of 8eeca7a56c that seems to have been made to the glsl
ir type after it was copied for use in nir but before nir landed.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-08-03 21:36:50 +10:00
Matt Turner
616355160d glsl: Initialize parse-state in constructor of lower_subroutine.
Static analysis tools don't like partial object initializations.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-07-31 10:33:03 -07:00
Timothy Arceri
75a96cedf7 glsl: set stage flag for structs and arrays in resource list
This fixes the remaining failing tests in:
ES31-CTS.program_interface_query.uniform-types

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-07-30 19:33:33 +10:00
Matt Turner
23bba717e1 glsl: Avoid double promotion. 2015-07-29 09:34:52 -07:00
Matt Turner
4251ccb47b nir: Avoid double promotion.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-07-29 09:34:51 -07:00
Matt Turner
5c7fd67045 glsl: Remove MSVC implementations of copysign and isnormal.
Non-Gallium parts of Mesa require MSVC 2013 which provides these.
2015-07-29 09:34:51 -07:00
Kenneth Graunke
e235ca159f glsl: Fix a bug where LHS swizzles of swizzles were too small.
A simple shader such as

   vec4 color;
   color.xy.x = 1.0;

would cause ir_assignment::set_lhs() to generate bogus IR:

   (swiz xy (swiz x (constant float (1.0))))

We were setting the number of components of each new RHS swizzle based
on the highest channel used in the LHS swizzle.  So, .xy.y would
generate (swiz xy (swiz xx ...)), while .xy.x would break.

Our existing Piglit test happened to use .xzy.z, which worked, since
'z' is the third component, resulting in an xxx swizzle.

This patch sets the number of swizzle components based on the size of
the LHS swizzle's inner value, so we always have the correct number
at each step.

Fixes new Piglit tests glsl-vs-swizzle-swizzle-lhs-[23].
Fixes ir_validate assertions in in Metro 2033 Redux.

v2: Move num_components updating completely out of update_rhs_swizzle
    (suggested by Timothy Arceri).  Simplify.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
2015-07-28 22:56:10 -07:00
Tapani Pälli
e17056f5a2 glsl: verify location when dual source blending
Same check is made for glBindFragDataLocationIndexed but it was missing
when using layout qualifiers.

Fixes following Piglit test:
	arb_blend_func_extended-output-location

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
2015-07-29 08:17:55 +03:00
Tapani Pälli
b868971e78 glsl: move max_index calc to assign_attribute_or_color_locations
Change function to get all gl_constants for inspection, this is used
by follow-up patch.

v2: rebase, update function documentation

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
2015-07-29 08:17:12 +03:00
Ilia Mirkin
4b15cb6daa glsl: enable conservative depth, ssbo based on GLSL version
Add in missed version checks in the GLSL parser

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-07-27 12:11:00 -04:00
Ilia Mirkin
b42444ffed glsl: recognize ARB_shading_language_420pack to be enabled with 4.20+
The 420pack extension enables various GLSL rules that need to be applied
to any GLSL 4.20+ shader even if the extension is not explicitly
enabled.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-07-24 18:25:06 -04:00
Samuel Iglesias Gonsalvez
30f97b5e52 glsl/glcpp: fix SIGSEGV when checking error condition for macro redefinition
Commit a6e9cd14c does not take into account than node_{a,b}->next could be NULL
in some circumstances, such as in a shader containing this code:

  #define A 1 /* comment */
  #define A 1 /* comment */

This patch fixes the segmentation fault for cases like that.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91290
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2015-07-24 07:01:13 +02:00
Dave Airlie
80511d176a i965: add support for ARB_shader_subroutine
This just adds some missing pieces to nir/i965,
it is lightly tested on my Haswell.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-07-24 10:25:08 +10:00
Dave Airlie
60266863d8 glsl: add uniform and program resource support (v2)
This adds linker support for subroutine uniforms, they
have some subtle differences from real uniforms, we also hide
them and they are given internal uniform names.

This also adds the subroutine locations and subroutine uniforms
to the program resource tracking for later use.

v1.1: drop is_subroutine_def

v2: handle explicit location properly, ARB_explicit_location
has a lot of language for subroutine shaders.
Calculate a link time the number of compatible subroutines
for a uniform, to make program resource easier later.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-07-23 17:25:43 +10:00
Dave Airlie
7dd429e8f7 glsl/ir: add subroutine lowering pass (v2.3)
This lowers the enhanced ir_call using the lookaside table
of subroutines into an if ladder. This initially was done
at the AST level but it caused some ordering issues so a separate
pass was required.

v2: clone return value derefs.
v2.1: update for subroutine->int convert.
v2.2: add a clone for the array index

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-07-23 17:25:37 +10:00
Dave Airlie
65ac360823 glsl: add ast/parser support for subroutine parsing storage (v3.2)
This is the guts of the GLSL parser and AST support for
shader subroutines.

The code creates a subroutine type in the parser, and
uses that there to validate the identifiers. The parser
also distinguishes between subroutine types/function prototypes
/uniforms and subroutine defintions for functions.

Then in the AST conversion it recreates the types, and
stores the subroutine definition info or subroutine info
into the ir_function along with a side lookup table in
the parser state. It also converts subroutine calls into
the enhanced ir_call.

v2: move to handling method calls in
function handling not in field selection.
v3: merge Chris's previous parser patches in here, to
make it clearer what's changed in one place.
v3.1: add more documentation, drop unused include
v3.2: drop is_subroutine_def

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-07-23 17:25:35 +10:00
Dave Airlie
884df9ef83 glsl/ir: allow ir_call to handle subroutine calling
This adds a ir_variable which contains the subroutine uniform
and an array rvalue for the deref of that uniform, these
are stored in the ir_call and lowered later.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-07-23 17:25:34 +10:00
Dave Airlie
30681c3bb8 glsl/ir: add subroutine information storage to ir_function (v1.1)
We need to store two sets of info into the ir_function,
if this is a function definition with a subroutine list
(subroutine_def) or if it a subroutine prototype.

v1.1: add some more documentation.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-07-23 17:25:32 +10:00
Dave Airlie
f73ef82486 glsl: don't eliminate subroutine types.
This stops dead code from removing subroutines types,
we need these for the queries to work properly.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-07-23 17:25:27 +10:00
Dave Airlie
57f24299b7 glsl/types: add new subroutine type (v3.2)
This type will be used to store the name of subroutine types

as in subroutine void myfunc(void);
will store myfunc into a subroutine type.

This is required to the parser can identify a subroutine
type in a uniform decleration as a valid type, and also for
looking up the type later.

Also add contains_subroutine method.

v2: handle subroutine to int comparisons, needed
for lowering pass.
v3: do subroutine to int with it's own IR
operation to avoid hacking on asserts (Kayden)
v3.1: fix warnings in this patch, fix nir,
fix tgsi
v3.2: fixup tests

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Dave Airlie <airlied@redhat.com>

tests: fix warnings
2015-07-23 17:25:25 +10:00
Chris Forbes
d16ff8ac78 glsl: Make subroutine a reserved keyword
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-07-23 17:25:23 +10:00
Chris Forbes
cc172fddf3 glsl: Add extension plumbing and define for ARB_shader_subroutine
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-07-23 17:25:15 +10:00
Dave Airlie
18955e8a80 glsl/tests: fix varying_test since tess changes.
This fixes make check since the tess changes.

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-07-23 12:46:42 +10:00
Marek Olšák
0af240e940 glsl: use separate varying slots for patch varyings
The idea is to allow 32 normal varyings and 32 patch varyings,
a total of 64. Previously, only a total of 32 was allowed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-07-23 00:59:29 +02:00
Marek Olšák
d070238944 glsl: fix locations of 2-dimensional varyings without varying packing (v2)
v2: renamed producer/consumer_type -> producer/consumer_stage

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-07-23 00:59:29 +02:00
Marek Olšák
41acdae2e9 glsl: don't demote tess control shader outputs
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-07-23 00:59:29 +02:00