Commit graph

846 commits

Author SHA1 Message Date
Kenneth Graunke
565ff67688 glsl: "Copyright", not "Constantright"
Clearly this started out as ir_copy_propagation.cpp, but the search and
replace was a bit overzealous.
2010-09-28 21:17:33 -07:00
Eric Anholt
586b4b500f glsl: Also update implicit sizes of varyings at link time.
Otherwise, we'll often end up with gl_TexCoord being 0 length, for
example.  With ir_to_mesa, things ended up working out anyway, as long
as multiple implicitly-sized arrays weren't involved.
2010-09-28 14:37:26 -07:00
Eric Anholt
5e8ed7a79b glsl: Add validation that a swizzle only references valid channels.
Caught the bug in the previous commit.
2010-09-27 15:52:56 -07:00
Eric Anholt
668cdbe129 glsl: Fix broadcast_index of lower_variable_index_to_cond_assign.
It's trying to get an int smeared across all channels, not trying to
get a 1:1 mapping of a subset of a vector's channels.  This usually
ended up not mattering with ir_to_mesa, since it just smears floats
into every chan of a vec4.

Fixes:
glsl1-temp array with swizzled variable indexing
2010-09-27 15:52:56 -07:00
Eric Anholt
3ffab36768 glsl: Fix copy'n'wasted ir_noop_swizzle conditions.
It considered .xyyy a noop for vec4 instead of .xyzw, and similar for vec3.
2010-09-22 13:09:51 -07:00
Eric Anholt
b39e6f33b6 glsl: Rework assignments with write_masks to have LHS chan count match RHS.
It turns out that most people new to this IR are surprised when an
assignment to (say) 3 components on the LHS takes 4 components on the
RHS.  It also makes for quite strange IR output:

(assign (constant bool (1)) (x) (var_ref color) (swiz x (var_ref v) ))
(assign (constant bool (1)) (y) (var_ref color) (swiz yy (var_ref v) ))
(assign (constant bool (1)) (z) (var_ref color) (swiz zzz (var_ref v) ))

But even worse, even we get it wrong, as shown by this line of our
current step(float, vec4):

(assign (constant bool (1)) (w)
	(var_ref t)
	(expression float b2f (expression bool >=
		    (swiz w (var_ref x))(var_ref edge))))

where we try to assign a float to the writemasked-out x channel and
don't supply anything for the actual w channel we're writing.  Drivers
right now just get lucky since ir_to_mesa spams the float value across
all the source channels of a vec4.

Instead, the RHS will now have a number of components equal to the
number of components actually being written.  Hopefully this confuses
everyone less, and it also makes codegen for a scalar target simpler.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2010-09-22 13:09:51 -07:00
Brian Paul
2b95525429 glsl2: fix typo in error msg 2010-09-21 14:57:10 -06:00
Eric Anholt
b5bb215629 glsl: Add definition of gl_TextureMatrix inverse/transpose builtins.
Fixes glsl2/builtin-texturematrix.
Bug #30196.
2010-09-21 10:09:46 -07:00
Kenneth Graunke
6ea16b6c51 glsl: Fix broken handling of ir_binop_equal and ir_binop_nequal.
When ir_binop_all_equal and ir_binop_any_nequal were introduced, the
meaning of these two opcodes changed to return vectors rather than a
single scalar, but the constant expression handling code was incorrectly
written and only worked for scalars.  As a result, only the first
component of the returned vector would be properly initialized.
2010-09-20 17:33:13 +02:00
Kenneth Graunke
14eea26828 glsl: Add comments to clarify the types of comparison binops. 2010-09-20 17:31:16 +02:00
Brian Paul
1739124159 glsl2: silence compiler warnings in printf() calls
Such as: "ir_validate.cpp:143: warning: format ‘%p’ expects type ‘void*’,
but argument 2 has type ‘ir_variable*’"
2010-09-20 08:22:54 -06:00
Ian Romanick
e053d62aa5 glsl: Add doxygen comments 2010-09-20 07:09:03 -07:00
Kenneth Graunke
dbd2480507 glsl/builtins: Switch comparison functions to just return an expression. 2010-09-18 16:23:48 +02:00
Kenneth Graunke
52f9156e88 glsl/builtins: Fix equal and notEqual builtins.
Commit 309cd4115b incorrectly converted
these to all_equal and any_nequal, which is the wrong operation.
2010-09-18 16:23:48 +02:00
Kenneth Graunke
ca92ae2699 glsl: Properly handle nested structure types.
Fixes piglit test CorrectFull.frag.
2010-09-18 11:21:34 +02:00
Tilman Sauerbeck
3894fddccc glsl2: Fixed cloning of ir_call error instructions.
Those have the callee field set to the null pointer, so
calling the public constructor will segfault.

Signed-off-by: Tilman Sauerbeck <tilman@code-monkey.de>
2010-09-18 09:19:57 +02:00
Vinson Lee
a822ae3f1a glsl: Fix 'control reaches end of non-void function' warning.
Fixes this GCC warning.

lower_variable_index_to_cond_assign.cpp:
In member function
'bool variable_index_to_cond_assign_visitor::needs_lowering(ir_dereference_array*) const':

lower_variable_index_to_cond_assign.cpp:261:
warning: control reaches end of non-void function
2010-09-18 00:14:20 -07:00
Tilman Sauerbeck
19f8f32a96 glsl2: Empty functions can be inlined.
Signed-off-by: Tilman Sauerbeck <tilman@code-monkey.de>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2010-09-18 01:28:47 +02:00
Ian Romanick
a6ecd1c372 glsl2: Add flags to enable variable index lowering 2010-09-17 11:00:24 +02:00
Ian Romanick
6e4fe39da2 glsl2: Refactor testing for whether a deref is of a matrix or array 2010-09-17 11:00:24 +02:00
Luca Barbieri
a47539c7a1 glsl: add pass to lower variable array indexing to conditional assignments
Currenly GLSL happily generates indirect addressing of any kind of
arrays.

Unfortunately DirectX 9 GPUs are not guaranteed to support any of them in
general.

This pass fixes that by lowering such constructs to a binary search on the
values, followed at the end by vectorized generation of equality masks, and
4 conditional assignments for each mask generation.

Note that this requires the ir_binop_equal change so that we can emit SEQ
to generate the boolean masks.

Unfortunately, ir_structure_splitting is too dumb to turn the resulting
constant array references to individual variables, so this will need to
be added too before this pass can actually be effective for temps.

Several patches in the glsl2-lower-variable-indexing were squashed
into this commit.  These patches fix bugs in Luca's original
implementation, and the individual patches can be seen in that branch.
This was done to aid bisecting in the future.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2010-09-17 10:58:58 +02:00
Kenneth Graunke
8fbe968a62 glsl: Don't print blank (function ...) headers for built-ins.
Fixes a regression caused when I added my GLSL ES support.
2010-09-16 03:09:25 -07:00
Kenneth Graunke
81f0339398 glsl: Change from has_builtin_signature to has_user_signature.
The print visitor needs this, and the only existing user can work with
has_user_signature just as well.
2010-09-16 02:52:25 -07:00
Vinson Lee
f20f2cc330 glsl: Fix 'format not a string literal and no format arguments' warning.
Fix the following GCC warning.
loop_controls.cpp: In function 'int calculate_iterations(ir_rvalue*, ir_rvalue*, ir_rvalue*, ir_expression_operation)':
loop_controls.cpp:88: warning: format not a string literal and no format arguments
2010-09-15 05:17:57 -07:00
Brian Paul
1c0644e9da glsl2: add case for ir_unop_noise
Silences a compiler warning.  Still need to add some assertions
for this case.
2010-09-14 09:15:20 -06:00
Brian Paul
2b04ead569 glsl2: fix comments 2010-09-14 09:05:46 -06:00
Ian Romanick
309cd4115b glsl2: Port equal() and notEqual() to ir_unop_all_equal and ir_unop_any_nequal 2010-09-13 17:53:32 -07:00
Luca Barbieri
4dfb89904c glsl: introduce ir_binop_all_equal and ir_binop_any_equal, allow vector cmps
Currently GLSL IR forbids any vector comparisons, and defines "ir_binop_equal"
and "ir_binop_nequal" to compare all elements and give a single bool.

This is highly unintuitive and prevents generation of optimal Mesa IR.

Hence, first rename "ir_binop_equal" to "ir_binop_all_equal" and
"ir_binop_nequal" to "ir_binop_any_nequal".

Second, readd "ir_binop_equal" and "ir_binop_nequal" with the same semantics
as less, lequal, etc.

Third, allow all comparisons to acts on vectors.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2010-09-13 17:53:04 -07:00
Luca Barbieri
2cdbced10d loop_unroll: unroll loops with (lowered) breaks
If the loop ends with an if with one break or in a single break unroll
it.  Loops that end with a continue will have that continue removed by
the redundant jump optimizer.  Likewise loops that end with an
if-statement with a break at the end of both branches will have the
break pulled out after the if-statement.

Loops of the form

   for (...) {
      do_something1();
      if (cond) {
	 do_something2();
	 break;
      } else {
	 do_something3();
      }
   }

will be unrolled as

   do_something1();
   if (cond) {
      do_something2();
   } else {
      do_something3();
      do_something1();
      if (cond) {
	 do_something2();
      } else {
	 do_something3();
	 /* Repeat inserting iterations here.*/
      }
   }

ir_lower_jumps can guarantee that all loops are put in this form
and thus all loops are now potentially unrollable if an upper bound
on the number of iterations can be found.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2010-09-13 16:20:40 -07:00
Ian Romanick
8f2214f489 glsl2: Add pass to remove redundant jumps 2010-09-13 14:25:26 -07:00
Ian Romanick
e79a1bb02a glsl: Explain file naming convention 2010-09-13 14:06:32 -07:00
Luca Barbieri
710d41131b loop_controls: fix analysis of already analyzed loops
The loop_controls pass didn't look at the counter values it put in ir_loop
on previous iterations, so while the first iteration worked, subsequent
ones couldn't determine max_iterations.
2010-09-13 13:03:10 -07:00
Luca Barbieri
3361cbac2a glsl: add continue/break/return unification/elimination pass (v2)
Changes in v2:
- Base class renamed to ir_control_flow_visitor
- Tried to comply with coding style

This is a new pass that supersedes ir_if_return and "lowers" jumps
to if/else structures.

Currently it causes no regressions on softpipe and nv40, but I'm not sure
whether the piglit glsl tests are thorough enough, so consider this
experimental.

It can be asked to:
1. Pull jumps out of ifs where possible
2. Remove all "continue"s, replacing them with an "execute flag"
3. Replace all "break" with a single conditional one at the end of the loop
4. Replace all "return"s with a single return at the end of the function,
   for the main function and/or other functions

This gives several great benefits:
1. All functions can be inlined after this pass
2. nv40 and other pre-DX10 chips without "continue" can be supported
3. nv30 and other pre-DX10 chips with no control flow at all are better supported

Note that for full effect we should also teach the unroller to unroll
loops with a fixed maximum number of iterations but with the canonical
conditional "break" that this pass will insert if asked to.

Continues are lowered by adding a per-loop "execute flag", initialized to
TRUE, that when cleared inhibits all execution until the end of the loop.

Breaks are lowered to continues, plus setting a "break flag" that is checked
at the end of the loop, and trigger the unique "break".

Returns are lowered to breaks/continues, plus adding a "return flag" that
causes loops to break again out of their enclosing loops until all the
loops are exited: then the "execute flag" logic will ignore everything
until the end of the function.

Note that "continue" and "return" can also be implemented by adding
a dummy loop and using break.
However, this is bad for hardware with limited nesting depth, and
prevents further optimization, and thus is not currently performed.
2010-09-13 13:03:09 -07:00
Luca Barbieri
55adbebc62 glsl: add ir_control_flow_visitor
This is just a subclass of ir_visitor with empty implementations of all
the visit methods for non-control flow nodes.

Used to avoid duplicating that in ir_visitor subclasses.

ir_hierarchical_visitor is another way to solve this, but is less natural
for some applications.
2010-09-13 13:03:09 -07:00
Jakob Bornecrantz
379e2e77ba glsl2: Fix scons build for all platforms 2010-09-10 01:17:56 +02:00
Ian Romanick
1f3c7d968c glsl2: Implement noise[1234] built-in functions using ir_unop_noise 2010-09-09 15:39:52 -07:00
Ian Romanick
547131ac87 glsl2: Add lowering pass to remove noise opcodes 2010-09-09 15:39:51 -07:00
Ian Romanick
3a5ce85cfa glsl2: Add ir_unop_noise 2010-09-09 15:39:51 -07:00
Kenneth Graunke
6dcca5a308 glsl/builtins: normalize of a negative scalar should be -1.0. 2010-09-09 15:23:12 -07:00
Luca Barbieri
e591c4625c glsl: add several EmitNo* options, and MaxUnrollIterations
This increases the chance that GLSL programs will actually work.

Note that continues and returns are not yet lowered, so linking
will just fail if not supported.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2010-09-08 20:36:37 -07:00
Eric Anholt
5ecd9c70ce glsl: Add info about talloc and optimization passes to the README. 2010-09-08 18:09:05 -07:00
Eric Anholt
e04f90712d glsl: Update README talking about multi-instruction operations.
The previous thing taking multiple instructions ended up being handled
at the IR level, as we suggested would be the common result.  Pick a
new one.
2010-09-08 18:05:22 -07:00
Kenneth Graunke
7fc882643c glsl/builtins: Set the API in the fake context.
Otherwise it gets used uninitialized.
2010-09-08 17:38:42 -07:00
Ian Romanick
f69a6647fb glsl2: Clear out profile pointers in _mesa_glsl_release_functions
Otherwise builtin_profiles contains dangling pointers the next time
_mesa_read_profile is called.  I suspect this may fix bugzilla #29847,
but I was never able to reproduce it.
2010-09-08 17:16:49 -07:00
Kenneth Graunke
fc1daab2a2 glsl: Fix for scalar float built-in definitions.
These need abs, and we need more tests.
2010-09-08 15:38:09 -07:00
Eric Anholt
c3db43df04 glsl: regenerate builtins 2010-09-08 15:01:02 -07:00
Eric Anholt
aa973d3533 glsl: Fix typo in builtin step() using a wrong channel. 2010-09-08 14:54:08 -07:00
Kenneth Graunke
368dc76f04 ir_validate: Ensure ir_binop_dot is only used on vector types. 2010-09-08 12:09:42 -07:00
Kenneth Graunke
4b2ffa0a42 glsl: Refresh automatically generated file builtin_function.cpp. 2010-09-08 12:09:42 -07:00
Kenneth Graunke
1f7e6e1e72 glsl/builtins: Don't use ir_binop_dot on floating point values.
ir_binop_dot is only defined for vector types.  Use ir_binop_mul.
2010-09-08 12:09:41 -07:00