These functions need to return the final computed value.
Now expressions such as a = (b += c) work properly.
Also, no need to use __asm intrinsics in these functions. The resulting
code is the same when using ordinary arithmetic operators and is more legible.
There was a note in state.c about _Active deserving to die, and there were
potential issues with it due to i965 forgetting to set _UseTexEnvProgram.
Removing both simplifies things.
Reviewed-by: Brian Paul <brianp@vmware.com>
Add a "max complexity" heuristic to allow unrolling long loops with small
bodies and short loops with large bodies.
The loop unroll limits may need further tweaking...
Loops such as this will be unrolled:
for (i = 0; i < 4; ++i) {
body;
}
where 'body' isn't too large.
This also helps to fix the issue reported in bug #19190. The problem there
is indexing vector types with a variable index. For example:
vec4 v;
v[2] = 1.0; // equivalent to v.z = 1.0
v[i] = 2.0; // variable index into vector!!
Since the for-i loop can be unrolled, we can avoid the problems associated
with variable indexing into a vector (at least in this case).
This fixes cases such as:
vec4 v4;
vec2 v2;
v4.xz.yx = v2;
The last line now correctly compiles into MOV TEMP[1].xz, TEMP[0].yyxw;
Helps to fix the Humus Domino demo. See bug 19189.
This adds all of the `mglu' symbols to the list of symbol exports
for GLU. Without this patch, mangled GLU symbols are considered
`internal' symbols, and calling any results in undefined references.
Now only the samplers that are actually used by texture() functions are
saved in the uniform variable list. Before, we could run out of samplers
if too many were declared while only some of them were actually used.
The max texture coord units is still 8. All the fixed-function paths are
still limited to 8 too. But GLSL shaders can use more samplers now.
Note that some texcoord-related data structures are declared to be 16
elements in size rather than 8. This just simplifies the code in a few
places; the extra elements aren't accessible to the user.
These changes haven't been extensively tested yet, but sanity checking has
been done.
It should be possible to increase the max image units/samplers to 32 without
doing anything special. Beyond that we'll need longer bitfields in a few
places.
This lets us avoid software fallbacks when clients forget to turn some state
off (engine demo) or just do crazy things to test conformance (OGLC).
This should probably be brought into mesa generic code so other drivers can
make use of it.
Bug #19016.