Fixed a boneheaded error in the generation of SPU code that calculates
the results of the stencil test. Basically, all the greater than/less than
calculations were exactly inverted: they were coded as though the
given comparison took the stencil value as a left-hand operand and the
reference value as a right-hand operand, but the actual semantics always
put the reference as the left-hand operand and the stencil as the right-hand
operand.
With this fix, tests/dinoshade runs, as do all the other Mesa tests
and samples that use stencil (and that don't use texture formats
unsupported by Cell).
With these changes, the tests/stencil_twoside test now works.
- Eliminate blending from the stencil_twoside test, as it produces an
unneeded dependency on having blending working
- The spe_splat() function will now work if the register being splatted
and the destination register are the same
- Separate fragment code generated for front-facing and back-facing
fragments. Often these are the same; if two-sided stenciling is on,
they can be different. This is easier and faster than generating
code that does both tests and merges the results.
- Fixed a cut/paste bug where if the back Z-pass stencil operation
were different from all the other operations, the back Z-fail
results were incorrect.
This ensures all batchbuffers have a same cliprect mode after calling
_intel_batchbuffer_flush even if there aren't invalid commands in the
current batch buffer. (fix bug#18362).
See bug 18445.
When getting array results, __glXReadReply() always reads a multiple of
four bytes. This can cause writing to invalid memory when 'n' is not a
multiple of four.
Special-case the glAreTexturesResident() functions now.
To fix the bug, we use a temporary buffer that's a multiple of four bytes
in length.
NOTE: this commit also reverts part of commit 919ec22ecf
(glx/x11: Added some #ifdef GLX_DIRECT_RENDERING protection) which
directly edited the indirect.c file rather than the python generator!
I'm not repairing that issue at this time.
- Use a lookup table for log2.
- Compute (float) (1 << ipart) by tweaking with the exponent directly to
avoid integer overflow and float conversion.
- Also table negative exponents to avoid float division and branching.
- Implement util_fast_exp as function of util_fast_exp2.
--------
Cherry-picked from gallium-0.2: 8415d06d90
This fixes some pow() glitches seen in fslight.c, spectex.c, etc.
Conflicts:
src/gallium/auxiliary/util/u_math.h
This allows us to use SSE codegen with debug builds again.
When PIPE_ARCH_SSE is set (w/ gcc -msse -msse2) we will also use the
gcc SSE intrinsic functions.