Commit graph

39832 commits

Author SHA1 Message Date
Roland Scheidegger
175cdfd491 gallivm: optimize soa linear clamp to edge wrap mode a bit
Clamp against 0 instead of -0.5, which simplifies things.
The former version would have resulted in both int coords being zero
(in case of coord being smaller than 0) and some "unused" weight value,
whereas now the int coords will be 0 and 1, but weight will be 0, hence the
lerp should produce the same value.
Still not happy about differences between normalized and non-normalized...
2010-10-09 00:36:38 +02:00
Roland Scheidegger
2cc6da85d6 gallivm: avoid unnecessary URem in linear wrap repeat case
Haven't looked at what code this exactly generates but URem can't be fast.
Instead of using two URem only use one and replace the second one with
select/add (this is what the corresponding aos code already does).
2010-10-09 00:36:38 +02:00
Roland Scheidegger
318bb080b0 gallivm: more linear tex wrap mode calculation simplification
Rearrange order of operations a bit to make some clamps easier.
All calculations should be equivalent.
Note there seems to be some inconsistency in the clamp to edge case
wrt normalized/non-normalized coords, could potentially simplify this too.
2010-10-09 00:36:38 +02:00
Roland Scheidegger
99ade19e6e gallivm: optimize some tex wrap mode calculations a bit
Sometimes coords are clamped to positive numbers before doing conversion
to int, or clamped to 0 afterwards, in this case can use itrunc
instead of ifloor which is easier. This is only the case for nearest
calculations unfortunately, except linear MIRROR_CLAMP_TO_EDGE which
for the same reason can use a unsigned float build context so the
ifloor_fract helper can reduce this to itrunc in the ifloor helper itself.
2010-10-09 00:36:38 +02:00
Roland Scheidegger
1e17e0c4ff gallivm: replace sub/floor/ifloor combo with ifloor_fract 2010-10-09 00:36:37 +02:00
Roland Scheidegger
cb3af2b434 gallivm: faster iround implementation for sse2
sse2 supports round to nearest directly (or rather, assuming default nearest
rounding mode in MXCSR). Use intrinsic to use this rather than round (sse41)
or bit manipulation whenever possible.
2010-10-09 00:36:37 +02:00
Roland Scheidegger
0ed8c56bfe gallivm: fix trunc/itrunc comment
trunc of -1.5 is -1.0 not 1.0...
2010-10-09 00:36:37 +02:00
Vinson Lee
0f4984a0fb i915: Silence unused variable warning in non-debug builds.
Fixes this GCC warning.
i830_vtbl.c: In function 'i830_assert_not_dirty':
i830_vtbl.c:704: warning: unused variable 'i830'
2010-10-08 15:35:35 -07:00
Ian Romanick
48156b87bc docs: Update status of GL 3.x related extensions 2010-10-08 14:55:27 -07:00
Ian Romanick
1d3027b507 docs: skeleton for 7.10 release notes 2010-10-08 14:50:34 -07:00
Ian Romanick
0ea8b99332 glsl: Remove const decoration from inlined function parameters
The constness of the function parameter gets inlined with the rest of
the function.  However, there is also an assignment to the parameter.
If this occurs inside a loop the loop analysis code will get confused
by the assignment to a read-only variable.

Fixes bugzilla #30552.

NOTE: this is a candidate for the 7.9 branch.
2010-10-08 14:29:11 -07:00
Ian Romanick
dc459f8756 intel: Enable GL_ARB_explicit_attrib_location 2010-10-08 14:21:23 -07:00
Ian Romanick
dbc6c9672d main: Enable GL_ARB_explicit_attrib_location for swrast 2010-10-08 14:21:23 -07:00
Ian Romanick
68a4fc9d5a glsl: Add linker support for explicit attribute locations 2010-10-08 14:21:23 -07:00
Ian Romanick
eee68d3631 glsl: Track explicit location in AST to IR translation 2010-10-08 14:21:23 -07:00
Ian Romanick
2b45ba8bce glsl: Regenerate files changes by previous commit 2010-10-08 14:21:23 -07:00
Ian Romanick
7f68cbdc4d glsl: Add parser support for GL_ARB_explicit_attrib_location layouts
Only layout(location=#) is supported.  Setting the index requires GLSL
1.30 and GL_ARB_blend_func_extended.
2010-10-08 14:21:22 -07:00
Ian Romanick
eafebed5bd glcpp: Regenerate files changes by previous commit 2010-10-08 14:21:22 -07:00
Ian Romanick
e0c9f67be5 glcpp: Add the define for ARB_explicit_attrib_location when present 2010-10-08 14:21:22 -07:00
Ian Romanick
5ed6610d11 glsl: Regenerate files modified by previous commits 2010-10-08 14:21:22 -07:00
Ian Romanick
e24d35a5b5 glsl: Wrap ast_type_qualifier contents in a struct in a union
This will ease adding non-bit fields in the near future.
2010-10-08 14:21:22 -07:00
Ian Romanick
5ff4cfb788 glsl: Clear type_qualifier using memset 2010-10-08 14:21:22 -07:00
Ian Romanick
fd2aa7d313 glsl: Slight refactor of error / warning checking for ARB_fcc layout 2010-10-08 14:21:22 -07:00
Ian Romanick
dd93035a4d glsl: Refactor 'layout' grammar to match GLSL 1.60 spec grammar 2010-10-08 14:21:22 -07:00
Ian Romanick
4b5489dd6f glsl: Fail linking if assign_attribute_locations fails 2010-10-08 14:21:22 -07:00
Vinson Lee
3b16c591a4 r600g: Silence uninitialized variable warning. 2010-10-08 14:17:14 -07:00
Vinson Lee
36b65a373a r600g: Silence uninitialized variable warning. 2010-10-08 14:14:16 -07:00
Vinson Lee
131485efae r600g: Silence uninitialized variable warning. 2010-10-08 14:08:50 -07:00
Vinson Lee
5e90971475 gallivm: Remove unnecessary header. 2010-10-08 14:03:10 -07:00
Eric Anholt
c52a0b5c7d i965: Add register coalescing to the new FS backend.
Improves performance of my GLSL demo 14.3% (+/- 4%, n=4) by
eliminating the moves used in ir_assignment and ir_swizzle handling.
Still 16.5% to go to catch up to the Mesa IR backend, presumably
because instructions are almost perfectly mis-scheduled now.
2010-10-08 13:22:27 -07:00
Eric Anholt
80c0077a6f i965: Enable attribute swizzling (repositioning) in the gen6 SF.
We were trying to remap a fully-filled array down to only handing the
WM the components it uses.  This is called attribute swizzling, and if
you don't enable it you just get 1:1 mappings of inputs to outputs.

This almost fixes glsl-routing, except for the highest gl_TexCoord[]
indices.
2010-10-08 12:00:04 -07:00
Eric Anholt
cac04a9397 i965: Fix new FS gen6 interpolation for sparsely-populated arrays.
We'd overwrite the same element twice.
2010-10-08 11:59:19 -07:00
Eric Anholt
624ce6f61b i965: Fix gen6 WM push constants updates.
We would compute a new buffer, but never point the hardware at the new
buffer.  This partially fixes glsl-routing, as now it get the updated
uniform for which attribute to draw.
2010-10-08 11:59:19 -07:00
José Fonseca
3fde8167a5 gallivm: Help for combined extraction and broadcasting.
Doesn't change generated code quality, but saves some typing.
2010-10-08 19:48:16 +01:00
José Fonseca
438390418d llvmpipe: First minify the texture size, then broadcast. 2010-10-08 19:11:52 +01:00
José Fonseca
f5b5fb32d3 gallivm: Move into the as much of the second level code as possible.
Also, pass more stuff trhough the sample build context, instead of
arguments.
2010-10-08 19:11:52 +01:00
Eric Anholt
5b24d69fcd i965: Handle swizzles in the addition of YUV texture constants.
If someone happened to land a set in a different swizzle order, we
would have assertion failed.
2010-10-08 10:24:30 -07:00
Eric Anholt
0534e958c9 i965: Drop the check for YUV constants in the param list.
_mesa_add_unnamed_constant() already does that.
2010-10-08 10:24:29 -07:00
Eric Anholt
fa8aba9da4 i965: Drop the check for duplicate _mesa_add_state_reference.
_mesa_add_state_reference does that check for us anyway.
2010-10-08 10:24:29 -07:00
Eric Anholt
e310c22bb7 mesa: Simplify a bit of _mesa_add_state_reference using memcmp. 2010-10-08 10:24:29 -07:00
José Fonseca
6b0c79e058 gallivm: Warn when doing inefficient integer comparisons. 2010-10-08 17:43:15 +01:00
José Fonseca
d5ef59d8b0 gallivm: Avoid control flow for two-sided stencil test. 2010-10-08 17:43:15 +01:00
Keith Whitwell
ef3407672e llvmpipe: fix off-by-one in tri_16 2010-10-08 17:30:08 +01:00
Keith Whitwell
0ff132e5a6 llvmpipe: add rast_tri_4_16 for small lines and points 2010-10-08 17:30:08 +01:00
Keith Whitwell
eeb13e2352 llvmpipe: clean up setup_tri a little 2010-10-08 17:30:08 +01:00
Keith Whitwell
e191bf4a85 gallivm: round rather than truncate in new 4x4f->1x16ub conversion path 2010-10-08 17:30:08 +01:00
José Fonseca
f91b4266c6 gallivm: Use the wrappers for SSE pack intrinsics.
Fixes assertion failures on LLVM 2.6.
2010-10-08 17:30:08 +01:00
Keith Whitwell
607e3c542c gallivm: special case conversion 4x4f to 1x16ub
Nice reduction in the number of operations required for final color
output in many shaders.
2010-10-08 17:30:08 +01:00
Keith Whitwell
29d6a1483d llvmpipe: avoid overflow in triangle culling
Avoid multiplying fixed-point values.  Calculate triangle area in
floating point use that for culling.

Lift area calculations up a level as we are already doing this in the
triangle_both() case.

Would like to share the calculated area with attribute interpolation,
but the way the code is structured makes this difficult.
2010-10-08 17:30:08 +01:00
Keith Whitwell
ad6730fadb llvmpipe: fail gracefully on oom in scene creation 2010-10-08 17:26:29 +01:00