Commit graph

51169 commits

Author SHA1 Message Date
Kenneth Graunke
85cd30406f i965: Implement guardband clipping on Sandybridge.
Improves performance in Citybench:
- 320x240:  19.8008% +/- 0.937818%
- 1280x480: 6.53856% +/- 0.859083%

No apparent difference in OpenArena nor Xonotic.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-05-15 14:52:24 -07:00
José Fonseca
5994a641d8 llvmpipe: Add a test for lp_build_sgn.
Only floating point though, but better than nothing.
2012-05-15 22:39:25 +01:00
José Fonseca
9fb4eef6a1 gallivm: Fix lp_build_sgn for normalized/fixed-point integers.
These types got broken with the recent commit that fixed lp_build_sgn
for negative integers.
2012-05-15 22:39:24 +01:00
José Fonseca
c95cea50a9 gallivm: Fix lp_build_const_xxx for negative integers.
Do proper rounding.

Thanks to Olivier Galibert for investigating this.
2012-05-15 22:39:24 +01:00
Brian Paul
1459c18f45 svga: fix FBO / viewport bugs
When drawing to a FBO, the viewport wasn't always set correctly.  It
was fine in the usual case of the viewport dims matching the surface
dims but broken otherwise.  In particular, this was happening because
the viewport scale is negative for FBO rendering.

The piglit fbo-viewport test exercises this.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-05-15 14:56:54 -06:00
Vadim Girlin
4a8d47c264 radeon/llvm: add support for texture offsets, fix TEX_LD
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-05-15 18:53:20 +04:00
Vadim Girlin
fa5a963dd6 radeon/llvm: add SET_GRADIENTS*, fix SAMPLE_G
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-05-15 18:53:06 +04:00
Vadim Girlin
b655f78b25 radeon/llvm: increase const regs count
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-05-15 18:48:26 +04:00
Vadim Girlin
12a2374da3 radeon/llvm: use IntrNoMem property for intrinsics where possible
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-05-15 18:48:16 +04:00
Vadim Girlin
63a8595271 radeon/llvm: use correct intrinsic for CEIL
Should be round_posinf instead of round_neginf.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-05-15 18:48:06 +04:00
Vadim Girlin
0298238bdd radeon/llvm: improve ABS_i32 lowering
We can save one instruction by lowering it to:
  SUB_INT tmp, 0, src
  MAX_INT dst, src, tmp

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-05-15 18:47:53 +04:00
Vadim Girlin
76e4898ba3 radeon/llvm: fix BUILD_VECTOR lowering for replicated value
We expect that all elements will be assigned even if they are equal

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-05-15 18:47:38 +04:00
Vadim Girlin
4b8db65dbf radeon/llvm: add names for AMDGPU* passes
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-05-15 18:47:22 +04:00
Vadim Girlin
76ba7e2205 radeon/llvm: add generated files to .gitignore
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-05-15 18:47:02 +04:00
Paul Berry
821c34ecd9 Add .gitignore files for recently-added gallium projects
This patch adds .gitignore files to ignore the makefiles generated by
the gallium pipe loader and the clover OpenCL state tracker.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2012-05-15 07:39:05 -07:00
José Fonseca
e88f9b9546 glsl: Fix lower_discard_flow prototype mismatch.
Should fix MSVC link failure.
2012-05-15 12:27:15 +01:00
Eric Anholt
9e9ae280e2 Revert "i965/fs: Jump from discard statements to the end of the program when done."
This reverts commit 31866308fc.

Fixes piglit glsl-fs-discard-exit-3 and unigine tropics rendering.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-05-14 17:03:53 -07:00
Eric Anholt
3de1395fa5 glsl: Implement the GLSL 1.30+ discard control flow rule in GLSL IR.
Previously, I tried implementing this in the i965 driver, but did so
in a way that violated the intent of the spec, and broke Tropics.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-05-14 17:03:51 -07:00
Eric Anholt
e21b9f1f19 glsl: Remove the opt_discard_simplification pass.
This conflicts with the GLSL 1.30+ rules for derivatives after a
discard has occurred.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-05-14 17:03:44 -07:00
Eric Anholt
f42cdc7984 i965/fs: Remove the requirement of no dead code for interference checks.
This will be convenient when I want to comment out optimization code
to see the raw program being optimized, but more importantly will let
the interference check be used during optimization.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2012-05-14 16:53:19 -07:00
Eric Anholt
d7787adda8 i965/fs: Add support for copy propagation.
We could do more by handling abs/negate and non-GRF sources, but this is
a good start.  Improves tropics performance 0.30% +/- .17% (n=43).

shader-db results:
Total instructions: 208032 -> 207184
60/1246 programs affected (4.8%)
23286 -> 22438 instructions in affected programs (3.6% reduction)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-05-14 16:53:19 -07:00
Eric Anholt
f7a71e2570 i965/fs: When doing no work for live interval calculation, do no allocation.
When I had a bug causing the backend to never finish optimizing, it
also sent me deep into swap.  This avoids extra memory allocation per
trip through optimization, and thus may reduce the peak memory
allocation of the driver even in the success case.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-05-14 16:51:00 -07:00
Eric Anholt
206eca631b i965/gen7: Set tile_x/y to 0 in the no-stencil case.
Fixes compiler warnings.
2012-05-14 16:51:00 -07:00
Eric Anholt
1e188f2dae intel: Fix signed/unsigned comparison warnings. 2012-05-14 16:51:00 -07:00
Eric Anholt
1c1040dcf0 intel: Fix compile warning from 7b6424143d 2012-05-14 16:51:00 -07:00
Eric Anholt
cdca6e3c9f intel: Fix compiler warning from 3cd7bee48f 2012-05-14 16:51:00 -07:00
Kenneth Graunke
a4e9b5a768 i965/fs: Add a local common subexpression elimination pass.
Total instructions: 18210 -> 17836
49/163 programs affected (30.1%)
12888 -> 12514 instructions in affected programs (2.9% reduction)

This reduces Lightsmark's "Scale down filter" shader from 395
instructions to 283, a whopping 28%.  It also reduces register pressure
significantly: the SIMD8 program now uses 29 registers instead of 101,
giving us more than enough room for a SIMD16 program.

v2: Add && !inst->conditional_mod to the "skip some instructions" check.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-05-14 15:13:55 -07:00
Kenneth Graunke
d1029f9988 i965/fs: Use a const reference in fs_reg::equals instead of a pointer.
This lets you omit some ampersands and is more idiomatic C++.  Using
const also marks the function as not altering either register (which
was obvious, but nice to enforce).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-05-14 15:12:46 -07:00
Oliver McFadden
bf78806133 mesa: print the Git SHA1 in GL_VERSION for ES1 and ES2.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-05-14 19:13:44 +03:00
Oliver McFadden
60e8a49440 mesa: GLES specifies restrictions on uniform matrix transpose.
GL_INVALID_VALUE is generated if transpose is not GL_FALSE.

http://www.khronos.org/opengles/sdk/docs/man/xhtml/glUniform.xml

Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-05-14 19:13:43 +03:00
Michel Dänzer
8969de7e98 radeonsi: Keep around copies of original sampler states.
Fixes crashes when restoring sampler states after blits.
2012-05-14 17:56:03 +02:00
Michel Dänzer
1deb2be2b7 radeonsi: Flesh out shader interpolation related code.
Handle perspective interpolation and ceontroid vs. center.
2012-05-14 17:56:03 +02:00
Michel Dänzer
de52a56a0e radeonsi: Add proper SI family names. 2012-05-14 17:56:02 +02:00
Michel Dänzer
23e4fe2a53 radeonsi: Separate states for samplers and sampler views.
And reset nregs on updates. Prevents eventual assertion failure.
2012-05-14 17:56:02 +02:00
Michel Dänzer
36abadd0db radeonsi: Fixups for drawing with an index buffer.
Mostly using the DRAW_INDEX_2 type 3 packet instead of DRAW_INDEX, which is
no longer supported on SI.
2012-05-14 17:56:02 +02:00
Vinson Lee
599140119e vl: Initialize pipe_vertex_buffer.user_buffer fields.
Fix uninitialized scalar variable defects reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-05-14 08:44:16 -07:00
James Benton
24678700ed llvmpipe: Calculate fixed point coordinates for triangle setup earlier.
This allows us to calculate the triangle's area using fixed point,
previously it was cacluated in floating point space. It was possible
that a triangle which had negative area in floating point space had
a positive area in fixed point space.

Fixes fdo 40920.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-05-14 16:07:49 +01:00
Tom Stellard
ec201667bf radeon/llvm: Coding style fixes for R600CodeEmitter.cpp 2012-05-14 10:40:58 -04:00
Tom Stellard
224e187f98 radeon/llvm: Lower bitcast instructions to copies 2012-05-14 10:40:58 -04:00
Marek Olšák
ed9955dc29 radeonsi: remove slab allocator for pipe_resource (used mainly for user buffers) 2012-05-13 14:32:57 +02:00
Marek Olšák
05ea705c7c r600g: remove slab allocator for pipe_resource (used mainly for user buffers) 2012-05-13 14:32:57 +02:00
Marek Olšák
b2d6386086 r600g: handle R16G16B16_FLOAT and R32G32B32_FLOAT in translate_colorswap (EG) 2012-05-12 23:13:45 +02:00
Marek Olšák
b496136af8 gallium: remove user_buffer_create from the interface
Nothing uses it now.
2012-05-12 23:13:45 +02:00
Marek Olšák
1a840cc592 gallium/graw: stop using user_buffer_create
This is compile-tested.
2012-05-12 23:13:45 +02:00
Marek Olšák
685a28fd8a gallium/util: remove unused parameter nr_vertex_buffers in util_draw_max_index 2012-05-12 23:13:45 +02:00
Francisco Jerez
b70736fa82 clover: Fix build on i386. 2012-05-12 19:43:06 +02:00
Francisco Jerez
fcab4d4a34 clover: Check the total work-group size provided to clEnqueueNDRangeKernel. 2012-05-12 19:43:01 +02:00
Christoph Bumiller
5c9bccc97e clover, gallium: add PIPE_COMPUTE_CAP_MAX_THREADS_PER_BLOCK
This is not necessarily the product of MAX_BLOCK_SIZE[i].

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2012-05-12 19:33:48 +02:00
Francisco Jerez
ec848d2730 r600g: Handle compute caps. 2012-05-12 19:17:18 +02:00
Francisco Jerez
4065639310 r300g: Handle compute caps. 2012-05-12 19:17:13 +02:00