Commit graph

16176 commits

Author SHA1 Message Date
José Fonseca
f5c41e16d7 gallium/tgsi: Don't declare temps individually when they are all similar.
tgsi_ureg was recently enhanced to support local temporaries, and as result
temps are declared individually.

This change avoids many TEMP register declarations on common shaders.

(And fixes performance regression due to mismatches against performance
sensitive shaders.)

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-07-02 12:14:53 +01:00
José Fonseca
e75fe7ba08 gallivm: Cleanup the 4 x float -> 16 ub special path in lp_build_conv.
No behaviour change intended.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-07-02 12:13:52 +01:00
José Fonseca
63e0e4b8f5 gallium/util: Add ULL suffix to large constants.
As suggested by Andy Furniss: it looks like some old gcc versions
require it.
2012-07-02 12:12:42 +01:00
Tom Stellard
1d21bd057a clover: Handle NULL devs argument in clBuildProgram
If devs is NULL, then the kernel should be compiled for all devices
associated with the program.
2012-07-01 15:45:24 +02:00
Francisco Jerez
c6bb41c28b clover: Define non-templated copy constructor for clover::ref_ptr.
The templated copy constructor doesn't prevent the compiler from
emitting a default copy constructor, which leads to inconsistent
memory handling and was reported to cause segfaults when doing event
manipulation.

Reported-by: Tom Stellard <thomas.stellard@amd.com>
2012-07-01 15:37:30 +02:00
Brian Paul
db2b6ca504 llvmpipe: fix comment typo 2012-06-29 17:19:12 -06:00
Tom Stellard
ca8fa02308 clover: Add a function internalizer pass before LTO v2
The function internalizer pass marks non-kernel functions as internal,
which enables optimizations like function inlining and global dead-code
elimination.

v2:
  - Pass vector arguments by const reference
2012-06-29 18:46:18 +00:00
Tom Stellard
a31b2f7107 radeon/llvm: Enable vec4 loads on R600 2012-06-29 18:46:18 +00:00
Tom Stellard
e17c586d08 radeon/llvm: Enable floating point stores on R600 2012-06-29 18:46:18 +00:00
Tom Stellard
b66ef1f48c radeon/llvm: Handle floating point loads on R600 2012-06-29 18:46:18 +00:00
Tom Stellard
c01199dfc0 radeon/llvm: Expand UDIV and UREM nodes 2012-06-29 18:46:18 +00:00
Tom Stellard
2c485cda20 radeon/llvm: Emit raw ISA for vertex fetch instructions 2012-06-29 18:46:18 +00:00
José Fonseca
16e0ebccb6 gallium/util: Truly disable INF/NAN tests on MSVC.
Thanks to Brian for spotting this.
2012-06-29 14:49:23 +01:00
José Fonseca
c9bada497c gallium/util: Disable INF/NAN tests on MSVC.
Somehow they are not recognized as constants.
2012-06-29 13:39:07 +01:00
José Fonseca
fa8dcb848f translate: Free elt8_func/elt16_func too.
These were leaking.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-06-29 12:21:08 +01:00
James Benton
6dd8e6f9cb util: Reimplement half <-> float conversions.
Removed u_half.py used to generate the table for previous method.

Previous implementation of float to half conversion was faulty for
denormalised and NaNs and would require extra logic to fix,
thus making the speedup of using tables irrelevant.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-06-29 12:21:02 +01:00
James Benton
c8d3481cdb tests: Updated tests to properly handle NaN for half floats.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-06-29 12:20:59 +01:00
James Benton
60dca53833 util: Updated u_format_tests to rigidly test half-float boundary values.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-06-29 12:20:57 +01:00
James Benton
d069d8ef38 util: Added functions for checking NaN / Inf for double and half-floats.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-06-29 12:20:54 +01:00
James Benton
34075d4133 util: Added util_format_is_array.
This function checks whether a format description is in a simple array format.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-06-29 12:20:37 +01:00
José Fonseca
638779e445 gallivm: Refactor lp_build_broadcast(_scalar) to share code.
Doesn't really change the generated assembly, but produces more compact IR,
and of course, makes code more consistent.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-06-28 20:20:34 +01:00
Johannes Obermayr
bf679ce1dc gallivm: Fix potential buffer overflowing in strncat.
Signed-off-by: José Fonseca <jfonseca@vmware.com>
2012-06-28 11:47:23 +01:00
Marcin Slusarz
1906d2b46b nv50: dynamically allocate space for shader local storage
Fixes 21 piglit tests:
spec/glsl-1.10/execution/variable-indexing/
fs-temp-array-mat4-index-col-row-wr
vs-temp-array-mat4-index-col-row-wr
vs-temp-array-mat4-index-row-wr

spec/glsl-1.20/execution/variable-indexing/
fs-temp-array-mat3-index-col-row-rd
fs-temp-array-mat3-index-row-rd
fs-temp-array-mat4-col-row-wr
fs-temp-array-mat4-index-col-row-rd
fs-temp-array-mat4-index-col-row-wr
fs-temp-array-mat4-index-row-rd
fs-temp-array-mat4-index-row-wr
vs-temp-array-mat3-index-col-row-rd
vs-temp-array-mat3-index-col-row-wr
vs-temp-array-mat3-index-row-rd
vs-temp-array-mat3-index-row-wr
vs-temp-array-mat4-col-row-wr
vs-temp-array-mat4-index-col-row-rd
vs-temp-array-mat4-index-col-row-wr
vs-temp-array-mat4-index-col-wr
vs-temp-array-mat4-index-row-rd
vs-temp-array-mat4-index-row-wr
vs-temp-array-mat4-index-wr

... and prevents a lot of GPU lockups
2012-06-28 00:01:02 +02:00
Marcin Slusarz
0fceaee4fd nv50: streamline screen_create error handling
Remove macro which changes control flow (it's evil).
Make all fail paths print (correct) error message.
2012-06-28 00:01:02 +02:00
Marcin Slusarz
96259b5128 nv50/ir: make colorful ir dump output optional 2012-06-28 00:01:02 +02:00
Brian Paul
098aa5f9ab softpipe: fix numFragsEmitted debug code 2012-06-27 07:50:57 -06:00
Brian Paul
81e2a238bc gallium: minor whitespace, comment changes 2012-06-27 07:50:57 -06:00
José Fonseca
d1c5ea9207 gallium/util: Fix parsing of options with underscore.
For example

  GALLIVM_DEBUG=no_brilinear

which was being parsed as two options, "no" and "brilinear".
2012-06-27 11:16:18 +01:00
James Benton
789436f1e0 gallivm: Added a generic lp_build_print_value which prints a LLVMValueRef.
Updated lp_build_printf to share common code.
Removed specific lp_build_print_vecX.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-06-27 11:16:18 +01:00
Stéphane Marchesin
45fc069600 i915g: Implement sRGB textures
Since we don't have them in hw we emulate them in the shader. Although not
recommended by the spec it is legit.

As a side effect we also get GL 2.1. I think this is as far as we can take
the i915.
2012-06-26 23:18:15 -07:00
Brian Paul
3bc39414ab svga: return 120 for PIPE_CAP_GLSL_FEATURE_LEVEL
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-06-26 17:03:33 -06:00
Brian Paul
ac8613c298 llvmpipe: return 120 for PIPE_CAP_GLSL_FEATURE_LEVEL
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-06-26 17:03:33 -06:00
Jerome Glisse
b75f1d973c r600g: enable DUAL_EXPORT mode when possible on r6xx/r7xx
DUAL_EXPORT can be enabled on r6xx/r7xx when all CBs use 16-bit export
and there is no depth/stencil export.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-06-27 02:06:55 +04:00
Vadim Girlin
470d00c0e2 r600g: enable DUAL_EXPORT mode when possible
It seems DUAL_EXPORT on evergreen may be enabled when all CBs use 16-bit export
mode (EXPORT_4C_16BPC), also there should be at least one CB, and the PS
shouldn't export depth/stencil.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2012-06-27 02:06:55 +04:00
Vadim Girlin
0c47d9dcab r600g: avoid unnecessary shader exports v2
In some cases TGSI shader has more color outputs than the number of CBs,
so it seems we need to limit the number of color exports. This requires
different shader variants depending on the nr_cbufs, but on the other hand
we are doing less exports, which are very costly.

v2: fix various piglit regressions

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-06-27 02:06:55 +04:00
Vadim Girlin
4acf71f01e r600g: cache shader variants instead of rebuilding v3
Shader variants are stored in the list, the key for lookup is based on the
states that require different hw shaders - currently it's rctx->two_side (all
gpus) and rctx->nr_cbufs (evergreen/cayman, when writes_all property is set).

v2:
 - use simple list instead of keymap as suggested by Marek on irc
 - call r600_adjust_gprs from r600_bind_vs_shader for r6xx/r7xx
   (r600_shader_select isn't used for vertex shaders currently)

v3:
 - fix call to r600_adjust_gprs - do it after updating current shader

Improves performance for some apps, e.g. FlightGear -
see https://bugs.freedesktop.org/show_bug.cgi?id=50360

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2012-06-27 02:06:55 +04:00
Brian Paul
55a89889ba svga: handle missing PIPE_CAP_x queries
And fix incorrect error message for a bad shader type/number.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-06-26 15:03:44 -06:00
Brian Paul
056e9b4511 llvmpipe: handle more PIPE_CAP_x queries
As with the previous commit for softpipe.

v2: remove 'default' case to get compile-time warning

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-06-26 15:03:44 -06:00
Brian Paul
7d23dcdacc softpipe: handle more PIPE_CAP_x queries
These all return zero.  Add a debug_printf() to catch the default case so
we don't accidently mishandle something important in the future.

v2: remove 'default' case to get compile-time warning

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-06-26 15:03:43 -06:00
Brian Paul
80efb524ee svga: return 1 for PIPE_CAP_MIXED_COLORBUFFER_FORMATS
This is actually required for GL_ARB_framebuffer_object, but the state
tracker doesn't currently check it.
Direct3D 9 allows mixed format color buffers with some restrictions.
Setting this allows Unigine Heaven 2.5 and 3.0 to run.  Tested both on
GL and D3D hosts.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
2012-06-26 15:03:43 -06:00
Olivier Galibert
27e94ba4ea u2f_emit: Fix type parameter in LLVM call.
The type is the destination type (i.e. float vector) and not the
source type.  Fixes piglit fs-{in,de}crement-uint.

Signed-off-by: Olivier Galibert <galibert@pobox.com>
Signed-off-by: José Fonseca <jfonseca@vmware.com>
2012-06-26 16:55:40 +01:00
José Fonseca
4bde1ba7fb st/wgl: Add a few more comments. 2012-06-26 10:15:36 +01:00
Marek Olšák
cc2cd8b356 r600g: don't disable streamout if it hasn't been started 2012-06-26 03:37:24 +02:00
Marek Olšák
496399d8e9 u_blitter: disable streamout before rendering
This fixes piglit EXT_transform_feedback tests:
- intervening-read output
- intervening-read prims_written
2012-06-26 03:37:23 +02:00
Brian Paul
345ee593e9 st/wgl: 80-column wrapping 2012-06-25 16:10:01 -06:00
Marek Olšák
4891c5dc64 r600g: inline r600_blit_push_depth and use resource_copy_region
We are going to have a separate resource for depth texturing and transfers
and this is just a transfer thing.
2012-06-25 23:53:49 +02:00
Marek Olšák
da98bb6fc1 r600g: split flushed depth texture creation and flushing 2012-06-25 23:53:49 +02:00
Brian Paul
45df3eb1db llvmpipe: fix the LP_NO_RAST debug option
It was only no-oping the clear() function, not actual triangle
rasterization.  Move the no_rast field from lp_context down into
lp_rasterizer so it's accessible where it's needed.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-06-25 08:14:33 -06:00
Brian Paul
fe68af6e0d svga: init pointer to NULL to silence MSVC warning 2012-06-22 17:24:37 -06:00
Tom Stellard
ea76f03310 clover: Add --with-clang-libdir option and verify CLANG_RESOURCE_DIR
$CLANG_RESOURCE_DIR is the directory that contains all resources
needed by clang to compile programs.  When clover uses clang to
compile kernels it needs to specify a resource dir, so that clang
can find its internal headers (e.g. stddef.h).

clang defines $CLANG_RESOURCE_DIR as $CLANG_LIBDIR/clang/$CLANG_VERSION

This patch adds the --with-clang-libdir option in order to accommodate
clang intalls to non-standard locations, and it also adds a check
to the configure script to verify that $CLANG_RESOURCE_DIR/include
contains the necessary header files.
2012-06-22 16:59:24 -04:00