Commit graph

32625 commits

Author SHA1 Message Date
Benjamin Segovia
5a38e70d59 prog_optimize: Only merge writes to temporary registers
In one optimization pass, register files may have been messed therefore
merging instructions which use the same index in two different register
files.
2010-08-17 14:57:18 -07:00
Jerome Glisse
608f749ec3 r600g: fix fake pixel output
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-08-17 17:25:08 -04:00
Eric Anholt
147ca9f3fc i965: Add support for DP2 in the VS.
Fixes glsl-vs-dot-vec2.
2010-08-17 13:59:08 -07:00
Eric Anholt
0e6066df63 glsl: When doing algebraic simplification, make sure the type still matches.
When simplifying (vec4(1.0) / (float(x))) to rcp(float(x)), we forgot
to produce a vec4, angering ir_validate when starting alien-arena.

Fixes:
glsl-algebraic-add-zero-2
glsl-algebraic-div-one-2
glsl-algebraic-mul-one-2
glsl-algebraic-sub-zero-3
glsl-algebraic-rcp-sqrt-2
2010-08-17 13:50:45 -07:00
Eric Anholt
f166d94fac glsl: Make ir_algebraic new expressions allocate out of the parent.
This could reduce the amount of memory used by a shader tree after
optimization, and increases consistency with other passes.
2010-08-17 13:47:15 -07:00
Ian Romanick
664364052f ir_constant: Don't assert on out-of-bounds array accesses
Several optimization paths, including constant folding, can lead to
accessing an ir_constant array with an out of bounds index.  The GLSL
spec lets us produce "undefined" results, but it does not let us
crash.

Fixes piglit test case glsl-array-bounds-01 and glsl-array-bounds-03.
2010-08-17 13:00:03 -07:00
Eric Anholt
1b708d8f4d mesa: Dump shader source before validating the shader.
This will make extracting source to produce minimal testcases for
shader compile issues easier.
2010-08-17 12:39:03 -07:00
Alex Deucher
6cee1d6adf r600c: fix dword miscount in blit emit code 2010-08-17 10:42:06 -04:00
Chia-I Wu
7f36b2980b targets/egl: Link with DRI_LIB_DEPS.
Use DRI_LIB_DEPS when linking GL/GLES state trackers.  This fixes
missing talloc symbol errors, and is hopefully more future proof.
2010-08-17 19:30:41 +08:00
nobled
37e5f78422 gallivm: Fix and re-enable MMX-disabling code
Signed-off-by: José Fonseca <jfonseca@vmware.com>
2010-08-17 12:25:10 +01:00
Dave Airlie
0aa41e1d96 mesa: fix es1/2 build hopefully
needed to add cpp rules and includes properly for es1/es2
2010-08-17 20:54:45 +10:00
Dave Airlie
1c2a44e445 r300g: fix context destroy under hyperz
we were destroying the mm before unrefing all the objects, so segfault.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-08-17 19:01:18 +10:00
Dave Airlie
6043ee6e62 r600g: kill event type magic number in winsys
these events have names, use them.
2010-08-17 16:07:48 +10:00
Dave Airlie
3e58007892 r600g: add user clip plane support.
Apart from the fact that the radeon.h/r600_states.h editing is a nightmare, this
wasn't so bad.

passes piglit user-clip test now also trivial tests.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-08-17 14:19:09 +10:00
Eric Anholt
00ce188eb8 i965: Use the implied move available in most brw_wm_emit brw_math() calls.
This saves an extra message reg move in the program, though I'm not
clear on whether it will have any performance impact other than cache
footprint.  It will also fix those math calls on Sandybridge, where
the brw_eu_emit.c brw_math() support relies on the implied move being
used.
2010-08-16 20:09:53 -07:00
Eric Anholt
62383ae6fe i965: Add disasm for Compr4 instruction compression. 2010-08-16 20:06:55 -07:00
Ian Romanick
6c03c576cc Merge branch 'glsl2'
Conflicts:
	src/mesa/program/prog_optimize.c
2010-08-16 19:08:53 -07:00
Vinson Lee
15a3b42e13 util: Remove check_os_katmai_support.
check_os_katmai_support checks that the operating system running on a
SSE-capable processor supports SSE. This is necessary for unpatched
2.2.x and earlier kernels. 2.4.x and later kernels support SSE.

check_os_katmai_support will disable SSE capabilities for 32-bit x86
operating systems for which there is no code path. Currently, this
function handles Linux, Windows, and several BSDs. Mac OS, Cygwin, and
Solaris are several operating systems with no code paths.

Rather than add code for the unhandled operating systems, remove this
function altogether. This will fix SSE detection on all recent 32-bit
x86 operating systems. This completely breaks functionality on unpatched
2.2.x and earlier kernels, although there are likely no Gallium3D users
on such operating systems.
2010-08-16 18:52:37 -07:00
Dave Airlie
f50df65fcc r600g: drop libdrm_radeon link 2010-08-17 10:56:58 +10:00
Kenneth Graunke
a433cd286c glsl2: Refresh autogenerated file builtin_function.cpp. 2010-08-16 15:18:44 -07:00
Kenneth Graunke
2f9ecc818d glsl2: Add builtins profile for GLSL 1.30.
Many functions are currently wrapped with #if 0 since we haven't
implemented them yet.
2010-08-16 15:18:44 -07:00
Ian Romanick
45d97dd6d5 linker: Include compiler.h to avoid spurious warnings about INLINE 2010-08-16 13:59:34 -07:00
Ian Romanick
d0a9cbd20e glsl2: Silence unused variable warning 2010-08-16 13:59:01 -07:00
Vinson Lee
f437ee85f4 translate: Move loop variable declaration outside for loop.
Fixes MSVC build.
2010-08-16 13:52:57 -07:00
Kenneth Graunke
2e26145862 glcpp: Refresh autogenerated lexer and parser. 2010-08-16 13:43:43 -07:00
Kenneth Graunke
6be3a8b70a glcpp: Remove spurious newline generated by #version handling.
This was causing line numbering to be off by one.  The newline comes
from the NEWLINE token at the end of the line; there's no need to
insert one.
2010-08-16 13:43:35 -07:00
Marek Olšák
ecec6df9cf r300g: fix assert in the rasterizer block for r3xx-r4xx
Reported-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
2010-08-16 19:19:17 +02:00
Ian Romanick
fc63e37b97 ir_to_mesa: Silence unused variable warnings 2010-08-16 09:45:01 -07:00
Ian Romanick
68772031e6 ir_to_mesa: Clean up assertions in ir_to_mesa_visitor::visit(ir_texture *) 2010-08-16 09:43:00 -07:00
Ian Romanick
0bf63733e5 ir_to_mesa: Support texture rectangle targets 2010-08-16 09:39:58 -07:00
José Fonseca
b421cb9546 translate: Remove unused temporary register.
Assuming the side-effect of x86_make_reg is also unnecessary.
2010-08-16 17:21:14 +01:00
José Fonseca
ded92e5dd8 translate: Eliminate void pointer arithmetic.
Non-portable.
2010-08-16 17:21:14 +01:00
Chia-I Wu
f945cb6515 st/egl: Add support for EGL_KHR_fence_sync.
The extension is implemented by pipe_fence_handle.
2010-08-17 00:06:19 +08:00
Chia-I Wu
2b2c5c4f5c egl: Add support for EGL_KHR_fence_sync.
Individual drivers still need to support and enable the extension.
2010-08-17 00:06:19 +08:00
Chia-I Wu
4b2495661f st/egl: Add support for EGL_KHR_reusable_sync.
The extension is implemented by pipe_condvar.
2010-08-17 00:06:19 +08:00
Chia-I Wu
4eebea74a8 egl: Add support for EGL_KHR_reusable_sync.
Individual drivers still need to support and enable the extension.
2010-08-17 00:06:19 +08:00
Alex Deucher
5ff769b21d r600c: blit emit updates
- set VGT_MAX_VTX_INDX to a larger value
- emit PA_SC_AA_CONFIG.  The command checker in 2.6.36+
  requires this reg.
2010-08-16 11:34:17 -04:00
Luca Barbieri
f201217c1d draw_llvm: fix segfaults on non-SSE2 CPUs where it is disabled (v2)
Changes in v2:
- Change function name

Currently draw_llvm refuses to create itself on non-SSE2 CPUs due to
an alleged LLVM bug.

However, this is implemented improperly, because other parts of draw
still attempt to access draw->llvm, resulting in segfaults.

Instead, put the check in debug_get_option_draw_use_llvm, check that
before calling draw_llvm_create, and then check whether draw->llvm is
non-null everywhere else.
2010-08-16 17:13:17 +02:00
Luca Barbieri
c2da8e7702 translate_sse: major rewrite (v5)
NOTE: Win64 is untested, and is thus currently disabled.
If you have such a system, please enable it and report whether it works.
To enable it, change src/gallium/auxiliary/translate/translate.c

Changes in v5:
- On Win64, preserve %xmm6 and %xmm7 as required by the ABI
- Use _WIN64 instead of WIN64

Changes in v4:
- Use x86_target() and x86_target_caps()
- Enable translate_sse in x86-64, but not in Win64

Changes in v3:
- Win64 support (untested)
- Use u_cpu_detect.h constants instead of #ifs

Changes in v2:
- Minimize #ifs
- Give a name to magic number CHANNELS_0001
- Add support for CPUs without SSE (only memcpy and swizzles, like non SSE2)
- Fixed comments

translate_sse is currently very limited to the point of
being useless in essentially all cases.

In particular, it only support some float32 and unorm8
formats and doesn't work on x86-64.

This commit rewrites it to support:
1. Dumb memory copy for any pair of identical formats
2. All formats that are swizzles of each other
3. Converting 32/64-bit floats and all 8/16/32-bit integers to 32-bit float
4. Converting unorm8/snorm8 to snorm16 and uscaled8/sscaled8 to sscaled16
5. Support for x86-64 (doesn't take advantage of it in any way though)

This new translate can even be useful to translate index buffers for
cards that lack 8-bit index support.

It passes the testsuite I wrote, but note that this is a major change, and more
testing would be great.
2010-08-16 17:09:24 +02:00
Luca Barbieri
a3e6e50544 rtasm: add minimal x86-64 support and new instructions (v5)
Changes in v5:
- Add sse2_movdqa

Changes in v4:
- Use _WIN64 instead of WIN64

Changes in v3:
- Add target and target caps functions, so that they could be different in
  principle from the current CPU and they don't need #ifs to check

Changes in v2:
- Win64 support (untested)
- Use u_cpu_detect.h constants instead of #ifs

This commit adds minimal x86-64 support: only movs between registers
are supported for r8-r15, and x64_rexw() must be used to ask for 64-bit
operations.

It also adds several new instructions for the new translate_sse code.

movdqa
2010-08-16 16:57:05 +02:00
Luca Barbieri
4a4e29a9ab translate: add support for 8/16-bit indices
Currently, only 32-bit indices are supported, but some use cases
translate needs support for all types.
2010-08-16 16:57:05 +02:00
Luca Barbieri
68e74f1b01 translate_sse: remove useless generated function wrappers
Currently translate_sse puts two trivial wrappers in the translate vtable.

These slow it down and enlarge the source code for no gain, except perhaps
the ability to set a breakpoint there, so remove them.

Breakpoints can be set on the caller of the translate functions, with no
loss of functionality.
2010-08-16 16:57:05 +02:00
Luca Barbieri
1cb92fb92e translate_generic: factor out common code between linear and indexed
This moves the common code into a separate ALWAYS_INLINE function.
2010-08-16 16:57:05 +02:00
Luca Barbieri
ddcf028aa0 translate_generic: use memcpy if possible (v3)
Changes in v3:
- If we can do a copy, don't try to get an emit func, as that can assert(0)

Changes in v2:
- Add comment regarding copy_size

When used in GPU drivers, translate can be used to simultaneously
perform a gather operation, and convert away from unsupported formats.

In this use case, input and output formats will often be identical: clearly
it would make sense to use a memcpy in this case.

Instead, translate will insist to convert to and from 32-bit floating point
numbers.

This is not only extremely expensive, but it also loses precision for
32/64-bit integers and 64-bit floating point numbers.

This patch changes translate_generic to just use memcpy if the formats are
identical, non-blocked, and with an integral number of bytes per pixel (note
that all sensible vertex formats are like this).
2010-08-16 16:57:05 +02:00
Chia-I Wu
9271059b36 drwa: Fix polygon edge flags.
Fix a copy-and-paste error introduced by
f141abdc8f.
2010-08-16 22:01:57 +08:00
Chia-I Wu
aaf51ed7c2 draw: No need to make max_vertices even.
Triangle strip alternates the front/back orientation of its triangles.
max_vertices was made even so that varray never splitted a triangle
strip at the wrong positions.

It did not work with triangle strips with adjacencies.  And it is no
longer relevant with vsplit.
2010-08-16 21:04:24 +08:00
Chia-I Wu
c3fee80f2b draw: Remove DRAW_PIPE_MAX_VERTICES and DRAW_PIPE_FLAG_MASK.
The higher bits of draw elements are no longer used for the stipple or
edge flags.
2010-08-16 20:57:08 +08:00
Chia-I Wu
a072f0e186 drwa: Add PRIMITIVE macro to vsplit.
PRIMITIVE is used by the indexed path to flush the entire primitive with
custom vertex count checks.  It replaces the existing fast path.
2010-08-16 20:46:29 +08:00
Chia-I Wu
7b3beb2240 draw: last_vertex_last is always true for GS and SO.
That is, OpenGL decomposition rule is assumed.  There should be a
pipe_context state to specify the rules.
2010-08-16 20:46:29 +08:00
Chia-I Wu
a97419a3ba draw: Remove varray and vcache.
They have been deprecated by vsplit.
2010-08-16 20:46:29 +08:00