Commit graph

18282 commits

Author SHA1 Message Date
Zack Rusin
d48054ff22 draw: don't crash if GS doesn't emit anything
Technically it's legal for geometry shader to not emit any
vertices. It's silly, but perfectly legal, so lets make draw
stop crashing if it happens.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-27 17:28:04 -04:00
Vadim Girlin
fb1eed9ec5 r600g/sb: remove unused code
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-30 21:50:48 +04:00
Vadim Girlin
3f18dd818f r600g/sb: collect shader statistics
Collects various statistical information for each shader
and total stats for contexts.

Printed with R600_DEBUG=sb,sbstat

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-30 21:50:48 +04:00
Vadim Girlin
6ba7a162b6 r600g/sb: don't propagate dead values in GVN pass
In some cases we use value::gvn_source field to link values that
are known to be equal before gvn pass (e.g. results of DOT4 in different
slots of the same alu group), but then source value may become dead later
and this confuses further passes.

This patch resets value::gvn_source to NULL in the dce_cleanup pass
if it points to dead value.

Fixes segfault during shader optimization with ETQW.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-30 21:50:48 +04:00
Vadim Girlin
3e476c311f r600g/sb: use simple heuristic to limit register pressure
It's not a complete register pressure tracking, yet it helps to prevent
register allocation problems in some cases where they were observed.

The problems are uncovered by false dependencies between fetch instructions
introduced by some recent changes in TGSI and/or default backend.
Sometimes we have code like this:

...
SAMPLE R5.xyzw, R5.xyzw
... store R5.xyzw somewhere
MOV R5.x, <next x coord>
MOV R5.y, <next y coord>
SAMPLE R5.xyzw, R5.xyzw
... <may be repeated a lot of times>

With 2D resources, z and w in SAMPLE src reg aren't used and can be simply
masked, but shader backend doesn't have this information, so it's
considered as data dependency by optimization algorithms.
2013-04-30 21:50:48 +04:00
Vadim Girlin
6d6c8c88a3 r600g/sb: improve error checking in ra_coalesce pass 2013-04-30 21:50:47 +04:00
Vadim Girlin
188c893e65 r600g/sb: use source bytecode in case of optimization errors 2013-04-30 21:50:47 +04:00
Vadim Girlin
ad1df471d0 r600g: plug in optimizing backend
Optimization is enabled with "R600_DEBUG=sb".

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-30 21:50:47 +04:00
Vadim Girlin
2cd7691793 r600g/sb: initial commit of the optimizing shader backend 2013-04-30 21:50:47 +04:00
Vadim Girlin
fbb065d629 r600g: use enum type for domains field in struct r600_resource
This prevents the problems when the header is included in C++ code.
2013-04-30 21:50:47 +04:00
Vadim Girlin
d5b30fd036 r600g: add new flags to isa instruction tables 2013-04-30 21:50:47 +04:00
Vadim Girlin
a919424215 r600g: always create reverse lookup isa tables 2013-04-30 21:50:47 +04:00
Vadim Girlin
7d555f2f4c r600g: mask unused source components for SAMPLE
This results in more clean shader code and may improve the quality of
optimized code produced by r600-sb due to eliminated false dependencies
in some cases.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-30 21:50:47 +04:00
Vincent Lejeune
a6a4b70e2d r600g/llvm: Fix opencl build 2013-04-30 16:38:47 +02:00
Alexander von Gluck IV
f1361ed084 Gallium: Use mmap on Haiku for executable memory vs malloc
* Haiku now has DEP enabled by default.
2013-04-29 23:22:35 -05:00
Vincent Lejeune
51e9bfdc48 r600g/llvm: get use_kill from compiler shader 2013-04-30 02:17:18 +02:00
Zack Rusin
a6e7c22664 draw/so: fix overflow calculation
only report overflow for missing targets if they're actually being
used. if the targets are missing but are not being used by any
slot in the stream output declaration we should correctly just
ignore them.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-27 03:48:36 -04:00
José Fonseca
220ef8295c llvmpipe: Fix queries when screen->num_threads == 0.
That is, when llvmpipe is run in single-threaded mode.

Trivial.

Tested with

  LP_NUM_THREADS=0 glean --run results --overwrite --quick --tests occluQry
2013-04-29 15:40:06 +01:00
Jerome Glisse
c7a13dc5f5 r600g: force full cache for hyperz
Seems that in some case allowing half cache usage confuse the gpu
and trigger lockup. Force full cache use.

Should fix :
https://bugs.freedesktop.org/show_bug.cgi?id=59592
https://bugs.freedesktop.org/show_bug.cgi?id=60848
https://bugs.freedesktop.org/show_bug.cgi?id=60969
https://bugs.freedesktop.org/show_bug.cgi?id=61747
https://bugs.freedesktop.org/show_bug.cgi?id=62466
https://bugs.freedesktop.org/show_bug.cgi?id=62669
https://bugs.freedesktop.org/show_bug.cgi?id=62721
https://bugs.freedesktop.org/show_bug.cgi?id=63124

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-04-29 10:06:29 -04:00
Rob Clark
3900a0e4df freedreno: fix rebase screw-up
Add back 2nd arg to emit_vertexbufs() which got lost in rebase.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-29 07:36:27 -04:00
Zack Rusin
3bba787879 llvmpipe: stop crashing when one of the so targets is null
Fixes a crash when one of the so targets is null.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-27 01:19:12 -04:00
Zack Rusin
0031cde1e1 draw/so: indicate overflow when buffer is missing
We were crashing if one of the buffers wasn't set, we should
just treat it as an overflow. It's useful when using so
statistics because it allows one to figure out how much data
would be generated by so without actually writing any of it.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-27 01:19:07 -04:00
Zack Rusin
f9f57312de gallivm: fix indirect addressing of temps in soa mode
we weren't adding the soa offsets when constructing the indices
for the gather functions. That meant that we were always returning
the data in the first vertex/primitive/pixel in the SoA structure
and not correctly fetching from all structures.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-27 01:18:51 -04:00
Zack Rusin
3093ac6f4f tgsi/ureg: Add a function to return the number of outputs
We already hold the variable, just weren't providing access
to it.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-26 23:05:45 -04:00
Zack Rusin
53d36d5fb0 draw/so: Fix overflow calculations
We weren't taking the buffer offset, destination offset or the
stride into consideration so we were frequently writing into
an overflown buffer.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 23:04:26 -04:00
Zack Rusin
d996622cfa draw/llvm: fix viewport transformations
This was a very serious bug. We were always doing the viewport
transformations on the first output of the vertex shader. That means
that every application that was storing position in anything but
OUT[0] was outputing untransformed vertices and had broken output
for whatever it was storing at OUT[0]. Correctly take into
consideration where the vertex position is actually stored.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 23:01:46 -04:00
Zack Rusin
5d9ef5b365 gallium: increase the number of available stream output decls
There can be more stream output decls than shader outputs because
individual components from them can be split and distributed
among different so buffers.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 23:01:23 -04:00
Zack Rusin
562835bcdf llvmpipe: implement so_overflow query
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 22:58:54 -04:00
Brian Paul
ff74cf62b1 llvmpipe: replace LP_MAX_THREADS with screen->num_threads in query code
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 16:22:24 -06:00
Brian Paul
38a751cbe8 llvmpipe: bump LP_MAX_THREADS to 16
On the mesa-users list, Burlen Loring reported a speed-up with 16 cores
and his test/app.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 16:22:12 -06:00
Christian König
e3ac293daa r600/uvd: stop advertising MPEG4 on UVD 2.x chips v2
That is just not supported by the hardware.

v2: fix compare

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-26 15:35:36 +02:00
Christian König
2c2c54b819 radeon/uvd: stop using anonymous unions
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-04-26 15:35:36 +02:00
José Fonseca
c5e8573762 Revert "draw: Yield zeros for LLVM fetches of non-existing vertex elements."
After more thought/discussion, it seems it is better to handle this sort
of stuff in the state tracker.

So this reverts commit 12096f334b, except the
variant->key -> key shorthands.
2013-04-26 12:15:39 +01:00
Chia-I Wu
5816a471af ilo: add the driver to the build system
Add ilo to targets/egl-static and add a new target dri-ilo.  Update autoconf
and automake rules.
2013-04-26 16:20:52 +08:00
Chia-I Wu
825aa60707 ilo: compile VS/GS/FS with the toy compiler 2013-04-26 16:20:52 +08:00
Chia-I Wu
7118ff8bb0 ilo: add a toy shader compiler
This is a simple shader compiler that performs almost zero optimizations.  The
generated code is usually much larger comparing to that generated by i965.
The generated code also requires many more registers.

Function-wise, it lacks register spilling and does not support most TGSI
indirections.  Other than those, it works alright.
2013-04-26 16:20:52 +08:00
Chia-I Wu
0fa2d0e98a ilo: hook up pipe context GPGPU functions
This just adds a stub.
2013-04-26 16:16:43 +08:00
Chia-I Wu
cf8f3dd373 ilo: hook up pipe context video functions
This just hooks them up with auxiliary/vl layer.
2013-04-26 16:16:43 +08:00
Chia-I Wu
12dd397d0c ilo: add support for time/occlusion/primitive queries 2013-04-26 16:16:43 +08:00
Chia-I Wu
e6186b0769 ilo: hook up pipe context 3D functions 2013-04-26 16:16:43 +08:00
Chia-I Wu
5b310f6230 ilo: add GEN7 support for 3D pipeline 2013-04-26 16:16:43 +08:00
Chia-I Wu
91ce766c35 ilo: add 3D pipeline for GEN6
The 3D pipeline is a high-level interface to emit 3D commands and states.  It
uses GEN6 GPE to do the real work.
2013-04-26 16:16:43 +08:00
Chia-I Wu
67233b56d6 ilo: add GEN7 GPE 2013-04-26 16:16:43 +08:00
Chia-I Wu
d3602dfac6 ilo: add GEN6 GPE
GEN6 GPE (Graphics Processing Engine) is a low-level interface to emit 3D
commands and states.
2013-04-26 16:16:43 +08:00
Chia-I Wu
72357cf3bb ilo: hook up pipe context query functions
None of the query types are supported yet.
2013-04-26 16:16:43 +08:00
Chia-I Wu
8f949bc1da ilo: hook up pipe context transfer functions 2013-04-26 16:16:42 +08:00
Chia-I Wu
0754ff33e3 ilo: hook up pipe context blit functions 2013-04-26 16:16:42 +08:00
Chia-I Wu
89d1702b9b ilo: hook up pipe context state functions 2013-04-26 16:16:42 +08:00
Chia-I Wu
520af66797 ilo: add functions to manage shaders
This commits add shader cache, shader state, shader variant, and etc.  It does
not add the shader compiler though.
2013-04-26 16:16:42 +08:00
Chia-I Wu
86940bf41c ilo: hook up pipe context flush function 2013-04-26 16:16:42 +08:00