Commit graph

18295 commits

Author SHA1 Message Date
Chia-I Wu
440557db4e ilo: allow one-off flags to be specified for CP
It will be used for SOL_RESET on GEN7.
2013-05-01 16:03:44 +08:00
Chia-I Wu
dd62e7bc02 ilo: fix tiling/size for special-purpose resources
We do not allocate such resources yet though.
2013-05-01 12:00:32 +08:00
Chia-I Wu
7726e9500c ilo: use UMS layout for render targets
As we do not advertise MSAA support, this change should not make any
difference yet.
2013-05-01 11:56:43 +08:00
Chia-I Wu
334abed828 ilo: support and prefer compact array spacing
There is no reason to waste the memory when the HW can support compact array
spacing (ARYSPC_LOD0).
2013-05-01 11:31:15 +08:00
Chia-I Wu
ce188bb252 ilo: move device limits to ilo_dev_info or to GPEs
It seems a bit weird to have device limits in a context.
2013-05-01 11:23:11 +08:00
Chia-I Wu
bef98f9c3a ilo: use ilo_dev_info in toy compiler
We need only dev->gen, but it makes sense to expose other information to the
compiler.
2013-05-01 11:22:57 +08:00
Chia-I Wu
51d749e7e2 ilo: use ilo_dev_info in GPE and 3D pipeline
We need only dev->gen and dev->gt, but it makes sense to expose other
information to the pipeline.
2013-05-01 11:22:20 +08:00
Chia-I Wu
bb1f635dcc ilo: add ilo_dev_info shared by the screen and contexts
The struct is used to describe the device information, such as PCI ID, GEN,
GT, and etc.
2013-05-01 11:20:41 +08:00
Chia-I Wu
355f3f7ab5 ilo: fix indentation of ilo_gpe_gen*.h 2013-05-01 11:20:32 +08:00
Matt Turner
460996b937 build: Remove libpipe_loader.la from GALLIUM_PIPE_LOADER_LIBS.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 14:03:32 -07:00
Matt Turner
538e10f3ea build: Remove HAVE_PIPE_LOADER_SW.
It guarded the function prototype of pipe_loader_sw_probe, whose use (in
pipe_loader.c) and definition (in pipe_loader_sw.c) were not guarded.
Both are built into libpipe_loader.la if HAVE_LOADER_GALLIUM, which is
enable_gallium_loader in configure.ac.

Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 14:03:32 -07:00
Matt Turner
ea6caf4cdf build: Remove libws_null.la from GALLIUM_PIPE_LOADER_LIBS.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 14:03:32 -07:00
Matt Turner
242809942f build: Rename PIPE_LOADER_HAVE_XCB to HAVE_PIPE_LOADER_XCB.
For consistency, since we already have HAVE_PIPE_LOADER_{SW,DRM}.

Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 14:03:32 -07:00
Zack Rusin
d48054ff22 draw: don't crash if GS doesn't emit anything
Technically it's legal for geometry shader to not emit any
vertices. It's silly, but perfectly legal, so lets make draw
stop crashing if it happens.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-27 17:28:04 -04:00
Vadim Girlin
fb1eed9ec5 r600g/sb: remove unused code
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-30 21:50:48 +04:00
Vadim Girlin
3f18dd818f r600g/sb: collect shader statistics
Collects various statistical information for each shader
and total stats for contexts.

Printed with R600_DEBUG=sb,sbstat

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-30 21:50:48 +04:00
Vadim Girlin
6ba7a162b6 r600g/sb: don't propagate dead values in GVN pass
In some cases we use value::gvn_source field to link values that
are known to be equal before gvn pass (e.g. results of DOT4 in different
slots of the same alu group), but then source value may become dead later
and this confuses further passes.

This patch resets value::gvn_source to NULL in the dce_cleanup pass
if it points to dead value.

Fixes segfault during shader optimization with ETQW.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-30 21:50:48 +04:00
Vadim Girlin
3e476c311f r600g/sb: use simple heuristic to limit register pressure
It's not a complete register pressure tracking, yet it helps to prevent
register allocation problems in some cases where they were observed.

The problems are uncovered by false dependencies between fetch instructions
introduced by some recent changes in TGSI and/or default backend.
Sometimes we have code like this:

...
SAMPLE R5.xyzw, R5.xyzw
... store R5.xyzw somewhere
MOV R5.x, <next x coord>
MOV R5.y, <next y coord>
SAMPLE R5.xyzw, R5.xyzw
... <may be repeated a lot of times>

With 2D resources, z and w in SAMPLE src reg aren't used and can be simply
masked, but shader backend doesn't have this information, so it's
considered as data dependency by optimization algorithms.
2013-04-30 21:50:48 +04:00
Vadim Girlin
6d6c8c88a3 r600g/sb: improve error checking in ra_coalesce pass 2013-04-30 21:50:47 +04:00
Vadim Girlin
188c893e65 r600g/sb: use source bytecode in case of optimization errors 2013-04-30 21:50:47 +04:00
Vadim Girlin
ad1df471d0 r600g: plug in optimizing backend
Optimization is enabled with "R600_DEBUG=sb".

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-30 21:50:47 +04:00
Vadim Girlin
2cd7691793 r600g/sb: initial commit of the optimizing shader backend 2013-04-30 21:50:47 +04:00
Vadim Girlin
fbb065d629 r600g: use enum type for domains field in struct r600_resource
This prevents the problems when the header is included in C++ code.
2013-04-30 21:50:47 +04:00
Vadim Girlin
d5b30fd036 r600g: add new flags to isa instruction tables 2013-04-30 21:50:47 +04:00
Vadim Girlin
a919424215 r600g: always create reverse lookup isa tables 2013-04-30 21:50:47 +04:00
Vadim Girlin
7d555f2f4c r600g: mask unused source components for SAMPLE
This results in more clean shader code and may improve the quality of
optimized code produced by r600-sb due to eliminated false dependencies
in some cases.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-30 21:50:47 +04:00
Vincent Lejeune
a6a4b70e2d r600g/llvm: Fix opencl build 2013-04-30 16:38:47 +02:00
Alexander von Gluck IV
f1361ed084 Gallium: Use mmap on Haiku for executable memory vs malloc
* Haiku now has DEP enabled by default.
2013-04-29 23:22:35 -05:00
Vincent Lejeune
51e9bfdc48 r600g/llvm: get use_kill from compiler shader 2013-04-30 02:17:18 +02:00
Zack Rusin
a6e7c22664 draw/so: fix overflow calculation
only report overflow for missing targets if they're actually being
used. if the targets are missing but are not being used by any
slot in the stream output declaration we should correctly just
ignore them.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-27 03:48:36 -04:00
José Fonseca
220ef8295c llvmpipe: Fix queries when screen->num_threads == 0.
That is, when llvmpipe is run in single-threaded mode.

Trivial.

Tested with

  LP_NUM_THREADS=0 glean --run results --overwrite --quick --tests occluQry
2013-04-29 15:40:06 +01:00
Jerome Glisse
c7a13dc5f5 r600g: force full cache for hyperz
Seems that in some case allowing half cache usage confuse the gpu
and trigger lockup. Force full cache use.

Should fix :
https://bugs.freedesktop.org/show_bug.cgi?id=59592
https://bugs.freedesktop.org/show_bug.cgi?id=60848
https://bugs.freedesktop.org/show_bug.cgi?id=60969
https://bugs.freedesktop.org/show_bug.cgi?id=61747
https://bugs.freedesktop.org/show_bug.cgi?id=62466
https://bugs.freedesktop.org/show_bug.cgi?id=62669
https://bugs.freedesktop.org/show_bug.cgi?id=62721
https://bugs.freedesktop.org/show_bug.cgi?id=63124

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-04-29 10:06:29 -04:00
Rob Clark
3900a0e4df freedreno: fix rebase screw-up
Add back 2nd arg to emit_vertexbufs() which got lost in rebase.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-29 07:36:27 -04:00
Zack Rusin
3bba787879 llvmpipe: stop crashing when one of the so targets is null
Fixes a crash when one of the so targets is null.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-27 01:19:12 -04:00
Zack Rusin
0031cde1e1 draw/so: indicate overflow when buffer is missing
We were crashing if one of the buffers wasn't set, we should
just treat it as an overflow. It's useful when using so
statistics because it allows one to figure out how much data
would be generated by so without actually writing any of it.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-27 01:19:07 -04:00
Zack Rusin
f9f57312de gallivm: fix indirect addressing of temps in soa mode
we weren't adding the soa offsets when constructing the indices
for the gather functions. That meant that we were always returning
the data in the first vertex/primitive/pixel in the SoA structure
and not correctly fetching from all structures.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-27 01:18:51 -04:00
Zack Rusin
3093ac6f4f tgsi/ureg: Add a function to return the number of outputs
We already hold the variable, just weren't providing access
to it.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-26 23:05:45 -04:00
Zack Rusin
53d36d5fb0 draw/so: Fix overflow calculations
We weren't taking the buffer offset, destination offset or the
stride into consideration so we were frequently writing into
an overflown buffer.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 23:04:26 -04:00
Zack Rusin
d996622cfa draw/llvm: fix viewport transformations
This was a very serious bug. We were always doing the viewport
transformations on the first output of the vertex shader. That means
that every application that was storing position in anything but
OUT[0] was outputing untransformed vertices and had broken output
for whatever it was storing at OUT[0]. Correctly take into
consideration where the vertex position is actually stored.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 23:01:46 -04:00
Zack Rusin
5d9ef5b365 gallium: increase the number of available stream output decls
There can be more stream output decls than shader outputs because
individual components from them can be split and distributed
among different so buffers.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 23:01:23 -04:00
Zack Rusin
562835bcdf llvmpipe: implement so_overflow query
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 22:58:54 -04:00
Brian Paul
ff74cf62b1 llvmpipe: replace LP_MAX_THREADS with screen->num_threads in query code
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 16:22:24 -06:00
Brian Paul
38a751cbe8 llvmpipe: bump LP_MAX_THREADS to 16
On the mesa-users list, Burlen Loring reported a speed-up with 16 cores
and his test/app.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 16:22:12 -06:00
Christian König
e3ac293daa r600/uvd: stop advertising MPEG4 on UVD 2.x chips v2
That is just not supported by the hardware.

v2: fix compare

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-26 15:35:36 +02:00
Christian König
2c2c54b819 radeon/uvd: stop using anonymous unions
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-04-26 15:35:36 +02:00
José Fonseca
c5e8573762 Revert "draw: Yield zeros for LLVM fetches of non-existing vertex elements."
After more thought/discussion, it seems it is better to handle this sort
of stuff in the state tracker.

So this reverts commit 12096f334b, except the
variant->key -> key shorthands.
2013-04-26 12:15:39 +01:00
Chia-I Wu
5816a471af ilo: add the driver to the build system
Add ilo to targets/egl-static and add a new target dri-ilo.  Update autoconf
and automake rules.
2013-04-26 16:20:52 +08:00
Chia-I Wu
825aa60707 ilo: compile VS/GS/FS with the toy compiler 2013-04-26 16:20:52 +08:00
Chia-I Wu
7118ff8bb0 ilo: add a toy shader compiler
This is a simple shader compiler that performs almost zero optimizations.  The
generated code is usually much larger comparing to that generated by i965.
The generated code also requires many more registers.

Function-wise, it lacks register spilling and does not support most TGSI
indirections.  Other than those, it works alright.
2013-04-26 16:20:52 +08:00
Chia-I Wu
0fa2d0e98a ilo: hook up pipe context GPGPU functions
This just adds a stub.
2013-04-26 16:16:43 +08:00