We were missing the implementation of PIPE_QUERY_SO_STATISTICS
query, this change implements it on top of the existing
facilities.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
v2: fix instrinsic name as well
v3: LLVM revision incremented as well
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
The multiplication part of tgsi_umad did not work on Cayman, because it did
not populate the correct vector slots.
This fixed hardlocks in the EXT_transform_feedback/order tests.
NOTE: This is a candidate for the stable branches.
(might not be easy to cherry-pick though)
Signed-off-by: Marek Olšák <maraeo@gmail.com>
v2:
- Only dump shaders when env variable is set.
v3:
- Don't emit VGT registers
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com
This function is a holdover from r600g and is identical to
si_pm4_inval_texture_cache(), so it is not needed.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com
This target string now contains four values instead of three. The old
processor field (which was really being interpreted as arch) has been split
into two fields: processor and arch. This allows drivers to pass a
more a more detailed description of the hardware to compiler frontends.
v2:
- Adapt to libclc changes
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Switch to use the envytools generated headers for register/bitfield
definitions. This is the first step in preparing to add a3xx support,
since it avoids having conflicting names for a3xx and a2xx registers.
And since I'm using envytools for a3xx it is simpler to just use it for
everything.
This shouldn't cause any functional change, it is really just a lot of
renaming.
Signed-off-by: Rob Clark <robdclark@gmail.com>
At least on llvm 3.2 this appears to work fine. Tested on an Athlon XP
2600+, which has sse and 3dnow but not sse2.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
When geometry shaders are present, one needs to be able to create
an empty geometry shader with stream output that needs to be
resolved later and attached to the currently bound vertex shader.
Lets add support for it to llvmpipe and draw. draw allows attaching
independent stream output info to any vertex shader and llvmpipe
resolves at draw time which vertex shader the given empty geometry
shader should be linked to.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
We need to reset the internal state of the so buffers or we'll
keep appending even though we're not supposed to.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Ported from r600g commit:
8891b2f9c9
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
NOTE: This is a candidate for the 9.1 branch.
The fallbacks count is the number of drawing calls that use a "draw"
module fallback, such as polygon stipple.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This still isn't optimal, since the fence will signal a bit late,
but better than checking on the bo, which may never be ready if it
is shared (which is likely).
Reduced stack size allows to run more threads in some cases,
improving performance for the shaders that use stack (that is, for the
shaders with control flow instructions). E.g. with unigine-based apps.
v4: implement exact computation taking into account wavefront size
v5: add cases for RV620, RS880
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
v2: reduce key size, don't copy key around to much.
v3: remove key size reduction
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
This works different than on R600, we need to add the start instance manually.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
This should be used by both SI and R600.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Fixes mplayer -vo vdpau OSD.
NOTE: This is a candidate for the 9.1 branch.
Reported-by: Igor Vagulin <igor.vagulin@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Christian König <christian.koenig@amd.com>
If we're drawing to a surface that's 2048 x 2048 pixels or larger there's
danger of fixed-point overflow in the triangle rasterization code. That
leads to various rendering glitches.
Rather than implement some intricate changes to the rasterization code,
simply subdivide triangles into smaller subtriangles to avoid the issue.
Only do this when the drawing surface is larger than 2048 by 2048.
Reviewed-by: José Fonseca <jfonseca@vmware.com>