Commit graph

11465 commits

Author SHA1 Message Date
Zack Rusin
c1cd19c3b8 llvmpipe: implement PIPE_QUERY_SO_STATISTICS
We were missing the implementation of PIPE_QUERY_SO_STATISTICS
query, this change implements it on top of the existing
facilities.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-10 12:32:56 -07:00
Christian König
ccf3e8fc9b radeonsi: remove sampler writemask v3
v2: fix instrinsic name as well
v3: LLVM revision incremented as well

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-10 10:41:29 +02:00
Brian Paul
4ad360133c softpipe: misc updates to image dumping in softpipe_flush() 2013-04-09 08:27:53 -06:00
Martin Andersson
a8246927e3 r600g: Fix UMAD on Cayman
The multiplication part of tgsi_umad did not work on Cayman, because it did
not populate the correct vector slots.

This fixed hardlocks in the EXT_transform_feedback/order tests.

NOTE: This is a candidate for the stable branches.
(might not be easy to cherry-pick though)

Signed-off-by: Marek Olšák <maraeo@gmail.com>
2013-04-09 03:09:37 +02:00
Vincent Lejeune
5019af2145 r600g/llvm: Add support for native isa for pre EG
This fixes bug 62756 :
https://bugs.freedesktop.org/show_bug.cgi?id=62756#c12
2013-04-08 15:11:59 +02:00
Tom Stellard
302f53dc20 radeonsi: Add compute support v3
v2:
  - Only dump shaders when env variable is set.

v3:
  - Don't emit VGT registers

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com
2013-04-05 18:43:34 -04:00
Tom Stellard
4f7fe2cf2c radeonsi: Set TCL1_ACTION_ENA when invalidating the texture cache
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com
2013-04-05 18:43:34 -04:00
Tom Stellard
0ccf82c557 radeonsi: Remove si_pm4_inval_vertex_cache()
This function is a holdover from r600g and is identical to
si_pm4_inval_texture_cache(), so it is not needed.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com
2013-04-05 18:43:34 -04:00
Tom Stellard
c5e5b3401c gallium: PIPE_COMPUTE_CAP_IR_TARGET - allow drivers to specify a processor v2
This target string now contains four values instead of three.  The old
processor field (which was really being interpreted as arch) has been split
into two fields: processor and arch.  This allows drivers to pass a
more a more detailed description of the hardware to compiler frontends.

v2:
  - Adapt to libclc changes

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2013-04-05 18:43:34 -04:00
Rob Clark
aac7f06ad8 freedreno: use autogenerated register defs
Switch to use the envytools generated headers for register/bitfield
definitions.  This is the first step in preparing to add a3xx support,
since it avoids having conflicting names for a3xx and a2xx registers.
And since I'm using envytools for a3xx it is simpler to just use it for
everything.

This shouldn't cause any functional change, it is really just a lot of
renaming.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2013-04-05 14:33:16 -04:00
Adam Jackson
ca70de9bd2 llvmpipe: Work without sse2 if llvm is new enough
At least on llvm 3.2 this appears to work fine.  Tested on an Athlon XP
2600+, which has sse and 3dnow but not sse2.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-04-05 11:32:53 -04:00
Vincent Lejeune
9276961223 r600g/llvm: Workaround for wrong tex.offset_* 2013-04-04 16:03:04 +02:00
Zack Rusin
302df7cc85 draw/llvmpipe: allow independent so attachments to the vs
When geometry shaders are present, one needs to be able to create
an empty geometry shader with stream output that needs to be
resolved later and attached to the currently bound vertex shader.
Lets add support for it to llvmpipe and draw. draw allows attaching
independent stream output info to any vertex shader and llvmpipe
resolves at draw time which vertex shader the given empty geometry
shader should be linked to.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Zack Rusin
246e68735f llvmpipe: reset so buffers when not appending
We need to reset the internal state of the so buffers or we'll
keep appending even though we're not supposed to.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Brian Paul
ac114c6824 svga: add new memory-used HUD query
To track the amount of memory used by all pipe_resources (textures
and buffers).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-03 11:02:47 -06:00
Vincent Lejeune
159d934066 r600g/llvm: Do not override llvm provided stack_size 2013-04-03 18:39:49 +02:00
Vincent Lejeune
097a6ecdfe r600g/llvm: Do not change cf_alu inst when adding alus 2013-04-03 18:22:40 +02:00
Marek Olšák
ff01e0db0e radeonsi: add more cases for copying unsupported formats to resource_copy_region
Ported from r600g commit:

8891b2f9c9

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>

NOTE: This is a candidate for the 9.1 branch.
2013-04-03 10:58:33 -04:00
Brian Paul
3838edaf5d svga: add HUD queries for number of draw calls, number of fallbacks
The fallbacks count is the number of drawing calls that use a "draw"
module fallback, such as polygon stipple.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-03 09:56:08 -06:00
Brian Paul
49ed1f3cb3 svga: refactor occlusion query code
This is in preparation for adding new query types for the HUD.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-03 09:56:07 -06:00
Brian Paul
0289ebaa0f svga: add case for PIPE_CAP_QUERY_PIPELINE_STATISTICS 2013-04-03 08:19:44 -06:00
Christoph Bumiller
80eef069f0 nv50,nvc0: remove MS resolve formats hack
Mesa now allows BlitFramebuffer resolve between RGBA and BGRA.
2013-04-03 13:19:15 +02:00
Christoph Bumiller
4de70bf43c nvc0: fix 128 bit compressed storage type selection 2013-04-03 12:54:44 +02:00
Christoph Bumiller
8e1dd58a7e nvc0: place staging textures in GART and map them directly 2013-04-03 12:54:44 +02:00
Christoph Bumiller
ba9b0b682f nv50: account for pesky prefetch in size calculation of linear textures 2013-04-03 12:54:44 +02:00
Christoph Bumiller
f0a0d59f0f nvc0: honour scaled coordiantes setting for linear textures 2013-04-03 12:54:44 +02:00
Christoph Bumiller
d801545964 nvc0: fix for 2d engine R source formats writing RRR1 and not R001 2013-04-03 12:54:43 +02:00
Christoph Bumiller
6417d56c19 nv50,nvc0: disable DEPTH_RANGE_NEAR/FAR clipping during blit
We send position.z == 0, DEPTH_RANGE may be some arbitrary range
not including 0 (for exmaple in piglit's hiz tests).
2013-04-03 12:54:43 +02:00
Christoph Bumiller
2a8145d36b nouveau: accelerate buffer copies in resource_copy_region 2013-04-03 12:54:43 +02:00
Christoph Bumiller
3ed4bbd769 nvc0: demagic some of the NVE4_COMPUTE_UPLOAD methods
It's actually the same as P2MF.
2013-04-03 12:54:43 +02:00
Christoph Bumiller
fb0334adb3 nvc0: read PM counters for each warp scheduler separately 2013-04-03 12:54:43 +02:00
Christoph Bumiller
7bac075f25 nvc0: add some metrics to driver specific queries 2013-04-03 12:54:43 +02:00
Christoph Bumiller
198f514aa6 nvc0: add some driver statistics queries 2013-04-03 12:54:43 +02:00
Christoph Bumiller
7628cc247f nvc0: disable compressed storage type 0xdb for now
Single-sample color compression doesn't seem that useful anyway.
2013-04-03 12:54:43 +02:00
Christoph Bumiller
ea12fc3f6c nvc0: use correct hw query for PRIMITIVES_GENERATED
It was the same as SO_STATISTICS[1] before.
2013-04-03 12:54:43 +02:00
Christoph Bumiller
6bca4e7085 nvc0: use fence to check state of queries that don't write sequence
This still isn't optimal, since the fence will signal a bit late,
but better than checking on the bo, which may never be ready if it
is shared (which is likely).
2013-04-03 12:54:43 +02:00
Christoph Bumiller
f35e96d973 gallium: add PIPE_CAP_QUERY_PIPELINE_STATISTICS
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-03 12:54:43 +02:00
Vadim Girlin
9be624b3ef r600g: don't reserve more stack space than required v5
Reduced stack size allows to run more threads in some cases,
improving performance for the shaders that use stack (that is, for the
shaders with control flow instructions). E.g. with unigine-based apps.

v4: implement exact computation taking into account wavefront size
v5: add cases for RV620, RS880

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-02 19:34:14 +04:00
Vadim Girlin
7e04227f39 r600g: fix range handling for tgsi input declarations v2
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-02 19:34:14 +04:00
Christian König
a0dca4409a radeonsi: add instance divisor support v3
v2: reduce key size, don't copy key around to much.
v3: remove key size reduction

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-02 13:01:43 +02:00
Christian König
cf9b31f78a radeonsi: add start instance support
This works different than on R600, we need to add the start instance manually.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-02 13:01:43 +02:00
Christian König
e4ed58763a radeonsi: add instanceid support
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-02 13:01:43 +02:00
Christian König
83df955ca9 radeon/llvm: move system value fetching to common code
This should be used by both SI and R600.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-02 13:01:42 +02:00
Michel Dänzer
c6efb4870b radeonsi: Handle arbitrary 2-byte formats in resource_copy_region
Fixes mplayer -vo vdpau OSD.

NOTE: This is a candidate for the 9.1 branch.

Reported-by: Igor Vagulin <igor.vagulin@gmail.com>

Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Christian König <christian.koenig@amd.com>
2013-04-02 11:42:35 +02:00
Maarten Lankhorst
6d20c646d6 nvc0: Fix fd leak in nvc0_create_decoder
NOTE: This is a candidate for the 9.0 and 9.1 branches.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-04-02 10:25:26 +02:00
Vincent Lejeune
50fd9c4544 r600g/llvm: Update LLVM_REVISION.txt 2013-04-01 23:50:20 +02:00
Vincent Lejeune
8c8c4e3977 r600g/llvm: Use stack_size provided from llvm. 2013-04-01 23:43:57 +02:00
Vincent Lejeune
4ac0d85ca6 r600g/llvm: uses function attribute to pass shader type 2013-04-01 23:43:42 +02:00
Vincent Lejeune
af38695f51 r600g/llvm: Add support for cf_alu native encode 2013-04-01 23:43:27 +02:00
Brian Paul
1165ff1af1 llvmpipe: use triangle subdivision to avoid fixed-point overflow issues
If we're drawing to a surface that's 2048 x 2048 pixels or larger there's
danger of fixed-point overflow in the triangle rasterization code.  That
leads to various rendering glitches.

Rather than implement some intricate changes to the rasterization code,
simply subdivide triangles into smaller subtriangles to avoid the issue.
Only do this when the drawing surface is larger than 2048 by 2048.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-01 08:40:35 -06:00