Commit graph

19043 commits

Author SHA1 Message Date
Ilia Mirkin
a2061eea0f nv50: add vp3/vp4 support for mpeg2/vc1
h264/mpeg4 remain disabled for pre-nvc0, there's some minor
bug/difference which causes the decoding to hang after some frames.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-16 09:48:47 +02:00
Ilia Mirkin
b3f6f127f2 nv50: separate video logic from noalloc
The upcoming vp3 logic will want the video layout, but allocated by the
miptree.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-16 09:48:26 +02:00
Ilia Mirkin
c1a6f59b20 nv30: remove no-longer-used formats from table
Commit 14ee790df7 removed the formats from the vtxfmt_table but forgot
to also update the info_table.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "9.2 and 9.1" <mesa-stable@lists.freedesktop.org>
2013-08-16 09:48:09 +02:00
Zack Rusin
7115bc3940 draw: handle nan clipdistance
If clipdistance for one of the vertices is nan (or inf) then the
entire primitive should be discarded.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-08-15 16:26:32 -04:00
Roland Scheidegger
6ca18e06ae gallivm: revert accidentally commited hunk
That magic wasn't meant to be commited, need to work on some proper fix.
2013-08-15 19:26:39 +02:00
Roland Scheidegger
5626a84a00 gallivm: do per-sample depth comparison instead of doing it post-filter
Doing the comparisons pre-filter is highly recommended by OpenGL (and d3d9)
and definitely required by d3d10.
This actually doesn't do it pre-filter but more "in-filter" as otherwise
need to push the comparisons even further down into fetch code and this
also trivially allows using a somewhat cheaper lerp.
Doing it pre-filter would actually have some performance advantage for UNORM
formats (because the comparisons should be done in texture format, we'd only
need to convert the shadow ref coord to texture format once, but in turn would
save converting the per-sample texture values to floats) but this gets a bit
messy as this has implications for border color handling as well (which needs
to be done prior to depth comparisons, hence would also need to convert border
color to texture format too or use some other tricks like doing separate border
color / shadow ref comparison and simply using that result directly when doing
border replacement).
Should make no difference for nearest filtering, and performance for linear
filtering should be mostly the same too (essentially have one more comparison
instruction per sample, and replace the sub/mul/add lerp with a sub/and/and/add
special "lerp" which all in all shouldn't be much of a difference).

v2: get rid of old code completely

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-15 18:42:20 +02:00
Michel Dänzer
3b2f3f90ac radeonsi: Pixel shaders pre-load one more SGPR
Acked-by: Marek Olšák <maraeo@gmail.com>
2013-08-15 17:55:00 +02:00
Michel Dänzer
f0753a3cd4 radeonsi: TGSI_SEMANTIC_CLIPVERTEX doesn't use any parameters 2013-08-15 17:54:40 +02:00
Michel Dänzer
2f98dc223f radeonsi: Don't export unused clip distance vectors from vertex shader
E.g. the Source engine seems to always write to gl_ClipVertex, but normally
doesn't enable any GL_CLIP_DISTANCEn states. This change removes some
irrelevant parts from the generated vertex shader code in such cases.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-08-15 17:53:50 +02:00
Michel Dänzer
b00269aa58 radeonsi: Don't leave gaps between position exports from vertex shader
If the vertex shader exports clip distances but not point size, use
position exports 1/2 instead of 2/3 for the clip distances. Fixes
geometry corruption in that case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66974

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-08-15 17:42:26 +02:00
Roland Scheidegger
abdd32dcd5 llvmpipe: fix stencil bug if we have both stencil and depth tests
This is a very well hidden bug found by accident (only the fixed glean
tstencil2 test so far seems to hit it).
We must use new mask with combined s_pass values and orig_mask values
for zpass/zfail stencil ops, otherwise both the sfail op and one of
zpass/zfail op are applied (probably not hit in most tests because
some of the ops tend to be KEEP usually).

Note: this is a candidate for the 9.2 branch.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-15 17:30:07 +02:00
Ilia Mirkin
4ea191fb2d nvc0: move video param and format support functions to nouveau
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:19:48 +02:00
Ilia Mirkin
9255019a53 nvc0: move firmware loading functions to nouveau
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:19:48 +02:00
Ilia Mirkin
9d8c076803 nvc0: move some of the simpler decoder functions into nouveau
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:19:48 +02:00
Ilia Mirkin
73f4499a02 nvc0: move vp param filling logic into nouveau
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:19:48 +02:00
Ilia Mirkin
e1cd987bb6 nvc0: move bsp param-filling logic into nouveau
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:19:48 +02:00
Ilia Mirkin
d6a82a7747 nvc0: move nvc0_decoder into nouveau, rename to nouveau_vp3_decoder
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:19:47 +02:00
Ilia Mirkin
86e5c3c97b nvc0: standardize on using #if for NVC0_DEBUG_FENCE
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:19:47 +02:00
Ilia Mirkin
b57875bbb3 nvc0: refactor video buffer management logic into nouveau_vp3
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:19:47 +02:00
Ilia Mirkin
940f7cec77 nv50: allow forcing PMPEG use, for ease of testing
This also allows people who don't want to install the binary blobs
required for VP2 to still get MPEG decoding.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:15:23 +02:00
Ilia Mirkin
ee3ca3614e nv30: hook up PMPEG support via nouveau_video, enables XvMC to work
Force the format to be the reasonable format that doesn't require an
inverse z-scan.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:15:12 +02:00
Ilia Mirkin
6010c683d0 nouveau: set buffer format of video buffer
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:15:04 +02:00
Ilia Mirkin
8975f83402 nouveau: fix number of surfaces in video buffer, use defines
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-08-15 15:15:02 +02:00
Ilia Mirkin
14ee790df7 nv30: U8_USCALED only works for size 4
See https://bugs.freedesktop.org/show_bug.cgi?id=61635 for a sample
program. Changing it to use a vec4 makes it work. Remove the unsupported
formats.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "9.2 and 9.1" <mesa-stable@lists.freedesktop.org>
2013-08-15 15:14:25 +02:00
Chia-I Wu
a453eb6f86 ilo: fix fragment shaders that use PCB on GEN7+
Missed this commit when preparing PCB changes for upstreaming.
2013-08-15 11:35:46 +08:00
Vinson Lee
ae645b83fc nouveau: Fix variable name.
Fixes build error introduced with commit
d1ba1055d9.

  CC     nouveau_video.lo
nouveau_video.c: In function 'nouveau_screen_get_video_param':
nouveau_video.c:866:33: error: 'screen' undeclared (first use in this function)
nouveau_video.c:866:33: note: each undeclared identifier is reported only once for each function it appear

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-08-14 17:35:31 -07:00
Marek Olšák
3d1b01662b radeonsi: unduplicate code in create_context
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-15 02:03:03 +02:00
Marek Olšák
e801b78aa0 radeonsi: initialize the radeon_surface structure
this fixes valgrind warnings

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-15 02:03:03 +02:00
Marek Olšák
731c6aa52d radeonsi: correct sampler function names
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-15 02:03:03 +02:00
Marek Olšák
0469171159 radeonsi: rename r600_texture::dirty_db_mask to dirty_level_mask
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-15 02:03:03 +02:00
Marek Olšák
363b2805f7 radeonsi: rename r600_resource_texture to r600_texture
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-15 02:03:02 +02:00
Marek Olšák
128819d394 tgsi: add info about MSAA samplers to tgsi_shader_info
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-15 02:03:02 +02:00
Marek Olšák
0ee4bae70d tgsi: fix the location of sample index
The sample index is always in W.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-08-15 02:03:02 +02:00
Roland Scheidegger
7727fbb7c5 r600/radeonsi: implement new float comparison instructions
Also use ordered comparisons for old cmp instructions.

Tested-by: Michel Dänzer <michel@daenzer.net>
Reviewed-by: Tom Stellard <tom@stellard.net>
2013-08-15 00:40:14 +02:00
Roland Scheidegger
72874d2352 nv50: implement new float comparison instructions
untested.

Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
2013-08-15 00:40:14 +02:00
Roland Scheidegger
e858921d52 ilo: implement new float comparison instructions
untested.

Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
2013-08-15 00:40:14 +02:00
Roland Scheidegger
e58c2310b8 gallivm: already pass coords in the right place in the sampler interface
This makes things a bit nicer, and more importantly it fixes an issue
where a "downgraded" array texture (due to view reduced to 1 layer and
addressed with (non-array) samplec instruction) would use the wrong
coord as shadow reference value. (This could also be fixed by passing
target through the sampler interface much the same way as is done for
size queries, might do this eventually anyway.)
And if we'd ever want to support (shadow) cube map arrays, we'd need
5 coords in any case.

v2: fix bugs (texel fetch using wrong layer coord for 1d, shadow tex
using wrong shadow coord for 2d...). Plus need to project the shadow
coord, and just for fun keep projecting the layer coord too.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-15 00:40:14 +02:00
Roland Scheidegger
d4b43cedb6 gallivm: change coordinate handling throughout functions
Instead of passing s,t,r coordinates pass a coord array - the reason is that
I need to pass more coords (in particular for shadow "coord", future will also
need another one for cube map arrays) so just pass them as an array.
Also, to simplify things, use fixed location for the shadow reference value I
want to get rid of the silly "where is the right coord value" game.
Keep old-style however for aos sampling (which is not going to need shadow
coord, though for cube map arrays it still would need fixing).
(Next patch will pass those through using the new arrangement directly from
sampler interface.)

v2: fix up soa split path (unreachable currently but still...)

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-15 00:40:14 +02:00
Roland Scheidegger
c6c55ad3e9 gallivm: fix border color with normalized texture formats
We need to put border color into texture format color space which
essentially means clamping for non-float, normalized formats (not entirely
sure if we're also meant to quantize the float but it's probably ok not to
do it thankfully).
For OpenGL we could do this easily outside generated code due to the
1:1 sampler/texture correspondence but not for d3d10 which is terrible
(as we recalculate a constant over and over again per shader invocation).
Fortunately border color should be rare enough that we don't care THAT much.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-15 00:40:14 +02:00
Zack Rusin
27cedd8aec llvmpipe: fix pipeline statistics with a null ps
If the fragment shader is null then pixel shader invocations have
to be equal to zero. And if we're running a null ps then clipper
invocations and primitives should be equal to zero but only
if both stancil and depth testing are disabled.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-08-14 18:23:36 -04:00
Zack Rusin
a3ae5dc7dd draw: make sure that the stages setup outputs
Calling the prepare outputs cleans up the slot assignments
for outputs, unfortunately aapoint and aaline didn't have
code to reset their slots after the initial setup, this
was messing up our slot assignments. The unfilled stage
was just missing the initial assignment of the face slot.
This fixes all of the reported piglit failures.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-08-14 18:23:35 -04:00
Rico Schüller
d1ba1055d9 vl: Add support for max level query v2
This patch adds the level query support to the video decoders
and uses some more reasonable defaults.

v2: (ck) add commit message

Reviewed-by: Christian König <christian.koenig@amd.com>
2013-08-14 13:20:01 +02:00
Jon Severinsson
9298f537a7 radeon/llvm: Add missing "%s" format string to fprintf.
This fixes a compilation warning with -Wformat-security.

CC: "9.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-08-13 19:18:14 -07:00
Vadim Girlin
17bb96b03d r600g/sb: use MULADD workaround on R7xx for MULADD_IEEE
Looks like the same issue that was seen with MULADD in trans slot on
R7xx also affects MULADD_IEEE (maybe all OP3 instructions and MULADD is
just a most frequently used?). So the workaround is to not allow affected
instructions to be placed into the trans slot.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=67927

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-14 01:03:18 +04:00
Roland Scheidegger
6991f86945 gallivm: implement new float comparison instructions returning integer masks
FSEQ/FSGE/FSLT/FSNE work just the same as SEQ/SGE/SLT/SNE except skip the
select.
And just for consistency use the same appropriate ordered/unordered comparisons
for the old opcodes as well.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-13 19:09:17 +02:00
Roland Scheidegger
0930082ffd tgsi: implement new float comparison instructions returning integer masks
Also while here add a bunch of other forgotten (integer) instructions to
tgsi_util_get_inst_usage_mask() (which isn't used for much except optimizing
away unused input components), though it may still be incomplete.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-13 19:09:17 +02:00
Roland Scheidegger
e7a5bf7a34 gallium: add new float comparison instructions returning integer masks
Newer graphic languages don't want messy float mask results but instead true
"boolean" mask results for float comparisons. Otherwise just need to convert
the floats back to integers. Need to keep the old opcodes however due to both
legacy (gl and d3d9) needing them and because older hw can't really deal with
integers. These new FSEQ/FSGE/FSLT/FSNE opcodes are part of integer API and
hence must be supported if a driver claims to support glsl 1.30 (or
PIPE_SHADER_CAP_INTEGERS).

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-13 19:09:17 +02:00
Chia-I Wu
3b6cee1634 ilo: enable dumping of WM PCB
It was disabled because it wasn't supported.
2013-08-13 16:28:24 +08:00
Chia-I Wu
0f8a86682f ilo: no binding table change when constants are pushed
When constants can be pushed, and nothing else requires new SURFACE_STATEs,
there is no need to emit BINDING_TABLE_STATE.
2013-08-13 16:26:03 +08:00
Chia-I Wu
c6e1e0157b ilo: support push constant model in shaders
Source constants from URB constant data when the constant data can fit in the
PCB.
2013-08-13 16:04:35 +08:00