Commit graph

21646 commits

Author SHA1 Message Date
Francisco Jerez
923c72982e clover/util: Implement compat::string using aggregation instead of inheritance. 2014-09-05 09:27:20 +03:00
Francisco Jerez
7c1e6d582c clover/util: Have compat::vector track separate size and capacity.
In order to make the behaviour of resize() and reserve() closer to the
standard.

Reported-by: EdB <edb+mesa@sigluy.net>
2014-09-05 09:27:20 +03:00
Francisco Jerez
995f7b37da clover: Use conversion operator to initialize build log from compat::string.
Fixes binary garbage in the compilation logs caused by
compat::string::c_str() not being null-terminated (which is a bug on
its own that will be fixed in another commit).

Reported-by: EdB <edb+mesa@sigluy.net>
2014-09-05 09:27:20 +03:00
Rob Clark
5d8f40a53a freedreno/ir3: fix constlen with relative addressing
We can't rely on the value from the assembler if relative addressing is
used.  So instead use the max of declared-consts (which does not include
compiler immediates) and what we get from the assembler (which does).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-04 22:28:50 -04:00
Rob Clark
73ff4c5f70 freedreno/ir3: fix error in bail logic
all_delayed will also be true if we didn't attempt to schedule anything
due to no more instructions using current addr/pred.  We rely on coming
in to block_sched_undelayed() to detect and clean up when there are no
more uses of the current addr/pred, which isn't necessarily an error.

This fixes a regression introduced in b823abed.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-04 22:28:50 -04:00
Rob Clark
08ee0488e6 freedreno/ir3: bit of debug
Make it easier to figure out which compiler stage failed.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-04 22:28:50 -04:00
Eric Anholt
4bca922878 vc4: Merge qcompile and tgsi_to_qir
The split between these two didn't make much sense.  I'm going to want the
chance to look at uniform contents in optimization passes, and the QPU
emit I think is going to end up rewriting the uniforms stream.
2014-09-04 17:00:54 -07:00
Eric Anholt
55d2a16262 vc4: Add a CSE optimization pass.
Debugging a regression in discard support was just too full of duplicate
instructions, so I decided to remove them instead of re-analyzing each of
them as I dumped their outputs in simulation.
2014-09-04 11:39:51 -07:00
Eric Anholt
80b27ca2cd vc4: Switch to using native integers.
There were troubles with bools without using native integers
(st_glsl_to_tgsi seemed to think bool true was 1.0f sometimes, when as a
uniform it's stored as ~0), and since I've got native integers other than
divide, I might as well just support them.
2014-09-04 11:39:51 -07:00
Eric Anholt
874dfa8b2e vc4: Expose compares at a lower level in QIR.
Before, we had some special opcodes like CMP and SNE that emitted multiple
instructions.  Now, we reduce those operations significantly, giving
optimization more to look at for reducing redundant operations.

The downside is that QOP_SF is pretty special -- we're going to have to
track it separately when we're doing instruction scheduling, and we want
to peephole it into the instruction generating the destination write in
most cases (and not allocate the destination reg, probably.  Unless it's
used for some other purpose, as well).
2014-09-04 11:39:51 -07:00
Eric Anholt
3972a6f057 vc4: Stop being so clever in CMP handling.
This kind of cleverness should be in a general merging-of-ADD-and-MUL
instruction scheduler, rather than individual opcodes.
2014-09-04 11:39:51 -07:00
Marek Olšák
3dbf55c1be r600g,radeonsi: make sure there's enough CS space before resuming queries
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83432

Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-04 16:15:21 +02:00
Marek Olšák
8bd6723179 Revert "r600g,radeonsi: initialize HTILE to fully-expanded state"
This reverts commit f05fe294e7.

Apparently the hw doesn't like this. Revert to the "cleared" state.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83418
2014-09-04 15:48:38 +02:00
Thomas Hellstrom
2d6206140a winsys/svga: Fix incorrect type usage in IOCTL v2
While similar in layout, the size of the SVGA3dSize type may be smaller than
the struct drm_vmw_size type that is part of the ioctl interface. The kernel
driver could accordingly overwrite a memory area following the size variable
on the stack. Typically that would be another local variable, causing
breakage in, for example, ubuntu 12.04.5 where the handle local variable
becomes overwritten.

v2: Fix whitespace errors

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Cc: "10.1 10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-04 14:31:52 +02:00
Carl Worth
c35f14f368 Eliminate several cases of multiplication in arguments to calloc
In commit 32f2fd1c5d, several calls to
_mesa_calloc(x) were replaced with calls to calloc(1, x). This is strictly
equivalent to what the code was doing previously.

But for cases where "x" involves multiplication, now that we are explicitly
using the two-argument calloc, we can do one step better and replace:

	calloc(1, A * B);

with:

	calloc(A, B);

The advantage of the latter is that calloc will detect any overflow that would
have resulted from the multiplication and will fail the allocation, (whereas
the former would return a small allocation). So this fix can change
potentially exploitable buffer overruns into segmentation faults.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-09-03 18:37:02 -07:00
Michel Dänzer
58b386dce4 gallivm: Fix build against LLVM SVN >= r216982
Only MCJIT is available anymore.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-03 09:15:01 -07:00
Marek Olšák
8abdc3c4a9 r600g: fix alpha-test with HyperZ enabled, fixing L4D2 tree corruption
*_update_db_shader_control depends on the alpha test state. The problem was
it was in a block which is only entered if the pixel shader is changed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74863

Cc: mesa-stable@lists.freedesktop.org
Tested-by: Benjamin Bellec <b.bellec@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-03 11:50:21 +02:00
Michel Dänzer
2adf7ee92e r600g,radeonsi: Preserve existing buffer flags
The default case was accidentally clearing RADEON_FLAG_CPU_ACCESS from the
previous fall-through cases.

Reported-by: Mathias Fröhlich <Mathias.Froehlich@gmx.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-09-03 12:49:59 +09:00
Dave Airlie
8380b894ad r300g: pointless assignment of info.indexed
Did this code mean to do something else, you tell me!

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-03 07:59:09 +10:00
Dave Airlie
2b24e58310 omx/h264: remove stray semicolon after if
Coverity reported this, looks wrong to me.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-03 07:58:58 +10:00
Dave Airlie
f4ccf687a6 vdpau: unlock the mutex on error paths in attribute setting.
Coverity pointed out we never dropped the lock here, so fix
it by using a common exit path.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-03 07:58:50 +10:00
Eric Anholt
2da9118852 u_primconvert: Use u_upload_mgr for our little IB allocations.
tex-miplevel-selection was hammering my memory manager with primconverts
on individual quads.  This gets all those converted IBs packed into larger
IBs.

Reviewed-by: Rob Clark <robclark@freedesktop.org>
2014-09-02 13:55:15 -07:00
Eric Anholt
6720d1573a u_primconvert: Shut up compiler warning.
gcc isn't detecting that src is set before used, since both are under if
(info->indexed).

Reviewed-by: Rob Clark <robclark@freedesktop.org>
2014-09-02 13:55:15 -07:00
Eric Anholt
1718ba30e5 gbm: Fix gallium build when X11 is in a non-system directory
pipe-loader.h will include Xlib.h when HAVE_PIPE_LOADER_XLIB is set in the
build.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-09-02 13:48:54 -07:00
Eric Anholt
d71a9b7d9d vc4: Handle a couple of the transfer map flags.
This is part of fixing extremely long runtimes on some piglit tests that
involve streaming vertex reuploads due to format conversions, and will
similarly be important for X performance, which relies on these flags.
2014-09-02 12:10:56 -07:00
Michel Dänzer
a75fee78c6 radeonsi: Compile dummy pixel shader on demand
It's never used under normal circumstances.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-09-02 15:24:07 +09:00
Michel Dänzer
b84b9eae20 u_blitter: Create all shaders on demand
Not all of these are used in every context, so this can make a
significant difference for short-lived contexts such as in piglit tests.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-09-02 15:24:07 +09:00
Michel Dänzer
51131c423c r600g,radeonsi: Inform the kernel if a BO will likely be accessed by the CPU
This allows the kernel to prevent such BOs from ever being stored in the
CPU inaccessible part of VRAM.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-09-02 15:24:07 +09:00
Dave Airlie
19f6e80a1e nouveau: don't leak dec struct on error
This one path doesn't goto fail, so it seems to leak dec.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-02 10:08:58 +10:00
Dave Airlie
32a8b2cf54 xvmc/tests: %C isn't a valid printf specifier.
Reported-by: Coverity scanner.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-02 10:07:54 +10:00
Dave Airlie
ea88b1de2f nouveau/nv40: quiten coverity warning in unused vertex texture code.
This fixes the code, but we never run it anyways, so silence coverity.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-02 10:04:29 +10:00
Ilia Mirkin
d0cd86686d nv50: remove unused variables
Recent code changes have caused these to no longer be used. Remove them.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-09-01 18:47:42 -04:00
Ilia Mirkin
2c44043313 nv50: attach the buffer bo to the miptree structures
The current code... makes no sense. Use nouveau_bo_ref to attach the bo
to the exposed resource so as to have the proper lifetime guarantees.

Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-01 18:38:02 -04:00
Ilia Mirkin
9d52e551a5 nv50: mt address may not be the underlying bo's start address
With VP2, nv50_miptree is faked because the underlying bo's have to be
laid out in a certain way. This is done by adjusting the address. Make
sure that blits (and everything else for consistency) use the mt address
rather than the bo address as a base.

This fixes retrieving chroma plane with VDPAU.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82255
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-01 18:38:02 -04:00
Ilia Mirkin
2528d402b9 nv50: set the miptree address when clearing bo's in vp2 init
The mt address is about to be used more, make sure it's set
appropriately.

Reported-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-01 18:38:02 -04:00
Ilia Mirkin
6c2b079231 nv50/ir: avoid creating instructions that can't be emitted
When constant folding a MAD operation, we first fold the multiply and
generate an ADD. However we do so without making sure that the immediate
can be handled in the saturate case. If it can't, load the immediate in
a separate instruction.

Reported-by: Tiziano Bacocco <tizbac2@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-01 18:38:02 -04:00
Ilia Mirkin
115d9a5525 nvc0: don't make 1d staging textures linear
Experimentally, the sampler doesn't appear to like these, neither as
buffer nor as rect textures. So remove 1D from the list of texture types
to make linear when used for staging.

This fixes the OSD in mplayer for VDPAU.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-01 18:38:02 -04:00
Ilia Mirkin
362cd26960 nv50: zero out unbound samplers
Samplers are only defined up to num_samplers, so set all samplers above
nr to NULL so that we don't try to read them again later.

Tested-by: Christian Ruppert <idl0r@qasl.de>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-01 18:38:02 -04:00
Ilia Mirkin
c4bb436f76 nvc0/ir: avoid infinite recursion when finding first uses of tex
In certain circumstances, findFirstUses could end up doubling back on
instructions it had already processed, resulting in an infinite
recursion. Avoid this by keeping track of already-visited instructions.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83079
Tested-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
2014-09-01 18:38:02 -04:00
Rob Clark
ef858ac770 freedreno/ir3: add DDX/DDY
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-01 18:08:21 -04:00
Rob Clark
5e5604cc28 freedreno/ir3: don't keep IR around
Once we've assembled the shader, no need to keep the intermediate
around.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-01 18:08:21 -04:00
Marek Olšák
a10c8db715 radeonsi: implement EXPCLEAR optimization for depth
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:52 +02:00
Marek Olšák
f05fe294e7 r600g,radeonsi: initialize HTILE to fully-expanded state
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:52 +02:00
Marek Olšák
573313c94e radeonsi: implement fast depth clear
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:51 +02:00
Marek Olšák
63cb4077e6 radeonsi: move DB_RENDER_CONTROL into draw_vbo
So that I can add fast depth clear.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:51 +02:00
Marek Olšák
78aa717601 radeonsi: disable occlusion queries if they are not needed
We always left them enabled, which turned off HiZ in some cases.
This should improve performace with Hyper-Z.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:51 +02:00
Marek Olšák
ab9ad91779 r600g,radeonsi: force fast stencil and HTILE stencil off, fixing a Hyper-Z hang
This should be as fast as no HTILE for stencil. I think we can still get full
performance with depth-only rendering even if stencil is present in the buffer
but not used, but I'm not 100% sure. This may be revisited when HiS and fast
stencil clear are implemented.

This fixes a hang in Brutal Legend.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64471

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:51 +02:00
Marek Olšák
ba14d4910c r600g: set VGT_ENHANCE=4 on R7xx
This is a golden setting on RV740, but there is a hw bug which recommends
setting it on all R7xx chipsets.

Acked-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:49 +02:00
Marek Olšák
13b93596da r600g: expose AMD_vertex_shader_layer and *_viewport_index on R600-R700
already implemented

Acked-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:45 +02:00
Marek Olšák
d159c5e3e0 r600g: fix layered clear
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:42 +02:00