Commit graph

27608 commits

Author SHA1 Message Date
Dave Airlie
3a26ef23e7 gallivm: convert size query to using a set of parameters.
This isn't currently that easy to expand, so fix it up
before expanding it later to include dynamic samplers.

[airlied: use some local variables (Roland)]

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-04-19 07:33:39 +10:00
Tim Rowley
3227c10270 swr: dereference cbuf/zbuf/views on context destroy
Fixes resource memory leaks.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-18 15:52:26 -05:00
Rob Clark
77a9107bf2 freedreno/ir3: fix grouping issue w/ reverse swizzles
When we have something like:

   MOV OUT[n], IN[m].wzyx

the existing grouping code was missing a potential conflict.  Due to
input needing to be sequential scalar regs, we have:

 IN:  x <-> y <-> z <-> w

which would be grouped to:

 OUT: w <-> z2 <-> y2 <-> x  (where the 2 denotes a copy/mov)

but that can't actually work.  We need to realize that x and w are
already in the same chain, not just that they aren't both already in
new chain being built.

With this fixed, we probably no longer need the hack from f68f6c0.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-18 15:41:32 -04:00
Marek Olšák
ed66c75784 radeonsi: use enums in si_shader.h
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-18 19:51:25 +02:00
Marek Olšák
0c52caf7b7 gallium/radeon: use enums in r600_query.h
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-18 19:51:25 +02:00
Marek Olšák
dd9ca77cb9 radeonsi: always use PFP_SYNC_ME when doing flushes and waits
This is typically used by the closed driver before SURFACE_SYNC.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-18 19:51:25 +02:00
Marek Olšák
1db5678688 radeonsi: don't do VS/PS partial flushes if SURFACE_SYNC waits too
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-18 19:51:25 +02:00
Marek Olšák
58494b42b5 radeonsi: add safety assertions for meta cache flushes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-18 19:51:25 +02:00
Marek Olšák
78f58a4e6f radeonsi: don't use ACQUIRE_MEM on the graphics ring
It's only required on the compute ring. This matches the closed driver.

The compute flag is removed to prevent confusion and Bas's compute shader
patches remove it in the whole function.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-18 19:51:25 +02:00
Marek Olšák
3faecdd4e1 radeonsi: remove TODO and correct a comment in si_emit_cache_flush
Yes, that flag is really needed.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-18 19:51:25 +02:00
Marek Olšák
28c2573b4f radeonsi: don't flush CB/DB caches for performance counters
I'm not sure about this. This will make the engines go idle, but the caches
will be unflushed. This should match app behavior without performance
counters, which can be a good thing.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-18 19:51:24 +02:00
Marek Olšák
97c328b2a3 gallium/radeon: don't flush CB/DB caches for timestamp queries
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-18 19:51:24 +02:00
Marek Olšák
6dc21b1962 gallium/util: fix undefined shift to the last bit in u_bit_scan
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-18 19:51:24 +02:00
Marek Olšák
9434aa8103 gallium/util: fix u_bit_scan_consecutive_range for mask == 0xffffffff
The second ffs returns 0, yielding count == -1.

v2: change 1 to 1u

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-04-18 19:51:24 +02:00
Marek Olšák
e50e1f86b0 gallium/radeon: fix Nine with its slightly shifted viewports
just need to do the calculation in floating-point and then round things
properly

Reviewed-by: Axel Davy <axel.davy@ens.fr>
2016-04-18 19:51:24 +02:00
Eric Anholt
48fe53bbb9 vc4: Add support for rendering to cube map surfaces.
We need to fix up the offset to point at the face of the cube.  Fixes
piglit fbo-cubemap, copyteximage CUBE, and glean's fbo test.

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-04-18 10:10:44 -07:00
Eric Anholt
21a9ed6207 vc4: Don't flush on read-only access of buffers read by the CL.
Fixes piglit mixed-immediate-and-vbo, and may significantly improve
performance of applications that store a 4-byte IB in the same VBO as
vertex data.
2016-04-18 10:10:44 -07:00
Eric Anholt
9e8a8b0c8b vc4: Sanity check that flushes don't happen between state emit and draw.
Catches the cause of failure in
arb_vertex_buffer_object-mixed-immediate-and-vbo, I've had this class of
failure before, and it probably won't be the last time.
2016-04-18 10:10:44 -07:00
Eric Anholt
56b14adf85 vc4: Sanity check strides for imported BOs.
If we're going to sample from or render to them at some particular size,
we'd better make sure that they actually are that size.  Causes some tests
under simulation to generate appropriate error messages instead of
failures.
2016-04-18 10:10:44 -07:00
Oded Gabbay
d3c98c73dc r600g: Move R600_BIG_ENDIAN to r600_pipe_common.h
I need to do this so I could use R600_BIG_ENDIAN in files which include
r600_pipe_common.h but not r600_pipe.h

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-18 09:50:08 +03:00
Oded Gabbay
72d0d2ba59 r600g: fix code indentation
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-18 09:50:08 +03:00
Roland Scheidegger
d11111a551 gallivm: don't use vector selects with llvm 3.7
llvm 3.7 sometimes simply miscompiles vector selects.
See https://bugs.freedesktop.org/show_bug.cgi?id=94972

This was fixed in llvm r249669
(https://llvm.org/bugs/show_bug.cgi?id=24532).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-18 00:23:34 +02:00
Marek Olšák
c1a2fe7fd1 gallium/radeon: handle vertex shaders that disable clipping & viewport
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-16 00:21:15 +02:00
Roland Scheidegger
64d3ae09b7 llvmpipe: (trivial) initialize src1_alpha var to NULL
The blend code would do a conditional assignment based on it, causing valgrind
to complain. Since that variable was actually unused in this case, this
doesn't fix anything but the warning.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94955
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-04-15 22:51:28 +02:00
Tim Rowley
082f6d75ae gallium/swr: confine c++11 flag to swr driver
On the philosophy that a driver shouldn't change the compile flags
for the entire tree, take the clove approach of moving the c++11 flag
to the swr driver directory.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-04-15 14:43:01 -05:00
Tim Rowley
ee72fec9cf gallium/swr: allow swr use as a swrast dri driver
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-15 14:21:50 -05:00
Eric Anholt
f6d21bcd6b vc4: Fix subimage accesses to LT textures.
This code started out like the T case, iterating over utile offsets, but I
had partially switched it to iterating over pixel offsets.  I hadn't
caught this before because it's unusual to do piecemeal uploads to small
textures.

Fixes bad text rendering in QT5 apps, which use a 256x16 glyph cache.
Also fixes 6 piglit tests related to glTexSubImage() and
glGetTexSubImage().

Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-04-15 11:57:17 -07:00
Leo Liu
8f4340c5e6 radeon/uvd: fix tonga feedback buffer size
This only applies to tonga

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-04-14 19:33:44 -04:00
Nicolai Hähnle
a17911ceb1 gallium/radeon: handle failure when mapping staging buffer
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-14 16:29:23 -05:00
Nicolai Hähnle
8bd0f0df50 radeonsi: mark ssbo and images descriptor pointers dirty at beginning of CS
Without this, we were getting non-deterministic VM faults under high pressure.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-14 16:29:23 -05:00
Samuel Pitoiset
bb4cdee9a4 nvc0: do not break the universe on GK110+
I removed that return 0 by mistake. Ooops.

Fixes: 6e23fd4 ("nvc0: allow to use compute support on GM200")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-14 21:57:21 +02:00
Samuel Pitoiset
6e23fd420d nvc0: allow to use compute support on GM200
This works like a charm but please not that NVF0_COMPUTE have to be set
because compute support is still not enabled by default on GK110+. This
will require more testing to make sure it won't break the 3D state.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-04-14 21:01:51 +02:00
Emil Velikov
bb949e262c gallium/swr: fold the almost identical Makefiles
Rather than having two almost identical Makefiles, with various VPATH
hacks just fold them, using COMMON_* variables and actually getting
things buildable/shipable.

v2: whitespace fixes, remove Makefile.sources-arch

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-04-14 16:30:57 +01:00
Marek Olšák
112291964e radeonsi: don't overwrite the scratch offset in shader prologs
Prologs only look at num_input_sgprs.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-04-14 17:00:14 +02:00
Marek Olšák
ffe44d0283 radeonsi: fold num_user_sgprs where it is possible
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-04-14 17:00:14 +02:00
Marek Olšák
51c4034f9b radeonsi: fix SGPRS calculation once more
This fixes GS piglit failures after adding SI_PARAM_SHADER_BUFFERS,
which bumped NUM_USER_SGPRS and uncovered this bug on SI.

If this was fixed in LLVM, these workarounds wouldn't be needed.

LLVM would have to look at the calling convention to know how many SGPR
inputs are declared, and add VCC and the scratch wave offset (which is
enabled even if we spill SGPRs but not VGPRs, oh well).

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-04-14 17:00:14 +02:00
Marek Olšák
aaf5be4a29 radeonsi: disable hw ETC2 on Polaris
not supported by hw directly, but it's still fully supported by the driver

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-14 16:58:59 +02:00
Jose Fonseca
50ddf03ada scons: Add a "check" target to run all unit tests.
Except:
- u_cache_test -- too long
- translate_test -- unreliable (it's probably testing corner cases that
  translate module doesn't care about.)

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-14 07:19:04 +01:00
Jose Fonseca
9ae0e8ee3c test/unit: Make translate_test invoke translate_create by default.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-14 07:19:04 +01:00
Jose Fonseca
f8a51034bd test/unit: Make pipe_barrier_test actually check correct bahavior.
So it can run unattended.

Also make it silent by default.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-04-14 07:19:04 +01:00
Michel Dänzer
171a570f38 clover: Fix build against LLVM SVN >= r266163
createInternalizePass now takes a callback instead of a StringSet.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-04-14 11:53:41 +09:00
Jason Ekstrand
b63a98b121 nir/dead_variables: Configurably work with any variable mode
The old version of the pass only worked on globals and locals and always
left inputs, outputs, uniforms, etc. alone.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-13 15:45:10 -07:00
George Kyriazis
f69a61b1aa gallium/swr: Make flat shading tris work.
- Incorporate flatshade flag into the shader generation
- Use provoking vertex (vc) in shader when flat shading.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-04-13 13:46:37 -05:00
Rob Clark
c53a12fedc Revert "freedreno/a4xx: better occlusion/sample counting"
This reverts commit 62fa868728.

dEQP-GLES3.functional.occlusion_query.* was unhappy about that change.
Still not really sure *what* the other slots in the sample results
buffer are.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-13 14:16:40 -04:00
Rob Clark
46e9bbc918 freedreno/a4xx: rasterizer_discard support
This one is slightly annoying, since trying to write RBRC from draw
would clobber values set in the tiling/gmem code.  We could do command-
stream patching for RBRC, as is done on a3xx.  Although since it seems
to be a rarely used feature, it is easier just to do RMW to set/clear
the bit.

Fixes dEQP-GLES3.functional.rasterizer_discard.basic.write_depth_triangles
and related tests.

a3xx still needs the same feature, although there it probably makes more
sense to take advantage of the existing cmdstream patching which is
required for RBRC for other reasons.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-13 14:16:21 -04:00
Rob Clark
216225ce57 freedreno/ir3: fix array textures on a4xx
Seems like a4xx needs offset added to array index for all arrays,
whereas a3xx only for cubemap arrays.  Fixes a whole swath of dEQP fails
(roughly *sampler2darray*).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-13 14:16:14 -04:00
Rob Clark
7e93b26b5d freedreno: fix stream-out offset handling for lines/tris
We need to increment offset by # of vertices, not by # of prims.  Fixes
a bunch of dEQP fails involving prims other than points.  For example,
dEQP-GLES3.functional.transform_feedback.position.lines_separate

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-13 14:16:02 -04:00
Rob Clark
6ca6e80f61 freedreno: fix handling for stream-out offsets
If changed && append, we shouldn't be resetting the internal offset back
to zero.  This fixes issues w/ sequences like:

   glBeginTransformFeedback()
   glDraw()
   glPauseTransformFeedback()
   glDraw()
   glResumeTransformFeedback()
   glDraw()
   glEndTransformFeedback()

Fixes dEQP-GLES3.functional.transform_feedback.array.separate.points.lowp_vec3
and related tests.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-13 14:15:54 -04:00
Rob Clark
0a4b0fc315 freedreno: fix prims-emitted query
This should only count when TF is not paused.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-13 14:15:47 -04:00
Rob Clark
a7eb12d089 freedreno: fix max-line-width
dEQP noticed that we were advertising completely bogus values.  The
actual maximum is 127.0f.

*But* we have to use an artifically low maximum to work around a bug
in the dEQP test, which gets confused when the max line width is too
large and lines start going off-screen.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-13 14:15:31 -04:00