Commit graph

25313 commits

Author SHA1 Message Date
Ilia Mirkin
c2be35d309 nv50,nvc0: fix crash when increasing bsp bo size for h264
H264 doesn't have a bitplane bo. We just need a device reference, so use
the one from the client.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b5f2f7073f)
2016-01-08 12:05:27 +02:00
Ilia Mirkin
132131af6b nv50,nvc0: make sure there's pushbuf space and that we ref the bo early
First off, we can't flush in the middle of a command. Secondly
requesting the extra push space might cause a flush to happen. If that
flush happens, we'd have to do the PUSH_REFN again. So instead do
PUSH_REFN after the push space request. This helps avoid rare crashes
with supertuxkart in libdrm due to assertion failures.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c1d14c6817)
[Emil Velikov: resolve trivial conflict]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c
2016-01-08 12:05:27 +02:00
Kenneth Graunke
f4977656c1 nvc0: Set winding order regardless of domain.
Quads need to respect winding order, too - not just triangles.

Fixes rendering in GFXBench 4.0's tessellation benchmark.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 65d3f85eb3)
2016-01-08 12:05:27 +02:00
Ilia Mirkin
1415b6b0ae nv50/ir: float(s32 & 0xff) = float(u8), not s8
Make sure to make conversion unsigned when we're ANDing the high bits
away. Fixes corruption in dolphin.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 724134f683)
2016-01-08 12:05:26 +02:00
Grazvydas Ignotas
e468d4b1d6 r600: fix constant buffer size programming
When buffer size is less than 16, zero ends up being programmed as
size, which prevents the hardware from fetching the correct values.
Fix it by combining shift and align so that the value is always
rounded up.

Cc: "11.1 11.0 10.6" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92229
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit da0e216e06)
2016-01-08 12:05:26 +02:00
Ilia Mirkin
b95dc1a5c8 nvc0: don't forget to reset VTX_TMP bufctx slot after blit completion
Also release the scratch allocation if any.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 109c348284)
2016-01-08 12:05:26 +02:00
Nicolai Hähnle
a8a7af68e8 gallium/radeon: fix regression in a number of driver queries
This rather silly mistake was introduced by commit 01910676.

Cc: "11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit ea8c0b16ec)
2016-01-08 12:05:26 +02:00
Eric Anholt
96e931fea1 vc4: Keep sample mask writes from being reordered after TLB writes
Fixes a regression I noticed after introducing scheduling on the QIR.

Cc: "11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 960f48809f)
2016-01-08 12:05:26 +02:00
Rob Herring
48b580d1cc freedreno/ir3: fix 32-bit builds with pointer-to-int-cast error enabled
Android builds with -Werror=pointer-to-int-cast causing an error on 32-bit
builds.

Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit b201a6ed9f)
2016-01-08 12:05:26 +02:00
Nicolai Hähnle
d4d2315d65 gallium/radeon: only dispose locally created target machine in radeon_llvm_compile
Unify the cleanup paths of the function rather than duplicating code.

Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit 0a6a17b9d7)
2016-01-08 12:05:26 +02:00
Jonathan Gray
a1843802f8 configure.ac: use pkg-config for libelf
Use PKG_CHECK_MODULES to get the flags to link libelf

v2: keep AC_CHECK_LIB as a fallback for elfutils provided
libelf that doesn't install a pkg-config file.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 7f585a6a98)
2016-01-08 12:05:25 +02:00
Samuel Pitoiset
259734ae3f nv50: free memory allocated by the prog which reads MP perf counters
This fixes a memory leak introduced in 6a9c151
("nv50: add compute-related MP perf counters on G84+")

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 695ae816da)
2016-01-08 12:05:25 +02:00
Samuel Pitoiset
10ec6685d0 nv50,nvc0: free memory allocated by performance metrics
The destroy_query() helper was actually never called. This fixes
a memory leak while monitoring performance metrics.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit aeee7f2a4d)
2016-01-08 12:05:25 +02:00
Samuel Pitoiset
89d585357a nvc0: free memory allocated by the prog which reads MP perf counters
This fixes a long time ago memory leak (even before all my query
related changes).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9aca60bfb0)
2016-01-08 12:05:25 +02:00
Eric Anholt
ab0b96a939 vc4: Warn instead of abort()ing on exec ioctl failures.
It's really harsh to abort() the X Server because of a momentary failure
(particularly -ENOMEM).  I don't see a way to pass an -ENOMEM up the stack
from here, but we can at least log to stderr before proceeding on.

Cc: "11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 02bcb443ee)
2016-01-08 12:05:25 +02:00
Ilia Mirkin
55d9c80d21 gk104/ir: simplify and fool-proof texbar algorithm
With the current algorithm, we only look at tex uses. However there's a
write-after-write hazard where we might decide to, on some path, not use
a texture's output at all, but instead to write a different value to
that register. However without the barrier, the texture might complete
later and overwrite that value.

This fixes Unreal Elemental demo on GK110/GK208, flightgear on GK10x,
and likely other random-looking failures.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7752bbc44e)
2016-01-08 11:54:09 +02:00
Ilia Mirkin
070dbfa810 nv50/ir: can't have predication and immediates
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6aca7fecb7)
2016-01-08 11:54:09 +02:00
Marek Olšák
ceb00fb1b9 gallium/radeon: fix Hyper-Z hangs by programming PA_SC_MODE_CNTL_1 correctly
This is the recommended setting according to hw people and it makes Hyper-Z
stable. Just the two magic states.

This fixes Evergreen, Cayman, SI, CI, VI (using the Cayman code).

Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d3c08309ab)
2016-01-08 11:54:09 +02:00
Marek Olšák
408fcfedee radeonsi: apply the streamout workaround to Fiji as well
Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 787ada6bf6)
2016-01-08 11:54:08 +02:00
Marek Olšák
f9dd8468b1 radeonsi: don't call of u_prims_for_vertices for patches and rectangles
Both caused a crash due to a division by zero in that function.
This is an alternative fix.

Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
(cherry picked from commit 0f9519b938)
2016-01-08 11:54:08 +02:00
Marek Olšák
2b526fc2d9 r600g: write all MRTs only if there is exactly one output (fixes a hang)
This fixes a hang in
piglit/arb_blend_func_extended-fbo-extended-blend-pattern_gles2 on REDWOOD.

Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b5b87c4ed1)
2016-01-08 11:54:08 +02:00
Marek Olšák
4974996545 tgsi/scan: add flag colors_written
This is a prerequisite for the following r600g fix.

Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit eb4813a952)
2016-01-08 11:54:08 +02:00
Patrick Rudolph
86100d4ca9 gallium/util: return correct number of bound vertex buffers
In case a state tracker unbinds every slot by a seperate
pipe->set_vertex_buffers() call, starting from slot zero, the number
of bound buffers would not reach zero at all.
The current algorithm does not account for pre-existing holes in the
buffer list.

Unbinding all buffers at once or starting at the top-most slot results
in correct behaviour.

Calculating the correct number of bound buffers fixes a NULL pointer
dereference in nvc0_validate_vertex_buffers_shared().

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93004
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 79bff488bc)
2016-01-07 12:40:14 +02:00
Patrick Rudolph
47fa06c839 nv50,nvc0: fix use-after-free when vertex buffers are unbound
Always reset the vertex bufctx to make sure there's no pointer to
an already freed pipe_resource left after unbinding buffers.
Fixes use after free crash in nvc0_bufctx_fence().

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93004
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
[imirkin: simplify nvc0 fix, apply to nv50]
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>

(cherry picked from commit 432a798cf5)
2016-01-07 12:25:19 +02:00
Dave Airlie
ce914d941d radeonsi: handle loading doubles as geometry shader inputs.
This adds the double code to the geometry shader input handling.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e307cfa7d9)
2015-12-12 19:39:03 +00:00
Dave Airlie
300f807649 radeonsi: handle doubles in lds load path.
This handles loading doubles from LDS properly.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: "11.0 11.1" <mesa-stable@lists.fedoraproject.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 8c9e40ac22)
2015-12-12 19:39:03 +00:00
Dave Airlie
61a275b789 r600: handle geometry dynamic input array index
This fixes:
glsl-1.50/execution/geometry/dynamic_input_array_index.shader_test
my profanity.

We need to load the AR register with the value from the index reg

Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit cce3864046)
2015-12-12 19:39:03 +00:00
Dave Airlie
0f3892ed9d r600g: fix geom shader input indirect indexing.
This fixes:
gs-input-array-vec4-index-rd

The others run out of gprs unfortunately.

Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 38542921c7)
2015-12-12 19:39:03 +00:00
Dave Airlie
3d942ee4e5 r600/shader: add utility functions to do single slot arithmatic
These utilities are to be used to do things like integer adds and
multiplies to be used in calculating the LDS offsets etc.

It handles CAYMAN MULLO differences as well.

Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 0696ebc899)
2015-12-12 19:39:03 +00:00
Dave Airlie
efdf841238 r600/shader: split address get out to a function.
This will be used in the tess shaders.

Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 4d64459a92)
2015-12-12 19:39:02 +00:00
Dave Airlie
5913a8c9ec r600g: fix outputing to non-0 buffers for stream 0.
This fixes:
arb_transform_feedback3-ext_interleaved_two_bufs_gs
arb_transform_feedback3-ext_interleaved_two_bufs_gs_max
transform-feedback-builtins

If we are only emitting one ring, then emit all output
buffers on it.

Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e97ac006d7)
[Emil Velikov: squash trivial conflicts]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/gallium/drivers/r600/r600_shader.c
2015-12-12 19:39:02 +00:00
Ilia Mirkin
3c9e76fc24 nv50/ir: fix cutoff for using r63 vs r127 when replacing zero
The only effect here is a space savings - 822 programs in shader-db
affected with the following overall change:

total bytes used in shared programs   : 44154976 -> 44139880 (-0.03%)

Fixes: 641eda0c (nv50/ir: r63 is only 0 if we are using less than 63 registers)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f920f8eb02)
2015-12-12 19:39:02 +00:00
Nicolai Hähnle
9908d19699 radeonsi: last_gfx_fence is a winsys fence
Cc: "11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit d5a5dbd71f)
2015-12-12 19:39:02 +00:00
Ilia Mirkin
a500109aad gk110/ir: fix imad sat/hi flag emission for immediate args
According to nvdisasm both the immediate and non-imm cases use the same
bits. Both of these flags are quite rarely set though.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1d708aacb7)
2015-12-12 19:39:01 +00:00
Ilia Mirkin
0e78a67709 gk104/ir: sampler doesn't matter for txf
We actually leave the sampler unset for OP_TXF, which caused the GK104+
logic to treat some texel fetches as indirect. While this works, it's
incredibly wasteful. This only happened when the texture was > 0 (since
sampler remained == 0).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 63b850403c)
2015-12-12 19:39:01 +00:00
Marek Olšák
4bb16d712a radeonsi: disable DCC on Stoney
Cc: 11.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 32f05fadbb)
2015-12-12 19:39:01 +00:00
Christian König
950e9886d0 st/va: disable MPEG4 by default v2
The workarounds are too hacky to enable them by default
and otherwise MPEG4 doesn't work reliably.

v2: add docs/envvars.html, CC stable and fix typos

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> (v1)
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1)
Cc: "11.1.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a2c5200a4b)
2015-12-12 19:39:01 +00:00
Ilia Mirkin
dff89432d8 gk110/ir: fix imul hi emission with limm arg
The elemental demo hits this case.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit db072d2086)
2015-12-12 19:39:01 +00:00
Marek Olšák
e0b11bcc87 radeonsi: fix occlusion queries on Fiji
Tested.

(cherry picked from commit bfc14796b0)
2015-12-12 19:39:01 +00:00
Eric Anholt
1df00e17d3 vc4: When doing algebraic optimization into a MOV, use the right MOV.
If there were src unpacks, changing to the integer MOV instead of float
(for example) would change the unpack operation.

(cherry picked from commit e3efc4b023)
2015-12-11 17:04:11 -08:00
Eric Anholt
ad3df9d168 vc4: Fix handling of src packs on in qir_follow_movs().
The caller isn't going to expect it from a return, so it would probably
get misinterpreted.  If the caller had an unpack in its reg, that's fine,
but don't lose track of it.

(cherry picked from commit 2591beef89)
2015-12-11 17:04:08 -08:00
Eric Anholt
e4cf550501 vc4: Add missing progress note in opt_algebraic.
(cherry picked from commit b70a2f4d81)
2015-12-11 17:04:00 -08:00
Eric Anholt
ecf2885d7f vc4: Fix handling of sample_mask output.
I apparently broke this in a late refactor, in such a way that I decided
its tests were some of those interminable ones that I should just
blacklist from my testing.  As a result, the refactors related to it were
totally wrong.

(cherry picked from commit 53b2523c6e)
2015-12-11 17:03:51 -08:00
Eric Anholt
fc59ca4064 vc4: Enable MSAA.
We still have several failures in the newly enabled tests in simulation:
sRGB downsampling is done as if it was just linear, stencil blits are not
supported on MSAA either, and derivatives are still not supported
(breaking some MSAA simulation shaders).  So, other than sRGB downsampling
quality, things seem to be in good shape.

(cherry picked from commit f61ceeb3fd)
2015-12-11 17:03:44 -08:00
Eric Anholt
396fbdc721 vc4: Add support for mapping of MSAA resources.
The pipe_transfer_map API requires that we do an implicit
downsample/upsample and return a mapping of that.

(cherry picked from commit fc4a1bfb88)
2015-12-11 17:03:40 -08:00
Eric Anholt
50ac2100df vc4: Add support for texel fetches from MSAA resources.
This is the core of ARB_texture_multisample.  Most of the piglit tests for
GL_ARB_texture_multisample require GL 3.0, but exposing support for this
lets us use the gallium blitter for multisample resolves.  We can
sometimes multisample resolve using just the RCL, but that requires that
the blit is 1:1, unflipped, and aligned to tile boundaries.

(cherry picked from commit 6b4dfd53ae)
2015-12-11 17:03:36 -08:00
Eric Anholt
08cf0f8529 vc4: Add support for multisample framebuffer operations.
This includes GL_SAMPLE_COVERAGE, GL_SAMPLE_ALPHA_TO_ONE, and
GL_SAMPLE_ALPHA_TO_COVAGE.

I haven't implemented a dithering function yet, and gallium doesn't give
me a good chance to do so for GL_SAMPLE_COVERAGE.

(cherry picked from commit a97b40dca4)
2015-12-11 17:03:31 -08:00
Eric Anholt
ba51596b1d vc4: Add a workaround for HW-2905, and additional failure I saw with MSAA.
I only stumbled on this while experimenting due to reading about HW-2905.
I don't know if the EZ disable in the Z-clear is actually necessary, but
go with it for now.

(cherry picked from commit edc3305de7)
2015-12-11 17:03:03 -08:00
Eric Anholt
3d13bb8851 vc4: Add support for drawing in MSAA.
(cherry picked from commit edfd4d853a)
2015-12-11 17:03:03 -08:00
Eric Anholt
3bf2c6b96a vc4: Add kernel RCL support for MSAA rendering.
(cherry picked from commit e7c8ad0a6c)
2015-12-11 17:03:03 -08:00