Commit graph

19806 commits

Author SHA1 Message Date
Brian Paul
4d2b21a326 svga: replace gotos with conditionals in array drawing code
No Piglit regressions.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-08-21 19:04:51 -06:00
Brian Paul
d50b8b91d7 llvmpipe: add some whitespace between functions in lp_texture.c
Trivial.
2017-08-21 19:04:51 -06:00
Brian Paul
196a0b28a0 svga: whitespace clean-up in svga_draw_private.h
Trivial.
2017-08-21 19:04:51 -06:00
Marek Olšák
db039d67aa radeonsi: don't prefetch VBO descriptors if vertex elements == NULL
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-21 23:06:42 +02:00
Marek Olšák
ea1b97714d r600g: don't set up and don't call the fetch shader if there are no VS inputs 2017-08-21 23:06:42 +02:00
Rob Herring
4734bfc02a Android: Fix LLVM duplicated symbols linking for N and M
Both statically linking libLLVMCore and dynamically linking libLLVM causes
duplicated symbols in gallium_dri.so and it fails to dlopen. We don't
really need to link libLLVMCore, but just need generated headers to be
built first. Dynamically linking to libLLVM instead is enough to do
that. Thanks to Qiang Yu for finding the root cause.

With this change, we can align all versions and just have libLLVM as a
shared lib dependency.

This also requires changes in the M and N versions of LLVM to export the
include paths for libLLVM. AOSP master is okay.

Fixes: 26aee6f4d5 ("Android: rework LLVM build support")
Reported-by: Mauro Rossi <issor.oruam@gmail.com>
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Qiang Yu <Qiang.Yu@amd.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-08-21 10:46:21 -05:00
Leo Liu
7319ff8787 radeon/uvd: add YUYV format support for target buffer
Make chroma plane optional for YUYV support

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
fceb52a230 radeon/video: MJPEG not support stacked video buffers
So we have to detect it for reallocation of de-interlaced buffers

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
130d1f456b radeon/uvd: reconstruct MJPEG bitstream
The current tier 1 mjpeg firmware only supports at the bitstream
level, the later tier 2 support will be at the buffers level with
newer hardware.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
15f3335577 radeon/video: add MJPEG support
v2: add ASIC and Kernel version check

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
3fe713ce3d radeon/uvd: add MJPEG support
There is no need of dpb buffer for mjpeg codec

v2: check dpb_size instead of format

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
b26cfdaebd radeon/uvd: add MJPEG stream type
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
2b1eacabfa radeon/uvd: get the target buffer pitch correct for different format
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Samuel Pitoiset
2843c5d15c radeonsi: update non-resident bindless descriptors if needed
Only resident bindless descriptors are currently updated and
re-uploaded, this makes sure that the non-resident ones are
also updated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-21 15:23:56 +02:00
Marek Olšák
57fb1bb585 gallium/radeon: remove old_fence parameter from r600_gfx_write_event_eop
just use the new scratch buffer.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-18 16:06:21 +02:00
Marek Olšák
41e053954d radeonsi/gfx9: prevent a GPU hang after a timestamp event
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-18 16:06:18 +02:00
Marek Olšák
13aa8d3da9 radeonsi: don't use CLEAR_STATE on SI
This fixes random hangs with Unigine Valley.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102201

Fixes: 064550238e ("radeonsi: use CLEAR_STATE to initialize some registers")
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-18 15:59:22 +02:00
Roland Scheidegger
3e96231457 llvmpipe: enable PIPE_CAP_QUERY_SO_OVERFLOW
The driver supported this since way before the GL spec for it existed.
Just need to support both the per-stream and for all streams variants
(which are identical due to only supporting 1 stream).
Passes piglit arb_transform_feedback_overflow_query-basic.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-08-17 18:46:44 +02:00
Roland Scheidegger
26d46b94b4 softpipe: enable PIPE_CAP_QUERY_SO_OVERFLOW
The driver was supposed to support this since way before the GL spec for it
existed, albeit it was apparently broken, so fix and enable it.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-08-17 18:46:44 +02:00
Ilia Mirkin
934511d1f3 nv50/ir: fix TXQ srcMask
src0.x is always read for the LOD, irrespective of which outputs are
read.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-08-16 22:39:22 -04:00
Ilia Mirkin
054c54d1be nv50/ir: fix srcMask computation for TG4 and TXF
This affects which inputs are marked as used. In a situation where only
the texture instruction uses an input, it might have been ignored as
unused due to input masks.

Affects subtests of KHR-GL45.texture_cube_map_array.sampling

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-08-16 22:39:21 -04:00
Tim Rowley
b333bc753e swr/rast: Fix invalid casting for calls to Interlocked* functions
CID: 1416243, 1416244, 1416255
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-08-16 14:20:22 -05:00
Boyuan Zhang
a44b334e48 radeon/vce: support all firmwares with major ver 53
The vce firmware interface should now be stable, all firmwares with
major version equals to 53 are supported.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig at amd.com>
2017-08-16 14:42:41 -04:00
Ilia Mirkin
f96f210239 a2xx: only update rasterizer settings when they're there
The rasterizer being empty can happen e.g. during clears

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-08-15 22:54:40 -04:00
Ilia Mirkin
08f72a8944 a2xx: add logicop support
This passes both gl-1.0-logicop and gl-1.1-xor piglits.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-08-15 22:54:40 -04:00
Jonas Pfeil
494f86bbe5 broadcom/vc4: Port NEON-code to ARM64
Changed all register and instruction names, works the same.

v2: Rebase on build system changes (by anholt)
v3: Fix build on clang (by anholt, reported by Rob)

Signed-off-by: Jonas Pfeil <pfeiljonas@gmx.de>
Tested-by: Rob Herring <robh@kernel.org>
2017-08-15 13:23:54 -07:00
Eric Anholt
bd5efbd70b broadcom/vc4: Build the vc4_tiling_lt_neon.c with -mfpu=neon on ARM.
If you don't pass this, the compiler refuses to compile the assembly for
pre-v7 CPUs.  This also keeps us from building identical, non-NEON code on
aarch64 and x86.

Fixes: a373f77662 ("vc4: Use a wrapper file to set VC4_BUILD_NEON instead of CFLAGS.")

v2: Fix Android build by just appending NEON_C_SOURCES when
    ARCH_ARM_HAVE_NEON.

Tested-by: Rob Herring <robh@kernel.org>
2017-08-15 13:23:54 -07:00
Marek Olšák
1ab7fed707 radeonsi: disable CE by default
It makes performance worse by a very small (hard to measure) amount.
We've done extensive profiling of this feature internally.

Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Christian König <christian.koenig@amd.com>
2017-08-15 15:03:43 +02:00
Dave Airlie
e0edfadec8 radeonsi: initialise imported surface to 0.
For memobj imports we weren't setting the surface to 0, which
meant sometimes we'd end up with tile_swizzle garbage, which
would corrupt rendering.

This seems to fix the image corruption on the imported memory
objects in vrdashboard for me.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-15 01:35:58 +01:00
Ilia Mirkin
165e18dd21 nv50/ir: clean up saturated values immediately
Since we don't iterate to a fixed point, we can end up in situations
where we have a SAT instruction + a long immediate. This is not legal.
However since it's immediately computable, just run unary straight away
to handle the situation.

Fixes: 24a799ad35 ("nv50/ir: fix ConstantFolding with saturation")
Reported-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-08-12 14:49:08 -04:00
Ilia Mirkin
ea22ac23e0 nvc0/ir: unlink values pre- and post-call to division function
While technically correct, this can lead to e.g. getImmediate assuming
that it can walk up the value chain. It could be fixed to not do this,
but it seems easier and less error-prone to just not link the two values
to save on one LValue object.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-08-12 14:49:08 -04:00
Marek Olšák
b420680ede gallium/radeon: only pass shader-specific debug flags to the disk shader cache
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-11 20:38:29 +02:00
Marek Olšák
d1285a7103 radeonsi/gfx9: fix the scissor bug workaround
otherwise there is corruption in most apps.

Fixes: 0fe0320 radeonsi: use optimal packet order when doing a pipeline sync

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-11 20:38:29 +02:00
Marek Olšák
27fef5d52d radeonsi/gfx9: use the VI codepath for clamping Z
This fixes corrupted shadows in Unigine Valley.
The corruption disappeared when I stopped setting IMG_DATA_FORMAT_24_8
for depth.

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-11 20:38:29 +02:00
Marek Olšák
4630ede102 ac: fail shader compilation if libelf is replaced by an incompatible version
UE4Editor has this issue.

This commit prevents hangs (release build) or assertion failures (debug
build). It doesn't fix the editor, but catastrophic scenarios are
prevented.

Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-10 13:24:23 +02:00
Karol Herbst
24a799ad35 nv50/ir: fix ConstantFolding with saturation
For mul(a, +-1) codegen can generate OP_MOV with a saturation flag
set which is ignored at emission. The same can happen with add(a, 0),
and others.

Adding an assert for detecting more of such issues.

Fixes wrongly rendered water in Hitman Absolution running under wine.
Also a few shaders in Mad Max and Alien Isolation produce such MOVs.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
[imirkin: generalize the fix for other cases]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-08-09 10:25:26 -04:00
Samuel Pitoiset
bbfad34606 radeonsi: drop two unused variables in create_function()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-09 12:56:00 +02:00
Marek Olšák
a2703fc119 radeonsi: fix a compile failure due to disabled asserts 2017-08-07 22:51:45 +02:00
Marek Olšák
0fe0320dc0 radeonsi: use optimal packet order when doing a pipeline sync
Process most new SET packets in parallel with previous draw calls, then
flush caches and wait, start the draw, and do L2 prefetches last.

This decreases the [CP busy / SPI busy] ratio (verified with GRBM perf
counters). In other words, the time window when shaders are idle (between
(the wait and the draw) is much shorter now.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-07 21:12:24 +02:00
Marek Olšák
895de1d03d radeonsi: expose the number of decompress calls to the HUD
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-07 21:12:24 +02:00
Marek Olšák
ca440bc651 gallium/radeon: rename GPU-dma-busy -> GPU-cp-dma-busy
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-07 21:12:24 +02:00
Marek Olšák
c093821cee radeonsi: rename shader_userdata -> shader_pointers where appropriate
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-07 21:12:24 +02:00
Marek Olšák
c441999b7a radeonsi: prefetch VBO descriptors after the first VGT shader
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-07 21:12:24 +02:00
Marek Olšák
e887c68bd2 radeonsi: add a separate dirty mask for prefetches
so that we don't rely on si_pm4_state_enabled_and_changed, allowing us
to move prefetches after draw calls.

v2: ckear the dirty mask after unbinding shaders

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
2017-08-07 21:12:24 +02:00
Marek Olšák
a7b0014d1a radeonsi: add and use si_pm4_state_enabled_and_changed
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-07 21:12:24 +02:00
Marek Olšák
58d062b87d radeonsi: de-atomize L2 prefetch
I'd like to be able to move the prefetch call site around.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-07 21:12:24 +02:00
Marek Olšák
4e629ca7c7 radeonsi: align all CE dumps to L2 cache line size
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-07 21:12:24 +02:00
Marek Olšák
01fed67608 radeonsi: remove a tautology sctx->framebuffer.nr_samples >= 1
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-07 21:12:24 +02:00
Marek Olšák
1694a8ba8d gallium/radeon: print all members of radeon_info with R600_DEBUG=info
also set max_alignment on amdgpu.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-07 21:12:24 +02:00
Wladimir J. van der Laan
948bb2caba etnaviv: Add support for R8_UNORM textures
R8_UNORM textures can be emulated by means of L8 and a swizzle.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-08-06 20:45:24 +02:00