Commit graph

82375 commits

Author SHA1 Message Date
Marek Olšák
7c6e88b643 radeonsi: allow MSAA resolving into a texture that has DCC enabled
Since DCC is enabled almost everywhere now, it's important not to disable
this fast path.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Marek Olšák
9a472a3e0b gallium/radeon: move DCC clearing into a separate function
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Marek Olšák
ffd54d1936 radeonsi: allow direct hw MSAA resolve for scanout surfaces
No idea why this was disabled, but it works fine.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Marek Olšák
4be46c7d9d radeonsi: don't allocate DCC for the temporary MSAA resolve surface
Allocating it has no effect, but it adds overhead (useless DCC clear).

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Marek Olšák
c06246501e radeonsi: don't enable DCC in the sampler if first_level doesn't have it
If first_level > 0 and DCC is disabled for that level, let's skip DCC
reads entirely.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Marek Olšák
00389100b6 winsys/amdgpu: enable DCC for mipmapped textures
Also add dcc_fast_clear_size for clearing only the necessary subset
of DCC. For no AA, it's equal to the size of the whole DCC level.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Marek Olšák
c65361763c gallium/radeon: don't disable DCC because of SDMA
We want to keep DCC enabled to save bandwidth. It was a bad idea to disable
it here.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Marek Olšák
2fd74a05bb radeonsi: don't flag renderbuffer feedback loop if DCC has just been disabled
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Marek Olšák
aa7fe70443 radeonsi: add per-level dcc_enabled flags
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Marek Olšák
60e93ddd06 radeonsi: compute DCC register parameters in si_emit_framebuffer_state
This will get more complicated with mipmapped DCC or when DCC is enabled
after allocation.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Marek Olšák
a01536a29f gallium/radeon: add an assertion checking the validity of PIPE_BIND_SCANOUT
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Marek Olšák
d4d733e39d gallium/radeon: don't allocate DCC for non-renderable texture formats
R9G9B9E5 is the only uncompressed one hopefully.

This fixes incorrect rendering not discovered (due to a lack of tests)
until DCC mipmapping was enabled.

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-08 00:22:45 +02:00
Nicolai Hähnle
b42bc90b6a radeonsi: enable WQM in PS prolog when needed
WQM is needed when the PS prolog computes a VGPR that is consumed by a shader
with (implicit or explicit) derivatives.

Depends on http://reviews.llvm.org/D20839 / LLVM r272063 for this to be
effective (otherwise it's just a no-op).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95130
Cc: 12.0 <mesa-dev@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-07 23:46:02 +02:00
Nicolai Hähnle
d3a584defe tgsi/scan: add uses_derivatives (v2)
v2:
- TG4 does not calculate derivatives (Ilia)
- also handle SAMPLE* instructions (Roland)

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-06-07 23:45:17 +02:00
Nanley Chery
b7a0c0ec7f docs/devinfo: Expound on helpful extension tips
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-06-07 11:16:23 -07:00
Nanley Chery
9e7de50cab docs/devinfo: Update bullet in stale extension guide
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-06-07 11:16:23 -07:00
Nanley Chery
26b0f023d7 docs/devinfo: Add closing paragraph tag
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-06-07 11:16:23 -07:00
Tim Rowley
87f0a0448f swr: fix provoking vertex
Use rasterizer provoking vertex API.

Fix rasterizer provoking vertex for tristrips and quad list/strips.

v2: make provoking vertex tables static const

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-06-07 11:47:52 -05:00
Ilia Mirkin
c81b090c92 st/mesa: revalidate image atoms when a texture is updated
A texture may be redefined with _NEW_TEXTURE, which might have been
bound to a shader image slot. We have to revalidate the image atoms to
pick up on the new resource.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-07 10:18:34 -04:00
Ilia Mirkin
71ad8a173f gk104/ir: fix conditions for adding a texbar
Sometimes a register source can actually be double- or even quad-wide.
We must make sure that the inserted texbars take that width into
account.

Based on an earlier patch by Samuel Pitoiset.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
2016-06-07 10:18:13 -04:00
Nicolai Hähnle
8239da28e8 radeonsi: keep track of dirty descriptor sets
Reduces CPU load for draw calls that change none or few of the descriptors.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-07 15:18:10 +02:00
Nicolai Hähnle
d152c73712 radeonsi: move si_descriptors into a per-context array
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-07 15:18:07 +02:00
Nicolai Hähnle
a29c4f9ebd radeonsi: pass shader stage to si_disable_shader_image
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-07 15:18:05 +02:00
Nicolai Hähnle
4e0fb72786 radeonsi: access descriptor sets via local variables
This will simplify moving them to a per-context array.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-07 15:18:02 +02:00
Nicolai Hähnle
ba4a2840c7 radeonsi: add si_set_rw_buffer to be used for internal descriptors
So that callers outside of si_descriptors.c need to worry less about the
details of descriptor handling.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-07 15:17:59 +02:00
Nicolai Hähnle
c615a055f4 radeonsi: pass shader stage to si_set_shader_image
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-07 15:17:57 +02:00
Nicolai Hähnle
e6612a3e68 radeonsi: pass shader stage to si_set_sampler_view
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-07 15:17:55 +02:00
Nicolai Hähnle
c32cd4b78d radeonsi: move descriptor set begin_new_cs handling into a separate function
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-07 15:17:39 +02:00
Nicolai Hähnle
031b57bc2f radeonsi: move enabled_mask out of si_descriptors
This mask is irrelevant for the generic descriptor set handling, and having it
outside simplifies subsequent changes slightly.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-07 15:17:23 +02:00
Jason Ekstrand
d1e141a661 anv/entrypoints: Stop using the C preprocessor
Now that we emit guards for everything, we can just generate the files and
trust build flags to keep us safe.  This should also fix the tarball
problems.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-07 12:30:25 +01:00
Jason Ekstrand
d1a53f91ee anv/entrypoints: Emit #if guards for all platforms
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-07 12:30:25 +01:00
Haixia Shi
1ea233c6f3 platform_android: prevent deadlock in droid_swap_buffers
To avoid blocking other EGL calls, release the display mutex before
we enqueue buffer to android frameworks and re-acquire the mutex
upon return.

v2: moved lock/unlock inside droid_window_enqueue_buffer().

TEST=verify pinch zoom in Photos app no longer causes hangs

Signed-off-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-07 12:30:25 +01:00
Emil Velikov
b7f7ec7843 mesa: automake: distclean git_sha1.h when building OOT
In the case of out-of-tree (OOT) builds, in particular when building
from tarball, we'll end up with the file in both srcdir and builddir.

We want the former to remain intact (since we need it on rebuild) while
the latter should be removed otherwise `make distclean' gets angry at
us.

Ideally there'll be a solution that feels a bit less of a hack. Until
then this does the job exactly as expected.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-07 12:30:23 +01:00
Emil Velikov
2c424e00c3 mesa: automake: ensure that git_sha1.h.tmp has the right attributes
... when copied from git_sha1.h.

As the latter file can we lacking the write attribute, one should set it
explicitly. Otherwise we'll get a warning/failure at cleanup stage.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-07 12:21:46 +01:00
Emil Velikov
359d9dfec3 mesa: automake: add directory prefix for git_sha1.h
Otherwise the build will assume that we've talking about builddir, which
is not the case in the else statement.

Here the file is already generated and is part of the tarball.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-07 12:21:45 +01:00
Emil Velikov
1816c837c1 egl: android: don't add the image loader extension for !render_node
With earlier commit we introduced support for render_node devices, which
was couples with the use of the image loader extension.

As the work was inspired by egl/wayland we (erroneously) added the
extension for the !render_node path as well.

That works for wayland, as the implementations of the DRI2 and IMAGE
loader extensions converge behind the scenes. As that is not yet
the case for Android we shouldn't expose the extension.

Fixes: 34ddef39ce ("egl: android: add dma-buf fd support")

Cc: <mesa-stable@lists.freedesktop.org>
Reported-by: Mauro Rossi <issor.oruam@gmail.com>
Tested-by: Mauro Rossi <issor.oruam@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-06-07 12:21:45 +01:00
Marek Olšák
095803a37a gallium/radeon: add support for sharing textures with DCC between processes
v2: use a function for calculating WORD1 of bo metadata

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-07 11:12:26 +02:00
Marek Olšák
9e5b5fbde0 gallium/radeon: don't discard DCC if an external user can write to it
We don't import textures with DCC now, but soon we will.

v2: if we can't disable DCC for image writes, at least decompress DCC
    at bind time

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-07 11:12:26 +02:00
Dave Airlie
c6b14bafa4 i915: fix typo CAP.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-07 18:31:14 +10:00
Jakob Sinclair
b450f29073 glsl: initialise pointer to NULL
Could cause issues if you tried to read from an uninitialised pointer.
This just initalises the pointer to null to avoid that being a problem.
Discovered by Coverity.

CID: 1343616

Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-06-07 08:13:25 +02:00
Dave Airlie
c295923d13 i965/gen8: fix cull distance emission for tessellation shaders.
This fixes some cases of:
GL45-CTS.cull_distance.functional
on Skylake.

Reviewed-by: Chris Forbes <chrisforbes@google.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-06-07 11:52:17 +10:00
Ilia Mirkin
704bc0f0e9 nvc0: add support for VOTE tgsi opcodes
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-06-06 20:49:29 -04:00
Ilia Mirkin
f64c36e2d7 st/mesa: expose GL_ARB_shader_group_vote when supported by backend
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-06-06 20:49:29 -04:00
Ilia Mirkin
edfa7a4b25 gallium: add PIPE_CAP_TGSI_VOTE for when the VOTE ops are allowed
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-06-06 20:49:29 -04:00
Ilia Mirkin
30684b50d7 gallium: add VOTE_* opcodes to implement GL_ARB_shader_group_vote
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-06-06 20:49:28 -04:00
Ilia Mirkin
5189f0243a mesa: hook up core bits of GL_ARB_shader_group_vote
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-06-06 20:48:46 -04:00
Kenneth Graunke
13b859de04 glsl: Make opt_copy_propagation_elements actually propagate into loops.
We've had a FINISHME here since Eric originally wrote the code in 2011.
This patch implements his suggested approach, which makes us actually
able to copy propagate into the loops, at the unfortunate cost of making
this pass even more expensive.

The shader-db statistics are basically a wash:

   No change in instruction counts.

   total cycles in shared programs: 78685980 -> 78680730 (-0.01%)
   cycles in affected programs: 2102646 -> 2097396 (-0.25%)
   helped: 48
   HURT: 83

I figured if we're going to do this for one copy propagation pass,
we may as well do it in both.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-06-06 14:14:31 -07:00
Kenneth Graunke
0756e3a25c glsl: Make opt_copy_propagation actually propagate into loops.
We've had a FINISHME here since Eric originally wrote the code in 2010.
This patch implements his suggested approach, which makes us actually
able to copy propagate into the loops, at the unfortunate cost of making
this pass even more expensive.

The shader-db statistics are not terribly impressive:

   total instructions in shared programs: 9008589 -> 9008613 (0.00%)
   instructions in affected programs: 4293 -> 4317 (0.56%)
   helped: 0
   HURT: 10

   total cycles in shared programs: 78550978 -> 78575760 (0.03%)
   cycles in affected programs: 655426 -> 680208 (3.78%)
   helped: 75
   HURT: 88

   GAINED: 2

Most of the "regressions" appear to be us successfully copy propagating
uniforms, which i965 is loading as pull constants instead of push, so we
occasionally have two pulls instead of one.  That doesn't seem like this
pass's job - it's propagating correctly, and we should be smarter about
pull loads in the backend.

This patch is also useful for a couple of reasons:

1. It can clean up copies created by varying packing (previously, we
   couldn't if the uses were inside a loop).

   This fixes a bug when interpolateAt*() is used on a packed varying
   inside a loop: glsl_to_nir struggles to see through the extra copy
   and mistakenly believed the variable was not an input.

2. It will help propagate uniform array access created by
   lower_const_array_to_uniforms().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-06-06 14:14:31 -07:00
Samuel Pitoiset
08ddfe7b2f nv50/ir: use round toward 0 when converting doubles to integers
Like floats, we should use the round toward 0 mode instead of the
nearest one (which is the default) for doubles to integers.

This fixes all arb_gpu_shader_fp64 piglits which convert doubles to
integers (16 tests).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
2016-06-06 22:56:04 +02:00
Marek Olšák
00e6899ae5 gallium/radeon: don't re-set BO metadata after CMASK deallocation
CMASK has no effect on metadata, because it's not sharable.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-06 22:50:55 +02:00