Commit graph

31712 commits

Author SHA1 Message Date
Marek Olšák
465bb47d6f radeonsi: expose ARB_timer_query unconditionally
clock_crystal_freq is always non-zero now.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:57:17 -04:00
Marek Olšák
d0963ef084 radeonsi/gfx9: don't read back non-existent register SRBM_STATUS2
It looks like there is no way to monitor SDMA busyness on GFX9.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:56:56 -04:00
Marek Olšák
5fb80a1e84 radeonsi: prevent a crash with DBG_CHECK_VM and u_threaded_context
by setting PIPE_CONTEXT_DEBUG in the caller

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:56:51 -04:00
Marek Olšák
ffa7ec9e22 radeonsi: simplify computation of tessellation offchip buffers
This is overly cautious, but better safe than sorry.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:55:07 -04:00
Marek Olšák
facfab28fe radeonsi/gfx9: add workarounds to avoid VGPR indexing completely
For inputs and outputs, indirect indexing is lowered by the GLSL compiler.
For temporaries, use alloca and disable the "promote-alloca" pass.

In the future, we could switch all codepaths to alloca permanently and
just rely on the "promote-alloca" pass.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
93391ac478 radeonsi: emit param exports after position exports
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
9d9ffc8475 radeonsi: move building parameter exports into a separate function
Both loops now look simple.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
4e30fb4ecc radeonsi: don't use info.num_inputs when it's unused
For clarity. It's only used by color interpolation.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
f8d6dd9b3d radeonsi: add si_build_fs_interp helper
This is much simpler.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
4560f2b90a radeonsi: merge si_llvm_get_amdgpu_target into ac_get_llvm_target
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
c351037d6c gallivm: inline gallivm_init_llvm_targets
there is only one user.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
ece0c0439f radeonsi: don't call gallivm_init_llvm_targets
It's for initializing the native (x86) target.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
d308460586 gallium/radeon: reallocate suballocated buffers when exported
This should fix exports of suballocated buffers.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
5b555854cc gallium/radeon: flush the context after in-place texture realloc before export
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Mark Thompson
63dcfed81f st/va: Fix scaling list ordering for H.265
Mesa here requires the scaling lists in diagonal scan order, but
VAAPI passes them in raster scan order.  Therefore, rearrange the
elements when copying.

v2: Move scan tables to vl_zscan.c.
    Fix type in size assertion.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-07-17 15:24:56 +01:00
Marek Olšák
f9d5611617 gallium/u_blitter: don't use TXF for scaled blits
There seems to be a rounding difference with F2I vs nearest filtering.
The precise problem in the rounding is unknown.

This fixes an incorrect output with OpenMAX encoding.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 15:47:30 +02:00
Samuel Pitoiset
c745beaf10 ddebug: fix parsing of the pipelined mode
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-17 10:28:45 +02:00
Tim Rowley
818209118c swr: JitManager runtime determination of architecture
Fixes performance regression from f50aa21456 - was forcing internal
code generation to target AVX (no gather, etc).

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-14 15:09:22 -05:00
Grigori Goronzy
8d980bf920 st/mesa: Add KHR_no_error toggle to driconf
Allows applications to be whitelisted.

v2: Remove misguided DRI common part.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-14 21:23:44 +02:00
Grigori Goronzy
2bbe235053 st/mesa: Add support for KHR_no_error flag
Add a new context flag and plumb it through the various layers of the
context creation code to set up dispatch tables for the no-error mode.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-14 21:23:40 +02:00
Grigori Goronzy
7299e82fa4 dri: Add KHR_no_error DRI extension
This basic extension allows usage of the __DRI_CTX_FLAG_NO_ERROR flag.
This includes support code for classic Mesa drivers to switch on the
no-error mode if the flag is set.

v2: Move to common DRI code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-14 21:20:31 +02:00
Christoph Haag
98514e9959 gallium/hud: use double values for all graphs
The fps graph for example calculates the fps as double with small
variations based on when query_new_value() is called, which causes
many values to be truncated on the cast to uint64_t.

The HUD internally stores the values as double, so just use double
everywhere instead of fixing this with rounding. Using doubles also
allows the hud to show small variations instead of being clamped to
discrete values.

v2: Don't print decimals in the dump file when not necessary
Signed-off-by: Christoph Haag <haagch+mesadev@frickel.club>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-07-14 17:34:39 +02:00
Lucas Stach
7e426ef6ec Revert "etnaviv: add support for snorm textures"
This reverts commit d8b2ccdb88, which causes priglit regressions on GPUs
with SNORM support. We'll have another try at enabling this feature after
the 17.2 branchpoint.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-07-14 17:21:50 +02:00
Wladimir J. van der Laan
1d05cec205 etnaviv: reset indexed rendering information when not rendering indexed
A dangling bo object would result in memory corruption while loading a
level in ioquake3_opengl2.

Fixes: 330d0607ed (gallium: remove pipe_index_buffer and set_index_buffer)
Suggested-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2017-07-14 17:19:42 +02:00
Wladimir J. van der Laan
bb2498a7f6 etnaviv: Use the correct LOG instruction on GC3000
GC3000 has a new LOG instruction, similar to the new SIN and COS instructions.

Generate the new instruction sequence when appropriate; there are
two occasions, as part of LIT and the generator for the LG2
instruction itself.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2017-07-14 17:15:41 +02:00
Lucas Stach
bccd21ee88 etnaviv: flush source TS before resolve
If we blit from a rendertarget or a depthstencil buffer there might still
be dirty data in the TS buffer which needs to be flushed out.

Fixes missing shadow tiles in glmark2 shadow.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
2017-07-14 17:13:12 +02:00
Philipp Zabel
e9b3381715 etnaviv: flush color cache and depth cache together before resolves
Before resolving a rendertarget or a depth/stencil resource into a
texture, flush both the color cache and the depth cache together.

It is unclear whether this is necessary for the following stall to
work properly, or whether the depth flush just adds enough time
for the color cache flush to finish before the resolver is started,
but this change removes artifacts that otherwise appear if a texture
is sampled directly after rendering into it.

The test case is a simple QML scene graph with a QtWebEngine based
WebView rendered on top of a blue background:

	import QtQuick 2.0
	import QtQuick.Window 2.2
	import QtWebView 1.1

	Window {
		Rectangle {
			id: background
			anchors.fill: parent
			color: "blue"
		}

		WebView {
			id: webView
			anchors.fill: parent
		}

		Component.onCompleted: {
			webView.url = "<some animated website>"
		}
	}

If the website is animated, the WebView renders the site contents into
texture tiles and immediately afterwards samples from them to draw the
tiles into the Qt renderbuffer. Without this patch, a small irregular
triangle in the lower right of each browser tile appears solid blue, as
if the texture sampler samples zeroes instead of the website contents,
and the previously rendered blue Rectangle shows through.

Other attempts such as adding a pipeline stall before the color flush or
a TS cache flush afterwards or flushing multiple times, with stalls
before and after each flush, have shown no effect.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2017-07-14 17:12:36 +02:00
Marek Olšák
f33d8af7aa st/dri: add 32-bit RGBX/RGBA formats
Add support for 32-bit RGBX/RGBA formats which are required for Android.

The original patch (commit ccdcf91104) was reverted (commit
c0c6ca40a2) in mesa as it broke GLX resulting in swapped colors. Based
on further investigation by Chad Versace, moving the RGBX/RGBA configs
to the end is enough to prevent breaking GLX.

The handling of RGBA/RGBX in dri_fill_st_visual is a fix from Marek
Olšák.

Cc: Eric Anholt <eric@anholt.net>
Cc: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-07-13 14:36:47 -05:00
Tim Rowley
254fa3dbf5 swr/rast: Fix use of KNL-only intrinsics in SKX build
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-13 08:47:10 -05:00
Tim Rowley
4c185dd3b3 swr/rast: Fix build warnings when using the Intel compiler
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-13 08:47:10 -05:00
Tim Rowley
bbc3b5c0dc swr/rast: SIMD16 Frontend - Fix USE_SIMD16_FRONTEND build
Previous check-ins without testing with USE_SIMD16_FRONTEND have
introduced regressions. This fixes the build, not the regressions.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-13 08:47:10 -05:00
Tim Rowley
640ea4d9a1 swr/rast: Removing unneeded MSVC warning pragma
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-13 08:47:10 -05:00
Tim Rowley
185b37f641 swr/rast: Add support for read-only render targets
Core will ensure hot tiles are loaded for read and write render targets,
and will skip all output merger for read-only render targets.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-13 08:47:10 -05:00
Tim Rowley
d8ebcad540 swr/rast: Support render target mask instead of render target count
WIP to support read-only render targets.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-13 08:47:10 -05:00
Nicolai Hähnle
c22e3c5373 radeonsi/gfx9: fix crash building monolithic merged ES-GS shader
Forwarding from the ES prolog to the ES just barely exceeds the current
maximum array size when 16 vertex attributes are used. Give it a decent
bump to account for merged shaders having up to 32 user SGPRs.

Fixes a crash in GL45-CTS.multi_bind.draw_bind_vertex_buffers.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-13 13:01:15 +02:00
Ilia Mirkin
3645268748 nv50/ir: fix threads calculation for non-compute shaders
We were using the "cp" union fields, which are only valid for compute
shaders. The threads calculation affects the available GPRs, so just
pick a small number for other shader types to avoid limiting available
registers.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-07-12 22:09:59 -04:00
Ilia Mirkin
87028f8639 freedreno/ir3: fix load_front_face conversion
The comments are correct - we get -1 and 0. However by adding 1, we
convert this into 0,1. This mostly works for conditionals, but when
negated, this will yield the wrong result. Instead just negate the
values (as they are backwards -- -1 means back instead of front).

Fixes tests/shaders/glsl-fs-frontfacing-not.shader_test and
dEQP-GLES3.functional.shaders.builtin_variable.frontfacing on A530.

The latter also tested on A306 by Rob Clark.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-07-12 19:30:46 -04:00
Bruce Cherniak
02735e6cf8 swr: Add path to draw directly from client memory without copy.
If size of client memory copy is too large, don't copy. The draw will
access user-buffer directly and then block.  This is faster and more
efficient than queuing many large client draws.

Applications that still use large client arrays benefit from this.  VMD
is an example.

The threshold for this path defaults to 32KB.  This value can be
overridden by setting environment variable SWR_CLIENT_COPY_LIMIT.

v2: Use #define for default value, rather than hard-coded constant.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-07-12 16:56:40 -05:00
Bruce Cherniak
1520a06607 swr: Move environment config options into separate function.
Moved reading of environment config options out of
swr_create_screen_internal, into a separate swr_validate_env_options.
This is to keep from cluttering create_screen.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-07-12 16:56:40 -05:00
Bruce Cherniak
5bd9554f3d swr: Remove hard-coded constant and "todo" comment.
Removed the hard-coded constant in favor of a #define.  Also removed
TODO comment.  The constant value doesn't need an environment
configurable option.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-07-12 16:56:40 -05:00
Rob Herring
7a7a84c8db Android: Fix vc4 build since XML changes.
Since commit 7f80a9ff13 ("vc4: Introduce XML-based packet header
generation like Intel's."), the vc4 build on Android is broken:

out/target/product/linaro_x86_64/gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h:12:10: fatal error: 'v3d_packet_helpers.h' file not found
external/mesa3d/src/gallium/drivers/vc4/vc4_cl_dump.c:28:10: fatal error: 'vc4_packet.h' file not found

The path of the generated header needs to be fixed since we build out of
tree.

Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-07-12 16:47:10 -05:00
Charmaine Lee
147d7fb772 st/mesa: add a winsys buffers list in st_context
Commit a5e733c6b5 fixes the dangling
framebuffer object by unreferencing the window system draw/read buffers
when context is released. However this can prematurely destroy the
resources associated with these window system buffers. The problem is
reproducible with Turbine Demo running with VMware driver. In this case,
the depth buffer content was lost when the context is rebound to a
drawable.

To prevent premature destroy of the resources associated with
window system buffers, this patch maintains a list of these buffers in
the context, making sure the reference counts of these buffers will not
reach zero until the associated framebuffer interface objects no
longer exist. This also helps to avoid unnecessary destruction and
re-construction of the resources associated with the framebuffer.

Fixes VMware bug 1909807.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-07-11 19:40:17 -07:00
Eric Anholt
84ed8b67c5 vc4: Set shareable BOs as T tiled if possible
X11 and GL compositor performance on VC4 has been terrible because of our
SHARED-usage buffers all being forced to linear.  This swaps SHARED &&
!LINEAR buffers over to being tiled.

This is an expected win for all GL compositors during rendering (a full
copy of each shared texture per draw call), allows X11 to be used with
decent performance without a GL compositor, and improves X11 windowed
swapbuffers performance as well.  It also halves the memory usage of
shared buffers that get textured from.  The only cost should be idle
systems with a scanout-only buffer that isn't flagged as LINEAR, in which
case the memory bandwidth cost of scanout goes up ~25%.

This implements the EGL_EXT_image_dma_buf_import_modifiers extension,
supporting the VC4 T_TILED modifier.

v2: Added modifier support to resource creation/import, and
    advertisement (by daniels).
v3: Fix old-kernel fallback path, fix compiler error and warnings, and
    comment touchups (by anholt).

Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-07-12 10:58:33 -07:00
Eric Anholt
bb466a996f vc4: Use vc4_setup_slices for resource import
Rather than open-coding populating the first slice inside resource
import, use vc4_setup_slices to do it for us.

v2: Rebase on VC4_DEBUG=surf change

Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-07-12 10:58:33 -07:00
Eric Anholt
111b6b77cb vc4: Make the miptree debug code available under VC4_DEBUG=surf
I kept flipping the bool on for debug, so let's just make it available.

Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-07-12 10:58:33 -07:00
Eric Anholt
a2d87a0019 vc4: Switch back to using a local copy of vc4_drm.h.
Needing to get our uapi header from libdrm has only complicated things.
Follow intel's lead and drop our requirement for it.

Generated from the same commit mentioned in the README.

v2: Update Android.mk as well, move vc4_drm.h reference for distcheck.

Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-07-12 10:58:33 -07:00
Eric Anholt
2aec62a45b vc4: Remove a stale comment.
The kernel hasn't been synchronous in a couple of years, plus there was
synchronization code right there.
2017-07-12 10:58:33 -07:00
Brian Paul
5e5f251db1 svga: whitespace, formatting fixes in svga_swtnl_backend.c 2017-07-12 10:58:14 -06:00
Brian Paul
f2b59f6c02 svga: whitespace, formatting fixes in svga_swtnl_draw.c 2017-07-12 10:58:14 -06:00
Brian Paul
183d4193b8 svga: whitespace, formatting fixes in svga_swtnl_state.c 2017-07-12 10:58:13 -06:00