Commit graph

31726 commits

Author SHA1 Message Date
Kenneth Graunke
2412c4c81e util: Make CLAMP turn NaN into MIN.
The previous implementation of CLAMP() allowed NaN to pass through
unscathed, by failing both comparisons.  NaN isn't exactly a value
between MIN and MAX, which can break the assumptions of many callers.

This patch changes CLAMP to convert NaN to MIN, arbitrarily.  Callers
that need NaN to be handled in a specific manner should probably open
code something, or use a macro specifically designed to do that.

Section 2.3.4.1 of the OpenGL 4.5 spec says:

   "Any representable floating-point value is legal as input to a GL
    command that requires floating-point data. The result of providing a
    value that is not a floating-point number to such a command is
    unspecified, but must not lead to GL interruption or termination.
    In IEEE arithmetic, for example, providing a negative zero or a
    denormalized number to a GL command yields predictable results,
    while providing a NaN or an infinity yields unspecified results."

While CLAMP may apply to more than just GL inputs, it seems reasonable
to follow those rules, and allow MIN as an "unspecified result".

This prevents assertion failures in i965 when running the games
"XCOM: Enemy Unknown" and "XCOM: Enemy Within", which call

   glTexEnv(GL_TEXTURE_FILTER_CONTROL_EXT, GL_TEXTURE_LOD_BIAS_EXT,
            -nan(0x7ffff3));

presumably unintentionally.  i965 clamps the LOD bias to be in range,
and asserts that it's in the proper range when converting to fixed
point.  NaN is not, so it crashed.  We'd like to at least avoid that.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-07-18 23:48:46 -07:00
Marek Olšák
ecec21add2 radeonsi: add back the USE_MININUM_PRIORITY flag to the low-prio compiler queue
Accidentally removed in 9f320e0a38.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-18 13:13:34 -04:00
Sinclair Yeh
ed45e8db3c winsys/svga/drm: Enable import/export fence FD
Enable the capability if the DRM supports it.

Hook up mechanism to send and receive fence FD from the DRM.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-17 10:09:25 -06:00
Sinclair Yeh
d554f72c41 winsys/svga/drm: Connect winsys-side fence_* functions
Connect fence_get_fd, fence_create_fd, and fence_server_sync.

Implement the required functions in vmw_fence module.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-17 10:09:25 -06:00
Sinclair Yeh
56a6e890f3 drivers/svga: Connect driver-side fence_* functions
Connect fence_get_fd, fence_create_fd, and fence_server_sync.
Return PIPE_CAP_NATIVE_FENCE_FD capability based on what the
winsys reports

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-17 10:09:25 -06:00
Sinclair Yeh
4da543e30a winsys/svga/drm: Create winsys interface for Fence FD
The new interfaces will be used to enable
EGL_ANDROID_native_fence_sync.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-17 10:09:25 -06:00
Sinclair Yeh
2431cccad1 winsys/svga/drm: Prepare to support fence fd
Make the fields and flags available.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-17 10:09:25 -06:00
Sinclair Yeh
65175df601 drivers/svga, winsys/svga/drm: Thread through timeout for fence_finish
The timeout parameter is required to implement
EGL_ANDROID_native_fence_sync.

v2
* Replaced default timeout from 0 to PIPE_TIMEOUT_INFINITE
* Add more documentation to the new timeout parameter

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-17 10:09:25 -06:00
Brian Paul
9ee86d6db7 svga: whitespace clean-up in svga_winsys.h
Trivial.
2017-07-17 10:09:25 -06:00
Brian Paul
6f4923bd38 svga: add some const qualifiers
Trivial.
2017-07-17 10:06:01 -06:00
Brian Paul
589f546256 svga: add comment about 'extra' constant locations
Trivial.
2017-07-17 10:06:00 -06:00
Marek Olšák
c62809171c radeonsi/gfx9: add VM fault dmesg parser support
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:57:34 -04:00
Marek Olšák
9f320e0a38 radeonsi: automatically resize shader compiler thread queues when they are full
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:57:29 -04:00
Marek Olšák
4cae274116 radeonsi: prevent a deadlock in util_queue_add_job with too many GL contexts
If the queue is full, util_queue_add_job will wait while bo_fence_lock is
held.

It pb_slab wants to reuse a buffer, it will lock the pb_slab mutex and
try to check BO fence busyness, but it has to wait for bo_fence_lock to get
released. Both bo_fence_lock and pb_slab mutex are locked now.

When the CS thread unreferences and releases a suballocated buffer,
it will try to lock the pb_slab mutex and has to wait. The CS thread
can't finish its job in order to free a queue slot and unblock
util_queue_add_job ==> deadlock.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:57:25 -04:00
Marek Olšák
465bb47d6f radeonsi: expose ARB_timer_query unconditionally
clock_crystal_freq is always non-zero now.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:57:17 -04:00
Marek Olšák
d0963ef084 radeonsi/gfx9: don't read back non-existent register SRBM_STATUS2
It looks like there is no way to monitor SDMA busyness on GFX9.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:56:56 -04:00
Marek Olšák
5fb80a1e84 radeonsi: prevent a crash with DBG_CHECK_VM and u_threaded_context
by setting PIPE_CONTEXT_DEBUG in the caller

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:56:51 -04:00
Marek Olšák
ffa7ec9e22 radeonsi: simplify computation of tessellation offchip buffers
This is overly cautious, but better safe than sorry.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:55:07 -04:00
Marek Olšák
facfab28fe radeonsi/gfx9: add workarounds to avoid VGPR indexing completely
For inputs and outputs, indirect indexing is lowered by the GLSL compiler.
For temporaries, use alloca and disable the "promote-alloca" pass.

In the future, we could switch all codepaths to alloca permanently and
just rely on the "promote-alloca" pass.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
93391ac478 radeonsi: emit param exports after position exports
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
9d9ffc8475 radeonsi: move building parameter exports into a separate function
Both loops now look simple.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
4e30fb4ecc radeonsi: don't use info.num_inputs when it's unused
For clarity. It's only used by color interpolation.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
f8d6dd9b3d radeonsi: add si_build_fs_interp helper
This is much simpler.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
4560f2b90a radeonsi: merge si_llvm_get_amdgpu_target into ac_get_llvm_target
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
c351037d6c gallivm: inline gallivm_init_llvm_targets
there is only one user.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
ece0c0439f radeonsi: don't call gallivm_init_llvm_targets
It's for initializing the native (x86) target.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
d308460586 gallium/radeon: reallocate suballocated buffers when exported
This should fix exports of suballocated buffers.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
5b555854cc gallium/radeon: flush the context after in-place texture realloc before export
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Mark Thompson
63dcfed81f st/va: Fix scaling list ordering for H.265
Mesa here requires the scaling lists in diagonal scan order, but
VAAPI passes them in raster scan order.  Therefore, rearrange the
elements when copying.

v2: Move scan tables to vl_zscan.c.
    Fix type in size assertion.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-07-17 15:24:56 +01:00
Marek Olšák
f9d5611617 gallium/u_blitter: don't use TXF for scaled blits
There seems to be a rounding difference with F2I vs nearest filtering.
The precise problem in the rounding is unknown.

This fixes an incorrect output with OpenMAX encoding.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 15:47:30 +02:00
Samuel Pitoiset
c745beaf10 ddebug: fix parsing of the pipelined mode
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-17 10:28:45 +02:00
Tim Rowley
818209118c swr: JitManager runtime determination of architecture
Fixes performance regression from f50aa21456 - was forcing internal
code generation to target AVX (no gather, etc).

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-14 15:09:22 -05:00
Grigori Goronzy
8d980bf920 st/mesa: Add KHR_no_error toggle to driconf
Allows applications to be whitelisted.

v2: Remove misguided DRI common part.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-14 21:23:44 +02:00
Grigori Goronzy
2bbe235053 st/mesa: Add support for KHR_no_error flag
Add a new context flag and plumb it through the various layers of the
context creation code to set up dispatch tables for the no-error mode.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-14 21:23:40 +02:00
Grigori Goronzy
7299e82fa4 dri: Add KHR_no_error DRI extension
This basic extension allows usage of the __DRI_CTX_FLAG_NO_ERROR flag.
This includes support code for classic Mesa drivers to switch on the
no-error mode if the flag is set.

v2: Move to common DRI code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-14 21:20:31 +02:00
Christoph Haag
98514e9959 gallium/hud: use double values for all graphs
The fps graph for example calculates the fps as double with small
variations based on when query_new_value() is called, which causes
many values to be truncated on the cast to uint64_t.

The HUD internally stores the values as double, so just use double
everywhere instead of fixing this with rounding. Using doubles also
allows the hud to show small variations instead of being clamped to
discrete values.

v2: Don't print decimals in the dump file when not necessary
Signed-off-by: Christoph Haag <haagch+mesadev@frickel.club>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-07-14 17:34:39 +02:00
Lucas Stach
7e426ef6ec Revert "etnaviv: add support for snorm textures"
This reverts commit d8b2ccdb88, which causes priglit regressions on GPUs
with SNORM support. We'll have another try at enabling this feature after
the 17.2 branchpoint.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-07-14 17:21:50 +02:00
Wladimir J. van der Laan
1d05cec205 etnaviv: reset indexed rendering information when not rendering indexed
A dangling bo object would result in memory corruption while loading a
level in ioquake3_opengl2.

Fixes: 330d0607ed (gallium: remove pipe_index_buffer and set_index_buffer)
Suggested-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2017-07-14 17:19:42 +02:00
Wladimir J. van der Laan
bb2498a7f6 etnaviv: Use the correct LOG instruction on GC3000
GC3000 has a new LOG instruction, similar to the new SIN and COS instructions.

Generate the new instruction sequence when appropriate; there are
two occasions, as part of LIT and the generator for the LG2
instruction itself.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2017-07-14 17:15:41 +02:00
Lucas Stach
bccd21ee88 etnaviv: flush source TS before resolve
If we blit from a rendertarget or a depthstencil buffer there might still
be dirty data in the TS buffer which needs to be flushed out.

Fixes missing shadow tiles in glmark2 shadow.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
2017-07-14 17:13:12 +02:00
Philipp Zabel
e9b3381715 etnaviv: flush color cache and depth cache together before resolves
Before resolving a rendertarget or a depth/stencil resource into a
texture, flush both the color cache and the depth cache together.

It is unclear whether this is necessary for the following stall to
work properly, or whether the depth flush just adds enough time
for the color cache flush to finish before the resolver is started,
but this change removes artifacts that otherwise appear if a texture
is sampled directly after rendering into it.

The test case is a simple QML scene graph with a QtWebEngine based
WebView rendered on top of a blue background:

	import QtQuick 2.0
	import QtQuick.Window 2.2
	import QtWebView 1.1

	Window {
		Rectangle {
			id: background
			anchors.fill: parent
			color: "blue"
		}

		WebView {
			id: webView
			anchors.fill: parent
		}

		Component.onCompleted: {
			webView.url = "<some animated website>"
		}
	}

If the website is animated, the WebView renders the site contents into
texture tiles and immediately afterwards samples from them to draw the
tiles into the Qt renderbuffer. Without this patch, a small irregular
triangle in the lower right of each browser tile appears solid blue, as
if the texture sampler samples zeroes instead of the website contents,
and the previously rendered blue Rectangle shows through.

Other attempts such as adding a pipeline stall before the color flush or
a TS cache flush afterwards or flushing multiple times, with stalls
before and after each flush, have shown no effect.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2017-07-14 17:12:36 +02:00
Marek Olšák
f33d8af7aa st/dri: add 32-bit RGBX/RGBA formats
Add support for 32-bit RGBX/RGBA formats which are required for Android.

The original patch (commit ccdcf91104) was reverted (commit
c0c6ca40a2) in mesa as it broke GLX resulting in swapped colors. Based
on further investigation by Chad Versace, moving the RGBX/RGBA configs
to the end is enough to prevent breaking GLX.

The handling of RGBA/RGBX in dri_fill_st_visual is a fix from Marek
Olšák.

Cc: Eric Anholt <eric@anholt.net>
Cc: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-07-13 14:36:47 -05:00
Tim Rowley
254fa3dbf5 swr/rast: Fix use of KNL-only intrinsics in SKX build
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-13 08:47:10 -05:00
Tim Rowley
4c185dd3b3 swr/rast: Fix build warnings when using the Intel compiler
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-13 08:47:10 -05:00
Tim Rowley
bbc3b5c0dc swr/rast: SIMD16 Frontend - Fix USE_SIMD16_FRONTEND build
Previous check-ins without testing with USE_SIMD16_FRONTEND have
introduced regressions. This fixes the build, not the regressions.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-13 08:47:10 -05:00
Tim Rowley
640ea4d9a1 swr/rast: Removing unneeded MSVC warning pragma
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-13 08:47:10 -05:00
Tim Rowley
185b37f641 swr/rast: Add support for read-only render targets
Core will ensure hot tiles are loaded for read and write render targets,
and will skip all output merger for read-only render targets.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-13 08:47:10 -05:00
Tim Rowley
d8ebcad540 swr/rast: Support render target mask instead of render target count
WIP to support read-only render targets.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-13 08:47:10 -05:00
Nicolai Hähnle
c22e3c5373 radeonsi/gfx9: fix crash building monolithic merged ES-GS shader
Forwarding from the ES prolog to the ES just barely exceeds the current
maximum array size when 16 vertex attributes are used. Give it a decent
bump to account for merged shaders having up to 32 user SGPRs.

Fixes a crash in GL45-CTS.multi_bind.draw_bind_vertex_buffers.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-13 13:01:15 +02:00
Ilia Mirkin
3645268748 nv50/ir: fix threads calculation for non-compute shaders
We were using the "cp" union fields, which are only valid for compute
shaders. The threads calculation affects the available GPRs, so just
pick a small number for other shader types to avoid limiting available
registers.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-07-12 22:09:59 -04:00