Commit graph

29173 commits

Author SHA1 Message Date
Charmaine Lee
0035f7f136 svga: add guest statistic gathering interface
This file was supposed to be added with the previous "svga: add guest
statistic gathering interface" patch but went MIA for some reason.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-26 08:04:02 -06:00
Marek Olšák
49c798e902 radeonsi: disable CE on SI + AMDGPU
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-08-26 15:50:10 +02:00
Marek Olšák
281f1a5980 winsys/amdgpu: disable IB chaining on SI
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-08-26 15:50:10 +02:00
Marek Olšák
a6869e7c06 winsys/amdgpu: finish up SI addrlib integration
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-08-26 15:50:10 +02:00
Ronie Salgado
97b55243fb winsys/amdgpu: initial SI support
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2016-08-26 15:50:10 +02:00
Marek Olšák
971ef7518f gallium/radeon: add a driver query for AMDGPU_INFO_NUM_EVICTIONS
If the kernel driver doesn't support it, it returns 0.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-08-26 15:50:10 +02:00
Marek Olšák
7172906c0c radeonsi: fix printing shaders and states on a VM fault
This was missed while rewriting the PIPE_DUMP flags.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-08-26 15:50:10 +02:00
Marek Olšák
5ee3cac138 radeonsi: increase performance for DRI PRIME offloading if 2nd GPU is CIK or VI
SDMA is much faster for tiled->linear blits from VRAM to GTT.
I have Bonaire in my second PCIe slot.

$ glxinfo | grep OpenGL.renderer
OpenGL renderer string: Gallium 0.4 on AMD TONGA ...

$ DRI_PRIME=1 glxinfo | grep OpenGL.renderer
OpenGL renderer string: Gallium 0.4 on AMD BONAIRE ...

Without SDMA:
$ DRI_PRIME=1 glxgears
8796 frames in 5.0 seconds = 1759.074 FPS
8899 frames in 5.0 seconds = 1779.672 FPS

With SDMA:
$ DRI_PRIME=1 glxgears
12765 frames in 5.0 seconds = 2552.788 FPS
12888 frames in 5.0 seconds = 2577.495 FPS

The 1st GPU is irrelevant. The improvement should be much lower at 60 fps,
but definitely measurable.

SI will get this once we add SDMA blit support for it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-08-26 15:50:10 +02:00
Marek Olšák
0241d8300f radeonsi: enable SDMA on CIK
It passes R600_DEBUG=testdma on Bonaire/radeon.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-08-26 15:50:10 +02:00
Marek Olšák
bcfd49e511 gallium/radeon: increase priority for shader binaries
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-08-26 15:50:10 +02:00
Marek Olšák
c3f716fe67 gallium/radeon: merge USER_SHADER and INTERNAL_SHADER priority flags
there's no reason to separate these

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-08-26 15:50:10 +02:00
Neha Bhende
10f6e08549 svga: fix regression related to srgb
This regression is caused because of commit 3190c7ee97
Regression caused by following OpenGL 4.4 spec rules relates to
GL_FRAMEBUFFER_SRGB in Mesa.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-08-26 06:19:52 -06:00
Neha Bhende
3b7341d547 svga: use local variable blit instead of pointer
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-08-26 06:19:52 -06:00
Brian Paul
b09e4ab13c svga: s/INDEX_0D/INDEX_IMMEDIATE32/
Both are zero, but the later is the right token.
2016-08-26 06:19:52 -06:00
Brian Paul
93779b87a1 svga: add comment about unsupported blend modes 2016-08-26 06:19:52 -06:00
Charmaine Lee
b1772651b7 svga: fix ordering of mksstats counter strings
String for SVGA_STATS_COUNT_TEXREADBACK was swapped
with the string for SVGA_STATS_COUNT_SURFACEWRITEFLUSH.

Trivial fix.
2016-08-26 06:19:52 -06:00
Charmaine Lee
2781d60375 svga: avoid emitting redundant SetShaderResource command
Tested with Lightsmark2008, Heaven, MTT piglit, glretrace, viewperf, conform.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-26 06:19:52 -06:00
Charmaine Lee
5313b294e6 svga: add a cleanup function to clean up sampler state
This patch adds a cleanup function to clean up sampler state at
context destruction time.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-26 06:19:52 -06:00
Brian Paul
e292f38c6c svga: loosen the condition to flush in get_query_result_vgpu10()
Fixes piglit spec/ext_transform_feedback/overflow-edge-cases segfaults
because the query's fence pointer was null.

Tested with Piglit, Sauerbraten, ETQW.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-08-26 06:19:52 -06:00
Brian Paul
99d8fe20ab svga: fix vgpu10 query fencing
We don't want to flush the command buffer or sync on the fence when ending
a query (that kind of defeats the whole purpose of async queries).  Do that
instead in get_query_result().

Tested with Piglit, arbocclude, Sauerbraten game, Nobel Clinician Viewer,
ETQW.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-08-26 06:19:52 -06:00
Charmaine Lee
3f51a3f6ac svga: avoid emitting redundant DXSetSamplers command
This patch avoid emitting redundant DXSetSamplers command.

Tested with Lightsmark2008, Heaven, MTT piglit, glretrace, viewperf.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-26 06:19:52 -06:00
Neha Bhende
6a43148e20 svga: enable ARB_clear_texture extension in the driver.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-08-26 06:19:52 -06:00
Neha Bhende
2111795d51 svga: define svga_clear() in svga_init_clear_functions()
Put all the clearing related functions in svga_init_clear_functions()

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-08-26 06:19:51 -06:00
Neha Bhende
40557ae07c svga: add svga_init_clear_functions()
define svga_init_clear_functions()
and svga_clear_texture as svga->pipe.clear_texture. This is part of
ARB_clear_texture extension

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-08-26 06:19:51 -06:00
Neha Bhende
52d88b67be svga: add new function svga_clear_texture()
To clear texture this function can be used. This is part of
ARB_clear_texture extension. Basically this extension allows you to
clear texture with given color values.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-08-26 06:19:51 -06:00
Neha Bhende
1da538f85b svga: add new begin_blit()
Saving all blitter states will be done in begin_blit() so that
begin_blit() can be used before performing any blit operation.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2016-08-26 06:19:51 -06:00
Charmaine Lee
a5fd54f8bf svga: add opt to the list of valid build types
For opt build, add VMX86_STATS to the list of cpp defines.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-26 06:19:51 -06:00
Charmaine Lee
2e1cfcc431 svga: add guest statistic gathering interface
With this patch, guest statistic gathering interface is added to
svga winsys interface that can be used to gather svga driver
statistic. The winsys module can then share the statistic info with
the VMX host via the mksstats interface.

The statistic enums used in the svga driver are defined in
svga_stats_count and svga_stats_time in svga_winsys.h

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-26 06:19:51 -06:00
Charmaine Lee
4791991808 svga: fix indirect non-indexable temp access
If the shader has indirect access to non-indexable temporaries,
convert these non-indexable temporaries to indexable temporary array.
This works around a bug in the GLSL->TGSI translator.

Fixes glsl-1.20/execution/fs-const-array-of-struct-of-array.shader_test
on DX11Renderer.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-26 06:19:51 -06:00
Brian Paul
d221a6545c gallium/hud: move signo declaration inside PIPE_OS_UNIX block
To silence unused var warning with MSVC, MinGW.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-26 06:19:51 -06:00
Kenneth Graunke
93bfa1d7a2 nir: Change nir_shader_get_entrypoint to return an impl.
Jason suggested adding an assert(function->impl) here.  All callers
of this function actually want ->impl, so I decided just to change
the API.

We also change the nir_lower_io_to_temporaries API here.  All but one
caller passed nir_shader_get_entrypoint(), and with the previous commit,
it now uses a nir_function_impl internally.  Folding this change in
avoids the need to change it and change it back.

v2: Fix one call I missed in ir3_compiler (caught by Eric).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-08-25 19:18:24 -07:00
Eric Anholt
00c72acba5 vc4: Add support for fddx/fddy
Based vaguely on a patch by jonasarrow on github.
2016-08-25 17:24:11 -07:00
Eric Anholt
e763e19808 vc4: Add register allocation support for MUL output rotation.
We need the source to be in r0-r3, so make a new register class for it.
It will be up to the surrounding passes to make sure that the r0-r3
allocation of its source won't conflict with anything other class
requirements on that temp.
2016-08-25 17:24:11 -07:00
Eric Anholt
8ce6526178 vc4: Add support for MUL output rotation.
Extracted from a patch by jonasarrow on github.
2016-08-25 17:24:11 -07:00
Eric Anholt
074f1f3c0c vc4: Add support for the 2-bit LOAD_IMM variants.
Extracted and fixed up from a patch by jonasarrow on github.  This ended
up not getting used for ddx/ddy, but seems like it might still be useful.
2016-08-25 17:24:11 -07:00
Eric Anholt
3da4e38f48 vc4: Add QPU scheduling to handle MUL rotate sources.
We need MUL rotates to do ddx/ddy support.
2016-08-25 17:24:11 -07:00
Eric Anholt
b0b99a7952 vc4: Add disassembly for constant MUL rotates 2016-08-25 17:24:11 -07:00
Eric Anholt
b160708e03 vc4: Add real validation for MUL rotation.
Caught problems in the upcoming DDX/DDY implementation.
2016-08-25 17:24:11 -07:00
Eric Anholt
31da39ddc9 vc4: Add a QIR value for the QPU element register.
This will be used in the ddx/ddy support for "Am I the top half?" or "Am I
the left half?" checks.
2016-08-25 17:24:11 -07:00
Marek Olšák
a491b9e945 radeonsi: don't use allocas for arrays with LLVM 3.8
It crashes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97413
2016-08-25 21:19:17 +02:00
Marek Olšák
fe91ae06d3 gallium/radeon: unify and simplify checking for an empty gfx IB
We can take advantage of the fact that multi_fence does the obvious thing
with NULL fences.

This fixes unflushed fences that can get stuck due to empty IBs.
2016-08-25 21:19:17 +02:00
Marek Olšák
3ff0b67e1b radeonsi: disable SDMA texture copying on Carrizo
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2016-08-25 14:51:08 +02:00
Marek Olšák
1276316d67 gallium/noop: use 3-space indentation
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-25 14:09:48 +02:00
Marek Olšák
9daaa6f5a6 gallium: add a pipe_context parameter to resource_get_handle
radeonsi needs to do some operations (DCC decompression) for OpenGL-OpenCL
interop and this is the only way to make it coherent with the current
context. It can optionally be set to NULL.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-08-25 14:09:48 +02:00
Samuel Pitoiset
a227b0a4f1 nvc0: invalidate textures/samplers on GK104+
Like Fermi, textures and samplers are aliased between 3D and compute,
especially the TIC_FLUSH/TSC_FLUSH methods and we have to re-validate
these resources when switching between the two pipelines.

This fixes a GPU hang with Elemental (and most likely with other UE4 demos).

Tested on GK107 and GM107.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
CC: <mesa-stable@lists.freedesktop.org>
2016-08-24 22:26:36 +02:00
Rhys Kidd
c9c989763a gallium/ttn: Remove duplicated TGSI_OPCODE_DP2A initialization
Duplicate line is currently on 1535.

Identified by Clang, when run through Eric Anholt's Travis harness.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-08-24 11:54:50 -07:00
Eric Anholt
87a88f2daa vc4: Fix GPU hangs with >16 varying values.
Fixes glsl-routing in piglit and hangs in glbenchmark 2.0.2.
2016-08-24 10:43:22 -07:00
Leo Liu
5277f25480 vl/rbsp: fix another three byte not detected
This happens when three byte "00 00 03" is partly loaded to
vlc->buffer, thus at the bottom of buffer with valid bits is
"00" or "00 00" and left  like "00 03" or "03" in the data,
so that it will not be detected by three byte emulation check.
The reason for that is the escaped bit was set to 0 from the
rbsp init.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2016-08-24 11:17:16 -04:00
Marek Olšák
2c13abb491 radeonsi: fix VM faults due NULL internal const buffers on CIK
They are harmless, but the interrupts do decrease performance.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97039

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
2016-08-24 15:39:57 +02:00
Tomasz Figa
577f85e2bb gallium/winsys/kms: Look up the GEM handle after importing a prime FD
drmPrimeHandleToFD() will return the same GEM handle every time the same
buffer is imported, even from a different prime FD. Since GEM handles
are not reference counted, we need to make sure that each GEM handle is
referenced only by one display target struct, by looking it up in
kms_sw->bo_list first and bumping the refcount of the found dt on hit
and falling back to creating a new dt only on miss.

v2: Split into separate function.
    Use helper function for lookup.

v3 [Emil Velikov]:
    Rename kms_sw_displaytarget_{lookup,find_and_ref} (Jordan)

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com> (v2)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-08-24 14:39:23 +01:00