Commit graph

24743 commits

Author SHA1 Message Date
Ilia Mirkin
9fe458335f nv50,nvc0: don't base decisions on available pushbuf space
We still have to push everything out, might as well kick earlier and
flip pushbufs when we know we'll need it. This resolves some issues with
the new policy of making sure that we always leave a bit of room at the
end for fences.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 47d11990b (nouveau: make sure there's always room to emit a fence)
Cc: mesa-stable@lists.freedesktop.org
2015-10-11 17:57:04 -04:00
Ilia Mirkin
8053c9208f nouveau: avoid emitting new fences unnecessarily
Right now we emit on every kick, but this is only necessary if something
will ever be able to observe that the fence completed. If there are no
refs, leave the fence alone and emit it another day.

This also happens to work around an issue for the kick handler -- a kick
can be a result of e.g. nouveau_bo_wait or explicit kick, or it can be
due to lack of space in the pushbuf. We want the emit to happen in the
current batch, so we want there to always be enough space. However an
explicit kick could take the reserved space for the implicitly-triggered
kick's fence emission if it happened right after. With the new mechanism,
hopefully there's no way to cause two fences to be emitted into the same
reserved space.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 47d11990b (nouveau: make sure there's always room to emit a fence)
Cc: mesa-stable@lists.freedesktop.org
2015-10-11 17:57:04 -04:00
Samuel Pitoiset
06abd1a25e nvc0: make use of NVC0_COMPUTE_CLASS for GF110
In theory, GF110+ should also support NVC8_COMPUTE_CLASS but, in practice,
a ILLEGAL_CLASS dmesg fail appears when using it.

This fixes compute support and MP performance counters on GF110.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-10-10 22:11:03 +02:00
Roland Scheidegger
4c4ba5a8c3 tgsi: (trivial) kill c99-ism. 2015-10-09 23:12:14 +02:00
Marek Olšák
c80c19a9d5 tgsi/scan: add info about declared samplers (v2)
v2: get it from declarations, not instructions
2015-10-09 22:02:18 +02:00
Marek Olšák
417927ebde tgsi: add a utility for emulating some GL features
st/mesa will use this, but drivers can use it too.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
2015-10-09 22:02:18 +02:00
Rob Clark
c9b982b72d glsl: move shader_enums into nir
First step towards inverting the dependency between glsl and nir (so nir
can be used without glsl).  Also solves this issue with 'make distclean'

  Making distclean in mesa
  make[2]: Entering directory '/mnt/sdb1/Src64/Mesa-git/mesa/src/mesa'
  Makefile:2486: ../glsl/.deps/shader_enums.Plo: No such file or directory
  make[2]: *** No rule to make target '../glsl/.deps/shader_enums.Plo'. Stop.
  make[2]: Leaving directory '/mnt/sdb1/Src64/Mesa-git/mesa/src/mesa'
  Makefile:684: recipe for target 'distclean-recursive' failed
  make[1]: *** [distclean-recursive] Error 1
  make[1]: Leaving directory '/mnt/sdb1/Src64/Mesa-git/mesa/src'
  Makefile:615: recipe for target 'distclean-recursive' failed
  make: *** [distclean-recursive] Error 1

Reported-by: Andy Furniss <adf.lists@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-10-09 15:03:28 -04:00
Samuel Pitoiset
7129cbf5f4 nvc0: move HW SM queries to nvc0_query_hw_sm.c/h files
Global performance counters (PCOUNTER) will be added to
nvc0_query_hw_pm.c/h files.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2015-10-09 14:09:57 +02:00
Samuel Pitoiset
224fec05ea nvc0: move HW queries to nvc0_query_hw.c/h files
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2015-10-09 14:09:57 +02:00
Samuel Pitoiset
77b6990d14 nvc0: move SW queries to nvc0_query_sw.c/h files
Loosely based on freedreno driver.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2015-10-09 14:09:57 +02:00
Samuel Pitoiset
0678530b9e nvc0: move nvc0_so_target_save_offset() to its correct location
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2015-10-09 14:09:57 +02:00
Samuel Pitoiset
0644196ab1 nvc0: add a header file for nvc0_query
This will allow to split SW and HW queries in an upcoming patch.

While we are at it, make use of nvc0_query struct instead of pipe_query.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2015-10-09 14:09:57 +02:00
Emil Velikov
1fda56cdb2 gallium/ddebug: add missing dd_util.h to sources list
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-10-08 18:13:24 +01:00
Emil Velikov
62741ff052 gallium/ddebug: automake: sort sources alphabetically
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-10-08 18:13:24 +01:00
Marek Olšák
13e69805ea radeonsi: fix a GS hang on VI
Broken by one of the cleanups: 0d46c3bc9d
Not applicable to stable.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-10-07 19:18:50 +02:00
Marek Olšák
5749676d03 radeonsi: remove TC L2 cache flush for index buffers on VI
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-10-07 19:18:50 +02:00
Brian Paul
6ed8fd3d67 svga: whitespace fixes in svga_sampler_view.c 2015-10-07 08:45:56 -06:00
Brian Paul
70c4cde453 svga: whitespace fixes in svga_resource_buffer.c 2015-10-07 08:45:56 -06:00
Brian Paul
2bad030ac9 svga: round UBO constant buffer size up/down to multiple of 16 bytes
The svga3d device requires constant buffers to be a multiple of 16 bytes
in size.  OpenGL UBOs may not fit that restriction.  As a work-around,
round the size up if possible, else round down.

Note that this patch only effects UBO constant buffers (index 1 or higher),
not the 0th/default constant buffer.

Fixes the game Grim Fandango Remastered.  VMware bug 1510130.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-10-07 08:45:56 -06:00
Ilia Mirkin
47d11990b2 nouveau: make sure there's always room to emit a fence
I started seeing a lot of situations on nv30 where fence emission
wouldn't fit into the previous buffer (causing assertions). This ensures
that whenever checking for space, we always leave a bit of extra room
for the fence emission commands. Adjusts the nv30 and nvc0 fence
emission logic to bypass the space checking as well.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2015-10-07 04:30:05 -04:00
Boyan Ding
64d9d4b730 vc4: use nir two-sided-color lowering
Similar to 9ffc1049ca (freedreno/ir3: use nir two-sided-color lowering).
No piglit regression.

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-10-06 16:34:07 -07:00
Eric Anholt
b6cd39fc47 vc4: Fix a leak of the last color read/write surface on context destroy. 2015-10-06 16:32:03 -07:00
Eric Anholt
922e0680f9 vc4: Fix a memory leak in the simulator case.
We validate per draw call, and need to free the shader per draw call, too.
2015-10-06 16:29:14 -07:00
Brian Paul
3801fa65c1 tgsi: add const qualifier to silence warning
Trivial.
2015-10-06 08:51:33 -06:00
Ilia Mirkin
78ec9e28ec nv30: always go through translate module on big-endian
It seems like things are either coming in slighly wrong, or perhaps
uploaded incorrectly, but either way passing them through the translate
module seems to fix everything. Eventually we should figure out what's
going wrong and fix it "for real", but this should do for now.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2015-10-04 21:50:41 -04:00
Ilia Mirkin
1fec05d114 nv30: pretend to have packed texture/surface formats
This puts us in line with what the DDX/DRI2 st are expecting. It also
happens to work... no idea why, but seems better to have it work than to
ask lots of questions.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2015-10-04 21:50:41 -04:00
Michel Dänzer
87c3c9acd2 st/dri: Use packed RGB formats
Fixes Gallium based DRI drivers failing to load on big endian hosts
because they can't find any matching fbconfigs.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71789
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-10-04 21:50:31 -04:00
Marek Olšák
814b7d1ab9 radeonsi: enable PIPE_CAP_FORCE_PERSAMPLE_INTERP
Now st/mesa won't generate 2 variants for this state.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-03 22:06:09 +02:00
Marek Olšák
b3c55fc669 radeonsi: do force_persample_interp in shaders for non-trivial cases
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-03 22:06:09 +02:00
Marek Olšák
9652bfcf2d radeonsi: implement the simple case of force_persample_interp
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-03 22:06:09 +02:00
Marek Olšák
214de2d815 radeonsi: move SPI_PS_INPUT_ENA/ADDR registers to a separate state
This will be a derived state used for changing center->sample and
centroid->sample at runtime.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-03 22:06:09 +02:00
Marek Olšák
55d406b71e tgsi/scan: add interpolation info into tgsi_shader_info
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-03 22:06:09 +02:00
Marek Olšák
f3b37e321f gallium: add per-sample interpolation control into rasterizer statOAe
Required by ARB_sample_shading for drivers that don't want a shader variant
in st/mesa.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Roland Scheidegger <sroland@vmware.com>
2015-10-03 22:06:09 +02:00
Marek Olšák
b78336085b st/dri: don't use _ctx in client_wait_sync
Not needed and it can be NULL.

v2: fix dri2_get_fence_from_cl_event - thanks Albert

Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Albert Freeman <albertwdfreeman@gmail.com>
2015-10-03 22:06:08 +02:00
Marek Olšák
27b102e7fd r600g: only do depth-only or stencil-only in-place decompression
instead of always doing both.
Usually, only depth is needed, so stencil decompression is useless.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-03 22:06:08 +02:00
Marek Olšák
c23c92c965 radeonsi: only do depth-only or stencil-only in-place decompression
instead of always doing both.
Usually, only depth is needed, so stencil decompression is useless.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-03 22:06:08 +02:00
Marek Olšák
5804c6adf8 gallium/radeon: add separate stencil level dirty flags
We will only do depth-only or stencil-only decompress blits, whichever is
needed by textures, instead of always doing both.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-03 22:06:08 +02:00
Marek Olšák
cc92b90375 radeonsi: dump buffer lists while debugging
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-03 22:06:08 +02:00
Marek Olšák
eb55610c89 winsys/radeon: implement cs_get_buffer_list
This is more complicated, because tracking priority_usage needed changing
the relocs_bo type.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-03 22:06:07 +02:00
Marek Olšák
6f48e2bee1 winsys/amdgpu: add winsys function cs_get_buffer_list
For debugging.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-03 22:06:07 +02:00
Marek Olšák
93641f4341 gallium/radeon: stop using "reloc" in a few places
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-03 22:06:07 +02:00
Marek Olšák
2edb060639 gallium/radeon: tell the winsys the exact resource binding types
Use the priority flags and expand them.
This information will be used for debugging.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-03 22:06:07 +02:00
Marek Olšák
9bd7928a35 radeonsi: add an option for debugging VM faults
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-03 22:06:07 +02:00
Marek Olšák
4502d0bf88 radeonsi: move dumping the last IB into its own function
v2: indentation fix

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-03 22:06:07 +02:00
Marek Olšák
89f73827d0 ddebug: separate creation of debug files
This will be used by radeonsi for logging.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-03 22:06:07 +02:00
Tom Stellard
a2e1e3d325 radeon/llvm: Initialize gallivm targets when initializing the AMDGPU target v2
This fixes a race condition in the glx-multithreaded-shader-compile
test.

v2:
  - Replace gallivm_init_llvm_{begin,end}() with gallivm_init_llvm_targets().

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
2015-10-02 23:41:27 +00:00
Tom Stellard
76cfd6f1da gallivm: Allow drivers and state trackers to initialize gallivm LLVM targets v2
Drivers and state trackers that use LLVM for generating code, must
register the targets they use with LLVM's global TargetRegistry.
The TargetRegistry is not thread-safe, so all targets must be added
to the registry before it can be queried for target information.

When drivers and state trackers initialize their own targets, they need
a way to force gallivm to initialize its targets at the same time.
Otherwise, there can be a race condition in some multi-threaded
applications (e.g. glx-multihreaded-shader-compile in piglit),
when one thread creates a context for a driver that uses LLVM (e.g.
radeonsi) and another thread creates a gallivm context (glxContextCreate
does this).

The race happens when the driver thread initializes its LLVM targets and
then starts using the registry before the gallivm thread has a chance to
register its targets.

This patch allows users to force gallivm to register its targets by
calling the gallivm_init_llvm_targets() function.

v2:
  - Use call_once and remove mutexes and static initializations.
  - Replace gallivm_init_llvm_{begin,end}() with
    gallivm_init_llvm_targets().

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
2015-10-02 23:41:26 +00:00
Tom Stellard
3219b48ae5 gallium/radeon: Use call_once() when initailizing LLVM targets
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
2015-10-02 23:19:01 +00:00
Ilia Mirkin
1d8cba9b51 nouveau: wait to unref the transfer's bo until it's no longer used
The bo will often come from a slab in which case it doesn't matter. But
for larger allocations this will be in its own bo, and we have to make
sure to wait until it's no longer used in order for it to be freed.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Marcin Ślusarz <marcin.slusarz@gmail.com>
2015-09-28 17:28:54 -04:00
Ilia Mirkin
3a6b9a7830 nouveau: delay deleting buffer with unflushed fence
If there is an unflushed fence on the bo, then the resource may still be
used in commands built up in the local pushbuf. Flushing can cause all
sorts of unwanted effects, so just free the bo when the relevant fence
is hit.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Marcin Ślusarz <marcin.slusarz@gmail.com>
2015-09-28 17:28:54 -04:00