Commit graph

27982 commits

Author SHA1 Message Date
Nicolai Hähnle
c95175581e radeonsi: fix calculation of valid RB mask per SE
The old calculation treated too many RBs as disabled.

Cc: 11.0 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-20 18:28:31 +02:00
Nicolai Hähnle
6c2e636982 radeonsi: raise SI_PM4_MAX_DW
The old limit, introduced in commit afa752d3f0,
was exceeded by 4 SE configurations which hit si_write_harvested_raster_configs.

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-20 18:28:17 +02:00
Roland Scheidegger
b0cf99165a gallivm: don't use integer min/max sse intrinsics with llvm >= 3.9
Apparently, these are deprecated. There's some AutoUpgrade feature which
is supposed to promote these to cmp/select, which apparently doesn't work
with jit code. It is possible it's not actually even meant to work (see
the bug filed against llvm which couldn't provide an answer neither)
but in any case this is meant to be only temporary unless the intrinsics
are really illegal. So, just use the fallback code (which should be cmp/select,
we're actually doing cmp/sext/trunc/select, but in any case llvm 3.9 manages
to optimize this back to pmin/pmax in the end).

This addresses https://llvm.org/bugs/show_bug.cgi?id=28176

CC: <mesa-stable@lists.freedesktop.org>

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Tested-by: Aaron Watry <awatry@gmail.com>
2016-06-20 17:19:03 +02:00
Ilia Mirkin
154c0a42a2 nvc0: don't make use of push hint if there are no non-const user vbos
This makes the check match up what we do on nv50 as well - there's no
point in switching over the push path if everything's in managed
buffers. This can happen when a shader uses a vertex without an enabled
array - we end up passing it a constant attribute.

This also has the effect of "fixing" some flickering in Talos. I have no
idea why. I've stared at the push logic forwards, backwards, and
sideways. By always forcing the push path (which is slow), the
flickering also goes away, but other rendering is still wrong
(specifically draw 383068 as identified in the bug). However by not
switching over to the push path, draw 383068 is correct.

Note that other flickering remains in Talos, like the red/green
walls/floors. This takes care of the shadow flickering though.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90513
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-06-19 10:14:57 -04:00
Ilia Mirkin
1804aa0b80 gk104/ir: fix tex use generation to be more careful about eliding uses
If we have a loop, instructions before the tex might be added as tex
uses, and those may in fact dominate all other uses of the tex results.
This however doesn't mean that we don't need a texbar after the tex.
Only check if uses dominate each other they are dominated by the tex.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96565
Fixes: 7752bbc44 (gk104/ir: simplify and fool-proof texbar algorithm)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-06-19 10:14:46 -04:00
Ilia Mirkin
194bcb49d1 nv50: add support for GL_EXT_window_rectangles
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-06-18 13:38:30 -04:00
Ilia Mirkin
b21a00d129 nvc0: add support for GL_EXT_window_rectangles
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-06-18 13:38:30 -04:00
Ilia Mirkin
07fcb06fe0 gallium: add PIPE_CAP_MAX_WINDOW_RECTANGLES to all drivers
This says how many window rectangles are supported by the
implementation, although it may not exceed PIPE_MAX_WINDOW_RECTANGLES.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-06-18 13:38:29 -04:00
Ilia Mirkin
82fab73246 gallium: add API for setting window rectangles
Window rectangles apply to all framebuffer operations, either in
inclusive or exclusive mode. They may also be specified as part of a
blit operation.

In exclusive mode, any fragment inside any of the specified rectangles
will be discarded.

In inclusive mode, any fragment outside every rectangle will be
discarded.

The no-op state is to have 0 rectangles in exclusive mode.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-18 12:59:12 -04:00
Samuel Pitoiset
b214e0d2fb nv50/ir: add missing strings for some recent sysvals
This is pretty useful for debugging purposes and those should
not be omitted.

Fixes: 517a93b3 ("nvc0: add ARB_shader_draw_parameters support")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-06-18 18:34:50 +02:00
Bruce Cherniak
6b0ac95c28 swr: Update screen->context pointer with multiple contexts.
A pipe pointer in the screen allows for access to current device context
 in flush_frontbuffer and resource_destroy.  This wasn't tracking current
context in multi-context situations.

v2: More caffeine.  Corrected compare, removed unnecessary set of
screen-pipe in create_context, and added a few comments.
2016-06-17 13:56:03 -05:00
Tim Rowley
5a64549f54 swr: switch from overriding -march to selecting features
Acked-by: Chuck Atkins <chuck.atkins@kitware.com>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com>
2016-06-17 10:34:17 -05:00
Christian König
6d877d7121 st/vdpau: we support lumakeying now
Signed-off-by: Christian König <christian.koenig@amd.com>
2016-06-16 09:41:13 +02:00
Christian König
bf89e672cf vl: support luma keying for interlaced surfaces as well
We had the CSC code twice in there, factor it out into a separate function.

Signed-off-by: Christian König <christian.koenig@amd.com>
2016-06-16 09:41:12 +02:00
Brian Paul
bb1292e226 auxilary/os: allow appending to GALLIUM_LOG_FILE
If the log file specified by the GALLIUM_LOG_FILE begins with '+', open
the file in append mode.  This is useful to log all gallium output for
an entire piglit run, for example.

v2: put GALLIUM_LOG_FILE support inside an #ifdef DEBUG block.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-06-15 17:16:42 -06:00
Rob Herring
067c5b10b6 vc4: fix vc4_resource_from_handle() stride calculation
The expected stride calculation is completely wrong. It should
ultimately be multiplying cpp and width rather than dividing. The width
also needs to be aligned to the tiling width first before converting to
stride bytes.

The whole stride check here is possibly pointless. Any buffers which
were allocated outside of vc4 may have strides with larger alignment
requirements.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-06-15 14:54:38 -07:00
Marek Olšák
d794072b3e winsys/radeon: use the common job queue for multithreaded command submission v2
v2: fixup after renaming to util_queue_fence

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-15 21:07:34 +02:00
Marek Olšák
562cb03d76 gallium/util: import the multithreaded job queue from amdgpu winsys (v2)
v2: rename the event to util_queue_fence

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-15 21:07:34 +02:00
Nicolai Hähnle
44e0c0e6ec radeonsi: fix undefined left-shift into sign bit
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-15 09:27:56 +02:00
Marek Olšák
6ef50efc10 gallium/radeon: num-cs-flushes query should display per-frame average
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-14 20:22:16 +02:00
Marek Olšák
4140afd04b gallium/radeon: add driver queries for compute/dma call stats and spills
also print the average count per frame

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-14 20:22:16 +02:00
Marek Olšák
8fc688c303 radeonsi: don't generate "ret void undef"
Use LLVMBuildRetVoid in epilogs and the GS copy shader and
si_llvm_build_ret otherwise.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-06-14 20:22:16 +02:00
Marek Olšák
4eea710b0d radeonsi: try to hit direct hw MSAA resolve by changing micro mode in clear
We could also do MSAA resolve in a compute shader like Vulkan and remove
these workarounds.

v2: comment the magic numbers

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-14 20:22:16 +02:00
Marek Olšák
373060652c radeonsi: clarify the MSAA resolve limitation with scanout
this is the correct hw requirement

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-14 20:22:16 +02:00
Marek Olšák
789618e3b4 gallium/radeon: add micro_tile_mode to radeon_surf
for easier access

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-14 20:22:16 +02:00
Roland Scheidegger
afbf5888f5 gallium/util: don't use blocksize for minify for assertions
The previous assertions required for texture sizes smaller than block_size
that src_box.x + src_box.width still be block size.
(e.g. for a texture with width 3, and src_box.x = 0, src_box.width would
have to be 4 to not assert.)
This caused some assertions with some other state tracker.
It looks though like callers aren't expected to round up widths to block sizes
(for sizes larger than block size the assertion would still have verified it
wouldn't have been rounded up) so we simply shouldn't use a minify which
rounds up to block size.
(No piglit change with llvmpipe.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-06-14 17:03:34 +02:00
Roland Scheidegger
f4184d5450 llvmpipe: hack-fix bugs due to bogus bind flags
The gallium contract would be that bind flags must indicate all possible
bindings a resource might get used, but fact is the mesa state tracker does
not set bind flags correctly, and this is more or less unfixable due to GL.

This caused a bug with piglit arb_uniform_buffer_object-rendering-dsa
since 6e6fd911da - the commit is correct,
but it caused us to miss updates to fs UBOs completely, since the
corresponding buffer didn't have the appropriate bind flag set (thus we
wouldn't check if it is indeed currently bound).
See the discussion about this starting here:
https://lists.freedesktop.org/archives/mesa-dev/2016-June/119829.html

So, update the bind flags when we detect such usage.
Note we update this value for now only in places which matter for us - that
is creating sampler/surface view, or binding constant buffer. There's plenty
more places (setting streamout buffers, vertex/index buffers, ...) where
things can be set with the wrong bind flags, but the bind flags there never
matter.

While here also make sure we only set dirty constant bit when it's a fs
constant buffer - totally doesn't matter if it's vs/gs.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-06-14 17:03:34 +02:00
Rob Clark
243417810b freedreno: support start param for sampler views/states
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-14 11:00:59 -04:00
Rob Clark
b8eb1493a9 freedreno: only do extra vertex-buffer state logic on a2xx
Possibly this should move into an fd2 wrapper fxn, similar to the
texture state tracking done for fd3/fd4 (clamp emulation, etc)

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-14 11:00:59 -04:00
Rob Clark
26d0efa9ce freedreno: use util_copy_constant_buffer() helper
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-14 11:00:59 -04:00
Nayan Deshmukh
fdec8f9e42 st/vdpau: replace 0.f and 1.f with 0.0f and 1.0f respectively
Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-06-14 15:32:04 +01:00
Michel Dänzer
9ee3f097b6 st/dri: Clear drawable texture_mask in dri2_invalidate_drawable
This makes sure that dri_set_tex_buffer2 -> dri_drawable_validate_att
will re-create the front left attachment buffer after the drawable got
invalidated.

Fixes window contents not updating until the window is resized when
using DRI2 PRIME.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-06-14 18:16:54 +09:00
Julien Isorce
1cdb4da1d6 st/va: ensure linear memory for dmabuf
In order to do zero-copy between two different devices
the memory should not be tiled.

Tested with GStreamer on a laptop that has 2 GPUs:
1- gstvaapidecode:
   HW decoding and dmabuf export with nouveau driver on Nvidia GPU.
2- glimagesink:
   EGLImage imports dmabuf on Intel GPU.

TEST: DRI_PRIME=1 gst-launch vaapidecodebin ! glimagesink

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-06-14 08:40:33 +01:00
Mathias Fröhlich
c3b6656676 mesa/gallium: Move u_bit_scan{,64} from gallium to util.
The functions are also useful for mesa.
Introduce src/util/bitscan.{h,c}. Move ffs function
implementations from src/mesa/main/imports.{h,c}.
Move bit scan related functions from
src/gallium/auxiliary/util/u_math.h. Merge platform
handling with what is available from within mesa.

v2: Try to fix MSVC compile.

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2016-06-14 05:19:10 +02:00
Aaron Watry
fafe026dbe clover: Include generated sources in AM_CPPFLAGS
git_sha1.c is generated in $(top_builddir)/src.

Fixes out-of-tree builds since 4825264f75.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96516
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2016-06-14 12:04:42 +09:00
Stephan Bergmann
0140938b26 nv50/ir: make Graph destructor virtual
Avoid ASan new-delete-type-mismatch when Function::domTree is created as
DominatorTree in Function::convertToSSA but destroyed only as base
Graph in ~Function.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-06-13 22:55:11 -04:00
Vedran Miletić
4825264f75 clover: Update OpenCL version string to match OpenGL
Change MESA into Mesa in CL_PLATFORM_VERSION and CL_DEVICE_VERSION. For
both, always append git version suffix from git_sha1.h.

v5: move semicolon to same line as MESA_GIT_SHA1.
v4: drop #ifdef guards.
v3: add missing include.
v2: change CL_DEVICE_VERSION as well.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-06-13 15:55:59 -07:00
Brian Paul
cf9bb9acac util: update some assertions in util_resource_copy_region()
To cope with copies of compressed images which are not multiples of
the block size.  Suggested by Jose.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@sroland@vmware.com>
2016-06-13 13:30:19 -06:00
Samuel Pitoiset
7f257abc1b nvc0/ir: clamp the UBO index for compute on Kepler
We already check that the address is not "too far", but we should also
clamp the UBO index in order to avoid looking at the wrong place in the
driver cb. This is a pretty rare situation though.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-13 20:12:48 +02:00
Marek Olšák
6e1b12c788 radeonsi: enable scratch coalescing
This makes one particular compute shader 8x faster.

Latest LLVM git is required.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-06-13 18:13:51 +02:00
Jimmy Berry
0c0f841e5d st/va: hardlink driver instances to gallium_drv_video.so
Removes the need to set LIBVA_DRIVER_NAME=gallium for supported targets and is
consistent with vdpau and general gallium drivers.

Note: some versions of libva can detect the gallium name and use the
backend. Although that behaviour seems inconsistent since it only works
for some platforms/backends.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-13 15:31:29 +01:00
Jan Vesely
1fb4179f92 vl: Fix trivial sign compare warnings
v2: add whitepace fixes

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
[Emil Velikov: squash a few more whitespace issues]
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-13 15:31:29 +01:00
Rob Herring
112e988329 Android: move libdrm settings to top-level Android.common.mk
Fix warnings like these due to HAVE_LIBDRM being inconsistently defined:

external/libdrm/include/drm/drm.h:839:30: warning: redefinition of typedef 'drm_clip_rect_t' is a C11 feature [-Wtypedef-redefinition]
typedef struct drm_clip_rect drm_clip_rect_t;

HAVE_LIBDRM needs to be set project wide to fix this. This change also
harmlessly links libdrm with everything, but simplifies the makefiles a
bit.

Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-13 15:31:29 +01:00
Emil Velikov
15bc7856bf gallium: remove st_api::get_proc_address hook
It has been unused for a long time, plus makes the gallium dri modules
require an extra glapi symbol relative to their classic counterparts.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-13 15:31:28 +01:00
Emil Velikov
fcb5a75a66 swr: automake: add missing -I flag
When building from a release tarball (where the generated/built files
are in srcdir) in an OOT fashion we need to have both builddir and
srcdir in the includes list.

Otherwise we'll error out, as the file (header gen_knobs.h in this case)
won't be in the location where we are looking.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-13 15:31:24 +01:00
Chuck Atkins
c86fcaca72 swr: Add missing headers for package inclusion
CC: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-06-13 15:24:44 +01:00
Jan Vesely
ace70aedcf gallivm: Fix trivial sign warnings
v2: include whitespace fixes

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-06-13 09:23:09 -04:00
Julien Isorce
a04804746f st/va: use proper temp pipe_video_buffer template
Instead of changing the format on the existing template
which makes error handling not nice and confuses coverity.

CoverityID: 1337953

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-06-13 09:14:32 +01:00
Julien Isorce
6c43e0016e st/va: it is valid to release the VABuffer of an exported resource
pipe_resource_reference(&res, NULL) will decrement reference counting,
i.e. p_atomic_dec(res->count). But the va surface still has the initial
reference since it has created the resource. So calling vaDestroyImage
on a derived image calls VaDestroyBuffer but the decrementation won't
reach 0. It is just wrong for vlVaDestroyBuffer to rely on the
export_refcount flag. Finally the vaapi intel driver has the same logic.

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-06-13 09:14:32 +01:00
Ilia Mirkin
3f48548a6f nv50: reinstate dedicated constbuf push path
This was disabled due to occasionally incorrect behavior when trying to
upload data. It later became apparent that nvc0 also had a similar but
slightly different issue, which was resolved in commit e50c01d5. This
takes the same logic as nvc0 and applies it to nv50 (which has somewhat
different interfaces).

Unfortunately I did not note down precisely what was broken with UBOs
when removing the support from nv50, but I've tested a bunch of local
traces, and none of them appear to regress. This should hopefully
improve performance when UBOs are used, but this was not directly
verified.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-06-11 12:18:43 -04:00