Commit graph

31510 commits

Author SHA1 Message Date
Constantine Charlamov
469e2ed473 r600g: get rid of trailing whitespace
Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 16:30:10 +10:00
Dave Airlie
27380d6b3e r600/asm: add support for other GDS operations.
This adds support for the GDS operations needed to do atomic
counters.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 16:27:51 +10:00
Dave Airlie
ccab3f7e1b r600: don't merge GDS into VTX
We don't want vtx/tex instructions ending up in GDS sections.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 16:23:21 +10:00
Dave Airlie
043f16eba1 r600: for memory instructions dump index gpr for read indirects also.
This just makes sure we can see the index gpr in the asm dumps.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 16:23:21 +10:00
Dave Airlie
ac8fb9800a r600: add support for vertex fetches via texture cache
On evergreen we can route vertex fetches via the texture cache,
and this is required for some images support. So add support
to the asm builder for it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 16:23:20 +10:00
Dave Airlie
b050b91e33 r600: route indirect address register correctly for vtx fetches.
This was found during writing the images code, we need to
make sure we route the correct index register.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 16:23:20 +10:00
Marek Olšák
09f6915bf8 gallium/hud: add glthread counters
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-26 02:17:03 +02:00
Marek Olšák
8f4bc8a324 gallium/hud: add API-thread-busy for monitoring the thread load
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-26 02:17:03 +02:00
Marek Olšák
11cf079b67 gallium/hud: add hud_pane::hud pointer
for later use

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-26 02:17:03 +02:00
Marek Olšák
5fa69be3c8 mesa/glthread: add glthread "perf" counters and pass them to gallium HUD
for HUD integration in following commits. This valuable profiling data
will allow us to see on the HUD how well glthread is able to utilize
parallelism. This is better than benchmarking, because you can see
exactly what's happening and you don't have to be CPU-bound.

u_threaded_context has the same counters.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-26 02:17:03 +02:00
Marek Olšák
833f3c1c31 gallium/hud: move struct hud_context to hud_private.h
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-26 02:17:03 +02:00
Marek Olšák
7492201c4e gallium/hud: rename API-thread-busy to main-thread-busy
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-26 02:17:03 +02:00
Marek Olšák
6884c95ab4 util: move pipe_thread_is_self from gallium to src/util
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-26 02:17:03 +02:00
Pierre Moreau
afb8f2d4a3 nv50/ir: Properly fold constants in SPLIT operation
Fixes: b7d9677d ("nv50/ir: constant fold OP_SPLIT")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-06-25 15:23:46 +02:00
Marek Olšák
e25950808f radeonsi/gfx9: don't overallocate shader binaries
It's not needed. The hw doesn't fetch ahead over page boundaries.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-24 23:04:37 +02:00
Lucas Stach
d6b9ba36a4 st/dri2: implement image offset query
This trivially adds support for the image offset query, which is needed
for the zwp_linux_dmabuf based EGL platform wayland implementation.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-06-24 16:57:55 +01:00
Roland Scheidegger
8bfe451ed3 llvmpipe: initialize default fb correctly in setup
If lp_setup_bind_framebuffer() is never called, then setup fb x1/y1 was not
correctly initialized. This can happen if there's never a fb set - both
cso and llvmpipe would consider setting this with no cbufs and no zsbuf a
redundant change and therefore it would never get set.
We rely on this setup fb rect being initialized correctly for the tri intersect
tests, throwing away tris which don't intersect. Not initializing it meant
we'd then say it intersected, and we'd try to bin that despite that we have
no actual tiles to bin it to, leading to assertion failures (pretty harmless
since tile 0/0 always exists nevertheless as tiles are statically allocated,
albeit that should change at some point).
(Note probably not an issue with gl state tracker)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-06-24 00:18:43 +02:00
Marek Olšák
f6e98e99e3 radeonsi: unreference vertex buffers when destroying the context
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-23 19:53:54 +02:00
Marek Olšák
ee16796d54 radeonsi: implement the workaround for Rocket League - postponed TGSI kill
Do KILL at the end of shaders so as not to break WQM.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100070

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-23 19:50:20 +02:00
Marek Olšák
a98a04ec80 gallium/radeon: pass create_screen flags to r600_common_screen_init
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-23 19:50:20 +02:00
Marek Olšák
118b2008ba st/dri: add a drirc workaround for Rocket League
This needs to be passed to gallium drivers.

No game fix is planned at this time.

The addition of glsl_correct_derivatives_after_discard is
generally a good thing for mesa compatibility with the broader GL
driver ecosystem.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100070

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-23 19:50:20 +02:00
Marek Olšák
6b0f6e693b st/dri: get drirc options before creating pipe_screen
dri_init_options_get_screen_flags will return the flags for create_screen().

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-23 19:50:20 +02:00
Marek Olšák
76f379330a gallium: allow passing 'unsigned flags' to create_screen()
for drirc options

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-23 19:50:20 +02:00
Roland Scheidegger
c7688d2de5 llvmpipe:fix using 32bit rasterization mistakenly, causing overflows
We use the bounding box (triangle extents) to figure out if 32bit rasterization
could potentially overflow. However, we used the bounding box which already got
rounded up to 0 for negative coords for this, which is incorrect, leading to
overflows and hence bogus rendering in some of our private use.

It might be possible to simplify this somehow (we're now using 3 different
boxes for binning) but I don't quite see how.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-06-23 19:39:29 +02:00
Roland Scheidegger
672d245ffe llvmpipe: fill in debug vertex info for tri rasterization
This is pretty useful for debugging rasterization issues, so turn it on
based on DEBUG (the actual existence of the fields is also conditionalized
on DEBUG, lines fill it out the same too).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-06-23 19:39:29 +02:00
Marek Olšák
c2f82fc1d3 Revert "radeonsi: don't emit partial flushes at the end of IBs (v2)"
This reverts commit c9040dc9e7.

People have reported it causes corruption on VI, and I see GPU hangs
on GFX9.
2017-06-23 19:13:55 +02:00
Brian Paul
9e57a2cbcf svga: minor whitespace fixes in svga_pipe_vertex.c 2017-06-22 13:33:48 -06:00
Brian Paul
041f8ae9f6 svga: check return value from svga_set_shader( SVGA3D_SHADERTYPE_GS, NULL)
If the call fails we need to flush the command buffer and retry.  In this
case, we were failing to unbind the GS which led to subsequent errors.

This fixes a bug replaying a Cinebench R15 apitrace in a Linux guest.
VMware bug 1894451

cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-06-22 13:33:48 -06:00
Charmaine Lee
3fbdab8778 svga: fix pre-mature flushing of the command buffer
When surface_invalidate is called to invalidate a newly created surface
in svga_validate_surface_view(), it is possible that the command
buffer is already full, and in this case, currently, the associated wddm
winsys function will flush the command buffer and resend the invalidate
surface command. However, this can pre-maturely flush the command buffer
if there is still pending image updates to be patched.

To fix the problem, this patch will add a return status to the
surface_invalidate interface and if it returns FALSE, the caller will
call svga_context_flush() to do the proper context flush.
Note, we don't call svga_context_flush() if surface_invalidate()
fails when flushing the screen surface cache though, because it is
already in the process of context flush, all the image updates are already
patched, calling svga_context_flush() can trigger a deadlock.
So in this case, we call the winsys context flush interface directly
to flush the command buffer.

Fixes driver errors and graphics corruption running Tropics. VMware bug 1891975.

Also tested with MTT glretrace, piglit and various OpenGL apps such as
Heaven, CinebenchR15, NobelClinicianViewer, Lightsmark, GoogleEarth.

cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-06-22 13:33:48 -06:00
George Kyriazis
08cb8cf256 swr: invalidate attachment on transition change
Consider the following RT attachment order:
1. Attach surfaces attachments 0 & 1, and render with them
2. Detach 0 & 1
3. Re-attach 0 & 1 to different surfaces
4. Render with the new attachment

The definition of a tile being resolved is that local changes have been
flushed out to the surface, hence there is no need to reload the tile before
it's written to.  For an invalid tile, the tile has to be reloaded from
the surface before rendering.

Stage (2) was marking hot tiles for attachements 0 & 1 as RESOLVED,
which means that the hot tiles can be written out to memory with no
need to read them back in (they are "clean").  They need to be marked as
resolved here, because a surface may be destroyed after a detach, and we
don't want to have un-resolved tiles that may force a readback from a
NULL (destroyed) surface.  (Part of a destroy is detach all attachments first)

Stage (3), during the no att -> att transition, we  need to realize that the
"new" surface tiles need to be fetched fresh from the new surface, instead
of using the resolved tiles, that belong to a stale attachment.

This is done by marking the hot tiles as invalid in stage (3), when we realize
that a new attachment is being made, so that they are re-fetched during
rendering in stage (4).

Also note that hot tiles are indexed by attachment.

- Fixes VTK dual depth-peeling tests.
- No piglit changes

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-06-22 11:51:08 -05:00
Marek Olšák
bcd67b1711 radeonsi/gfx9: enable DCC fast clear
It seems to work now.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 13:15:27 +02:00
Marek Olšák
db37c0be13 radeonsi/gfx9: don't ever flush the TC metadata cache
The closed Vulkan driver doesn't do it either.

Also remove some old comments that aren't useful.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 13:15:27 +02:00
Marek Olšák
920f20f039 radeonsi/gfx9: use TC L2 for fast color clear with CP DMA
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 13:15:27 +02:00
Marek Olšák
c1754b69dc radeonsi: fix DCC fast clear for luminance and alpha formats
I reproduced this bug on Polaris11 and Raven.

I can't get this bug on Fiji. The reason might be that Fiji doesn't use
2D tiling for the test due to higher 2D tiling alignment requirements.

Fixes piglit: spec@ext_framebuffer_object@fbo-fast-clear

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 13:15:27 +02:00
Marek Olšák
c9040dc9e7 radeonsi: don't emit partial flushes at the end of IBs (v2)
The kernel sort of does the same thing with fences.

v2: do emit partial flushes on SI

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 13:15:27 +02:00
Chandu Babu N
1d4cbcdf28 change va max_entrypoints
As encode support is added along with decode, increase max_entrypoints to two.
vaMaxNumEntrypoints was returning incorrect value and causing
memory corruption before this commit

v2: assert when max_entrypoints needs to be bigger

CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-06-22 12:10:57 +02:00
Chandu Babu N
b1a359b7d8 st/va: Fix leak in VAAPI subpictures
sampler view allocated in vaAssociateSubpicture is not cleared
in vaiDeassociateSubpicture.

Reviewed-by: Christian König <christian.koenig@amd.com>
2017-06-22 12:09:43 +02:00
Nicolai Hähnle
da2e52b382 radeonsi: use the correct LLVMTargetMachineRef in si_build_shader_variant
si_build_shader_variant can actually be called directly from one of
normal-priority compiler threads. In that case, the thread_index is
only valid for the normal tm array.

v2:
- use the correct sel/shader->compiler_ctx_state

Fixes: 86cc809726 ("radeonsi: use a compiler queue with a low priority for optimized shaders")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-22 09:45:23 +02:00
Marek Olšák
79bd1d4f8b radeonsi/gfx9: keep reusing the same buffer/address for the gfx9 flush fence
instead of using a monotonic suballocator

v2: initialize the memory at context creation

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
c66fc618cc radeonsi/gfx9: enable the constant engine
I think this kernel commit fixes it:
     drm/amdgpu:use FRAME_CNTL for new GFX ucode

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
d7141d8bc0 radeonsi/gfx9: indirect buffers and all CP packets use TC L2
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
2638250fec radeonsi: flush CB after MSAA only when transitioning from CB to textures
The main flush before texturing is done after the FMASK decompress pass.

CB after MSAA rendering is not flushed in set_framebuffer_state and also
not in memory_barrier if the current color buffer is MSAA. We fully rely
on the FMASK decompress pass for the flushing.

Some CB decompress and resolve passes need an explicit flush before and
after.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
51c219739c radeonsi: unify CB_RESOLVE blitter invocation code
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
2263610827 radeonsi: flush DB caches only when transitioning from DB to texturing
Use the mechanism of si_decompress_textures, but instead of doing
the actual decompression, just flag the DB cache flush there.

This removes a lot of unnecessary DB cache flushes.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
fdca690e91 radeonsi: add separate HUD counters for CB and DB cache flushes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
c9a16fde80 cso: inline a few frequently-used functions
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
0e0fc1ce71 cso: don't return errors from sampler functions
No code checks the errors.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
4d6fab245e cso: don't track the number of sampler states bound
This removes 2 loops from hot codepaths and adds 1 loop to a rare codepath
(restore_sampler_states), and makes sanitize_hash() slightly worse.

Sampler states, when bound, are not unbound for draw calls that don't need
them. That's OK, because bound sampler states don't add any overhead.

This results in lower CPU overhead in most cases.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Eric Engestrom
4a1238a452 egl: turn one more boolean int into a bool
Same as the previous commit, but this one was split out because it's
a bit more complicated: this field is given as a pointer to a function,
so the function had to be changed as well, and the function was use in
a bunch of places, which needed updating as well.

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-21 21:42:14 +01:00
Lucas Stach
629003b5b8 etnaviv: fix blend color for RB swapped rendertargets
Same as with the colormasks, the blend color needs to be swizzled according
to the rendertarget format.

Signed-off-by: Lucas Stach <dev@lynxeye.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-06-21 07:45:15 +02:00