fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 00:48:07 +02:00

Author	SHA1	Message	Date
Marek Olšák	3a71eac783	st/dri: fix deadlock when waiting on android fences Android fences can't be deferred, because st/dri calls fence_finish with ctx = NULL, so the driver can't flush u_threaded_context. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-11 04:12:53 +01:00
Rob Clark	881f6e741f	meson: Guard freedreno build with with_gallium_freedreno. This prevents build failures when libdrm_freedreno is unavailable, which started happening after the ir3_compiler build was enabled. (Patch by Rob, commit message by Ken). Fixes: `fecd04a66a` ("freedreno/ir3: fix standalone compiler meson build") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-10 17:11:48 -08:00
Dylan Baker	ad9c2f5469	meson: build gallium-xlib based glx Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-10 13:00:01 -08:00
Dylan Baker	140b688c57	meson: add nir_builder_opcodes_h to gallium_auxiliary This creates a dependency on this header being generated before trying to compile any of these targets, as well as passing the correct -I to the compiler to ensure it's included correctly. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-10 12:59:54 -08:00
Dylan Baker	7210d0096a	gallium/xlib: remove GL_{MAJOR,MINOR,TINY} These variables were removed from autotools in 2008 (sha: `80f68e1b6a`), but they have lived on here. The Scons build meanwhile doesn't set a patch/tiny version at all, just major and minor. This patch removes the unused variables and simply sets the version, leaving patch/tiny as 0 since that's what the autotools build as been doing forever. This shouldn't change any behavior. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-10 12:40:08 -08:00
Timothy Arceri	f9e5216f71	radeonsi: get llvm types from ac Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-11 06:54:25 +11:00
Marek Olšák	e456d4def5	st/dri: fix android fence regression Fixes piglit - egl_khr_fence_sync/android_native tests. Broken by `884a0b2a9e`. Introduce state-tracker flush flags, analogous to the pipe ones. Use the former when with stapi->flush(). Fixes: `884a0b2a9e` ("st/dri: use stapi flush instead of pipe flush when creating fences") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-10 17:17:13 +01:00
Nicolai Hähnle	ee880e91cc	gallium/u_threaded: fix end_query regression Ouch... Fixes: `244536d3d6` ("gallium/u_threaded: avoid syncs for get_query_result") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103653 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-10 16:37:37 +01:00
Bruce Cherniak	d473f91758	swr: Fixed an uncommon freed-memory access during state validation State validation is performed during clear and draw calls. Validation during clear was still accessing vertex buffer state. When the currently set vertex buffers are client arrays, this could lead to accessing freed memory. Such is the case with the VMD application. Previously, vertex buffer validation depended on a dirty bit or the draw info indicating an indexed draw. This required special handling for clears. But, vertex buffer validation still occurred which was unnecessary and wrong. Now, only minimal validation is performed during clear, deferring the remainder to the next draw. And, by setting the dirty bit in swr_draw_vbo for indexed draws, vertex buffer validation is only dependent upon a single dirty bit. This fixes a bug exposed by the VMD application when changing models. Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2017-11-10 08:55:42 -06:00
Rob Clark	fecd04a66a	freedreno/ir3: fix standalone compiler meson build Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-10 08:57:33 -05:00
Rob Clark	86154acb57	freedreno/ir3: correct # of dest components for intrinsics Don't rely on intr->num_components having a valid value. It doesn't seem to anymore for non-vectorized intrinsics. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-10 08:57:33 -05:00
Rob Clark	3fcf18634c	freedreno/ir3: remove bogus assert The ssbo atomic instructions are not vectorized. So num_components is not expected to be valid. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-10 08:57:33 -05:00
Eric Anholt	62deeaa23a	broadcom/vc4: Fix simulator mode for the MADVISE usage.	2017-11-09 15:51:56 -08:00
Dave Airlie	06993e4ee3	r600: add support for hw atomic counters. (v3) This adds support for the evergreen/cayman atomic counters. These are implemented using GDS append/consume counters. The values for each counter are loaded before drawing and saved after each draw using special CP packets. v2: move hw atomic assignment into driver. v3: fix messing up caps (Gert Wollny), only store ranges in driver, drop buffers. Signed-off-by: Dave Airlie <airlied@redhat.com> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-By: Gert Wollny <gw.fossdev@gmail.com>	2017-11-10 08:39:36 +10:00
Dave Airlie	cca5617348	gallium: add hw atomic buffer binding API. This API binds atomic buffers for all bound shaders (as per the GL semantics). This is needed to support cross shader hw atomic counters. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-By: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-10 08:39:35 +10:00
Dave Airlie	4b0b82770a	gallium/tgsi: start adding hw atomics (v3.2) This adds support for a hw atomic counters to TGSI. A new register file for storing atomic counters is added, along with a new atomic counter semantic, along with docs for both. v2: drop semantic, move hw counter to backend, Ilia pointed out SSO would have busted my plan, and he was right. v3: drop BUFFER decls. (Marek) v3.1: minor fixups for whitespace, set ureg error if we overflow the hw atomic limits. (nha) v3.2: fix some docs inconsistencies (Ilia) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-By: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-10 08:39:35 +10:00
Dave Airlie	2a06423c00	gallium: add CAPs to support HW atomic counters. (v3) This looks like an evergreen specific feature, but with atomic counters AMD have hw specific counters they use instead of operating on buffers directly. These are separate to the buffer atomics, so require different limits and code paths. I've left the CAP for atomic type extensible in case someone else has a variant on this sort of thing (freedreno maybe?) and needs to change it. This adds all the CAPs required to add support for those atomic counters, along with a related CAP for limiting the number of output resources. I'd like to land this and the st patch then I can start to upstream the evergreen support for these and other GL4.x features. v2: drop the ATOMIC_COUNTER_MODE cap, just use the return from the HW counters. If 0 we use the current mode. v3: fix some rebase errors (Gert Wollny) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-By: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-10 08:39:34 +10:00
Dave Airlie	24baca6e75	r600/query: drop rest of vi workaround code. This isn't needed in r600 anymore. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-10 08:39:16 +10:00
Boris Brezillon	359a8f6ae5	broadcom/vc4: Mark BOs as purgeable when they enter the BO cache This patch makes use of the DRM_IOCTL_VC4_GEM_MADVISE ioctl to mark all BOs placed in the mesa BO cache as purgeable so that the system can reclaim this memory under memory pressure. v2: - Removed BOs from the cache when they've been purged by the kernel - Check whether the madvise ioctl is supported or not before using it v3: Don't walk the whole list when we find a busy BO (by anholt, acked by Boris) Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-09 10:57:17 -08:00
Eric Anholt	ebcb4c2156	meson: Enable VC4's NEON assembly support. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-11-09 09:40:30 -08:00
Eric Anholt	9c9fd8ff37	meson: Always link libgallium_dri.so against dep_thread. Somehow on my cross build the -pthread is getting lost. All the other deps seem to work out fine. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Tested-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-11-09 09:40:27 -08:00
Marek Olšák	9ceb057ebf	radeonsi: pack r600_surface better 160 -> 136 bytes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-09 17:32:14 +01:00
Marek Olšák	169525684f	radeonsi: pack r600_texture better 1752 -> 1736 bytes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-09 17:32:14 +01:00
Marek Olšák	f8a4b606a2	radeonsi: clean up r600_surface 216 -> 160 bytes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-09 17:32:14 +01:00
Marek Olšák	6916ee7e17	radeonsi: remove r600_texture::non_disp_tiling Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-09 17:32:14 +01:00
Marek Olšák	a06fe75eac	radeonsi: remove DBG_NO_DISCARD_RANGE Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-09 17:32:14 +01:00
Nicolai Hähnle	884a0b2a9e	st/dri: use stapi flush instead of pipe flush when creating fences There may be pending operations (e.g. vertices) that need to be flushed by the state tracker. Found by inspection. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:20:58 +01:00
Nicolai Hähnle	b921da3b74	radeonsi: use a threaded context even for debug contexts Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:01:04 +01:00
Nicolai Hähnle	1a6d9e087a	radeonsi: record and dump time of flush Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:01:04 +01:00
Nicolai Hähnle	b07569ad8b	ddebug: optionally handle transfer commands like draws Transfer commands can have associated GPU operations. Enabled by passing GALLIUM_DDEBUG=transfers. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:01:03 +01:00
Nicolai Hähnle	18fd2a859d	ddebug: dump context and before/after times of draws Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:01:03 +01:00
Nicolai Hähnle	ba2f2b6f2a	ddebug: generalize print_named_xxx via a PRINT_NAMED macro Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:01:03 +01:00
Nicolai Hähnle	c9fefa062b	ddebug: rewrite to always use a threaded approach This patch has multiple goals: 1. Off-load the writing of records in 'always' mode to another thread for performance. 2. Allow using ddebug with threaded contexts. This really forces us to move some of the "after_draw" handling into another thread. 3. Simplify the different modes of ddebug, both in the code and in the user interface, i.e. GALLIUM_DDEBUG. In particular, there's no 'pipelined' anymore, since we're always pipelined; and 'noflush' is replaced by 'flush', since we no longer flush by default. 4. Fix the fences in pipelining mode. They previously relied on writes via pipe_context::clear_buffer. However, on radeonsi, those could (quite reasonably) end up in the SDMA buffer. So we use the newly added PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE fences instead. 5. Improve pipelined mode overall, using the finer grained information provided by the new fences. Overall, the result is that pipelined mode should be more useful, and using ddebug in default mode is much less invasive, in the sense that it changes the overall driver behavior less (which is kind of crucial for a driver debugging tool). An example of the new hang debug output: Gallium debugger active. Hang detection timeout is 1000ms. GPU hang detected, collecting information... Draw # driver prev BOP TOP BOP dump file ------------------------------------------------------------- 2 YES YES YES NO /home/nha/ddebug_dumps/shader_runner_19919_00000000 3 YES NO YES NO /home/nha/ddebug_dumps/shader_runner_19919_00000001 4 YES NO YES NO /home/nha/ddebug_dumps/shader_runner_19919_00000002 5 YES NO YES NO /home/nha/ddebug_dumps/shader_runner_19919_00000003 Done. We can see that there were almost certainly 4 draws in flight when the hang happened: the top-of-pipe fence was signaled for all 4 draws, the bottom-of-pipe fence for none of them. In virtually all cases, we'd expect the first draw in the list to be at fault, but due to the GPU parallelism, it's possible (though highly unlikely) that one of the later draws causes a component to get stuck in a way that prevents the earlier draws from making progress as well. (In the above example, there were actually only 3 draws truly in flight: the last draw is a blit that waits for the earlier draws; however, its top-of-pipe fence is emitted before the cache flush and wait, and so the fact that the draw hasn't truly started yet can only be seen from a closer inspection of GPU state.) Acked-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:01:03 +01:00
Nicolai Hähnle	e8bb8758dd	ddebug: use an atomic increment when numbering files Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:01:03 +01:00
Nicolai Hähnle	d6710fe874	dd/util: extract dd_get_debug_filename_and_mkdir Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:01:03 +01:00
Nicolai Hähnle	8491fcafab	gallium/u_dump: add and use util_dump_transfer_usage Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:01:02 +01:00
Nicolai Hähnle	9b8033a4a7	gallium/u_dump: add util_dump_ns Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:01:02 +01:00
Nicolai Hähnle	6f4a03b08a	gallium/u_dump: export util_dump_ptr Change format to %p while we're at it. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:01:02 +01:00
Nicolai Hähnle	125a915052	radeonsi: implement PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE v2: use uncached system memory for the fence, and use the CPU to clear it so we never read garbage when checking the fence Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:55 +01:00
Nicolai Hähnle	e4627ac8fb	radeonsi: document some subtle details of fence_finish & fence_server_sync v2: remove the change to si_fence_server_sync, we'll handle that more robustly Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:50 +01:00
Nicolai Hähnle	14b9fa75e4	gallium: add pipe_context::callback For running post-draw operations inside the driver thread. ddebug will use it. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:50 +01:00
Nicolai Hähnle	2bdfbb0e53	gallium/u_threaded: implement pipe_context::set_log_context Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:49 +01:00
Nicolai Hähnle	244536d3d6	gallium/u_threaded: avoid syncs for get_query_result Queries should still get marked as flushed when flushes are executed asynchronously in the driver thread. To this end, the management of the unflushed_queries list is moved into the driver thread. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:49 +01:00
Nicolai Hähnle	609a230375	gallium/u_threaded: implement asynchronous flushes This requires out-of-band creation of fences, and will be signaled to the pipe_context::flush implementation by a special TC_FLUSH_ASYNC flag. v2: - remove an incorrect assertion - handle fence_server_sync for unsubmitted fences by relying on the improved cs_add_fence_dependency - only implement asynchronous flushes on amdgpu Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:42 +01:00
Nicolai Hähnle	11b380ed0c	gallium/u_threaded: mark queries flushed only for non-deferred flushes The driver uses (and must use) the flushed flag of queries as a hint that it does not have to check for synchronization with currently queued up commands. Deferred flushes do not actually flush queued up commands, so we must not set the flushed flag for them. Found by inspection. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:42 +01:00
Nicolai Hähnle	78a4750d91	radeonsi: move fence functions to si_fence.c Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:42 +01:00
Nicolai Hähnle	e6dbc804a8	winsys/amdgpu: handle cs_add_fence_dependency for deferred/unsubmitted fences The idea is to fix the following interleaving of operations that can arise from deferred fences: Thread 1 / Context 1 Thread 2 / Context 2 -------------------- -------------------- f = deferred flush <------- application-side synchronization -------> fence_server_sync(f) ... flush() flush() We will now stall in fence_server_sync until the flush of context 1 has completed. This scenario was unlikely to occur previously, because applications seem to be doing Thread 1 / Context 1 Thread 2 / Context 2 -------------------- -------------------- f = glFenceSync() glFlush() <------- application-side synchronization -------> glWaitSync(f) ... and indeed they probably have to use this ordering to avoid deadlocks in the GLX model, where all GL operations conceptually go through a single connection to the X server. However, it's less clear whether applications have to do this with other WSI (i.e. EGL). Besides, even this sequence of GL commands can be translated into the Gallium-level sequence outlined above when Gallium threading and asynchronous flushes are used. So it makes sense to be more robust. As a side effect, we no longer busy-wait on submission_in_progress. We won't enable asynchronous flushes on radeon, but add a cs_add_fence_dependency stub anyway to document the potential issue. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:22 +01:00
Nicolai Hähnle	1e5c9cf590	gallium: add PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE bits These bits are intended to be used by the ddebug hang detection and are named in analogy to the Vulkan stage bits (and the corresponding Radeon pipeline event). Hang detection needs fences on the granularity of individual commands, which nothing else really covers. The closest alternative would have been PIPE_QUERY_GPU_FINISHED, but (a) queries are a per-context object and we really want a per-screen object, (b) queries don't offer a wait with timeout, and (c) in any case, PIPE_QUERY_GPU_FINISHED is meant to imply that GPU caches are flushed, which the new bits explicitly aren't. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 13:58:16 +01:00
Nicolai Hähnle	ea6df1ce37	gallium: add PIPE_FLUSH_ASYNC and PIPE_FLUSH_HINT_FINISH Also document some subtleties of pipe_context::flush. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 13:58:16 +01:00
Nicolai Hähnle	c50743f61c	gallium: remove unused and deprecated u_time.h Cc: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:57:22 +01:00

1 2 3 4 5 ...

32753 commits