fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-22 13:08:09 +02:00

Author	SHA1	Message	Date
Topi Pohjolainen	46b346899d	i965/gen6: Issue direct depth stall and flush after depth clear instead of calling unconditionally brw_emit_mi_flush() which does: brw_emit_pipe_control_flush(brw, PIPE_CONTROL_DEPTH_CACHE_FLUSH \| PIPE_CONTROL_RENDER_TARGET_FLUSH \| PIPE_CONTROL_CS_STALL); brw_emit_pipe_control_flush(brw, PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE \| PIPE_CONTROL_CONST_CACHE_INVALIDATE); Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-18 22:42:47 +02:00
Topi Pohjolainen	e6da6943fe	i965: Make depth clear flushing more explicit Current blorp logic issues unconditional "flush everything" (see brw_emit_mi_flush()) after each render. For example, all blits issue this unconditionally which shouldn't be needed if they set render cache properly so that subsequent renders do necessary flushing before drawing. In case of piglit: ext_framebuffer_multisample-accuracy all_samples depth_draw small intel_hiz_exec() is always preceded by blorb blit and the unconditional flush looks to hide the lack of stall and flushes in depth clears. By removing the brw_emit_mi_flush() I get gpu hangs. This patch adds the stalls and flushes mandated by the spec and gets rid of those hangs. v2 (Jason, Ken): Document the rational for separating depth cache flush and stall on Gen7. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-18 22:42:47 +02:00
Topi Pohjolainen	4840a53e90	i965/blorp: Use the render cache mechanism instead of explicit flushing by replacing brw_emit_mi_flush() with brw_render_cache_set_check_flush(). The latter splits the flush in two: brw_emit_pipe_control_flush(brw, PIPE_CONTROL_DEPTH_CACHE_FLUSH \| PIPE_CONTROL_RENDER_TARGET_FLUSH \| PIPE_CONTROL_CS_STALL); brw_emit_pipe_control_flush(brw, PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE \| PIPE_CONTROL_CONST_CACHE_INVALIDATE); instead of int flags = PIPE_CONTROL_NO_WRITE \| PIPE_CONTROL_RENDER_TARGET_FLUSH; if (brw->gen >= 6) { flags \|= PIPE_CONTROL_INSTRUCTION_INVALIDATE \| PIPE_CONTROL_CONST_CACHE_INVALIDATE \| PIPE_CONTROL_DEPTH_CACHE_FLUSH \| PIPE_CONTROL_VF_CACHE_INVALIDATE \| PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE \| PIPE_CONTROL_CS_STALL; } brw_emit_pipe_control_flush(brw, flags); v2 (Jason): Check that destination exists before trying to add to render cache. Depth clears and resolves don't have it. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-18 22:42:47 +02:00
Emil Velikov	ea8b2624c8	utils: really remove the __END_DECLS macro Fixes: `d1efa09d34` "util: import sha1 implementation from OpenBSD" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 20:09:57 +00:00
Emil Velikov	9f8dc3bf03	utils: build sha1/disk cache only with Android/Autoconf Earlier commit imported a SHA1 implementation and relaxed the SHA1 and disk cache handling, broking the Windows builds. Restrict things for now until we get to a proper fix. Fixes: `d1efa09d34` "util: import sha1 implementation from OpenBSD" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 20:09:01 +00:00
Emil Velikov	d1efa09d34	util: import sha1 implementation from OpenBSD At the moment we support 5+ different implementations each with varying amount of bugs - from thread safely problems [1], to outright broken implementation(s) [2] In order to accommodate these we have 150+ lines of configure script and extra two configure toggles. Whist an actual implementation being ~200loc and our current compat wrapping ~250. Let's not forget that different people use different code paths, thus effectively makes it harder to test and debug since the default implementation is automatically detected. To minimise all these lovely experiences, import the "100% Public Domain" OpenBSD sha1 implementation. Clearly document any changes needed to get building correctly, since many/most of those can be upstreamed making future syncs easier. As an added bonus this will avoid all the 'fun' experiences trying to integrate it with the Android and SCons builds. v2: Manually expand __BEGIN_DECLS/__END_DECLS and document (Tapani). Furthermore it seems that some games (or surrounding runtime) static link against OpenSSL resulting in conflicts. For more information see the discussion thread [3] Bugzilla [1]: https://bugs.freedesktop.org/show_bug.cgi?id=94904 Bugzilla [2]: https://bugs.freedesktop.org/show_bug.cgi?id=97967 [3] https://lists.freedesktop.org/archives/mesa-dev/2017-January/140748.html Cc: Mark Janes <mark.a.janes@intel.com> Cc: Vinson Lee <vlee@freedesktop.org> Cc: Tapani Pälli <tapani.palli@intel.com> Cc: Jonathan Gray <jsg@jsg.id.au> Tested-by: Jonathan Gray <jsg@jsg.id.au> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Tapani Pälli <tapani.palli@intel.com> (v1) Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1)	2017-01-18 19:07:23 +00:00
Kenneth Graunke	5b4a531207	i965: Make brw_cache_item structure private to brw_program_cache.c. struct brw_cache_item is an implementation detail of the program cache. We don't need to make those internals available to the entire driver. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2017-01-18 10:53:14 -08:00
Marek Olšák	c67a2793b3	radeonsi: determine in advance which VBOs should be added to the buffer list v2: now it should be correct Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Marek Olšák	1db2bf8d2b	radeonsi: use fewer pointer dereferences in upload_vertex_buffer_descriptors Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Marek Olšák	b9b9540a60	radeonsi: reject invalid vertex buffer indices at state creation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Marek Olšák	cf248929bf	radeonsi: use a global dirty mask for shader pointers Only vertex buffers use a separate bool flag. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Marek Olšák	861d7af1cb	radeonsi: use a bitmask-based loop in si_decompress_textures Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Marek Olšák	4bde7d3d3c	radeonsi: skip an unnecessary mutex lock for L2 prefetches the mutex lock is inside util_range_add. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Marek Olšák	d93b0eacb7	radeonsi: si_cp_dma_prepare is a no-op for L2 prefetches Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Marek Olšák	395c49849d	radeonsi: add SI_CPDMA_SKIP_BO_LIST_UPDATE the next commit will use it in a clever way, because the CP DMA prefetch doesn't need this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Marek Olšák	35cd7551a4	radeonsi: use the correct target machine when building shader variants If the shader selector is created with a different context than the shader variant, we should use the calling context's target machine for the shader variant. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99419 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Marek Olšák	3ae3be6dd4	radeonsi: move shader pipe context state into a separate structure Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Ben Widawsky	b0cc55f298	i965: Fix SURFACE_STATE to handle non-zero aux offsets Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Daniel Stone <daniels@collabora.com>	2017-01-18 09:38:18 -08:00
Christian Gmeiner	65a44a76fc	Revert "etnaviv: Fake occlusion query capability" This reverts commit `b7ac0f5671`. This is a half baked solution needs some rework to fixes issues with reported counter bits (GL_QUERY_COUNTER_BITS_ARB). Also it enables PIPE_CAP_QUERY_TIME_ELAPSED accidently. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-01-18 17:39:28 +01:00
Mauro Rossi	730574c58e	android: ac/debug: move sid_tables.h generation and IB decode to amd/common This patch is the porting to android of the following commits: `b838f64` "ac/debug: Move sid_tables.h generation to common code." `0ef1b4d` "ac/debug: Move IB decode to common code." Fixes android building errors due to sid_tables.h and ac_debug.c, ac_debug.h moved to amd/common Tested by building nougat-x86 Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:28:59 +00:00
Mauro Rossi	02185a1c9b	android: gallium/auxiliary: fix building error in Android 7.0 Conditional libLLVMCore static library dependency is added, for the case when MESA_ENABLE_LLVM is true Fixes the following building error with Android 7.0: In file included from external/mesa/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp:62: ... external/llvm/include/llvm/IR/Attributes.h:68:14: fatal error: 'llvm/IR/Attributes.inc' file not found #include "llvm/IR/Attributes.inc" ^ 1 error generated. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:24:19 +00:00
Mauro Rossi	f93f7cae14	android: amd/common: fix LLVMInitializeAMDGPU* functions declaration LLVMInitializeAMDGPU* functions need to be explicitly declared and mesa expects them via <llvm-c/Target.h> header, but LLVM needs to be instructed to invoke its own LLVM_TARGET(AMDGPU) macro, or the functions will not be available. A new llvm cflag (-DFORCE_BUILD_AMDGPU) serves this purpose, the same mechanism is used also by other llvm targets e.g. FORCE_BUILD_ARM A necessary prerequisite is to have AMDGPU target handled accordingly in llvm config files i.e. {Target,AsmParser,AsmPrinter}.def for llvm device build includes. This avoids the following building errors: external/mesa/src/amd/common/ac_llvm_util.c:43:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetInfo' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUTargetInfo(); ^ external/mesa/src/amd/common/ac_llvm_util.c:44:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTarget' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUTarget(); ^ external/mesa/src/amd/common/ac_llvm_util.c:45:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetMC' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUTargetMC(); ^ external/mesa/src/amd/common/ac_llvm_util.c:46:2: error: implicit declaration of function 'LLVMInitializeAMDGPUAsmPrinter' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUAsmPrinter(); ^ Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:21:40 +00:00
Mauro Rossi	db3aaa3137	android: radeonsi: fix LLVMInitializeAMDGPU* functions declaration LLVMInitializeAMDGPU* functions need to be explicitly declared and mesa expects them via <llvm-c/Target.h> header, but LLVM needs to be instructed to invoke its own LLVM_TARGET(AMDGPU) macro, or the functions will not be available. A new llvm cflag (-DFORCE_BUILD_AMDGPU) serves this purpose, the same mechanism is used also by other llvm targets e.g. FORCE_BUILD_ARM A necessary prerequisite is to have AMDGPU target handled accordingly in llvm config files i.e. {Target,AsmParser,AsmPrinter}.def for llvm device build includes. This avoids the following building errors: external/mesa/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c:129:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetInfo' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUTargetInfo(); ^ external/mesa/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c:130:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTarget' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUTarget(); ^ external/mesa/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c:131:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetMC' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUTargetMC(); ^ external/mesa/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c:132:2: error: implicit declaration of function 'LLVMInitializeAMDGPUAsmPrinter' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUAsmPrinter(); ^ Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:21:35 +00:00
Mauro Rossi	a2a63ad262	android: radeon: fix LLVMInitializeAMDGPU* functions declaration LLVMInitializeAMDGPU* functions need to be explicitly declared and mesa expects them via <llvm-c/Target.h> header, but LLVM needs to be instructed to invoke its own LLVM_TARGET(AMDGPU) macro, or the functions will not be available. A new llvm cflag (-DFORCE_BUILD_AMDGPU) serves this purpose, the same mechanism is used also by other llvm targets e.g. FORCE_BUILD_ARM A necessary prerequisite is to have AMDGPU target handled accordingly in llvm config files i.e. {Target,AsmParser,AsmPrinter}.def for llvm device build includes. This avoids the following building errors: external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:121:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetInfo' [-Werror=implicit-function-declaration] LLVMInitializeAMDGPUTargetInfo(); ^ external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:122:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTarget' [-Werror=implicit-function-declaration] LLVMInitializeAMDGPUTarget(); ^ external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:123:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetMC' [-Werror=implicit-function-declaration] LLVMInitializeAMDGPUTargetMC(); ^ external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:124:2: error: implicit declaration of function 'LLVMInitializeAMDGPUAsmPrinter' [-Werror=implicit-function-declaration] LLVMInitializeAMDGPUAsmPrinter(); ^ Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:21:28 +00:00
Emil Velikov	9c5003996c	nouveau: remove always false argument in nouveau_fence_new() No point in having the extra argument considering that it's effectively unused since the function was introduced. Cc: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-01-18 16:01:15 +00:00
Emil Velikov	af4a298719	egl/wayland: resolve quirky try_damage_buffer() implementation The implementation was added with commit `d085a5dff5` and effectively provided a hidden dependency. Namely: the codepath used was determined solely during build time. Thus if we built again new wayland and then run against older (yet still within the requirements, as per the configure) one will get undefined symbols. As of earlier commit `36b9976e1f` "egl/wayland: Avoid race conditions when on non-main thread" the required version was bumped to one which provides the API, thus we can drop the quirky solution. Cc: Derek Foreman <derekf@osg.samsung.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Derek Foreman <derekf@osg.samsung.com>	2017-01-18 16:01:15 +00:00
Emil Velikov	e8044dd434	configure: remove HAVE_EGL_DRIVER_DRI[23] We have them for local purposes in configure, where we can use their direct dependency. With the only remaining instance in the makefile(s) being always true, as it can be seen in the configure snippet. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:01:15 +00:00
Emil Velikov	4380a2098b	gallium: correctly manage libsensors link flags We should be using LIBS rather than the LDFLAGS variable. Furthermore try to keep the linking to the final stage, rather than intermetent static library. Cc: Steven Toth <stoth@kernellabs.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 16:01:14 +00:00
Emil Velikov	cb5e799448	egl/wayland: unify dri2_wl_create_surface implementations Rather than having two almost identical codepaths (one for HW/wl_drm and another for SW/wl_shm), just factorise and reuse in both places. v2: Rebase v3: Rebase Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> (v2)	2017-01-18 16:01:14 +00:00
Emil Velikov	bfd6314350	egl/wayland: use the destroy_window_callback for swrast As described in commit `690ead4a13` ("egl/wayland-egl: Fix for segfault in dri2_wl_destroy_surface.") if we attempt to destroy a EGL surface attached to already destroyed Wayland window we'll get a segfault. v2: set the correct callback alongside the window->private. (Dan) Cc: Daniel Stone <daniels@collabora.com> Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2017-01-18 16:01:14 +00:00
Emil Velikov	3ecd6c6abd	glx: unify GLX_SGIX_pbuffer aliased declarations No point in having an identical code in two places. Not to mention that the Apple one incorrectly uses GLXDrawable as pbuf type. This change is both API and ABI safe since the header uses the correct GLXPbufferSGIX and both types are a typedef of the same primitive XID. Cc: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jeremy Sequoia <jeremyhu@apple.com>	2017-01-18 16:01:14 +00:00
Emil Velikov	9898bcf3f4	glx: use GLX_ALIAS for glXGetProcAddress Use the macro, rather than open-coding it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:01:14 +00:00
Emil Velikov	dfc84c2296	mesa: make use of HAVE_FUNC_ATTRIBUTE_ALIAS macro We must make sure that xserver has an equivalent one-line change to its configure.ac as the glx/glapi headers get copied over. Then again, xserver does _not_ seem to set HAVE_ALIAS to begin with so one might want to look into that first. Cc: Adam Jackson <ajax@redhat.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:01:14 +00:00
Emil Velikov	f121ac68b0	glx: remove always false ifdef GLX_NO_STATIC_EXTENSION_FUNCTIONS Quick search through git history (of both mesa and xserver) hows no instances where this was ever set. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 16:01:14 +00:00
Wladimir J. van der Laan	b7ac0f5671	etnaviv: Fake occlusion query capability This enables the PIPE_CAP_OCCLUSION_QUERY capability without adding an occlusion query type. This is necessary to get Mesa to report desktop GL 2.0 support (to run exciting things such as ioq3's OpenGL 2 renderer), and should be valid because exposing the capability does not guarantee that any counters are actually implemented. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-01-18 16:58:18 +01:00
Christian Gmeiner	103c363e0a	etnaviv: add flags parameter to texture barrier Fixes compile warning introduced by commit a1c848. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-01-18 16:58:11 +01:00
Christian Gmeiner	3ef916c128	etnaviv: handle PIPE_CAP_TGSI_FS_FBFETCH Fixes compile warning introduced by commit ee3ebe. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-01-18 16:58:05 +01:00
Roland Scheidegger	56441708cf	gallivm: (trivial) fix copy/paste bug with big endian code `8bd67a35c5` introduced using undefined variable on big endian archs due to copy/paste bug. (compile hack tested only)	2017-01-18 16:30:50 +01:00
Jose Fonseca	34041968f8	configure.ac: Revert recent HAVE_LLVM changes. This reverts changes 903eb09b5fb78d47d0f8a4bdf826a113ca2aff40..1a0aa468f354f0ee94dd383cd40ae915584624aa: Tobias Droste (5): configure.ac: Rename MESA_LLVM to FOUND_LLVM configure.ac: Only set LLVM_LIBS if LLVM is used configure.ac: Only define HAVE_LLVM if LLVM is used configure.ac: Set and use HAVE_GALLIUM_LLVM define configure.ac: Don't check LLVM version in gallium_require_llvm They break scons build, and I'm not convinced this is the right fix. In particular changing HAVE_LLVM in the C code is something I'd rather avoid no matter what. So it's better to discuss without the pressure of broken builds.	2017-01-18 14:46:54 +00:00
Emil Velikov	8d1712a065	vulkan: automake: do not use EXTRA_DIST in a conditional Otherwise the file might not end up in the tarball. Fixes: `dbd677efb4` "vulkan: add API registry" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 13:41:32 +00:00
Tobias Droste	4d0efb9683	configure.ac: Set and use HAVE_GALLIUM_LLVM define Gallium code used HAVE_LLVM to check if it needs to compile code for LLVM in header and source files. With the new logic HAVE_LLVM is always set. Use extra define to figure out if LLVM is used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99010 Signed-off-by: Tobias Droste <tdroste@gmx.de>	2017-01-18 13:23:01 +00:00
Jose Fonseca	903eb09b5f	gallivm: Cleanup USE_MCJIT. Split USE_MCJIT macro dual nature into a separate constant time define and a run-time variable. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-18 12:35:01 +00:00
Kenneth Graunke	aa291c3ba9	i965: Don't map/unmap in brw_print_program_cache on LLC platforms. We have a persistent mapping. Don't map it a second time or try to unmap it. Just use the pointer. This most likely would wreak havoc except that this code is unused (it's only called from an if (0) debug block). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2017-01-17 21:47:38 -08:00
Kenneth Graunke	ce89239294	i965: Move program cache printing to brw_program_cache.c. It makes sense to put a function which prints out the entire contents of the program cache in the file that implements the program cache. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2017-01-17 21:47:36 -08:00
Kenneth Graunke	f9edc550b2	i965: Make a helper for finding an existing shader variant. We had five copies of the same "walk the cache and look for an existing shader variant for this program" code. Now we have one helper function that returns the key. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2017-01-17 21:47:10 -08:00
Kenneth Graunke	e7d4008ebf	glsl: Make copy propagation not panic when it sees an intrinsic. A number of games have large arrays of constants, which we promote to uniforms. This introduces copies from the uniform array to the original temporary array. Normally, copy propagation eliminates those copies, making everything refer to the uniform array directly. A number of shaders in "Deus Ex: Mankind Divided" recently exposed a limitation of copy propagation - if we had any intrinsics (i.e. image access in a compute shader), we weren't able to get rid of these copies. That meant that any variable indexing remained on the temporary array rather being moved to the uniform array. i965's scalar backend currently doesn't support indirect addressing of temporary arrays, which meant lowering it to if-ladders. This was horrible. According to Marek, on radeonsi/GCN, "F1 2015" uses 64% less spilled-temp-array memory. On i965/Skylake: total instructions in shared programs: 13362954 -> 13329878 (-0.25%) instructions in affected programs: 43745 -> 10669 (-75.61%) helped: 12 HURT: 0 total cycles in shared programs: 248081010 -> 245949178 (-0.86%) cycles in affected programs: 4597930 -> 2466098 (-46.37%) helped: 12 HURT: 0 total spills in shared programs: 9493 -> 9507 (0.15%) spills in affected programs: 25 -> 39 (56.00%) helped: 0 HURT: 1 total fills in shared programs: 12127 -> 12197 (0.58%) fills in affected programs: 110 -> 180 (63.64%) helped: 0 HURT: 1 Helps Deus Ex: Mankind Divided. The one shader with hurt spills/fills is from Tomb Raider at Ultra settings, but that same shader has a -39.55% reduction in instructions and -14.09% reduction in cycle counts, so it seems like a win there as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-17 21:45:22 -08:00
Kenneth Graunke	9919542f1c	i965: Make DCE set null destinations on messages with side effects. (Co-authored by Matt Turner.) Image atomics, for example, return a value - but the shader may not want to use it. We assigned a useless VGRF destination. This seemed harmless, but it can actually be quite harmful. The register allocator has to assign that VGRF to a real register. It may assign the same actual GRF to the destination of an instruction that follows soon after. This results in a write-after-write (WAW) dependency, and stall. A number of "Deus Ex: Mankind Divided" shaders use image atomics, but don't use the return value. Several of these were hitting WAW stalls for nearly 14,000 (poorly estimated) cycles a pop. Making dead code elimination null out the destination avoids this issue. This patch cuts one shader's estimated cycles by -98.39%! Removing the message response should also help with data cluster bandwidth. On Skylake: (instruction counts remain identical) total cycles in shared programs: 255413890 -> 248081010 (-2.87%) cycles in affected programs: 12019948 -> 4687068 (-61.01%) helped: 24 HURT: 10 v2: Make can_omit_write independent of can_eliminate (Curro). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-17 21:45:04 -08:00
Kenneth Graunke	90bf39cd2b	i965: Combine some dead code elimination NOP'ing code. In theory we might have incorrectly NOP'd instructions that write the flag, but where that flag value isn't used, and yet the instruction either writes the accumulator or has side effects. I don't believe any such instructions exist, so this is mostly a code cleanup. Curro pointed out that FS_OPCODE_FB_WRITE has a null destination and actually writes the flag on Gen4-5 to dynamically decide whether to write some payload data. The hunk removed in this patch might have NOP'd it, except that we don't actually mark flags_written() in the IR, so it doesn't think the flag is touched at all. That's sketchy, but it means it wouldn't hit this today (though there are likely other problems!). v2: Properly replace the inst->regs_written() check in the second hunk with the flag being live (mistake caught by Curro). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-17 21:45:00 -08:00
Kenneth Graunke	be5f53e769	i965: Make DCE explicitly not eliminate any control flow instructions. According to Matt, the dead code pass explicitly avoided IF and WHILE because on Sandybridge, these could have conditional modifiers and null destination registers. Normally, those instructions use BAD_FILE for the destination register. Nowadays, we don't do that anymore, so we could technically drop these checks. However, it's clearer to explicitly leave control flow instructions alone, so change it to the more generic !inst->is_control_flow(). This should have no actual change. [This patch implements review feedback from Curro and Matt.] Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-01-17 21:44:29 -08:00
Dave Airlie	aac562f112	radv: disable vertex reuse when writing viewport index This fixes some issues we'd hit later if using viewport indexes. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-01-18 08:04:11 +10:00

1 2 3 4 5 ...

80950 commits