fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-04-07 07:50:35 +02:00

Author	SHA1	Message	Date
Lionel Landwerlin	ea7a6fa980	vulkan/overlay: add pipeline statistic & timestamps support v2: switch to VkBase{In,Out}Structure v3: Add timestamps at begin/end of primary command buffers to estimate gpu time spent per submission (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v2)	2019-05-02 17:02:06 +01:00
Lionel Landwerlin	4438188f49	vulkan/overlay: record stats in command buffers and accumulate on exec/submit This significantly reworks how numbers displayed are computed. We accumulate operations written into command buffers and add those to the device when submitted to a queue. These collected values are then used to compute per frame overlay data. We also accumulate the data over the sampling fps period to produce numbers for that period of time. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-05-02 17:02:06 +01:00
Lionel Landwerlin	9eddceef44	vulkan/overlay: update help printout Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 17:02:06 +01:00
Lionel Landwerlin	a1e6b5e9be	vulkan/util: generate a helper function to return pNext struct sizes This will be used to copy chains of structures so that we can alterate some of them. v2: Drop vk_util.h include (Eric) Use VkBaseInStructure directly (Eric) v3: Drop --platforms= param to generator script, instead produce a file with #ifdef based what platforms are compiled. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 17:02:02 +01:00
Tomeu Vizoso	ad7c9ba0ec	panfrost/midgard: Skip liveness analysis for instructions without dest [Alyssa: Add comment explanation] Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-02 15:29:48 +00:00
Tomeu Vizoso	a5dddc2d42	panfrost/midgard: Skip register allocation if there's no work to do Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-02 15:29:41 +00:00
Eric Engestrom	a34ee4dec7	egl: hard-code destroy function instead of passing it around as a pointer Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-05-02 14:44:16 +00:00
Connor Abbott	6ec4ed48fc	nir/search: Add debugging code to dump the pattern matched This was useful while debugging the previous commit. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-02 16:14:06 +02:00
Connor Abbott	7ce86e6938	nir/search: Add automaton-based pre-searching nir_opt_algebraic is currently one of the most expensive NIR passes, because of the many different patterns we've added over the years. Even though patterns are already sorted by opcode, there are still way too many patterns for common opcodes like bcsel and fadd, which means that many patterns are tried but only a few actually match. One way to fix this is to add a pre-pass over the code that scans it using an automaton constructed beforehand, similar to the automatons produced by lex and yacc for parsing source code. This automaton has to walk the SSA graph and recognize possible pattern matches. It turns out that the theory to do this is quite mature already, having been developed for instruction selection as well as other non-compiler things. I followed the presentation in the dissertation cited in the code, "Tree algorithms: Two Taxonomies and a Toolkit," trying to keep the naming similar. To create the automaton, we have to perform something like the classical NFA to DFA subset construction used by lex, but it turns out that actually computing the transition table for all possible states would be way too expensive, with the dissertation reporting times of almost half an hour for an example of size similar to nir_opt_algebraic. Instead, we adopt one of the "filter" approaches explained in the dissertation, which trade much faster table generation and table size for a few more table lookups per instruction at runtime. I chose the filter which resulted the fastest table generation time, with medium table size. Right now, the table generation takes around .5 seconds, despite being implemented in pure Python, which I think is good enough. Based on the numbers in the dissertation, the other choice might make table compilation time 25x slower to get 4x smaller table size, but I don't think that's worth it. As of now, we get the following binary size before and after this patch: text data bss dec hex filename 11979455 464720 730864 13175039 c908ff before i965_dri.so text data bss dec hex filename 12037835 616244 791792 13445871 cd2aef after i965_dri.so There are a number of places where I've simplified the automaton by getting rid of details in the LHS patterns rather than complicate things to deal with them. For example, right now the automaton doesn't distinguish between constants with different values. This means that it isn't as precise as it could be, but the decrease in compile time is still worth it -- these are the compilation time numbers for a shader-db run with my (admittedly old) database on Intel skylake: Difference at 95.0% confidence -42.3485 +/- 1.375 -7.20383% +/- 0.229926% (Student's t, pooled s = 1.69843) We can always experiment with making it more precise later. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-02 16:14:06 +02:00
Samuel Pitoiset	08be23bfde	radv: set WD_SWITCH_ON_EOP=1 when drawing primitives from a stream output buffer According to RadeonSI, this seems to be required by the hardware to avoid GPU hangs. I think I just forgot to set that bit when I implemented VK_EXT_transform_feedback. This fixes a GPU hang with Space Engineers and DXVK. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110291 Fixes: `b4eb029062` ("radv: implement VK_EXT_transform_feedback") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-02 15:55:46 +02:00
Brian Paul	48107b5a2b	glsl: fix typo in #warning message Trivial. Spotted by Eric Engestrom.	2019-05-02 06:32:57 -06:00
Brian Paul	f0f7c3b03a	svga: add SVGA_NO_LOGGING env var (v2) valgrind crashes when we try to initialize host logging. This env var can be used to disable logging. v2: rebase onto "svga: move host logging to winsys". Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Neha Bhende <bhenden@vmware.com>	2019-05-02 06:09:35 -06:00
Charmaine Lee	9c5f407b0b	svga: move host logging to winsys This patch adds a host_log interface to svga_winsys and moves the host logging code to the winsys layer. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2019-05-02 06:09:35 -06:00
Eric Engestrom	da8d9e2d88	wsi/wayland: document lack of vkAcquireNextImageKHR timeout support Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:51:03 +00:00
Daniel Stone	9826e04eca	vulkan/wsi/wayland: Respect non-blocking AcquireNextImage If the client has requested that AcquireNextImage not block at all, with a timeout of 0, then don't make any non-blocking calls. This will still potentially block infinitely given a non-infinte timeout, but the fix for that is much more involved. Signed-off-by: Daniel Stone <daniels@collabora.com> Cc: mesa-stable@lists.freedesktop.org Cc: Chad Versace <chadversary@chromium.org> Cc: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108540 Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-02 11:51:03 +00:00
Rhys Perry	13c423629e	radv: fix set_output_usage_mask() with composite and 64-bit types It previously used var->type instead of deref_instr->type and didn't handle 64-bit outputs. This fixes lots of transform feedback CTS tests involving transform feedback and geometry shaders (mostly dEQP-VK.transform_feedback.fuzz.random_geometry.*) v2: fix writemask widening when comp != 0 v3: fix 64-bit variables when comp != 0, again Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-05-02 10:24:20 +01:00
Thomas Hellstrom	20b7839392	winsys/svga: Don't abort on EBUSY errors from execbuffer This error code typically indicated that a buffer object that was referenced by the command stream was being used for CPU access by another client. The correct action here is to retry after a while. Use usleep() until we have proper kernel support for this wait. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2019-05-02 09:51:15 +02:00
Thomas Hellstrom	c69557c4a2	winsys/svga: Update the drm interface file The file vmwgfx_drm.h was a bit outdated. Update to a recent version, including defines supporting coherent memory. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2019-05-02 09:51:07 +02:00
Thomas Hellstrom	978d66e4d5	svga: Avoid bouncing buffer data in malloced buffers Some constant- and texture upload buffer data may bounce in malloced buffers before being transferred to hardware buffers. In the case of texture upload buffers this seems to be an oversight. In the case of constant buffers, code comments indicate that we want to avoid mapping hardware buffers for reading when copying out of buffers that need modification before being passed to hardware. In this case we avoid data bouncing for upload manager buffers but make sure buffers that we read out from stay in malloced memory. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2019-05-02 09:51:00 +02:00
Thomas Hellstrom	5961189f4e	winsys/svga: Enable the transfer_from_buffer GPU command for vgpu10 We didn't have the path using this command enabled as typically we take an alternate path using DMA uploads. Emable it so that we can exercise that code-path by turning off the DMA path. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2019-05-02 09:50:52 +02:00
Thomas Hellstrom	50e58966fa	winsys/svga: Add an environment variable to force host-backed operation The vmwgfx kernel module has a compatibility mode for user-space that is not guest-backed resource aware. Add an environment variable to facilitate testing of this mode on guest-backed aware kernels: if the environment variable SVGA_FORCE_HOST_BACKED is defined, the driver will use host-backed operation. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Deepak Rawat <drawat@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2019-05-02 09:50:22 +02:00
Samuel Pitoiset	492e828848	ac: tidy up ac_build_llvm8_tbuffer_{load,store} For consistency with ac_build_llvm8_buffer_{load,store}_common helpers and that will help a bit for removing the vec3 restriction. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-02 09:24:05 +02:00
Samuel Pitoiset	6ac10e07c2	radv: implement a workaround for VK_EXT_conditional_rendering Per the Vulkan spec 1.1.107, the predicate is a 32-bit value. Though the AMD hardware treats it as a 64-bit value which means it might fail to discard. I don't know why this extension has been drafted like that but this definitely not fit with AMD. The hardware doesn't seem to support a 32-bit value for the predicate, so we need to implement a workaround. This fixes an issue when DXVK enables conditional rendering with RADV, this also fixes the Sasha conditionalrender demo. Fixes: `e45ba51ea4` ("radv: add support for VK_EXT_conditional_rendering") Reported-by: Philip Rebohle <philip.rebohle@tu-dortmund.de> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-02 09:24:05 +02:00
Samuel Pitoiset	e03e7c510f	radv: fix color conversions for normalized uint/sint formats The hardware actually rounds before conversion. This now matches what values are used when performing fast clears vs slow clears. This fixes a rendering issue with Far Cry 3&4. This also fixes a bunch of CTS tests that use a 8-bit UNORM format (only when the 512*512 image size hint is manually disabled). Cc: "19.0" "19.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-02 09:24:05 +02:00
Samuel Pitoiset	6162543999	radv: do not need to force emit the TCS regs on Vega20 This chip doesn't need the fixup. This fixes a bunch of dEQP-VK.tessellation tests and avoid random GPU hangs. Cc: "19.0" "19.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-02 09:24:05 +02:00
Jason Ekstrand	bf774b56be	util/bitset: Return an actual bool from test macros I want to be able to do BITSET_TEST() != BITSET_TEST() and this isn't currently possible because BITSET_TEST() returns a random bit. Compare to zero to get an actual Boolean. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-05-02 03:12:54 +00:00
Brian Paul	413e55b5b9	glsl: work around MinGW 7.x compiler bug I'm not sure what triggered this, but building with scons platform=windows toolchain=crossmingw machine=x86 build=profile with MinGW g++ 7.3 or 7.4 causes an internal compiler error. We can work around it by forcing -O1 optimization. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2019-05-01 20:06:54 -06:00
Brian Paul	96540e4f0a	llvmpipe: init some vars to NULL to silence MinGW compiler warnings Reviewed-by: Neha Bhende <bhenden@vmware.com>	2019-05-01 20:06:54 -06:00
Marek Olšák	2d48a6959f	radeonsi: set sampler state and view functions for compute-only contexts	2019-05-01 21:16:13 -04:00
Marek Olšák	bfd3d50487	radeonsi: use new atomic LLVM helpers This depends on "ac,ac/nir: use a better sync scope for shared atomics"	2019-05-01 21:16:13 -04:00
Marek Olšák	181dcf0792	st/mesa: don't flush the front buffer if it's a pbuffer This is the best guess I can make here. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-05-01 21:15:33 -04:00
Marek Olšák	35294f2eca	mesa: fix pbuffers because internally they are front buffers This fixes the egl_ext_device_base piglit test, which uses EGL pbuffers. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-05-01 21:15:33 -04:00
Marek Olšák	f753f913f5	mesa: rework error handling in glDrawBuffers It's needed by the next pbuffer fix, which changes the behavior of draw_buffer_enum_to_bitmask, so it can't be used to help with error checking. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-05-01 21:15:33 -04:00
Bas Nieuwenhuizen	0c99b5ace8	radv: Restrict YUVY formats to 1 layer. Fixes: `8bb3cec7c9` "radv: Expose VK_EXT_ycbcr_image_arrays." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-05-02 02:29:51 +02:00
Bas Nieuwenhuizen	aab201635e	radv: Set is_array in lowered ycbcr tex instructions. Fixes array tests. Fixes: `91702374d5` "radv: Add ycbcr lowering pass." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-05-02 02:29:51 +02:00
Bas Nieuwenhuizen	2c57d3361a	radv: Fix hang width YCBCR array textures. Forgot to apply the width/height divisor for CB writes resulting in the CB using larger than expected slice sizes. Fixes: `42d159f276` "radv: Add multiple planes to images." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110530 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110526 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-05-02 02:29:51 +02:00
Erico Nunes	257a9b0a94	lima/gpir: add limit of max 512 instructions It has been noted that the lima GP has a limit of 512 instructions, after which the shaders don't work and fail silently. This commit adds a check to make the shader compilation abort when the shader exceeds this limit, so that we get a clear reason for why the program will not work. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-05-02 00:02:58 +00:00
Alyssa Rosenzweig	09c669260f	panfrost: Fix blend shader upload Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-01 23:20:51 +00:00
Alyssa Rosenzweig	910608b29a	panfrost/decode: Hit MRT blend shader enable bits Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-01 23:20:50 +00:00
Alyssa Rosenzweig	b304b30f2c	panfrost: Remove shader dump Redundant via the midgard shader dump. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-01 23:20:48 +00:00
David Riley	dec68e32ea	virgl: Re-use and extend queue transfers for intersecting buffer subdatas. Small buffer subdatas which are essentially doing a memcpy were getting bogged down by all the overhead of creating new transfers. Signed-off-by: David Riley <davidriley@chromium.org> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-05-01 15:48:51 -07:00
David Riley	a54c231b56	virgl: Allow transfer queue entries to be found and extended. Intersecting transfer queue entries allow for the possibility of extending an existing transfer instead of creating a new one (and all the associated mappign/unmapping). Signed-off-by: David Riley <davidriley@chromium.org> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-05-01 15:48:46 -07:00
David Riley	e94a9a7f38	virgl: Store mapped hw resource with transfer object. Signed-off-by: David Riley <davidriley@chromium.org> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-05-01 15:48:28 -07:00
Kenneth Graunke	ebbb05b3c9	iris: Fix imageBuffer and PBO download. Recently we added checks to try and deny multisampled shader images. Unfortunately, this messed up imageBuffers, which have sample_count = 0, which are also used in PBO download, causing us hit CPU map fallbacks. Fixes: `b15f5cfd20` iris: Do not advertise multisampled image load/store. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-05-01 14:37:46 -07:00
Dave Airlie	e2fecf57e3	r600: reset tex array override even when no view bound If no view is bound we still should reset the override to 0 and array mode. This should fix misrendering in firefox WebRender since the pbo sampler was removed. Fixes: `1250383e36` (st/mesa: remove sampler associated with buffer texture in pbo logic)	2019-05-02 07:34:32 +10:00
Ian Romanick	85e6865ff6	nir: Saturating integer arithmetic is not associative In 8-bits, iadd_sat(iadd_sat(0x7f, 0x7f), -1) = iadd_sat(0x7f, -1) = 0x7e but, iadd_sat(0x7f, iadd_sat(0x7f, -1)) = iadd_sat(0x7f, 0x7e) = 0x7f Fixes: `272e927d0e` ("nir/spirv: initial handling of OpenCL.std extension opcodes") Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-01 09:07:47 -07:00
Eric Engestrom	70da00ffd6	util: move #include out of #if linux This #include is needed for `NULL`, which is used on all OSes, not just Linux. Reported-by: Juan A. Suarez Romero <jasuarez@igalia.com> Fixes: `316964709e` "util: add os_read_file() helper" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2019-05-01 15:45:47 +00:00
Alok Hota	a44420d9cc	swr/rast: Add general SWTag statistics Update Archrast parser to use stats, used with an internal tool Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-05-01 15:11:30 +00:00
Alok Hota	b8adb540a0	swr/rast: Add string handling to AR event framework For use by an internal tool Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-05-01 15:11:30 +00:00
Alok Hota	f355f03388	swr/rast: Add initial SWTag proto definitions Update gen_archrast.py to properly generate event IDs Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-05-01 15:11:30 +00:00

1 2 3 4 5 ...

102055 commits