fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-08 13:28:06 +02:00

Author	SHA1	Message	Date
Juan A. Suarez Romero	cc7fc7e319	docs: Add SHA256 sums for 19.1.0 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `2a5b4e2b9f`)	2019-06-11 15:26:42 +00:00
Juan A. Suarez Romero	7e8e49475c	docs: Add release notes for 19.1.0 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `1517811f4f`)	2019-06-11 15:26:38 +00:00
Samuel Iglesias Gonsálvez	32e1d85cb6	radv: assert on inline uniform blocks in radv_CmdPushDescriptorSetKHR() According to the Vulkan spec, inline uniform blocks are not allowed to be updated through vkCmdPushDescriptorSetKHR(). These are the spec quotes from "13.2.1. Descriptor Set Layout" that are relevant for this case: "VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR specifies that descriptor sets must not be allocated using this layout, and descriptors are instead pushed by vkCmdPushDescriptorSetKHR." "If flags contains VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR, then all elements of pBindings must not have a descriptorType of VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT". There is no explicit mention in vkCmdPushDescriptorSetKHR() to forbid this case but it is implied in the creation of the descriptor set layout as aforementioned. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-11 16:32:27 +02:00
Samuel Iglesias Gonsálvez	d0c52ff610	anv: ignore inline uniform blocks in anv_CmdPushDescriptorSetKHR() According to the Vulkan spec, inline uniform blocks are not allowed to be updated through vkCmdPushDescriptorSetKHR(). These are the spec quotes from "13.2.1. Descriptor Set Layout" that are relevant for this case: "VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR specifies that descriptor sets must not be allocated using this layout, and descriptors are instead pushed by vkCmdPushDescriptorSetKHR." "If flags contains VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR, then all elements of pBindings must not have a descriptorType of VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT". There is no explicit mention in vkCmdPushDescriptorSetKHR() to forbid this case but it is implied in the creation of the descriptor set layout as aforementioned. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-11 16:25:53 +02:00
Eric Engestrom	773ff93bc4	egl: compare the whole list of attributes `memcmp()` compares a given number of bytes, but `EGLAttrib` is larger than a byte. Fixes: `8e991ce539` "egl: handle the full attrib list in display::options" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-06-11 12:18:09 +00:00
Eduardo Lima Mitev	3fb7b1fd35	freedreno/a5xx: Fix indirect draw max_indices calculation The number of elements to draw should not be affected by the offset. A similar fix was submitted for a6xx at `79180a05`. Fixes these dEQP tests on a5xx: dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_separate_grid_500x500_drawcount_8 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_separate_grid_500x500_drawcount_2500 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawarrays_separate_grid_500x500_drawcount_2500 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawarrays_combined_grid_500x500_drawcount_2500 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_500x500_drawcount_8 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_500x500_drawcount_2500 Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-11 08:28:45 +02:00
Samuel Pitoiset	40699f74b8	radv: remove extra assignment in radv_decompress_resolve_subpass_src() baseArrayLayer is defined twice, trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-11 08:17:22 +02:00
Samuel Pitoiset	c39a1611ab	radv: add radv_get_resolve_pipeline() helper in the graphics path Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-11 08:06:42 +02:00
Samuel Pitoiset	b06d1f029d	radv: do not decompress all image layers before resolving inside a subpass When decompressing resolve source images, we should rely on the framebuffer layer count instead of resolving all images layers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-11 08:06:39 +02:00
Samuel Pitoiset	4efbd963ec	radv: initialize the aspect mask when decompressing resolve source images Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-11 08:06:35 +02:00
Samuel Pitoiset	c31a07fa85	radv: perform proper layout transitions before resolving Use an explicit pipeline barrier for doing layout transitions instead of duplicating some code. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-11 08:06:32 +02:00
Samuel Pitoiset	92fa6264cb	radv: do not resolve all image layers with compute inside a subpass When resolving inside a subpass, we should rely on the framebuffer layer count instead of resolving all images layers. This should improve performance of layered resolves a bit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-11 08:06:28 +02:00
Kenneth Graunke	a8588f512b	iris: Bypass half-float pack/unpack lowering. This skips GLSL IR lowering of pack/unpackHalf operations, allowing the NIR optimizer to see them Improves performance in Synmark2's OglCSDof by about 2x, by cutting about 90% of the cycles from one of the compute shaders. shader-db statistics on Skylake: 4 compute shaders went from SIMD8 to SIMD16. total instructions in shared programs: 15598871 -> 15542568 (-0.36%) instructions in affected programs: 143016 -> 86713 (-39.37%) helped: 144 HURT: 0 helped stats (abs) min: 17 max: 4669 x̄: 390.99 x̃: 164 helped stats (rel) min: 7.48% max: 85.28% x̄: 30.17% x̃: 24.22% 95% mean confidence interval for instructions value: -510.50 -271.49 95% mean confidence interval for instructions %-change: -32.70% -27.65% Instructions are helped. total cycles in shared programs: 371973958 -> 368902103 (-0.83%) cycles in affected programs: 5557722 -> 2485867 (-55.27%) helped: 144 HURT: 0 helped stats (abs) min: 106 max: 1026600 x̄: 21332.33 x̃: 1697 helped stats (rel) min: 0.53% max: 88.98% x̄: 36.12% x̃: 34.67% 95% mean confidence interval for cycles value: -41570.02 -1094.64 95% mean confidence interval for cycles %-change: -38.44% -33.80% Cycles are helped. total spills in shared programs: 11936 -> 11903 (-0.28%) spills in affected programs: 110 -> 77 (-30.00%) helped: 3 HURT: 2 total fills in shared programs: 25644 -> 25178 (-1.82%) fills in affected programs: 677 -> 211 (-68.83%) helped: 5 HURT: 0 total loops in shared programs: 4830 -> 4829 (-0.02%) loops in affected programs: 1 -> 0 helped: 1 HURT: 0	2019-06-10 16:01:36 -07:00
Bas Nieuwenhuizen	e0d12f79c5	radv: Handle UNDEFINED format in image format list. Was watching a presentation on YT where this was used and it turns out it is not invalid. The only case it is actually valid as format in the creation of an image or image view is with Android Hardware Buffers which have their format specified externally. So we can just ignore all entries with VK_FORMAT_UNDEFINED. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-10 22:21:16 +00:00
Bas Nieuwenhuizen	39c71e0025	radv: Prevent out of bound shift on 32-bit builds. uintptr_t is 32-bits then and shifting it by 32 bits results in undefined behavior IIRC. Fixes: `b3c8de1c55` "radv: save all descriptor pointers into the trace BO" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-10 22:18:51 +00:00
Caio Marcelo de Oliveira Filho	2cb5907508	glsl: Check order and uniqueness of interlock functions With this commit all remaining compilation tests in Piglit for ARB_fragment_shader_interlock will pass. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2019-06-10 14:29:32 -07:00
Caio Marcelo de Oliveira Filho	b7c9fc72fd	glsl: Make interlock builtins follow same compiler rules as barriers Generalize the barrier code to provide correct error messages for other builtins. Fixes most of piglit compilation tests for ARB_fragment_shader_interlock. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2019-06-10 14:29:26 -07:00
Eduardo Lima Mitev	fb2169040a	nir/opt_algebraic: Fix rules for imadsh_mix16 The rules added in patch `3addd7c` are inverted: It should be: (al * bh) << 16 + c instead of: (ah * bl) << 16 + c Fixes a number of regressions under dEQP-GLES31.functional.draw_indirect.compute_interop.large.* on Freedreno. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-10 22:27:46 +02:00
Alyssa Rosenzweig	e9703fb416	panfrost: Ignore discards in dead branch analysis Fixes regressions in dEQP-GLES2.functional.shaders.discard.dynamic_loop_* Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 08:23:08 -07:00
Samuel Pitoiset	e9316fdfd4	radv: fix setting CB_SHADER_MASK for dual source blending CB_SHADER_MASK was computed without the second color buffer format which looks totally wrong to me. While we are at it, copy a comment from RadeonSI. Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-10 17:21:56 +02:00
Alyssa Rosenzweig	50ffaaff3b	panfrost/midgard: Disambiguate register mode We postfix instructions by their size if a destination override is in place (a la AT&T assembly), disambiguating instruction sizes. Previously, "16-bit instruction, 16-bit dest, 16-bit sources" disassembled identically to "32-bit instruction, 16-bit dest, 16-bit sources", which is semantically distinct due to the lessened opportunity for parallelism but (potentially) greater precision. Adding a postfix removes the ambiguity and relieves mental gymnastics reading weird disassemblies even in some cases that are not ambiguous. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 06:50:12 -07:00
Alyssa Rosenzweig	8027cc9975	panfrost/midgard: Expose vec8/vec16 modes Midgard ALUs can operate in one of four modes: vec2 64-bit, vec4 32-bit, vec8 16-bit, or vec16 8-bit. Our compiler (and indeed, any OpenGL ES shader) only uses 32-bit (and eventually vec4 16-bit) modes in normal circumstances. Nevertheless, the other modes do exist and are easily accessible through OpenCL; they also come up in cases like blend shaders. While we have had minimal support for decoding 8-bit/64-bit modes, we did so pretending they were vec4 in each case; 16-bit registers had a synthetically duplicated register file to separate lo/hi halves, etc. This works for GL, but it doesn't map to what the hardware is -actually- doing, which can cause some headscratchingly bizarre disassemblies from OpenCL. So, we dive in the deep end and support these other modes natively in the disassembler, using absurdly long masks/swizzles, since the hardware is considerably more flexible than what was exposed before. Outside of some fixed routines for blending, none of the above is supported in the compiler yet. But it's better to have it in the ISA definitions and disassembler than not, for future use if nothing else. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 06:50:11 -07:00
Alyssa Rosenzweig	2d0bda0885	panfrost/midgard: Add shifting int modifiers As a source modifier, shift allows shifting a value left by the bit size, useful in conjunction with a greater register mode, for instance to implement `upsample`. As a concrete example, the following OpenCL: ushort hr0 = /* ... /, uint r1 = / ... /; uint r2 = (convert_uint(hr0) << 16) ^ b; compiles to the following Midgard assembly: ixor r, (hr0) << 16, b In reverse, the ".hi" output modifier shifts the value right by the bit size, leaving just the carry/overflow at the bottom. To implement _hi functions in OpenCL (for <64-bit), we do arithmetic in the 2x higher mode with the .hi modifier. (For 64-bit, things are hairier, since there is not an 128-bit int mode). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 06:50:11 -07:00
Alyssa Rosenzweig	6780481a3f	panfrost/midgard: Add integer outmods For floats, output modifiers determine clamping behaviour. For integers, they determine wrapping/saturation behaviour (or shifting -- see next commit). These are very different; they are conceptually two unrelated enums union'ed together; the distinction is responsible for many-a-bug. While clamping behaviour for floats was clear from GL, the int behaviour is only known From OpenCL contortion with convert_*_sat() functions. With the underlying functions known, clean up the codebase, likely fixing outmod type related bugs in the process. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 06:50:11 -07:00
Alyssa Rosenzweig	215b8844ee	panfrost/midgard: Note floating compares type convert OP_TYPE_CONVERTS denotes an opcode that returns a different type than is source (going from int-domain to float-domain or vice versa), named after the f2i/i2f family of opcodes it covers. We care because source mods are determined by the source type (i/f) but output modifiers are determined by the output type (equals the source type, unless the op type converts, in which case it's the opposite). The upshot is that floating-point compares (feq/fne/etc) actually do type-convert. That is, that take in floating-points and output in integer space (a boolean), so we mark them off this way to ensure the correct output modifiers are used. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 06:50:11 -07:00
Alyssa Rosenzweig	d48d991ce2	panfrost: Align linear renderable resources It's just -easier- to render to aligned framebuffers. For winsys targets, we already align, but even for an internal linear FBO we ought to align everything nicely. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 06:48:07 -07:00
Alyssa Rosenzweig	d89e0716a1	panfrost: Fix stride check when mipmapping Now that we support custom strides on mipmapped textures (theoretically, at least), extend the stride check to support mipmaps. Fixes incorrect strides of linear windows in Weston. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-06-10 06:47:18 -07:00
Alyssa Rosenzweig	416fc3b5ef	panfrost: Refactor texture/sampler upload We move some coding packing the texture/sampler descriptors into dedicated functions (out of the terrifyingly long emit_for_draw monolith), cleaning them up as we go. The discovery triggering the cleanup is the format for including manual strides in the presence of mipmaps/cubemaps. Rather than placed at the end like previously assumed, they are interleaved after each address. This difference is relevant when handling NPOT linear mipmaps. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 06:45:33 -07:00
Alyssa Rosenzweig	a35069a7b5	panfrost: Refactor blitting code We refactor the wallpaper rendering code to separate the wallpaper-specific bits from the general blitting capabilities. In the (hopefully near) future, we'll turn this on to implement real Gallium blits, e.g. for automatic mipmap generation. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 06:45:25 -07:00
Alyssa Rosenzweig	d878753efa	panfrost: Refactor AFBC code This patch does a substantial cleanup of the code for handling AFBC, moving various disparate misplaced functions into a new central pan_afbc.c file. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 06:45:14 -07:00
Alyssa Rosenzweig	b4763984ac	panfrost: Move pan_screen() to pan_screen.h Trivial. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 06:45:05 -07:00
Alyssa Rosenzweig	a38583e352	panfrost: Always align strides to cache line (64) (Performance tweak.) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 06:44:56 -07:00
Emil Velikov	0534fcf57d	docs: fixup 19.0.5 <> 19.0.6 confusion The title of the release notes says 19.0.5 while the rest of the file (correctly) says 19.0.6 Fixes: `fe79d75ccf` ("docs: Add relnotes for 19.0.6") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan at pnwbakers.com>	2019-06-10 14:04:39 +01:00
Emil Velikov	a379b1c0ee	mapi: correctly handle the full offset table Earlier commit converted ES1 and ES2 to a new, much simpler, dispatch generator. At the same time, GL/glapi and the driver side are still using the old code. There is a hidden ABI between GL*.so and glapi.so, former referencing entry-points by offset in the _glapi_table. Hence earlier commit added the full table of entry-points, alongside a marker for other cases like indirect GL(X) and driver-size remapping. Yet the patches did not handle things fully, thus it was possible to get different interpretations of the dispatch table after the marker. This commit fixes that adding an indicative error message to catch future bugs. While here correct the marker (MAX_OFFSETS) comment. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110302 Fixes: `cf317bf093` ("mapi: add all _glapi_table entrypoints tostatic_data.py") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-06-10 14:04:30 +01:00
Emil Velikov	497de977bd	mapi: add static_date offset to EXT_dsa As elaborated in the next patch, there is some hidden ABI that effectively require most entrypoints to be listed in the file. Cc: Marek Olšák <marek.olsak@amd.com> Fixes: `d2906293c4` ("mesa: EXT_dsa add selectorless matrix stackfunctions") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-06-10 14:04:25 +01:00
Emil Velikov	61960547df	mapi: add static_date offset to MaxShaderCompilerThreadsKHR As elaborated in the next patch, there is some hidden ABI that effectively require most entrypoints to be listed in the file. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110302 Cc: Marek Olšák <maraeo@gmail.com> Fixes: `c5c38e831e` ("mesa: implement ARB/KHR_parallel_shader_compile") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-06-10 14:04:18 +01:00
Mathias Fröhlich	a7ecf78b90	egl: Let the caller of dri2_create_drawable decide about loaderPrivate. In the call arguments to dri2_create_drawable decouple loaderPrivate from dri2_surf. For all callers of dri2_create_drawable the two pointers are the same with the exception of the gbm backed platform. Let the calling code of dri2_create_drawable decide what loaderPrivate shall be. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-06-10 11:06:48 +02:00
Samuel Pitoiset	91aa25f462	radv: fix alpha-to-coverage when there is unused color attachments When alphaToCoverage is enabled, we should always write the alpha channel of MRT0 if it's unused. This now matches RadeonSI. This fixes the new CTS: dEQP-VK.pipeline.multisample.alpha_to_coverage_unused_attachment.samples_*.alpha_invisible Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl	2019-06-10 09:23:41 +02:00
Tomeu Vizoso	2fe7f9f2ae	panfrost: ci: Switch from direct Docker use to buildah Use the infrastructure in wayland/ci-templates to build the container images. This prevents from getting into some situations in which the images wouldn't be rebuilt, and allows us to share some infrastructure with other projects in freedesktop.org. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Suggested-by: Michel Dänzer <michel@daenzer.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 08:09:23 +02:00
Kenneth Graunke	81582e9366	gallium/u_transfer_helper: Free the staging buffer on unmap. u_transfer_helper sometimes mallocs a staging buffer, and leaked it. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-09 15:16:10 -07:00
Lionel Landwerlin	17898a9b7e	intel/gpu_dump: fix argument passing We were dropping "/' around arguments grouped together. This was triggering failures with : $ ./framemetrics -g "Memory Writes Distribution Gen9" -o /tmp/output.csv -f ./my.trace 10 11 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-06-09 19:45:13 +00:00
Eric Engestrom	93349d7118	util/os_file: suppress sign comparison warning Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-09 13:14:13 +00:00
Eric Engestrom	fd5c18de88	util/os_file: fix error being sign-cast back and forth Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-09 13:14:13 +00:00
Eric Engestrom	341ba406fd	util/os_file: avoid shadowing read() with a local variable Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-09 13:14:13 +00:00
Eric Engestrom	7e35f20d44	util/os_file: actually return the error read() gave us Fixes: `316964709e` "util: add os_read_file() helper" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-09 13:14:13 +00:00
Alexandros Frantzis	f8f222ea36	virgl: Work around possible memory exhaustion Since we don't normally flush before performing copy transfers, it's possible in some scenarios to use too much memory for staging resources and start failing. This can happen either because we exhaust the total available memory (including system memory virtio-gpu swaps out to), or, more commonly, because the total size of resources in a command buffer doesn't fit in virtio-gpu video memory. To reduce the chances of this happening, force a flush before a copy transfer if the total size of queued staging resources exceeds a certain limit. Since after a flush any queued staging resources will be eventually released, this ensures both that each command buffer doesn't require too much video memory, and that we don't end up consuming too much memory for staging resources in total. Fixes kernel errors reported when running texture_upload tests in glbench. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-07 21:45:45 -07:00
Alexandros Frantzis	e34f79c918	virgl: Remove incorrect resource wait condition Now that we have copy transfers in place, we can remove the incorrect resource wait condition. Copy transfers and other optimizations minimize the performance impact of this removal, while providing the correct behavior. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-07 21:45:43 -07:00
Alexandros Frantzis	236c55f650	virgl: Use copy transfers for textures Extend copy transfers to also be used for busy textures. Performance results: Unigine Valley, qemu before: 22.7 FPS after: 23.1 FPS Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-07 21:45:42 -07:00
Alexandros Frantzis	a22c5df079	virgl: Use buffer copy transfers to avoid waiting when mapping We typically need to wait for a buffer to become ready before mapping, so that we don't write new contents while the host is still using the old contents. However, if we are allowed to discard the contents of the mapped buffer range, then we can avoid waiting by using a staging buffer range which we guarantee to never be busy, copying from the staging buffer range to the target buffer in the host. This commit implements this optimization by utilizing a dedicated u_upload_mgr for the staging buffer. Performance results: Twilight Struggle (Steam/Proton), qemu before: 7 FPS after: 25 FPS glmark2 ubo, qemu before: 38 FPS after: 331 FPS Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Suggested-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-07 21:45:39 -07:00
Alexandros Frantzis	6e7726e50c	virgl: Support copy transfers Support transfers that use a different resource as the source of data to transfer. This will be used in upcoming commits to send data to host buffers through a transfer upload buffer, in order to avoid waiting when the buffer resource is busy. Note that we don't support queueing copy transfers in the transfer queue. Copy transfers should be emitted directly in the command queue, allowing us to avoid flushes before them and leads to better performance. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-07 21:45:36 -07:00

1 2 3 4 5 ...

111667 commits