fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-01-21 21:40:22 +01:00

Author	SHA1	Message	Date
Rob Clark	dac3bc9862	freedreno/a6xx: handle non-UBWC-compatible texture views Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-11 10:55:27 -07:00
Rob Clark	fe5c7b2b75	freedreno: add helper to uncompress UBWC resource We'll need this for a few edge cases, like image/sampler view that uses a format that UBWC does not support with a resource originally created in a format that UBWC does support. NOTE we could in some cases do an in-place uncompress. But that has a couple potential sharp edges: 1) the uncompressed buffer could have different layout, ie. a5xx with meta and pixel data of layers/levels interleaved. 2) if it comes mid-batch, it would force flush, or somehow fixing up cmdstream for draws already emitted. But with the resource shadowing approach we can rely on batch re-ordering to avoid splitting things.. older draws see the older compressed version, newer draws see the new uncompressed version of the rsc. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-11 10:55:27 -07:00
Rob Clark	846b8a76bd	freedreno: handle images in rebind_resource() Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-11 10:55:27 -07:00
Rob Clark	c6ae354299	freedreno: allow null discard box in shadow path When uncompressing a UBWC buffer, we don't want to discard anything. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-11 10:55:27 -07:00
Rob Clark	12201d7a8b	freedreno: swap UBWC state in shadow path It doesn't come up yet, as so far we only hit this path with linear buffers. But it will when we start re-using the shadow path for uncompressing UBWC buffers. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-11 10:55:27 -07:00
Rob Clark	3c9a31eb50	freedreno: add modifier param to fd_try_shadow_resource() To uncompress UBWC, I want to re-use the shadow path, but we'll need a way to request that the new buffer is not compressed. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-11 10:55:27 -07:00
Rob Clark	3b05a120a3	freedreno: correct modifier for UBWC buffers Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-11 10:55:27 -07:00
Chia-I Wu	15323c14fd	virgl: consider newly created resources idle A newly created resource can be regarded as idle. We don't care if the RESOURCE_CREATE command has been retired, unless it is used for fencing. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-11 10:03:54 -07:00
Chia-I Wu	9e4452cfd9	virgl: make resource_wait/resource_is_busy cheaper The round trip to the kernel is expensive. Add a local cache to avoid it when possible. There is a race condition when two contexts access the same resource at the same time (e.g., ctx1 submits a cmdbuf that accesses a resource while ctx2 maps the resource). But that is probably an app bug in the first place. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-11 10:03:54 -07:00
Chia-I Wu	ddc90be907	virgl: add virgl_drm_{alloc,free,clear}_res_list Helpers to work with resource list. virgl_drm_release_all_res is removed. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-11 10:03:54 -07:00
Chia-I Wu	71465fe569	virgl: do not cache external resources We should not reuse a resource for other purposes when it can still be accessed by another process or device. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-11 10:03:54 -07:00
Alyssa Rosenzweig	7d43999e63	panfrost: Enable AFBC on depth/stencil This seems to be a performance win, but more rigorous testing is necessary to figure out the exact circumstances when this is good/bad. Incidentally, this fixes non-aligned ZS. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:46:43 -07:00
Alyssa Rosenzweig	15f62b8e7c	panfrost: Linear depth/stencil should be aligned We might render to it. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:46:43 -07:00
Alyssa Rosenzweig	d7ad29ce25	panfrost/midgard: Decode LOD/bias registers For constant LODs/biases, we can use an immediate embedded in the texture (already decoded); for non-constant, we have to use a register squeezed into the usual immediate field, which is decoded here. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig	b4a3296e77	panfrost/midgard: Decode texture offset register swizzle Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig	4e9e42cc56	panfrost/midgard/disasm: include textureGather() Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig	6c18ae33bc	panfrost/midgard: Support negative immediate offsets It's not at all clear why this work for texelFetch but not texture. Maybe the top bits are dual-purpose on other texturing ops...? Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig	4d8157f12d	panfrost/midgard: Fix redunant mask redundancy Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig	3dee556c4e	panfrost/midgard/disasm: Print LOD for texelFetch Its encoding differs slightly from the LOD used in normal texture calls. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig	cda9f32909	panfrost/midgard: Identify the in_reg_full field This is clear for texelFetch, hence the confusion with Bifrost's filter field, but it's much more general in reality. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig	445a7b523f	panfrost/midgard/disasm: Correctly dump bias/LOD Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig	873a3ed342	panfrost/midgard/disasm: Cleanup texture op code Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig	289405392d	panfrost/midgard/disasm: Add missing space Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig	f4ee8d055c	panfrost/midgard/disasm: LOD immediate/register select Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig	59fa7c95c8	panfrost/midgard/disasm: Use texture op name bare This allows us to show a call to textureLod in a reasonable way. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:18 -07:00
Alyssa Rosenzweig	109460f03a	panfrost/midgard/disasm: Varying perspective divides With an extra flag, we're able to do a perspective division "for free" while loading a varying. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:18 -07:00
Alyssa Rosenzweig	fc472007e7	panfrost/midgard: Add perspective division opcodes ...on the load/store unit, not the ALUs. Looks goofy but hey. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:18 -07:00
Alyssa Rosenzweig	b0396d6dda	panfrost/midgard: Print texture offsets This patch identifies the two modes of offsets in a texture instruction (immediate and register, disambiguated by the bit-once-known-as "has_offset") and implements disassembly for both. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:18 -07:00
Alyssa Rosenzweig	ed1c48e91d	panfrost/midgard: Expand texture to 4-channel swizzle This eliminates some unknowns, clarifies 3D textures, and will maybe help with array/shadow textures? Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:18 -07:00
Juan A. Suarez Romero	b586ed51f3	docs: update calendar, add news item and link release notes for 19.1.0 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-06-11 17:38:22 +02:00
Juan A. Suarez Romero	cc7fc7e319	docs: Add SHA256 sums for 19.1.0 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `2a5b4e2b9f`)	2019-06-11 15:26:42 +00:00
Juan A. Suarez Romero	7e8e49475c	docs: Add release notes for 19.1.0 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `1517811f4f`)	2019-06-11 15:26:38 +00:00
Samuel Iglesias Gonsálvez	32e1d85cb6	radv: assert on inline uniform blocks in radv_CmdPushDescriptorSetKHR() According to the Vulkan spec, inline uniform blocks are not allowed to be updated through vkCmdPushDescriptorSetKHR(). These are the spec quotes from "13.2.1. Descriptor Set Layout" that are relevant for this case: "VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR specifies that descriptor sets must not be allocated using this layout, and descriptors are instead pushed by vkCmdPushDescriptorSetKHR." "If flags contains VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR, then all elements of pBindings must not have a descriptorType of VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT". There is no explicit mention in vkCmdPushDescriptorSetKHR() to forbid this case but it is implied in the creation of the descriptor set layout as aforementioned. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-11 16:32:27 +02:00
Samuel Iglesias Gonsálvez	d0c52ff610	anv: ignore inline uniform blocks in anv_CmdPushDescriptorSetKHR() According to the Vulkan spec, inline uniform blocks are not allowed to be updated through vkCmdPushDescriptorSetKHR(). These are the spec quotes from "13.2.1. Descriptor Set Layout" that are relevant for this case: "VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR specifies that descriptor sets must not be allocated using this layout, and descriptors are instead pushed by vkCmdPushDescriptorSetKHR." "If flags contains VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR, then all elements of pBindings must not have a descriptorType of VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT". There is no explicit mention in vkCmdPushDescriptorSetKHR() to forbid this case but it is implied in the creation of the descriptor set layout as aforementioned. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-11 16:25:53 +02:00
Eric Engestrom	773ff93bc4	egl: compare the whole list of attributes `memcmp()` compares a given number of bytes, but `EGLAttrib` is larger than a byte. Fixes: `8e991ce539` "egl: handle the full attrib list in display::options" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-06-11 12:18:09 +00:00
Eduardo Lima Mitev	3fb7b1fd35	freedreno/a5xx: Fix indirect draw max_indices calculation The number of elements to draw should not be affected by the offset. A similar fix was submitted for a6xx at `79180a05`. Fixes these dEQP tests on a5xx: dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_separate_grid_500x500_drawcount_8 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_separate_grid_500x500_drawcount_2500 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawarrays_separate_grid_500x500_drawcount_2500 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawarrays_combined_grid_500x500_drawcount_2500 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_500x500_drawcount_8 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_500x500_drawcount_2500 Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-11 08:28:45 +02:00
Samuel Pitoiset	40699f74b8	radv: remove extra assignment in radv_decompress_resolve_subpass_src() baseArrayLayer is defined twice, trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-11 08:17:22 +02:00
Samuel Pitoiset	c39a1611ab	radv: add radv_get_resolve_pipeline() helper in the graphics path Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-11 08:06:42 +02:00
Samuel Pitoiset	b06d1f029d	radv: do not decompress all image layers before resolving inside a subpass When decompressing resolve source images, we should rely on the framebuffer layer count instead of resolving all images layers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-11 08:06:39 +02:00
Samuel Pitoiset	4efbd963ec	radv: initialize the aspect mask when decompressing resolve source images Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-11 08:06:35 +02:00
Samuel Pitoiset	c31a07fa85	radv: perform proper layout transitions before resolving Use an explicit pipeline barrier for doing layout transitions instead of duplicating some code. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-11 08:06:32 +02:00
Samuel Pitoiset	92fa6264cb	radv: do not resolve all image layers with compute inside a subpass When resolving inside a subpass, we should rely on the framebuffer layer count instead of resolving all images layers. This should improve performance of layered resolves a bit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-11 08:06:28 +02:00
Kenneth Graunke	a8588f512b	iris: Bypass half-float pack/unpack lowering. This skips GLSL IR lowering of pack/unpackHalf operations, allowing the NIR optimizer to see them Improves performance in Synmark2's OglCSDof by about 2x, by cutting about 90% of the cycles from one of the compute shaders. shader-db statistics on Skylake: 4 compute shaders went from SIMD8 to SIMD16. total instructions in shared programs: 15598871 -> 15542568 (-0.36%) instructions in affected programs: 143016 -> 86713 (-39.37%) helped: 144 HURT: 0 helped stats (abs) min: 17 max: 4669 x̄: 390.99 x̃: 164 helped stats (rel) min: 7.48% max: 85.28% x̄: 30.17% x̃: 24.22% 95% mean confidence interval for instructions value: -510.50 -271.49 95% mean confidence interval for instructions %-change: -32.70% -27.65% Instructions are helped. total cycles in shared programs: 371973958 -> 368902103 (-0.83%) cycles in affected programs: 5557722 -> 2485867 (-55.27%) helped: 144 HURT: 0 helped stats (abs) min: 106 max: 1026600 x̄: 21332.33 x̃: 1697 helped stats (rel) min: 0.53% max: 88.98% x̄: 36.12% x̃: 34.67% 95% mean confidence interval for cycles value: -41570.02 -1094.64 95% mean confidence interval for cycles %-change: -38.44% -33.80% Cycles are helped. total spills in shared programs: 11936 -> 11903 (-0.28%) spills in affected programs: 110 -> 77 (-30.00%) helped: 3 HURT: 2 total fills in shared programs: 25644 -> 25178 (-1.82%) fills in affected programs: 677 -> 211 (-68.83%) helped: 5 HURT: 0 total loops in shared programs: 4830 -> 4829 (-0.02%) loops in affected programs: 1 -> 0 helped: 1 HURT: 0	2019-06-10 16:01:36 -07:00
Bas Nieuwenhuizen	e0d12f79c5	radv: Handle UNDEFINED format in image format list. Was watching a presentation on YT where this was used and it turns out it is not invalid. The only case it is actually valid as format in the creation of an image or image view is with Android Hardware Buffers which have their format specified externally. So we can just ignore all entries with VK_FORMAT_UNDEFINED. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-10 22:21:16 +00:00
Bas Nieuwenhuizen	39c71e0025	radv: Prevent out of bound shift on 32-bit builds. uintptr_t is 32-bits then and shifting it by 32 bits results in undefined behavior IIRC. Fixes: `b3c8de1c55` "radv: save all descriptor pointers into the trace BO" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-10 22:18:51 +00:00
Caio Marcelo de Oliveira Filho	2cb5907508	glsl: Check order and uniqueness of interlock functions With this commit all remaining compilation tests in Piglit for ARB_fragment_shader_interlock will pass. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2019-06-10 14:29:32 -07:00
Caio Marcelo de Oliveira Filho	b7c9fc72fd	glsl: Make interlock builtins follow same compiler rules as barriers Generalize the barrier code to provide correct error messages for other builtins. Fixes most of piglit compilation tests for ARB_fragment_shader_interlock. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2019-06-10 14:29:26 -07:00
Eduardo Lima Mitev	fb2169040a	nir/opt_algebraic: Fix rules for imadsh_mix16 The rules added in patch `3addd7c` are inverted: It should be: (al * bh) << 16 + c instead of: (ah * bl) << 16 + c Fixes a number of regressions under dEQP-GLES31.functional.draw_indirect.compute_interop.large.* on Freedreno. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-10 22:27:46 +02:00
Alyssa Rosenzweig	e9703fb416	panfrost: Ignore discards in dead branch analysis Fixes regressions in dEQP-GLES2.functional.shaders.discard.dynamic_loop_* Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 08:23:08 -07:00
Samuel Pitoiset	e9316fdfd4	radv: fix setting CB_SHADER_MASK for dual source blending CB_SHADER_MASK was computed without the second color buffer format which looks totally wrong to me. While we are at it, copy a comment from RadeonSI. Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-10 17:21:56 +02:00

... 74 75 76 77 78 ...

115447 commits