fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 13:38:19 +02:00

Author	SHA1	Message	Date
Jason Ekstrand	f793c57cc5	intel/isl: Tighten up restrictions for CCS on gen7 It may technically be possible to enable some sort of fast-clear support for at least the base slice of a 2D array texture on gen7. However, it's not documented to work, we've never tried to do it in GL, and we have no idea what the hardware does if you turn on CCS_D with arrayed rendering. Let's just play it safe and disallow it for now. If someone really cares that much about gen7 performance, they can come along and try to get it working later.	2017-07-22 20:12:07 -07:00
Jason Ekstrand	20533e0da7	anv/blorp: Assert isl_surf_init success in do_buffer_copy Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-22 08:21:27 -07:00
Jason Ekstrand	cf39fb06e3	anv/blorp: Explicitly set row_pitch in do_buffer_copy We have a very specific row pitch that we want and we don't want ISL to be changing it on us so just be explicit about it. Fixes: `a40f043034` Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-22 08:20:07 -07:00
Kenneth Graunke	30d6bc470a	i965: Set lower_vote_trivial in vector_nir_options_gen6 too. There's a second struct for Gen6+. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-07-21 18:09:01 -07:00
Topi Pohjolainen	fbfc6a2f67	intel/isl/gen7: Don't allow multisampled surfaces with valign2 There is the same constraintg later on as assert in isl_gen7_choose_image_alignment_el() so catch it earlier in order to return error instead of crash. Needed to avoid crashes with piglits on IVB and HSW: arb_internalformat_query2.image_format_compatibility_type pname checks arb_internalformat_query2.all internalformat_<x>_type pname checks arb_internalformat_query2.max dimensions related pname checks arb_copy_image.arb_copy_image-formats --samples=2/4/6/8 arb_texture_float.multisample-fast-clear gl_arb_texture_float Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-22 00:14:16 +03:00
Topi Pohjolainen	df9bb8dc05	intel/isl/gen7: Allow msaa with signed integer formats These formats are already allowed by the i965 GL driver, and the feature seems to work just fine. There are tests for multisampled rendering in piglit: tests/spec/ext_framebuffer_multisample which can be patched to try 16I/32I in addition to GL_RGBA8I. IvyBridge passed all tests with all sample numbers. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-22 00:14:16 +03:00
Topi Pohjolainen	abb84e3f2d	intel/isl/gen7: Allow msaa with 128-bit formats These formats are already allowed by the i965 GL driver, and the feature seems to work just fine. There are tests for multisampled rendering in piglit: tests/spec/ext_framebuffer_multisample which can be patched to try GL_RGBA16F/32F/16I/16UI/32I/32UI in addition to GL_RGBA/8I. IvyBridge passed all tests with all sample numbers and even with 128-bit formats. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-22 00:14:16 +03:00
Topi Pohjolainen	514d68576d	intel/isl: Allow 1D surfaces with compressed formats Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-22 00:14:16 +03:00
Topi Pohjolainen	a40f043034	intel/isl: Align non-tiled horizontally by cache line in order to support blit engine. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-22 00:14:16 +03:00
Matt Turner	069bf7c907	i965/fs: Match destination type to size for ballot No use in taking a 64-bit value when we know the high 32-bits are zero.	2017-07-20 16:56:50 -07:00
Matt Turner	1038d385a9	nir: Reduce destination size of ballot intrinsic when possible Some hardware, like i965, doesn't support group sizes greater than 32. In that case, we can reduce the destination size of the ballot intrinsic, which will simplify our code generation. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-07-20 16:56:49 -07:00
Matt Turner	782ef30451	i965/fs: Implement ARB_shader_ballot operations Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-07-20 16:56:49 -07:00
Matt Turner	8238930510	i965/fs: Do not move MOVs writing the flag outside of control flow The implementation of ballotARB() will start by zeroing the flags register. So, a doing something like if (gl_SubGroupInvocationARB % 2u == 0u) { ... = ballotARB(true); [...] } else { ... = ballotARB(true); [...] } (like fs-ballot-if-else.shader_test does) would generate identical MOVs to the same destination (the flag register!), and we definitely do not want to pull that out of the control flow. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-07-20 16:56:49 -07:00
Francisco Jerez	f1b7c47913	i965/fs: Handle explicit flag sources in flags_read() The implementations of the ARB_shader_ballot intrinsics will explicitly read the flag as a source register. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-07-20 16:56:49 -07:00
Matt Turner	43ef75b394	nir: Add system values from ARB_shader_ballot We already had a channel_num system value, which I'm renaming to subgroup_invocation to match the rest of the new system values. Note that while ballotARB(true) will return zeros in the high 32-bits on systems where gl_SubGroupSizeARB <= 32, the gl_SubGroup??MaskARB variables do not consider whether channels are enabled. See issue (1) of ARB_shader_ballot. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-07-20 16:56:49 -07:00
Matt Turner	ee9fa4ac18	i965/fs: Implement ARB_shader_group_vote operations Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-07-20 16:56:49 -07:00
Francisco Jerez	93dc736f4e	i965/fs: Handle explicit flag destinations in flags_written() The implementations of the ARB_shader_group_vote intrinsics will explicitly write the flag as the destination register. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-07-20 16:56:49 -07:00
Matt Turner	30b72f4126	i965/vec4: Lower ARB_shader_group_vote intrinsics I don't expect anyone is going to care about using this in vec4 programs (vertex/tessellation/geometry on Gen6/7), no one has come up with a good way to implement it much less test it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-07-20 16:56:49 -07:00
Matt Turner	d4c9d6a3b2	nir: Add pass to optimize intrinsics Specifically, constant fold intrinsics from ARB_shader_group_vote, but I suspect it'll be useful for other things in the future. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-07-20 16:56:49 -07:00
Topi Pohjolainen	c4ac0d4949	intel/isl/gen4: Represent cube maps with 3D layout v2 (Jason): Check for !ISL_SURF_DIM_3D instead of CUBE_BIT. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-20 11:32:21 +03:00
Topi Pohjolainen	171b72542c	intel/isl: Add i915 to isl_tiling converter v2: s/i915_tiling_to_isl_tiling(/isl_tiling_from_i915_tiling/ Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-20 11:32:21 +03:00
Chad Versace	5d69052113	anv/image: Fix VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT We incorrectly detected VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT. We looked for the bit in VkImageCreateInfo::usage, but it's actually in VkImageCreateInfo::flags. Found by assertion failures while enabling VK_ANDROID_native_buffer. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-07-19 11:25:50 -07:00
Topi Pohjolainen	0926fb69a4	intel/blorp/gen4: Drop cube map flag for single face copy This will falsely trigger an assert on number of layers once isl is used for 3D layouts of Gen4 cube maps. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-18 21:36:13 +03:00
Topi Pohjolainen	4733891e51	intel/isl: Take 3D surfaces into account in image params Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-18 21:35:44 +03:00
Jason Ekstrand	cd9fd68a50	anv: Advertise support for VK_KHR_variable_pointers We don't support the general version yet because that requires us to lower shared variables up-front in SPIR-V -> NIR. This shouldn't be a whole lot of work but it's not something we support today. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-07-18 09:43:13 -07:00
Jason Ekstrand	bc9319583a	anv: Advertise support for VK_KHR_storage_buffer_storage_class Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-07-18 09:43:13 -07:00
Jason Ekstrand	828c437078	intel/isl: Add a row_pitch parameter to surf_get_ccs_surf Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-07-17 13:48:38 -07:00
Jason Ekstrand	c5700ed72e	anv/image: Add INPUT_ATTACHMENT to the list of required usages From the Vulkan 1.0.53 spec VU for vkCreateImageView: "image must have been created with a usage value containing at least one of VK_IMAGE_USAGE_SAMPLED_BIT, VK_IMAGE_USAGE_STORAGE_BIT, VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT, VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT, or VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT" We were missing VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT from out list. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable@lists.freedesktop.org	2017-07-17 08:18:46 -07:00
Jason Ekstrand	cbdfd1daa2	anv: Stop leaking the no_aux sampler surface state Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable@lists.freedesktop.org	2017-07-17 08:18:46 -07:00
Jason Ekstrand	bd41564746	anv/cmd_buffer: Properly handle render passes with 0 attachments We were early returning and never created the NULL surface state. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: James Legg <jlegg@feralinteractive.com> Cc: mesa-stable@lists.freedesktop.org	2017-07-17 08:18:46 -07:00
Emil Velikov	43c188f970	anv: advertise v6 of the wayland surface extension Jason updated the Khronos spec to explicitly state that Wayland surfaces must support VK_PRESENT_MODE_MAILBOX_KHR. ANV did so since day one (back in 2015) Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-07-17 15:24:32 +01:00
Lionel Landwerlin	59adde0eab	anv: ensure device name contains terminating character v2: Use sizeof() (Chris) CID: 1415113 Reported-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-07-17 14:36:38 +01:00
Jason Ekstrand	0ee8d81718	anv: Implement VK_KHR_external_memory_* Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-07-15 08:59:38 -07:00
Jason Ekstrand	c02da9cad6	anv: Implement VK_KHR_dedicated_allocation We always recommend sub-allocation and don't do anything special for dedicated allocations. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-07-15 08:59:38 -07:00
Jason Ekstrand	8c82aa5f43	anv: Implement VK_KHR_get_memory_requirements2 Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-07-15 08:59:38 -07:00
Jason Ekstrand	5b57bdc1cf	anv: Advertise version 1.0.54 Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-07-15 08:59:38 -07:00
Jason Ekstrand	227debdc92	vulkan: Update to the new 1.0.54 spec XML and headers There is one small ANV change here because we used the VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX enum in the BO cache and that had to be updated to have the _KHR suffix. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-07-15 08:59:38 -07:00
Jason Ekstrand	dc179aa123	anv: Drop support for VK_KHX_external_semaphore_* These have been formally deprecated by Khronos never to be shipped again. The KHR versions should be implemented/used instead. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-07-15 08:58:51 -07:00
Jason Ekstrand	4ac94d0dee	anv: Drop support for VK_KHX_external_memory_* These have been formally deprecated by Khronos never to be shipped again. The KHR versions should be implemented/used instead. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-07-14 22:12:39 -07:00
Juan A. Suarez Romero	5cd4ece34e	anv/pipeline: do not use BITFIELD64_BIT() In the previous commit, forgot to apply v2 suggestions. Fixes: `28d0c38` (anv/pipeline: use unsigned long long constant to check enable vertex inputs) Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2017-07-14 10:33:19 +00:00
Juan A. Suarez Romero	28d0c38d85	anv/pipeline: use unsigned long long constant to check enable vertex inputs When initializing the ANV pipeline, one of the tasks is checking which vertex inputs are enabled. This is done by checking if the enabled bits in inputs_read. But the mask to use is computed doing `(1 << (VERT_ATTRIB_GENERIC0 + desc->location))`. The problem here is that if location is 15 or greater, the sum is 32 or greater. But C is handling 1 as a 32-bit integer, which means the displaced bit is out of range and thus the full value is 0. Thus, use 1ull, which is an unsigned long long value. This fixes: dEQP-VK.pipeline.vertex_input.max_attributes.16_attributes.binding_one_to_one.interleaved v2: use 1ull instead of BITFIELD64_BIT() (Matt Turner) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2017-07-14 08:09:18 +00:00
Kenneth Graunke	b2da123801	i965: Use pushed UBO data in the scalar backend. This actually takes advantage of the newly pushed UBO data, avoiding pull loads. Improves performance in GLBenchmark Manhattan 3.1 by: HSW: ~1%, BDW/SKL/KBL GT2: 3-4%, SKL GT4: 7-8%, APL: 4-5%. (thanks to Eero Tamminen for these numbers) shader-db results on Skylake, ignoring programs with spill/fill changes: total instructions in shared programs: 13963994 -> 13651893 (-2.24%) instructions in affected programs: 4250328 -> 3938227 (-7.34%) helped: 28527 HURT: 0 total cycles in shared programs: 179808608 -> 172535170 (-4.05%) cycles in affected programs: 79720410 -> 72446972 (-9.12%) helped: 26951 HURT: 1248 LOST: 46 GAINED: 21 Many "Deus Ex: Mankind Divided" shaders which already spilled end up spill a lot more (about 240 programs hurt, 9 helped). The cycle estimator suggests this is still overall a win (-0.23% in cycle counts) presumably because we trade pull loads for fills. v2: Drop "PULL" environment variable left in for initial debugging (caught by Matt). Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-07-13 20:18:54 -07:00
Kenneth Graunke	c9ef27e77b	i965: Factor out push locations. With UBOs, the answer of "have we decided to push this uniform" gets a bit more complicated - for one, we have multiple surfaces. This patch refactors things so we can add the new code in a single place. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-07-13 20:18:54 -07:00
Kenneth Graunke	4f586cd8f1	i965: Push UBO data, but don't use it just yet. This patch starts uploading UBO data via 3DSTATE_CONSTANT_* packets, and updates the compiler to know that there's extra payload data, so things continue working. However, it still issues pull loads for all data. I wanted to separate the two aspects for greater bisectability. v2: Update for new intel_bufferobj_buffer parameter. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-07-13 20:18:30 -07:00
Kenneth Graunke	6d28c6e52c	i965: Select ranges of UBO data to be uploaded as push constants. This adds a NIR pass that decides which portions of UBOS we should upload as push constants, rather than pull constants. v2: Switch to uint16_t for the UBO block number, because we may have a lot of them in Vulkan (suggested by Jason). Add more comments about bitfield trickery (requested by Matt). v3: Skip vec4 stages for now...I haven't finished wiring up support in the vec4 backend, and so pushing the data but not using it will just be wasteful. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-07-13 19:56:49 -07:00
Kenneth Graunke	8ec5a4e4a4	i965: Switch to absolute addressing for constant buffer 0. By default, 3DSTATE_CONSTANT_* Constant Buffer 0 is relative to dynamic state base address. This makes it unusable for pushing UBOs. I'd like to be able to use all four push buffers. There is a bit in the INSTPM register (or CS_DEBUG_MODE2 on Skylake) which controls whether buffer 0 is relative to dynamic state base address, or simply a normal pointer. Setting that gives us full flexibility. We can't currently write this on Haswell and earlier, and will need to update the kernel command parser, and then do the whole version checking song and dance. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-07-13 19:56:49 -07:00
Lionel Landwerlin	6131a1ae40	aubinator: don't leak fd of opened aubfile CID: 1373563 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-07-13 22:50:50 +01:00
Lionel Landwerlin	d1bd731e30	anv: don't use strcpy for copying strings CID: 1358935 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-07-13 22:50:47 +01:00
Lionel Landwerlin	226fae7849	intel/compiler: no need to check unsigned is >= 0 CID: 1338342 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-07-13 22:50:45 +01:00
Lionel Landwerlin	95c917668c	intel/compiler: don't check unsigned is >= 0 CID: 1224468 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-07-13 22:50:38 +01:00

1 2 3 4 5 ...

1975 commits