fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-25 02:10:11 +01:00

Author	SHA1	Message	Date
Jason Ekstrand	90108deb27	anv: Update to use the new features struct names These were updated in version 1.1.106 of vulkan.h to make more sense with the extension names. We may as well keep with the times. Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-15 13:25:43 +00:00
Lionel Landwerlin	9e7b0988d6	anv: leave the top 4Gb of the high heap VMA unused In `628c9ca908` I forgot to apply the same -4Gb of the high address of the high heap VMA. This was previously computed in the HIGH_HEAP_MAX_ADDRESS. Many thanks to James for pointing this out. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reported-by: Xiong, James <james.xiong@intel.com> Fixes: `628c9ca908` ("anv: store heap address bounds when initializing physical device") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-13 12:08:23 +00:00
Lionel Landwerlin	628c9ca908	anv: store heap address bounds when initializing physical device We can then reuse those bounds to initialize the VMA heaps at logical device creation. This fixes an issue on EHL which has only 36bits of VMA. We were incorrectly using the fixed 48bits upper bound to initialize the logical device heap, resulting in addresses beyong the device's limits. v2: Don't confuse heap size (limited by system memory) and VMA size (limited by number of addressing bits the platform has) v3: Fix low heap vma_size :( (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reported-by: James Xiong <james.xiong@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)	2019-04-11 22:56:43 +01:00
Juan A. Suarez Romero	ec7a33af58	anv: advertise 8 subtexel/mipmap precision bits So far ANV was advertising 4 bits for both subTexelPrecisionBits and mipmapPrecisionBits. But these values were not actually verified. But it seems the right value is actually 8 bits for both cases. Unfortunately Intel PRM does not clarify how many bits the hardware use. For the mipmap case, there is the following reference in PRM Volume 6 (3D Media GPGPU), specifically in LOD Computation Pseudocode: ``` Bias: S4.8 MinLod: U4.8 MaxLod: U4.8 Base: U4.1 MIPCnt: U4 SurfMinLod: U4.8 ResMinLod: U4.8 `` We have other clues, though: - On one side, dEQP-VK.texture.explicit_lod.* tests fail when using 4 bits, but work when using 8 bits. These tests try to mimic the expected behaviour as much real as possible, and they use the reported subTexelPrecisionBits and mipmapPrecisionBits reported to get this. - On the other side, the equivalent driver for Windows is reporting 8 bits for both elements. Not sure if they got to verify it from the PRM or from a diffent source. CC: Jason Ekstrand <jason@jlekstrand.net> CC: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-04-09 15:28:42 +00:00
Caio Marcelo de Oliveira Filho	45a4129392	anv: Implement VK_NV_compute_shader_derivatives Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-08 19:29:33 -07:00
Jason Ekstrand	ce47999cee	Revert "anv/radv: release memory allocated by glsl types during spirv_to_nir" This reverts commit `4e1bbb000c`. It turns out that some DXVK apps due to some implementation detail of DXVK or other create and destroy instances in an interleaved way. Freeing the glsl_type memory without being a bit more careful causes use-after-free issues. Looks like we need to try again.	2019-03-27 11:24:58 -05:00
Gurchetan Singh	620df57dbb	anv: fix build on Nougat AHardwareBuffer is only available on O and above. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-03-21 15:36:39 -07:00
Tapani Pälli	4e1bbb000c	anv/radv: release memory allocated by glsl types during spirv_to_nir Fixes leaks for each glsl_type generated: ==32470== 384 bytes in 3 blocks are possibly lost in loss record 18 of 18 ==32470== at 0x483880B: malloc (vg_replace_malloc.c:309) ==32470== by 0x4C43F4A: ralloc_size (ralloc.c:119) ==32470== by 0x4C44014: rzalloc_size (ralloc.c:151) ==32470== by 0x4C44258: rzalloc_array_size (ralloc.c:215) ==32470== by 0x4D38957: glsl_type::glsl_type(glsl_struct_field const, unsigned int, char const) (glsl_types.cpp:114) ==32470== by 0x4D3BEED: glsl_type::get_struct_instance(glsl_struct_field const, unsigned int, char const) (glsl_types.cpp:1146) ==32470== by 0x4D42ECC: glsl_struct_type (nir_types.cpp:501) ==32470== by 0x4CDB5A1: vtn_handle_type (spirv_to_nir.c:1269) ==32470== by 0x4CE53DD: vtn_handle_variable_or_type_instruction (spirv_to_nir.c:4018) ==32470== by 0x4CD8CFF: vtn_foreach_instruction (spirv_to_nir.c:365) ==32470== by 0x4CE5E6B: spirv_to_nir (spirv_to_nir.c:4490) ==32470== by 0x497AF10: anv_shader_compile_to_nir (anv_pipeline.c:173) v2: move release call to vkDestroyInstance v3: apply fix also to radv driver Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-21 08:30:22 +02:00
Jason Ekstrand	9a129510f5	anv: Bump maxComputeWorkgroupInvocations We initially set this lower because we didn't have SIMD32 support yet but we've supported SIMD32 for quite some time now. We should bump it up to the real limit. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-20 09:26:56 -05:00
Jason Ekstrand	887041c763	anv: Implement VK_EXT_host_query_reset Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-18 14:48:41 +00:00
Jason Ekstrand	13099d4490	anv: Stop using VK_TRUE/FALSE We've been fairly inconsistent about this so we should really choose whether we're going to use VK_TRUE/FALSE or the C boolean values. The Vulkan #defines are set to 1 and 0 respectively so it's the same value as C gives you when you cast a boolean expression to an integer. Since there are several places where we set a VkBool32 to a C logical expression, let's just embrace C booleans and stop using the VK defines. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-13 17:58:27 -05:00
Tapani Pälli	bef354321b	anv: revert "anv: release memory allocated by glsl types during spirv_to_nir" This reverts commit `47fc359822`. Reason is that patch did not take in to account situation where we might have both OpenGL and Vulkan using glsl_types at the same time. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-12 14:12:36 +02:00
Tapani Pälli	47fc359822	anv: release memory allocated by glsl types during spirv_to_nir Fixes leaks for each glsl_type generated: ==32470== 384 bytes in 3 blocks are possibly lost in loss record 18 of 18 ==32470== at 0x483880B: malloc (vg_replace_malloc.c:309) ==32470== by 0x4C43F4A: ralloc_size (ralloc.c:119) ==32470== by 0x4C44014: rzalloc_size (ralloc.c:151) ==32470== by 0x4C44258: rzalloc_array_size (ralloc.c:215) ==32470== by 0x4D38957: glsl_type::glsl_type(glsl_struct_field const, unsigned int, char const) (glsl_types.cpp:114) ==32470== by 0x4D3BEED: glsl_type::get_struct_instance(glsl_struct_field const, unsigned int, char const) (glsl_types.cpp:1146) ==32470== by 0x4D42ECC: glsl_struct_type (nir_types.cpp:501) ==32470== by 0x4CDB5A1: vtn_handle_type (spirv_to_nir.c:1269) ==32470== by 0x4CE53DD: vtn_handle_variable_or_type_instruction (spirv_to_nir.c:4018) ==32470== by 0x4CD8CFF: vtn_foreach_instruction (spirv_to_nir.c:365) ==32470== by 0x4CE5E6B: spirv_to_nir (spirv_to_nir.c:4490) ==32470== by 0x497AF10: anv_shader_compile_to_nir (anv_pipeline.c:173) v2: move release call to vkDestroyInstance Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-11 13:14:45 +02:00
Timothy Arceri	051b4064da	anv: add support for dumping shader info via VK_EXT_debug_report This information will be used by the vkpipeline-db tool. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-11 16:16:04 +11:00
Kenneth Graunke	4787bc944a	isl: Add a swizzle parameter to isl_buffer_fill_state() This is necessary for legacy texture buffer object formats, where we'll need to use a swizzle to fake e.g. luminance. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-07 11:39:27 -08:00
Lionel Landwerlin	acb50d6b1f	intel/decoders: handle decoding MI_BBS from ring An MI_BATCH_BUFFER_START in the ring buffer acts as a second level batchbuffer (aka jump back to ring buffer when running into a MI_BATCH_BUFFER_END). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:31 +00:00
Caio Marcelo de Oliveira Filho	69cc6272fb	anv: Implement VK_EXT_external_memory_host v2: Ignore the import if handleType == 0. (Jason) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-05 12:59:50 -08:00
Jason Ekstrand	43f40dc7cb	anv: Implement VK_EXT_inline_uniform_block Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:06:50 -06:00
Tapani Pälli	3bb8768b9d	anv: toggle on support for VK_EXT_ycbcr_image_arrays We already propagate coord_components correctly and did not have layer restrictions for ycbcr formats. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-05 10:39:17 +00:00
Lionel Landwerlin	32ffd90002	anv: add support for INTEL_DEBUG=bat As requested by Ken ;) v2: Also decode simple batches (Caio) Fix u_vector usage issues (Lionel) v3: Make binding/instruction/state/surface available (Lionel) v4: Going through device pools for simple batches (Lionel) Centralize search BO callbacks into anv_device.c (Lionel) v5: Clear decoded batch buffer var after use (Caio) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-02 12:53:21 +00:00
Juan A. Suarez Romero	4f917e6a61	anv: advertise 8 subpixel precision bits On one side, when emitting 3DSTATE_SF, VertexSubPixelPrecisionSelect is used to select between 8 bit subpixel precision (value 0) or 4 bit subpixel precision (value 1). As this value is not set, means it is taking the value 0, so 8 bit are used. On the other side, in the Vulkan CTS tests, if the reference rasterizer, which uses 8 bit precision, as it is used to check what should be the expected value for the tests, is changed to use 4 bit as ANV was advertising so far, some of the tests will fail. So it seems ANV is actually using 8 bits. v2: explicitly set 3DSTATE_SF::VertexSubPixelPrecisionSelect (Jason) v3: use _8Bit definition as value (Jason) v4: (by Jason) anv: Explicitly set 3DSTATE_CLIP::VertexSubPixelPrecisionSelect This field was added on gen8 even though there's an identically defined one in 3DSTATE_SF. CC: Jason Ekstrand <jason@jlekstrand.net> CC: Kenneth Graunke <kenneth@whitecape.org> CC: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-22 17:53:55 +01:00
Lionel Landwerlin	f509213675	anv: implement VK_EXT_depth_clip_enable A new extension allowing the user to explictly specify the clipping behavior. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-20 09:57:58 +00:00
Eric Engestrom	f1374805a8	drm-uapi: use local files, not system libdrm There was an issue recently caused by the system header being included by mistake, so let's just get rid of this include path and always explicitly #include "drm-uapi/FOO.h" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-14 11:20:00 +00:00
Caio Marcelo de Oliveira Filho	5299c9cbcc	anv: skip bit6 swizzle detection in Gen8+ It is always false on Gen8+. Also, move the variable definition near its use. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-04 20:44:41 -08:00
Jason Ekstrand	48ed2a7bb0	anv: Implement VK_EXT_buffer_device_address Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-01 17:09:42 -06:00
Eric Engestrom	9af77fcf98	anv: drop always-successful VkResult Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-25 09:45:27 +00:00
Jason Ekstrand	ac0f8a6ea0	anv: Implement transform feedback queries Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-22 10:42:57 -06:00
Jason Ekstrand	2be89cbd82	anv: Implement vkCmdDrawIndirectByteCountEXT Annoyingly, this requires that we implement integer division on the command streamer. Fortunately, we're only ever dividing by constants so we can use the mulh+add+shift trick and it's not as bad as it sounds. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-22 10:42:56 -06:00
Jason Ekstrand	36ee2fd61c	anv: Implement the basic form of VK_EXT_transform_feedback Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-22 10:42:56 -06:00
Danylo Piliaiev	1952fd8d2c	anv: Implement VK_EXT_conditional_rendering for gen 7.5+ Conditional rendering affects next functions: - vkCmdDraw, vkCmdDrawIndexed, vkCmdDrawIndirect, vkCmdDrawIndexedIndirect - vkCmdDrawIndirectCountKHR, vkCmdDrawIndexedIndirectCountKHR - vkCmdDispatch, vkCmdDispatchIndirect, vkCmdDispatchBase - vkCmdClearAttachments Value from conditional buffer is cached into designated register, MI_PREDICATE is emitted every time conditional rendering is enabled and command requires it. v2: by Jason Ekstrand - Use vk_find_struct_const instead of manually looping - Move draw count loading to prepare function - Zero the top 32-bits of MI_ALU_REG15 v3: Apply pipeline flush before accessing conditional buffer (The issue was found by Samuel Iglesias) v4: - Remove support of Haswell due to possible hardware bug - Made TMP_REG_PREDICATE and TMP_REG_DRAW_COUNT defines to define registers in one place. v5: thanks to Jason Ekstrand and Lionel Landwerlin - Workaround the fact that MI_PREDICATE_RESULT is not accessible on Haswell by manually calculating MI_PREDICATE_RESULT and re-emitting MI_PREDICATE when necessary. v6: suggested by Lionel Landwerlin - Instead of calculating the result of predicate once - re-emit MI_PREDICATE to make it easier to investigate error states. v7: suggested by Jason - Make anv_pipe_invalidate_bits_for_access_flag add CS_STALL if VK_ACCESS_CONDITIONAL_RENDERING_READ_BIT is set. v8: suggested by Lionel - Precompute conditional predicate's result to support secondary command buffers. - Make prepare_for_draw_count_predicate more readable. Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-18 18:31:44 +00:00
Rafael Antognolli	643248b66a	anv: Remove state flush. We have all the state buffers snooped, so we don't need to clflush everything anymore. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-17 15:08:22 -08:00
Iago Toral Quiroga	f92c5bc8f3	anv/device: fix maximum number of images supported We had defined MAX_IMAGES as 8, which we used to size the array for image push constant data. The comment there stated that this was for gen8, but anv_nir_apply_pipeline_layout runs for all gens and writes that array, asserting that we don't exceed that number of images, which imposes a limit of MAX_IMAGES on all gens. Furthermore, despite this, we are exposing up to 64 images per shader stage on all gens, gen8 included. This patch lowers the number of images we expose in gen8 to 8 and keeps 64 images for gen9+ while making sure that only pre-SKL gens use push constant space to handle images. v2: - <= instead of < in the assert (Eric, Lionel) - Change the way the assertion is written (Eric) v3: - Revert the way the assertion is written to the form it had in v1, the version in v2 was not equivalent and was incorrect. (Lionel) v4: - gen9+ doesn't need push constants for images at all (Jason) Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v3)	2019-01-17 07:59:00 +01:00
Jason Ekstrand	5e4f9ea363	anv: Implement VK_KHR_depth_stencil_resolve	2019-01-14 10:16:52 -06:00
Eric Engestrom	4f5a526789	anv: drop unneeded KHR suffix Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-08 18:47:56 +00:00
Jason Ekstrand	754eff07d2	anv: Sort properties and features switch statements Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-07 18:41:15 -06:00
Tapani Pälli	f1654fa7e3	anv/android: support creating images from external format Since we don't know the exact format at creation time, some initialization is done only when bound with memory in vkBindImageMemory. v2: demand dedicated allocation in vkGetImageMemoryRequirements2 if image has external format v3: refactor prepare_ahw_image, support vkBindImageMemory2, calculate stride correctly for rgb(x) surfaces, rename as 'resolve_ahw_image' v4: rebase to `b43f955037` changes v5: add some assertions to verify input correctness (Lionel) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-19 09:38:41 +02:00
Tapani Pälli	c79a528d2b	anv/android: support import/export of AHardwareBuffer objects v2: add support for non-image buffers (AHARDWAREBUFFER_FORMAT_BLOB) v3: properly handle usage bits when creating from image v4: refactor, code cleanup (Jason) v5: rebase to `b43f955037` changes, initialize bo flags as ANV_BO_EXTERNAL (Lionel) v6: add assert that anv_bo_cache_import succeeds, add comment about multi-bo support to clarify current implementation (Lionel) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-19 09:38:41 +02:00
Tapani Pälli	5c65c60d6c	anv: refactor, remove else block in AllocateMemory This makes it cleaner to introduce more cases where we import memory from different types of external memory buffers. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-19 09:38:41 +02:00
Jason Ekstrand	a10a450db2	anv: Advertise support for MinLod on Skylake+ These are usually used for dealing with sparse resources but there's no reason why we can't hook them up before we have sparse. We have the hardware; let's light it up. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-12-11 21:26:23 -06:00
Eric Engestrom	56d126f8fd	anv: correctly use vulkan 1.0 by default Per chapter 3.2 "Instances": > Providing a NULL VkInstanceCreateInfo::pApplicationInfo or providing > an apiVersion of 0 is equivalent to providing an apiVersion of > VK_MAKE_VERSION(1,0,0). Reported-by: Niklas Haas <git@haasn.xyz> Fixes: `8c048af589` "anv: Copy the appliation info into the instance" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-26 22:05:02 +00:00
Jason Ekstrand	a845c2bc10	anv: Expose VK_EXT_scalar_block_layout Our compile already splits UBO loads into scalars and the untyped surface read messages we use for SSBO reads and writes only require dword alignment. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-11-22 08:16:47 -06:00
Jason Ekstrand	07eb8e7466	anv: Return VK_ERROR_DEVICE_LOST from anv_device_set_lost This lets us get rid of a bunch of duplicated error messages. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-26 13:27:21 -05:00
Jason Ekstrand	292ebdbf98	anv: Handle the device loss abort in anv_device_set_lost Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-26 08:40:23 -05:00
Jason Ekstrand	cd0960b430	anv: Add helpers for setting/checking device lost Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-26 08:40:21 -05:00
Jason Ekstrand	319ff6f1ad	anv: Provide a error message with a DEVICE_LOST Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-26 08:40:10 -05:00
Eric Engestrom	ed5d65a6a1	anv: use snprintf() instead of memset()+strcpy() snprintf() guarantees that it will not write more chars than allowed, and that the string will be null-terminated, without the need to fill the whole thing with zeroes to begin with. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-24 18:15:56 +01:00
Jason Ekstrand	0d380af809	anv: Define trampolines as the weak functions Instead of having weak references to the anv functions and separate trampoline functions with their own dispatch table, just make the trampoline functions weak. This gets rid of a dispatch table and potentially lets the compiler delete the unused weak function. The end result is a reduction in the .text section of 5.7K and a reduction in the .data section of 1.4K. Before: text data bss dec hex filename 3190329 282232 8960 3481521 351fb1 _install/lib64/libvulkan_intel.so After: text data bss dec hex filename 3184548 280792 8960 3474300 35037c _install/lib64/libvulkan_intel.so Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-19 11:52:00 -05:00
Keith Packard	67a2c1493c	vulkan: Add VK_EXT_calibrated_timestamps extension (radv and anv) [v5] Offers three clocks, device, clock monotonic and clock monotonic raw. Could use some kernel support to reduce the deviation between clock values. v2: Ensure deviation is at least as big as the GPU time interval. v3: Set device->lost when returning DEVICE_LOST. Use MAX2 and DIV_ROUND_UP instead of open coding these. Delete spurious TIMESTAMP in radv version. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> v4: Add anv_gem_reg_read to anv_gem_stubs.c Suggested-by: Jason Ekstrand <jason@jlekstrand.net> v5: Adjust maxDeviation computation to max(sampled_clock_period) + sample_interval. Suggested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-17 20:10:15 -07:00
Lionel Landwerlin	322a919a41	anv: Implement VK_EXT_pci_bus_info Even though the Intel GPU are always at the same PCI location, all the info we need is already provided by libdrm. Let's be future proof. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-16 12:47:55 +01:00
Jason Ekstrand	ae18c53ba6	anv: Split dispatch tables into device and instance There's no reason why we need generate trampoline functions for instance functions or carry N copies of the instance dispatch table around for every hardware generation. Splitting the tables and being more conservative shaves about 34K off .text and about 4K off .data when built with clang. Before splitting dispatch tables: text data bss dec hex filename 3224305 286216 8960 3519481 35b3f9 _install/lib64/libvulkan_intel.so After splitting dispatch tables: text data bss dec hex filename 3190325 282232 8960 3481517 351fad _install/lib64/libvulkan_intel.so Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-15 13:30:24 -05:00

1 2 3 4 5 ...

376 commits