fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-26 12:28:12 +02:00

Author	SHA1	Message	Date
Jason Ekstrand	c4be72934e	anv: Fix a relocation race condition Previously, we would read the offset from the BO in anv_reloc_list_add to generate the presumed offset and then again in the caller to compute the 64-bit address to write into the buffer. However, if the offset somehow changed between these two points, the presumed offset would no longer match the written offset. This is unlikely to actually ever be a problem in practice because the presumed offset gets recorded first and so if the written address is wrong then the presumed offset is almost certainly wrong and the relocation will trigger. However, it's much safer to simply have anv_reloc_list_add return the 64-bit address. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:08 +00:00
Jason Ekstrand	bbf389013f	anv: Use a util_sparse_array for the GEM handle -> BO map This lets us do less allocation because the anv_bo's are now embedded in the sparse array and it also allows lock-free translation from GEM handle to BO which will be useful in future commits. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:08 +00:00
Jason Ekstrand	821ce7be36	anv: Move refcount to anv_bo Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:08 +00:00
Bas Nieuwenhuizen	3e86d553a4	anv: Remove _mesa_locale_init/fini calls. The resulting locale is not used for Vulkan, and it is not reference counted, giving issues when multiple instances are created. CC: 19.2 19.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-31 09:47:56 +00:00
Rafael Antognolli	3c317e8187	anv: Add Tile Cache Flush for Unified Cache.	2019-10-30 19:51:03 +00:00
Rafael Antognolli	e51722a7c7	anv: Align fast clear color state buffer to a page. On gen11 and older, compressed images are tiled and aligned to 4K. On gen12 this 4K alignment restriction was removed. However, only aligning the fast clear color buffer to 64B (a cacheline, as it's on the documentation) is causing some bugs where the fast clear color is not converted during the fast clear operation. Aligning things to 4K seems to fix it. v2: Assert that image->planes[plane].offset is 4K aligned (Nanley) Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-30 19:41:29 +00:00
Jason Ekstrand	beca63c6c0	anv: Avoid emitting UBO surface states that won't be used This shaves around 4-5% off of a CPU-limited example running with the Dawn WebGPU implementation. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-30 16:05:57 +00:00
Jason Ekstrand	52aa7f3e05	anv: Reduce the minimum number of relocations The original value of 256 was under the assumption that you're a batch buffer which is likely going to have a large number of relocations. However, pipeline objects on Gen7 will have at most 6 relocations (one per shader stage and one for the workaround BO) so this is a lot of per-pipeline wasted space. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-29 20:27:52 +00:00
Jason Ekstrand	a3153162a9	anv: Delay allocation of relocation lists The old relocation list code always allocated 256 relocations and a hash set up-front without knowing whether or not we really need them. In particular, in the softpin case, this is two fairly large allocations that we don't need to be making. Also, for pipeline objects on haswell where we don't have softpin, we don't need relocations unless scratch is used so this is extra data per-pipeline. Instead, we should do it on-demand. This shaves 3.5% off of a cpu-limited example running with the Dawn WebGPU implementation. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-29 20:27:52 +00:00
Plamena Manolova	4fe2317601	anv: Implement new way for setting streamout buffers. For gen12 we set the streamout buffers using 4 separate commands instead of 3DSTATE_SO_BUFFER. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-29 19:21:20 +00:00
Plamena Manolova	f9ad73cdfd	anv: Set depthBounds to true in anv_GetPhysicalDeviceFeatures. Add depth bounds testing to the list of supported physical device features. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-29 16:05:33 +00:00
Caio Marcelo de Oliveira Filho	e2155158e9	anv: Fix output of INTEL_DEBUG=bat for chained batches The anv_batch_bo contents are linked one to another, and when printing we have to start with the first of those. Since in `u_vector` new elements are added to the head, to get the first element we need the vector's tail. Fixes: `32ffd90002` ("anv: add support for INTEL_DEBUG=bat") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-28 19:34:54 -07:00
Eric Engestrom	ea8116908c	anv: add a couple printflike() annotations Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-28 23:17:16 +00:00
Nanley Chery	6451008e8b	intel: Refactor blorp_can_hiz_clear_depth() Prepare this function to be used in iris and to handle new Gen12 behavior. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:06 -07:00
Nanley Chery	c50f8b2fc9	intel: Support HIZ_CCS in isl_surf_get_ccs_surf Add an extra aux parameter which will be filled out with CCS if the first two isl_surf parameters fit the requirements for HiZ_CCS. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:05 -07:00
Nanley Chery	8af1853331	anv/private: Modify aux slice helpers for Gen12 CCS The isl_surf structs for Gen12's CCS won't describe how many slices in the main surface can be compressed. All slices will be compressable if CCS is enabled, so lookup the main surface's logical dimension. v2. Add a space before a `?`. (Jordan) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 10:47:05 -07:00
Nanley Chery	300d77c2fa	anv/cmd_buffer: Don't assume CCS_E includes CCS_D There's no longer a clear-only compression mode of CCS on Gen12+. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:05 -07:00
Nanley Chery	4f0b5f9732	anv/image: Disable CCS_D on Gen12+ Clear-only compression no longer exists on TGL. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 10:47:04 -07:00
Nanley Chery	0eaf293b47	anv/formats: Disable I915_FORMAT_MOD_Y_TILED_CCS on TGL+ The format of the CCS has changed. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 10:47:04 -07:00
Nanley Chery	d0fcc2dd50	anv: Properly allocate aux-tracking space for CCS_E add_aux_state_tracking_buffer() actually checks the aux usage when determining how many dwords to allocate for state tracking. Move the function call to the point after the CCS_E aux usage is assigned. Fixes: `de3be61801` ("anv/cmd_buffer: Rework aux tracking") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:04 -07:00
Nanley Chery	698d723a6d	anv/blorp: Use BLORP_BATCH_NO_UPDATE_CLEAR_COLOR Avoid failing the `info->use_clear_address` assertion in ISL on Gen12+. Fixes: `6c9f9a82d7` ("intel/genxml,isl: Add gen12 render surface state changes") Reported-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:04 -07:00
Plamena Manolova	939ddccb7a	anv: Add support for depth bounds testing. In gen12 we use the 3DSTATE_DEPTH_BOUNDS instruction to enable depth bounds testing. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-28 14:13:04 +00:00
Timothy Arceri	7f106a2b5d	util: rename list_empty() to list_is_empty() This makes it clear that it's a boolean test and not an action (eg. "empty the list"). Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-10-28 11:24:38 +00:00
Lionel Landwerlin	6af8a4acc4	anv: Add aux-map translation for gen12+ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 00:09:14 -07:00
Jordan Justen	7737f56544	anv/gen12: Write GFX_AUX_TABLE base address register Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-28 00:09:14 -07:00
Jordan Justen	d4a3299ba1	anv/gen12: Initialize aux map context Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-28 00:09:13 -07:00
Jordan Justen	062022f2e4	anv: Implement aux-map allocator interface This interface allows the aux-map code in the intel/common library to allocate and free buffers. Reworks: * free gen_buffer in gen_aux_map_buffer_free. (Rafael) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-28 00:09:13 -07:00
Eric Engestrom	493903199c	anv: fix empty-body instruction Fixes: `8d43e2b2de` ("meson: add -Werror=empty-body to disallow `if(x);`") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-27 22:09:14 +00:00
Caio Marcelo de Oliveira Filho	06aecb14c0	anv: Implement VK_KHR_vulkan_memory_model Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-24 11:39:56 -07:00
Eric Engestrom	47571a01ec	anv: fix error message `strerror()` takes an `errno`, not the negative value returned by the `ioctl()`. Instead of fixing this as `"%s", strerror(errno)`, let's just use the `"%m"` shortcut for it. Fixes: `2b5f30b1d9` ("anv: implement VK_INTEL_performance_query") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-24 13:57:40 +00:00
Lionel Landwerlin	2b5f30b1d9	anv: implement VK_INTEL_performance_query v2: Introduce the appropriate pipe controls Properly deal with changes in metric sets (using execbuf parameter) Record marker at query end v3: Fill out PerfCntr1&2 v4: Introduce vkUninitializePerformanceApiINTEL v5: Use new execbuf extension mechanism v6: Fix comments in genX_query.c (Rafael) Use PIPE_CONTROL workarounds (Rafael) Refactor on the last kernel series update (Lionel) v7: Only I915_PERF_IOCTL_CONFIG when perf stream is already opened (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-23 05:41:15 +00:00
Lionel Landwerlin	0dfa643feb	anv: fix unwind of vkCreateDevice fail We're skipping the context destruction in some cases which is the grand scheme of thing is not that important because closing device->fd will destroy the associated context as well. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reported-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Fixes: `b30e01aef5` ("anv: fix memory leak on device destroy")	2019-10-22 20:44:26 +00:00
Lionel Landwerlin	b30e01aef5	anv: fix memory leak on device destroy v2: handle vma destruction if vkCreateDevice fails (Jordan) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1959 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-20 08:02:22 +00:00
Lionel Landwerlin	3f8f52b241	anv: fix vkUpdateDescriptorSets with inline uniform blocks With inline uniform blocks descriptor, the meaning of descriptorCount is a number of bytes to copy into the descriptor. Don't try to use that size as an index into the descriptor table. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `43f40dc7cb` ("anv: Implement VK_EXT_inline_uniform_block") Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1195 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-19 13:16:40 +03:00
Caio Marcelo de Oliveira Filho	58286c7969	anv: Advertise VK_KHR_spirv_1_4 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-14 08:25:42 -07:00
Eric Engestrom	960038d550	anv: add exported symbols check Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-13 17:40:47 +01:00
Marek Olšák	dd4cc56ebd	nir: add a strip parameter to nir_serialize so that drivers don't have to call nir_strip manually. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-10-10 15:47:07 -04:00
Jason Ekstrand	c7e5d24d8f	anv/pipeline: Capture serialized NIR This allows the serialized NIR to be displayed in RenderDoc and similar tools. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-09 22:28:01 +00:00
Caio Marcelo de Oliveira Filho	44978baece	anv: Disable fast clears when running with INTEL_DEBUG=nofc Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-09 13:29:26 -07:00
Caio Marcelo de Oliveira Filho	9560c9b498	anv: Enable VK_EXT_shader_subgroup_{ballot,vote} Anvil now supports and passes Vulkan CTS tests matching dEQP-VK.subgroups..ext_shader_subgroup_ballot. dEQP-VK.subgroups..ext_shader_subgroup_vote. and crucible tests matching func.shader-ballot.* func.shader-subgroup-vote.* Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-08 16:34:00 -07:00
Tapani Pälli	e4a826b2c8	anv/android: fix images created with external format support This fixes a case where user first creates image and then later binds it with memory created from AHW buffer. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-08 07:19:05 +03:00
Caio Marcelo de Oliveira Filho	f7ca072ab2	anv: Implement VK_KHR_shader_clock Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-07 09:12:12 -07:00
Rafael Antognolli	cdc331c6f9	anv/block_pool: Align anv_block_pool state to 64 bits. On 64 bits platforms, some atomic operations like __sync_fetch_and_add() have constant time, but on 32 bits platforms they are implemented with a loop and might take much longer. Additionally, it seems like if their operands are not aligned to 64 bits, they also require extra memory accesses. From the Intel Architecture's Developer Manual Vol. 1, 4.1.1: "A word or doubleword operand that crosses a 4-byte boundary or a quadword operand that crosses an 8-byte boundary is considered unaligned and requires two separate memory bus cycles for access." Forcing the u64 field to be aligned to 64 bits seems to make the unit tests that are stressing this finish much faster. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-03 12:40:33 -07:00
Lionel Landwerlin	da2d67fc3b	anv: gem-stubs: return a valid fd got anv_gem_userptr() Fixes invalid close(-1) in the unit tests. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-09-25 22:02:51 +03:00
Kenneth Graunke	b9e93db208	intel: Increase Gen11 compute shader scratch IDs to 64. From the MEDIA_VFE_STATE docs: "Starting with this configuration, the Maximum Number of Threads must be set to (#EU * 8) for GPGPU dispatches. Although there are only 7 threads per EU in the configuration, the FFTID is calculated as if there are 8 threads per EU, which in turn requires a larger amount of Scratch Space to be allocated by the driver." It's pretty clear that we need to increase this for scratch address calculations, because the FFTID has a certain bit-pattern. The quote above seems to indicate that we should increase the actual thread count programmed in MEDIA_VFE_STATE as well, but we think the intention is to only bump the scratch space. Fixes GPU hangs in Bioshock Infinite and Synmark's CSDof on Icelake 8x8. Fixes: `5ac804bd9a` ("intel: Add a preliminary device for Ice Lake") Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-09-23 16:59:40 -07:00
Kenneth Graunke	50c0dd8621	Revert "intel/gen11+: Enable Hardware filtering of Semi-Pipelined State in WM" This reverts commit `729de1488f`. It turns out that, although the register is in the logical context, it isn't whitelisted, so we can't actually write it from userspace batch buffers. The write just becomes a noop, which is why we saw no performance changes. I manually whitelisted it, and still observed no performance gains, but it did regress KHR-GL46.texture_cube_map_array.color_depth_attachments on the iris driver. So we might need to fix something before enabling this. To prevent it randomly getting turned on should the kernel ever whitelist this register, we revert the patch for now.	2019-09-23 16:31:23 -07:00
Jason Ekstrand	7d861ab812	anv: Advertise VK_KHR_shader_subgroup_extended_types Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-09-20 18:02:15 +00:00
Eric Engestrom	3c1a24de07	anv: implement ICD interface v4 Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-09-20 08:31:58 +00:00
Eric Engestrom	19db95e78e	anv: split instance dispatch table This effectively breaks the instance dispatch table in 2 with entry points using a physical device as first argument getting their own dispatch table. As a result we now have to check instance & physical device dispatch table instead of just the instance dispatch table before. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-09-20 08:31:58 +00:00
Jason Ekstrand	0c4e89ad5b	Move blob from compiler/ to util/ There's nothing whatsoever compiler-specific about it other than that's currently where it's used. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-19 19:56:22 +00:00

... 46 47 48 49 50 ...

4700 commits