fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-23 04:18:14 +02:00

Author	SHA1	Message	Date
José Roberto de Souza	d491742d19	anv: Add support all possible cached and coherent memory types This changes allow us to support HOST_COHERENT, HOST_CACHED and HOST_COHERENT + HOST_CACHED memory types for platforms that has the PAT uAPI. Be aware that Xe KMD will not be able to support cached only memory types, anv_xe_physical_device_init_memory_types() will reflect that but internal usage should not allocate VK_MEMORY_PROPERTY_HOST_CACHED_BIT only memory, hence the assert added. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25462>	2023-11-29 14:57:42 +00:00
José Roberto de Souza	3baab9bb38	anv: Rename ANV_BO_ALLOC_SNOOPED to ANV_BO_ALLOC_HOST_CACHED_COHERENT Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25462>	2023-11-29 14:57:42 +00:00
Jan Beich	112093f9e2	intel: make CLOCK_BOOTTIME optional for non-Linux src/intel/common/xe/intel_gem.c:71:9: error: use of undeclared identifier 'CLOCK_BOOTTIME' case CLOCK_BOOTTIME: ^ Fixes: `ae0df368a8` ("intel/common: Add intel_gem_read_correlate_cpu_gpu_timestamp()") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26392>	2023-11-29 10:14:01 +00:00
Jan Beich	5c32c41f65	intel: make CLOCK_TAI optional for non-Linux src/intel/common/xe/intel_gem.c:72:9: error: use of undeclared identifier 'CLOCK_TAI' case CLOCK_TAI: ^ Fixes: `ae0df368a8` ("intel/common: Add intel_gem_read_correlate_cpu_gpu_timestamp()") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26392>	2023-11-29 10:14:01 +00:00
Tapani Pälli	ec43c20182	anv: implement dummy blit for Wa_16018063123 Insert a dummy blit prior to MI_ARB_CHECK, MI_SEMAPHORE_WAIT, MI_FLUSH_DW submitted on the copy engine. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26209>	2023-11-29 08:09:06 +00:00
Lionel Landwerlin	7dff232c09	intel/ds: add trace of buffer markers Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14924>	2023-11-29 01:16:22 +00:00
Kenneth Graunke	c8e122a738	anv: Implement rudimentary VK_AMD_buffer_marker support This provides a basic implementation of VK_AMD_buffer_marker: we can write the 32-bit markers from within a command buffer. Unfortunately, our hardware has several limitations that make this difficult to implement well: 1. We don't have insight into when specific stages finish (i.e. all geometry shaders are done, but pixel rasterization may still be occurring). 2. We cannot perform pipelined writes of 32-bit values to arbitrary memory locations. PIPE_CONTROL::Write Immediate Value would be the obvious way to implement this, but it only supports 64-bit values, and the extension doesn't allow us to do that. We instead use MI_STORE_DATA_IMM to write 32-bit values, but this requires hard stalls. Despite those limitations, the extension may still be useful for tools to debug GPU hangs. We hope to offer another extension in the future which offers similar functionality but is more efficient on our GPUs. v2: Updated by Lionel Landwerlin to fix a number of flushing and cache coherency issues with these writes. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14924>	2023-11-29 01:16:22 +00:00
Caio Oliveira	5de5a0d475	intel/compiler: Don't use fs_visitor::bld in thread payload classes Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26301>	2023-11-28 19:53:51 +00:00
Caio Oliveira	2d6240ab14	intel/compiler: Don't use fs_visitor::bld in fs_reg_alloc Just set up the builder without relying on the pre-existing one. Moves one step close to remove bld from fs_visitor. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26301>	2023-11-28 19:53:51 +00:00
Caio Oliveira	f55867b56c	intel/compiler: Don't use fs_visitor::bld in tests Tests create their own fs_builder now. Moves one step closer to remove bld from fs_visitor. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26301>	2023-11-28 19:53:51 +00:00
Caio Oliveira	9540259e1c	intel/compiler: Prefer ctor/dtors in some Google Tests Per Google Test FAQ recommendation, prefer consutrctors and destructors unless there's a need to use SetUp/TearDown. We will take advantage of this later to initialize an fs_builder. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26301>	2023-11-28 19:53:51 +00:00
José Roberto de Souza	6a245e4eea	intel: Share function to do device query in Xe KMD A "dance" is required with this uAPI, first we need to ask KMD what is the size of the giving query id, then memory needs to be allocated to match that size and then query again with the memory address set and at this time Xe KMD will copy the query data to memory. This dance was being duplicated in xe_engine_get_info() and anv_xe_physical_device_get_parameters() and the next patch will also use it in Iris, so here adding it common/xe and re-using as much as possible. There is one more implementation of this function in intel/dev but due to how libs are linked intel/dev can't depend on to intel/common. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26325>	2023-11-28 18:17:45 +00:00
Lionel Landwerlin	b18006397b	anv: remove heuristic preferring dedicated allocations This heuristic doesn't show much difference when you have a beafy processor but on lower end skus, it increase the number of buffers in the execbuffer ioctl, adding significant overhead in i915. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `4cdd3178fb` ("anv: Meet CCS alignment reqs with dedicated allocs") Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26335>	2023-11-28 16:13:11 +00:00
Lionel Landwerlin	7b87e1afbc	anv: track & unbind image aux-tt binding This solves a problem when you have a big memory chunk of which some regions are bound to images. If the image is destroyed, currently the aux-tt mapping stays and prevent any new image aux-tt mapping within that region, until the memory is freed. This maps & unmaps the aux-tt region at respectively bind & destroy time, so that the memory chunks can be map through aux-tt. If there is aliasing of memory to 2 different images, then the first one "wins" the aux mapping and gets compression support. The second one doesn´t. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `ee6e2bc4a3` ("anv: Place images into the aux-map when safe to do so") Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26335>	2023-11-28 16:13:11 +00:00
Lionel Landwerlin	b09db9d823	anv: use main image address to determine ccs compatibility The BO address is not really a good criteria since we can bind an image at an offset inside a BO. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `ee6e2bc4a3` ("anv: Place images into the aux-map when safe to do so") Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26335>	2023-11-28 16:13:11 +00:00
Lionel Landwerlin	7c6faa1efe	intel/aux_map: introduce ref count of L1 entries To implement this feature, we need to do CPU side tracking of all L3/L2/L1 entries. This does add a little bit of CPU allocations, but the advantage is that the traversal of the page table tree is faster. No more need for the linear seach of find_buffer(). With this feature, we can have multiple VkImage bind to the same main memory address, as long as they share exact same mapping parameters. The AUX mapping will be removed when the last VkImage is destroyed. As previously, if the L1 mapping entry parameters don't match, the mapping fails. Anv handles this nicely by just disabling AUX on the image. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26335>	2023-11-28 16:13:11 +00:00
Lionel Landwerlin	e22e88f8ce	intel/fs: reuse set_predicate() Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26306>	2023-11-28 13:40:07 +00:00
Lionel Landwerlin	83a1657b6c	intel/fs: fix incorrect register flag interaction with dynamic interpolator mode Once NIR code is lowered and a few optimization passes have run, there might be flag register interactions between instructions quite far away from one another. In the following case : f0 = and r0, r1 ... fs_interpolate r2, r3 ... if f0 ... endif If we lower fs_inteporlate while using the f0 register, we completely garble the value meant for the if block. To fix this, emit the predication for fs_interpolate in brw_fs_nir.cpp when doing the NIR translation to the backend IR. This will guarantee that the flag register interactions are visible to the optimization passes, avoiding the problem above. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `68027bd38e` ("intel/fs: implement dynamic interpolation mode for dynamic persample shaders") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9757 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26306>	2023-11-28 13:40:07 +00:00
Iván Briano	6f9be9a2a0	hasvk: ensure we reapply always pipeline dynamic state in runtime state Backport of `24631d308c` Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26341>	2023-11-27 20:36:07 +00:00
Eric Engestrom	cf510e38a5	intel/ci: fix .hasvk-manual-rules Fixes: `570acf5655` ("ci: Add a manual full and 1/10th hasvk CTS runs.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26259>	2023-11-27 12:55:18 +00:00
Eric Engestrom	1942073112	intel/perf: fix regex escaping `\$` is interpreted before being passed to `re.search()`, but luckily for us the escape is also invalid and because of that, python 3.12+ warns us about it. Use a raw string instead, so that the `\` is passed untouched to `re.search()`. Fixes: `aa04b47c6e` ("intel/perf: add support for GtSlice/GtSliceXDualsubsliceY variables") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26355>	2023-11-27 11:58:03 +00:00
José Roberto de Souza	7046a9e280	intel: Rename PAT entries Here renaming the PAT entries to a name that better express each entry. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25447>	2023-11-23 21:19:18 +00:00
Iván Briano	43cb4cb6dd	anv: use the right vertexOffset on CmdDrawMultiIndexed Fixes: `c70ef757e6` ("anv: Use extended parameters on Gen11+") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26327>	2023-11-22 13:11:34 -08:00
Sagar Ghuge	2d3f0a834a	anv: Add comment to copy image code block Anybody will be tempted to factor out the if-else block code since it looks like duplication but else block actually handles the ycbcr images where the aspect masks are compatible but don't need to be the same. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26294>	2023-11-22 17:42:43 +00:00
Daniel Schürmann	1179d83a89	nir: remove info.fs.needs_all_helper_invocations Use info.uses_wide_subgroup_intrinsics instead. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26026>	2023-11-22 11:31:11 +01:00
Tapani Pälli	d3e3c30d36	anv: implement Wa_18020335297 Set some state and implement dummy draws whenever viewport pointer is being reprogrammed. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25987>	2023-11-22 05:23:12 +00:00
Tapani Pälli	418299c120	anv: refactor state emission Add a helper that only emits hw_state, this makes it easier to modify dirty state and call helper to emit only wanted state. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25987>	2023-11-22 05:23:12 +00:00
Caio Oliveira	e8220b9319	intel/compiler: Simplify allocation of NIR related arrays Those are not reused, so this will be the first and only allocation, so no need to use the "realloc" variants. For the fs_reg arrays, there's currently no particular reason to keep them uninitialized, so zero-initialize them too -- not ideal but better than random values. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26302>	2023-11-21 18:31:05 +00:00
Francisco Jerez	6a810b0ba8	intel: Improve N-way pixel hashing computation to handle pixel pipes with asymmetric processing power. This reworks the intel_compute_pixel_hash_table_nway() pixel pipe hashing table computation helper to handle cases where some pixel pipes have processing power different from the others, this is helpful for Gfx12.7+ platforms where there are pixel pipes with 1 DSS as well as pixel pipes with 2 DSSes, which currently can lead to a serious performance bottleneck in the pixel pipes with lower processing power. In order to avoid such a load imbalance the intel_compute_pixel_hash_table_nway() function will now take two pixel pipe bitsets instead of one: Pixel pipes enabled on both bitsets will appear with twice the frequency on the table as pixel pipes which only appear on one bitset. See the comments below for more details on the algorithm used to construct a pixel hashing table with the desired properties. With this change rendering performance improves by about 25% on a fused MTL platform -- The list of specific configs this is expected to show an improvement on is not included here since the list is rather long and some of the configs may still be embargoed or may never be productized, but in order to find out whether your Gfx12.7+ device could be affected by this you can check the output of the intel_dev_info tool from the Mesa tree and see if there are multiple "pixel pipe" entries with different DSS count. That isn't expected to occur on any DG2 configuration, only on MTL+ platforms, so this change should have no effect at all on DG2 (it's easy to convince oneself that it won't since for DG2 mask1 should equal mask2 so mask2 will be set to zero at the beginning of intel_compute_pixel_hash_table_nway() and the new swzx[] permutation will be set to the identity). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26266>	2023-11-20 23:48:34 +00:00
José Roberto de Souza	205c5874d4	intel: Sync xe_drm.h Sync xe_drm.h with commit 3b8183b7efad ("drm/xe/uapi: Be more specific about the vm_bind prefetch region"). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26238>	2023-11-20 17:57:34 +00:00
Lionel Landwerlin	f9bab3566b	intel/perf: fix querying of configurations Using the unsized data field is incorrect. The data is located behind the entire drm_i915_query_perf_config structure. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26285>	2023-11-20 16:00:05 +00:00
Eric Engestrom	4de3ce1f2c	ci/piglit: specify only the traces file in the job config Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26278>	2023-11-20 15:23:40 +00:00
Shuicheng Lin	dddab9fa77	intel/xe: Correct DRM_XE_EXEC_QUEUE_SET_PROPERTY's ioctl DRM_XE_EXEC_QUEUE_SET_PROPERTY is the offset, while DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY is the real number. Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26253>	2023-11-18 10:17:45 +00:00
Paulo Zanoni	563678f310	anv/sparse: don't support YCBCR 2x1 compressed formats Regarding supporting these formats, the spec says: "A sparse image created using VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT supports all non-compressed color formats with power-of-two element size that non-sparse usage supports. Additional formats may also be supported and can be queried via vkGetPhysicalDeviceSparseImageFormatProperties. VK_IMAGE_TILING_LINEAR tiling is not supported." Regarding the formats themselves, the spec says: "VK_FORMAT_B8G8R8G8_422_UNORM specifies a four-component, 32-bit format containing a pair of G components, an R component, and a B component, collectively encoding a 2×1 rectangle of unsigned normalized RGB texel data. One G value is present at each i coordinate, with the B and R values shared across both G values and thus recorded at half the horizontal resolution of the image. This format has an 8-bit B component in byte 0, an 8-bit G component for the even i coordinate in byte 1, an 8-bit R component in byte 2, and an 8-bit G component for the odd i coordinate in byte 3. This format only supports images with a width that is a multiple of two. For the purposes of the constraints on copy extents, this format is treated as a compressed format with a 2×1 compressed texel block." Since these formats are to be considered compressed 2x1 blocks and we don't necessarily have to support non-compressed formats that non-sparse support, we can claim them as not supported with sparse. In addition to all of that, if you look at isl_gfx125_filter_tiling() you'll see that we don't even support Tile64 for these formats, so sparse residency (i.e., non-opaque image binds) doesn't really make sense for them yet. The Vulkan spec defines 4 other YCBCR "2x1 compressed" formats like the ones we have in this commit, but we don't support them even without sparse, so there's no reason to check them here. A recent change in VK-GL-CTS made tests that use these formats go from unsupported to failures: 7ecc7716a983 ("Do not use and check for STORAGE image support, when it is not used in the test") This commit "fixes" the following VK-GL-CTS failures (by making them return NotSupported): dEQP-VK.sparse_resources.image_block_shapes.2d.b8g8r8g8_422_unorm.samples_1 dEQP-VK.sparse_resources.image_block_shapes.2d.g8b8g8r8_422_unorm.samples_1 dEQP-VK.sparse_resources.image_block_shapes.2d_array.b8g8r8g8_422_unorm.samples_1 dEQP-VK.sparse_resources.image_block_shapes.2d_array.g8b8g8r8_422_unorm.samples_1 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>	2023-11-17 17:58:29 +00:00
Paulo Zanoni	a0559768db	anv: enable sparse by default on i915.ko On i915.ko we don't have the vm_bind ioctl, so sparse requires TR-TT. Unfortunately, on gfx < 20 TR-TT is not compatible with non-render queues, so we have to disable those when sparse is enabled. Notice that although we don't have TR-TT for non-render queues on gfx >= 20, vm_bind is the default, and it doesn't have this restriction. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>	2023-11-17 17:58:29 +00:00
Paulo Zanoni	fda5163f34	anv/trtt: properly handle the lifetime of TR-TT batch BOs We need to wait for the batches to complete before we return the BOs to the pool. We were previously doing this completely synchronously, which made the code unnecessarily wait. Now we have a timeline syncobj that signals completion of the previous BOs, so sometimes we check where we are in the timeline and then return the BOs that we know are unused. This, in addition to the previous patch that made us wait for the other syncobjs through the execbuf ioctl instead of through the CPU, makes TR-TT batches at least an order of magnitude faster. Still, I don't think we'll notice any changes in games's FPS as they don't bind sparse resources that often. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>	2023-11-17 17:58:29 +00:00
Paulo Zanoni	0f21836272	anv/trtt: add support for queue->sync to the TR-TT batches At this moment this patch won't buy us anything since we're already being completely synchronous, but the next patch is going to change this and so queue->sync will start making sense. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>	2023-11-17 17:58:29 +00:00
Paulo Zanoni	1534ee46b8	anv/trtt: add struct anv_trtt_batch_bo and pass it around For now it just wraps the bo and size, so there's really no value to having it. In the next commit we'll add more elements to the struct. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>	2023-11-17 17:58:29 +00:00
Paulo Zanoni	18bd00c024	anv/trtt: don't wait/signal syncobjs using the CPU anymore Pass them as part of the TR-TT batch. This is what a lot of the previous commits were building up to. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>	2023-11-17 17:58:28 +00:00
Paulo Zanoni	f2206a0eb1	anv/xe: allow passing extra syncs to xe_exec_process_syncs() We're going to use this in two different patches. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>	2023-11-17 17:58:28 +00:00
Paulo Zanoni	4b435d6983	anv/i915: extract setup_execbuf_fence_params() I'm about to add a 3rd caller for it. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>	2023-11-17 17:58:28 +00:00
Paulo Zanoni	d797d9233d	anv/sparse: process image binds before opaque image binds When sparse images are being used, applications normally use non-opaque binds and leave opaque binds just for the miptail part. Since miptails are always at the end of the array layers, processing the opaque binds after processing the non-opaque binds increases the chance that anv_sparse_submission_add() will join the miptail bind operation with the last non-opaque opreration, especially if the user is trying to bind the last few non-miptail levels and the miptail in the same vkQueueBindSparse opration. In the real world this case does happen, so we're able to save a bind operation every once in a while in Steam games. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>	2023-11-17 17:58:28 +00:00
Paulo Zanoni	040063c156	anv/sparse: move waiting/signaling syncobjs to the backends Move waiting/signaling to the backends so we can fix each backend separately. As I write this patch the vm_bind backend is back to using synchronous vm_binds so we can't pass syncobjs to the synchronous vm_bind ioctl anymore. We'll need more discussions and possibly some rework before we go back to asynchronous vm_binds. This commit should allow us to fix the TR-TT backend in the next commit and leave vm_bind for later. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>	2023-11-17 17:58:28 +00:00
Paulo Zanoni	cbf09b4254	anv/trtt: use 'queue' from anv_sparse_submission in the backend Don't pass it as a parameter when it's also part of a struct. Have to touch 9 files just for that... Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>	2023-11-17 17:58:28 +00:00
Paulo Zanoni	f6d28bec6d	anv/sparse: add 'queue' to anv_sparse_submission If we're going to move syncobj waiting/signaling down to the backend we're going to need a queue to signal as lost in case those operations fail. In some places of the stack we don't have a queue available, such as when we're creating or destroying resources. For those, for vm_bind cases we don't use the queue for anything so passing it as NULL is fine. For TR-TT we are already using device->trtt.queue. For TR-TT specifically this also means we're going to start using the actual queues from the call stack instead of trtt->queue, but that shouldn't make any difference since we only ever have one queue. Still, this is more technigally correct. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>	2023-11-17 17:58:28 +00:00
Paulo Zanoni	576275907a	anv/sparse: pass anv_sparse_submission to the backend functions Our ultimate goal is to have the backend functions deal with the wait and signal syncobjs instead of waiting for them on the CPU inside anv_queue_submit_sparse_bind_locked(). For that, we'll need waits and signals parameters to be passed all the way to the backend functions that actually make the submission, and this is what this patch does, through struct anv_sparse_submission. This patch just deals with passing the parameters to the functions, nothing is using the new variables yet. There should be no functional changes here. The goal here is to make code review easier. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>	2023-11-17 17:58:28 +00:00
Paulo Zanoni	6c7753ee0b	anv/sparse: join all submissions into a single anv_sparse_bind() call Currently, a single vkQueueBindSparse() call may lead to multiple bind calls in the backend (either a vm_bind ioctl or a command submission that updates the TR-TT page tables). These operations can be quite slow so it's better for us if we try to emit as few of them as possible. On top of that, this gives our "just extend the last operation's size if possible" code a little more chance to act and save us real time. Our ultimate goal here is to also pass submit->waits and submit->signals to the backend so we can avoid doing CPU waits, so having a single call to the backend helps simplify things a little too, and we just created the structure to carry these extra pointers forward. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>	2023-11-17 17:58:28 +00:00
Paulo Zanoni	11e9a700f6	anv/sparse: drop anv_sparse_binding_data from dump_anv_vm_bind() Having it helped us printing the resource offset, which made debugging some situations easier. The problem is that we want to rework the code a little bit and we won't have a 'sparse' struct anymore to pass around. Since it's all debug code drop it for now so it doesn't get in the way of the rework. If we need it later we can find a way to add it back, or we find another way to print the value. Drive-by drop the DEBUG_SPARSE check that's already in the caller. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>	2023-11-17 17:58:28 +00:00
Paulo Zanoni	b4fef9a745	anv/trtt: also join the L3/L2 writes into a single MI_STORE_DATA_IMM Same as the L1 case, but this one deals with 64bit entry addresses and pte addresses. Consecutive L3/L2 writes are much rarer than L1 writes since they require some pretty big buffers, but we can still those cases in the wild. I just don't think any change will be noticeable though. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>	2023-11-17 17:58:28 +00:00
Paulo Zanoni	31f720fd6e	anv/trtt: join L1 writes into a single MI_STORE_DATA_IMM when possible If the addresses are sequential, we can emit only a single MI_STORE_DATA_IMM instruction. This is a very common case, it should save us some space: 4 bytes per extra_write. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>	2023-11-17 17:58:28 +00:00

1 2 3 4 5 ...

10653 commits