fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 17:48:15 +02:00

Author	SHA1	Message	Date
Lionel Landwerlin	fb3ae17d96	anv: fix missing tracking for alpha-to-coverage runtime changes Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `9926aedc96` ("anv: enable EDS3 AlphaToCoverageEnable & RasterizationSamples") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31196>	2024-09-23 15:56:01 +00:00
Nanley Chery	b3882c4488	intel: Avoid no-op calls to anv_image_clear_color Whenever we execute a fast-clear due to LOAD_OP_CLEAR, we decrease the number of layers to clear by one. We then enter the slow clear function and possibly exit without clearing if the layer count is zero. Unfortunately, we've already compiled the shader for slow clears by the time we exit. Skip the slow clear function if there are no layers to clear. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31167>	2024-09-20 16:34:37 +00:00
Nanley Chery	1c7fe9ad1b	anv: Support fast clears in anv_CmdClearColorImage At least two game traces make use of this path: TWWH3 and Factorio. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31167>	2024-09-20 16:34:37 +00:00
Nanley Chery	46d58583ff	anv: Move exec_ccs_op and exec_mcs_op higher up The next patch will use them in anv_CmdClearColorImage(). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31167>	2024-09-20 16:34:37 +00:00
Nanley Chery	03286117ef	anv: Move and rename anv_can_fast_clear_color_view It's no longer specific to image views. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31167>	2024-09-20 16:34:36 +00:00
Nanley Chery	44351d67f8	anv: Change params of anv_can_fast_clear_color_view Expand the scope to more than just image views. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31167>	2024-09-20 16:34:36 +00:00
José Roberto de Souza	7c01cbda6f	anv: Optimize vkQueueWaitIdle() on Xe KMD vk_common_QueueWaitIdle() creates a syncobj, does a submit with no batch buffers what translates to execute trivial_batch_bo and then waits for syncobj to be signaled when trivial_batch_bo finishes. On Xe KMD on other hand we can avoid the trivial_batch_bo submission and instead use the special DRM_IOCTL_XE_EXEC with num_batch_buffer == 0 to get a syncobj to be signaled when the last exec finish execution. This should free a bit GPU to execute more important workloads. This will also optimize vkDeviceWaitIdle() that calls QueueWaitIdle(). It have to fallback to vk_common_QueueWaitIdle() when queue is in VK_QUEUE_SUBMIT_MODE_THREADED mode because vkQueueWaitIdle() could return but there still stuff in VK/CPU submission queue. Also it could cause use after free when resources attached to submission are freed before it is processed, example: vkCreateFence() or vkCreateSemaphore() vkQueueSubmit() // with Fence or Semaphore created above vkQueueWaitIdle() // with the race it returns vkDestroyFence() or vkDestroySemaphore() // vk_queue_submit_thread_func() start to process submission above... Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30958>	2024-09-19 23:12:45 +00:00
José Roberto de Souza	2f7c9f906d	intel: Split anv_xe_wait_exec_queue_idle() and move part of it to common/ Split anv_xe_wait_exec_queue_idle() into 2 functions, the first function creates the syncobj and prepares it to be signaled when the last workload in queue is completed. And the second one that calls the first function, then waits for the syncobj to be signaled and destroy the syncobj. The main reason for that is that the first function can be reused in Iris and a future patch will add another user, so lets share it. No changes in behavior are expected here. Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30958>	2024-09-19 23:12:44 +00:00
Tapani Pälli	b01d76027d	blorp: assert that color depth is not 96 for Wa_16021021469 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31263>	2024-09-19 22:44:49 +00:00
Nanley Chery	290f3a9367	intel/isl: Disable 3D Ys/Yf miptails for CCS We currently disable CCS if a 3D Ys/Yf surface uses miptails. However, ISL generally configures surfaces to be compatible with compression. For consistency, disable miptails on 3D Ys/Yf surfaces in order to allow compression. If drivers prefer to have a more compact layout, they can pass the ISL_SURF_USAGE_DISABLE_AUX_BIT flag at surface creation time. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30081>	2024-09-19 20:39:59 +00:00
Nanley Chery	19ed0e1685	intel/isl: Reduce miptail slot usage to allow CCS We currently disable CCS if a surface uses more than 11 slots in a miptail. However, ISL generally configures surfaces to be compatible with compression. For consistency, reduce the number of slots used in miptails in order to allow compression. If drivers prefer to have a more compact layout, they can pass the ISL_SURF_USAGE_DISABLE_AUX_BIT flag at surface creation time. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30081>	2024-09-19 20:39:59 +00:00
José Roberto de Souza	89c6fa1883	anv: Fix condition to clear query pool with blorp The comment above says it all, only when queue is not protected that it is possible to clear query pool with blorp but it was checking the opposite. Fixes: `d5b0526507` ("anv: propagate protected information for blorp operations") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31239>	2024-09-19 17:54:24 +00:00
José Roberto de Souza	0ced5663e2	anv: Improve readbility of khr_perf_query_availability_offset() and khr_perf_query_data_offset() No changes in behavior expected here. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31239>	2024-09-19 17:54:24 +00:00
José Roberto de Souza	3d09ffde46	anv/query: Fix batch end value This were not causing any issues but better set end to the correct value. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31239>	2024-09-19 17:54:24 +00:00
José Roberto de Souza	ac95745dc4	anv: Add documentation to some fields in anv_query_pool Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31239>	2024-09-19 17:54:24 +00:00
Sergi Blanch Torne	213f5e9152	Uprev Piglit to e9ab30aeaed97b69868cf4d6d6a3f70f3b53c362 `93b4bd2e0a...e9ab30aeae` Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com> Reviewed-by: David Heidelberg <david@ixit.cz> Acked-by: Daniel Stone <daniels@collabora.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31058>	2024-09-19 15:41:32 +00:00
Eric Engestrom	b8782c783c	intel/ci: track changes to the global driver `*-skips.txt` files Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31253>	2024-09-19 12:20:36 +00:00
Lionel Landwerlin	ed64eccab0	brw: fix virtual register splitting to not go below physical register size Otherwise we can end up generating invalid assembly not following destination/source alignments requirements. Fixes the following tests: dEQP-VK.spirv_assembly.instruction.graphics.float16.arithmetic_4.tan_frag dEQP-VK.spirv_assembly.instruction.graphics.float16.arithmetic_2.tan_frag dEQP-VK.spirv_assembly.instruction.graphics.float16.arithmetic_1.tan_frag dEQP-VK.spirv_assembly.instruction.graphics.float16.arithmetic_3.tan_frag Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Backport-to: 24.2 Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31206>	2024-09-18 23:26:34 +00:00
Dylan Baker	ec66109c1d	intel/perf: delete dead code. The inner loop with p is dead, because n_passes_written is no longer updated as of `56bd81ee21`, so it is always comparing a uint32_t < 0, which is never true. Since the inner loop is dead code, the pass array is dead code, as it simply keeps writing to element 0, and but never reads or uses it, along with all of the pass count information. Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31213>	2024-09-18 19:56:04 +00:00
José Roberto de Souza	dec5a624e9	anv: Check if vkCreateQueryPool() is being created in a supported queue Turns out not even VK CTS was calling vkEnumeratePhysicalDeviceQueueFamilyPerformanceQueryCountersKHR() to check if queue supports query. So here adding a explicity check in our implementation of vkCreateQueryPool(). https://github.com/KhronosGroup/VK-GL-CTS/pull/482 Cc: 24.2 <mesa-stable> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30652>	2024-09-18 15:29:16 +00:00
José Roberto de Souza	141e7eaca7	anv: Make sure all previous vm binds are done before execute perf query pool The query pool batch buffer or other bos could not be bound when exec starts. Cc: 24.2 <mesa-stable> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30652>	2024-09-18 15:29:16 +00:00
José Roberto de Souza	0a19d92ca5	anv: Add warning about mismatch between query queues Cc: 24.2 <mesa-stable> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30652>	2024-09-18 15:29:16 +00:00
José Roberto de Souza	c5d79d533a	anv: Fix context id or exec queue used to open perf stream It was always using device->context_id what is not valid in i915 when has_vm_control is true or when running with Xe KMD. But anv_AcquireProfilingLockKHR() don't have the queue information so at least for now we will only support queries in a single queue. And for consistency doing the same in anv_QueueSetPerformanceConfigurationINTEL() although here we have the queue parameter but queries are only supported in render engine so it would only expose other queues if user set some parameters. Cc: 24.2 <mesa-stable> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30652>	2024-09-18 15:29:16 +00:00
Dylan Baker	67bcdbf4a1	hasvk: remove useless uint >= 0 check Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31214>	2024-09-17 21:16:36 +00:00
Dylan Baker	27dd9fd677	anv: remove useless uint >= 0 check Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31214>	2024-09-17 21:16:36 +00:00
Lionel Landwerlin	45377dc5c4	brw: fix vecN rebuilds When loading a 64bit address from the push constants, we'll load a vec2, so we need to allocate 2 GRFs and MOV each component. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11831 Fixes: `339630ab05` ("brw: enable A64 loads source rematerialization") Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31010>	2024-09-17 14:22:23 +00:00
Lionel Landwerlin	c16b27f66f	brw: use a builder of the size of the physical register for uniforms Should avoid any partial write non-sense on Xe2+. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `339630ab05` ("brw: enable A64 loads source rematerialization") Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31010>	2024-09-17 14:22:23 +00:00
Lionel Landwerlin	02b124846f	brw: fix TGM messages to use cmask lsc opcodes This is a restriction for TGM. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `b55f7716` ("intel/brw: Switch to emitting MEMORY_*_LOGICAL opcodes") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31199>	2024-09-17 09:28:58 +00:00
Lionel Landwerlin	2159e17da0	brw: remove (load\|store)_raw_intel Those are Elk specific intrinsics. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `b8f264cfe4` ("intel/brw: Handle load/stores in lsc_op_for_nir_intrinsic()") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31199>	2024-09-17 09:28:58 +00:00
Dylan Baker	3f3cb1e2fa	intel/elk: delete copy constructor and copy-assignment-operator To keep the rule-of-three. This points out that the implicit copy operations would be dangerous when there is an explicit constructor and destructor, since the class is holding un-managed memory. Acked-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29667>	2024-09-16 20:31:45 +00:00
Dylan Baker	5809209316	anv: enforce state->cmd_buffer is never null in emit_Simpler_shader_init_fragment We have a couple of checks where we allow this to be NULL, but later we unconditionally and unavoidably dereference the pointer, which means there's no way that it ever could have been NULL. Change the assert at the top to not allow NULL, and remove checks for it being NULL CID: 1616544 Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31156>	2024-09-16 19:16:58 +00:00
Dylan Baker	5ebdfc8813	anv: assert we don't write past the end of an array Our array has a fixed size of 32, and we know at the start of the block that our type_count is < 32, but in the loop we grow the block, in theory up to 31 times. Coverity notes that, and points out we could write off the end of the array. Add an assert in the loop to ensure we don't, and to help Coverity out. CID: 1615171 Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31173>	2024-09-16 17:42:40 +00:00
Dylan Baker	7556521417	intel: replace `(uint64_t - uint64_t) > 0` with `uint64_t > uint64_t` As coverity points out, if the second uint64_t was greater than the first (I don't think it actually can be), then the overflow would result in the check succeeding when it shouldn't. We could cast this to an integer type, but since we have uint64_t, we'd need int128_t for that. Instead, replace the comparison to 0 with a direct comparison, since that would give the correct result without potential to overflow. CID: 1604833 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31175>	2024-09-16 17:12:17 +00:00
Rohan Garg	daea7e1651	intel/compiler: use the correct cache enum for loads and stores Fixes: `74efde7` ('intel/brw/xehp+: Drop redundant arguments of lsc_msg_desc*()') Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30742>	2024-09-16 15:18:31 +00:00
Rohan Garg	b99fd944e8	intel/compiler: version can never be above 11 due to the previous check Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30742>	2024-09-16 15:18:31 +00:00
Dylan Baker	ed8d1d3c9b	anv: if queue is NULL in vm_bind return early In the error handling path we end up creating a vk_sync and then later we vk_sync_wait() on it. If that wait fails somehow we'll end up calling vk_queue_set_lost(&queue->vk, ...) which would segfault if queue is NULL. If we end up in this situation (no queue), return directly whatever the backend's vm_bind function returned, propagating the error up if necessary. Fixes: `dd5362c78a` ("anv/xe: try harder when the vm_bind ioctl fails") Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31048>	2024-09-13 20:17:40 +00:00
Caio Oliveira	5e47c5f94a	intel/executor: Fix a couple of memory leaks in the tool Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31120>	2024-09-13 01:21:24 +00:00
Ian Romanick	447dae7c13	intel/brw: Use nir_opt_generate_bfi No shader-db changes on any Intel platform. The "regression" in SEND messages occurs because a loop containing a SEND is unrolled. v2: Move after nir_opt_algebraic. Suggested by Georg. shader-db: All Intel platforms had similar results. (Meteor Lake shown) total instructions in shared programs: 19787034 -> 19785933 (<.01%) instructions in affected programs: 373573 -> 372472 (-0.29%) helped: 541 / HURT: 6 total cycles in shared programs: 906012612 -> 905626304 (-0.04%) cycles in affected programs: 58456516 -> 58070208 (-0.66%) helped: 382 / HURT: 180 fossil-db: Lunar Lake Totals: Instrs: 140671401 -> 140670495 (-0.00%); split: -0.00%, +0.00% Send messages: 12891430822 -> 12891430834 (+0.00%) Loop count: 46905 -> 46904 (-0.00%) Cycle count: 21527511599 -> 21530278999 (+0.01%); split: -0.00%, +0.02% Spill count: 70728 -> 70766 (+0.05%) Fill count: 139397 -> 139254 (-0.10%); split: -0.13%, +0.02% Max live registers: 47512432 -> 47512500 (+0.00%) Totals from 355 (0.06% of 549270) affected shaders: Instrs: 878953 -> 878047 (-0.10%); split: -0.18%, +0.08% Send messages: 19289 -> 19301 (+0.06%) Loop count: 1243 -> 1242 (-0.08%) Cycle count: 1434664642 -> 1437432042 (+0.19%); split: -0.06%, +0.25% Spill count: 15826 -> 15864 (+0.24%) Fill count: 38454 -> 38311 (-0.37%); split: -0.46%, +0.08% Max live registers: 52530 -> 52598 (+0.13%) Meteor Lake and DG2 had similar results. (Meteor Lake shown) Totals: Instrs: 152516575 -> 152516147 (-0.00%); split: -0.00%, +0.00% Send messages: 7491001 -> 7491013 (+0.00%) Loop count: 47588 -> 47587 (-0.00%) Cycle count: 17124433133 -> 17126147156 (+0.01%); split: -0.01%, +0.02% Max live registers: 31854704 -> 31854764 (+0.00%) Totals from 402 (0.06% of 633223) affected shaders: Instrs: 839338 -> 838910 (-0.05%); split: -0.09%, +0.04% Send messages: 20203 -> 20215 (+0.06%) Loop count: 1243 -> 1242 (-0.08%) Cycle count: 1327042160 -> 1328756183 (+0.13%); split: -0.11%, +0.24% Max live registers: 33237 -> 33297 (+0.18%) Tiger Lake *** Shaders only in 'before' results are ignored: fossil-db/steam-native/wolfenstein_youngblood/b8cefe7f700304c4/fs.32/0 from 1 apps: fossil-db/steam-native/wolfenstein_youngblood Totals: Instrs: 150549467 -> 150548952 (-0.00%); split: -0.00%, +0.00% Send messages: 7495582 -> 7495594 (+0.00%) Loop count: 46605 -> 46604 (-0.00%) Cycle count: 15472381586 -> 15472247085 (-0.00%); split: -0.00%, +0.00% Spill count: 59776 -> 59775 (-0.00%) Fill count: 103475 -> 103464 (-0.01%) Scratch Memory Size: 2384896 -> 2383872 (-0.04%) Max live registers: 31760724 -> 31760787 (+0.00%) Max dispatch width: 5569928 -> 5569912 (-0.00%) Totals from 525 (0.08% of 632443) affected shaders: Instrs: 349074 -> 348559 (-0.15%); split: -0.25%, +0.11% Send messages: 24355 -> 24367 (+0.05%) Loop count: 849 -> 848 (-0.12%) Cycle count: 187080291 -> 186945790 (-0.07%); split: -0.19%, +0.12% Spill count: 483 -> 482 (-0.21%) Fill count: 1372 -> 1361 (-0.80%) Scratch Memory Size: 22528 -> 21504 (-4.55%) Max live registers: 36705 -> 36768 (+0.17%) Max dispatch width: 6272 -> 6256 (-0.26%) Ice Lake Totals: Instrs: 151804923 -> 151804396 (-0.00%); split: -0.00%, +0.00% Send messages: 7553216 -> 7553228 (+0.00%) Loop count: 46196 -> 46195 (-0.00%) Cycle count: 15099805668 -> 15099533898 (-0.00%); split: -0.00%, +0.00% Fill count: 103978 -> 103979 (+0.00%) Max live registers: 32168254 -> 32168323 (+0.00%) Totals from 527 (0.08% of 637191) affected shaders: Instrs: 347482 -> 346955 (-0.15%); split: -0.25%, +0.10% Send messages: 24586 -> 24598 (+0.05%) Loop count: 849 -> 848 (-0.12%) Cycle count: 191147758 -> 190875988 (-0.14%); split: -0.16%, +0.02% Fill count: 1392 -> 1393 (+0.07%) Max live registers: 37379 -> 37448 (+0.18%) Skylake Totals: Instrs: 140981504 -> 140980647 (-0.00%); split: -0.00%, +0.00% Cycle count: 14653477192 -> 14653249734 (-0.00%); split: -0.00%, +0.00% Fill count: 99636 -> 99637 (+0.00%) Max live registers: 31472062 -> 31472126 (+0.00%) Totals from 523 (0.08% of 626432) affected shaders: Instrs: 335551 -> 334694 (-0.26%); split: -0.26%, +0.01% Cycle count: 178047284 -> 177819826 (-0.13%); split: -0.14%, +0.02% Fill count: 1100 -> 1101 (+0.09%) Max live registers: 36734 -> 36798 (+0.17%) Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31006>	2024-09-13 00:21:00 +00:00
Kenneth Graunke	02482604e5	intel/brw: Delete old-style surface and A64 message opcodes These have now been replaced by the MEMORY_*_LOGICAL opcodes. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Acked-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>	2024-09-12 20:54:36 +00:00
Kenneth Graunke	7090578c35	intel/brw: Switch load_ubo_uniform_block_intel over to memory intrinsics While there are many cases that turn into the *_PULL_CONSTANT_LOAD ops or push constants, this one piece was emitting surface block loads. Switch it over to use the new intrinsics to delete a bunch of code. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Acked-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>	2024-09-12 20:54:36 +00:00
Kenneth Graunke	b55f77161d	intel/brw: Switch to emitting MEMORY__LOGICAL opcodes We introduce a new fs_nir_emit_memory_access() helper that can handle image, bindless image, SSBO, shared, global, and scratch memory, and handles loads, stores, atomics, and block loads. It translates each of these NIR intrinsics into the new MEMORY__LOGICAL intrinsics. As a result, we delete a lot of similar surface access emitter code. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Acked-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>	2024-09-12 20:54:36 +00:00
Kenneth Graunke	3ba97176d6	intel/brw: Switch load_num_workgroups to the new memory intrinsic A simple case we handle directly. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Acked-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>	2024-09-12 20:54:36 +00:00
Kenneth Graunke	dc4770b005	intel/brw: Lower MEMORY_OPCODE__LOGICAL to HDC messages This is more complicated. We map the MEMORY__LOGICAL opcodes to the older HDC messages: typed and untyped surface read/write/atomic (whether float or integer), DWord and Byte scattered messages, OWord block, and both A64, BTI, and stateless messages. - MEMORY_MODE_* is used to select stateless-scratch, typed, or untyped. - MEMORY_FLAG_TRANSPOSE is used to select block access. - MEMORY_BINDING_TYPE = FLAT and 64-bit address size selects A64. - Alignment and data type size select between byte/dword scattered or surface messages. While we may not be able to handle the full generality of message possibilities, we can handle everything we generate currently. The plan here is to assert/validate that we don't generate MEMORY_*_LOGICAL ops on HDC-based platforms which can't support those particular messages. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>	2024-09-12 20:54:36 +00:00
Kenneth Graunke	3255c9cc49	intel/brw: Lower MEMORY_OPCODE__LOGICAL to LSC messages This is pretty straightforward, as the new MEMORY__LOGICAL opcodes are designed to match the new LSC's capabilities. The main part is constructing the message payload. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>	2024-09-12 20:54:36 +00:00
Kenneth Graunke	a82e8b1c6b	intel/brw: Pretty-print memory logical opcodes The new MEMORY__LOGICAL intrinsics have a lot of control sources with a bunch of LSC_ enums (opcode, memory type, address type, address and data sizes), as well as flags, coordinate components vs. components... they unfortunately are nigh-unreadable with the default printing since there's just a string of unreadable UD immediates in some order. To fix this, we add some basic pretty-printing. If a control source is simply an enum whose value communicates the entire purpose, we print it. If it has a numeric value (i.e. alignment, or data), we add a label. For example: memory_store(16) (null):UD store shared flat addr: %2:UD coord_comps:1u align:16u d32 comps:2u data0: %3:UD memory_store(16) (null):UD store typed bti:%2+0.0<0>:UD addr: %3+0.0:D coord_comps:2u align:0u d32 comps:4u data0: %4:UD This make them much easier to read. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>	2024-09-12 20:54:36 +00:00
Kenneth Graunke	2c67729386	intel/brw: Expose functions to convert LSC enums to strings We had tables for these in the disassembler already, but I'd like to use them in brw_print.cpp as well. Just wrap the tables in convenience functions we can use there. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>	2024-09-12 20:54:36 +00:00
Kenneth Graunke	d5f38be713	intel/brw: Introduce new MEMORY_*_LOGICAL opcodes This is a new unified set of opcodes for memory access loosely patterned after the new LSC-style data port messages introduced on Alchemist GPUs. Rather than creating an opcode for every type of memory access, it has only three opcodes: load, store, and atomic. It has various sources to indicate the rest: - Binding type (raw pointer, pointer to surface state, or BT index) - Address size (A64, A32, A16) - Data size (bit size, number of components) - Opcode (atomic opcode, or LOAD/STORE vs. LOAD_CMASK/STORE_CMASK) - Mode (typed vs. untyped vs. shared-local vs. scratch) - Address (and its dimensionality) - Data (0 for loads, 1 for stores, 2 for atomics) - Whether we want block access Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>	2024-09-12 20:54:36 +00:00
Kenneth Graunke	b8f264cfe4	intel/brw: Handle load/stores in lsc_op_for_nir_intrinsic() Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>	2024-09-12 20:54:36 +00:00
Kenneth Graunke	8a6903e50d	intel/brw: Rename lsc_aop_for_nir_intrinsic to "op" instead of "aop" This is going to handle more than atomics shortly. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>	2024-09-12 20:54:36 +00:00
Kenneth Graunke	e8883bd40b	intel/brw: Use size_written for NoMask instructions in is_partial_write The intention of inst->is_partial_write() is that it should return true when any REG_SIZE (32B) chunk of inst's destination is written but not fully overwritten. This can be used to tell whether inst combines new data with existing data, or screens off any previous writes, so the old values are no longer required. The existing (exec_size * brw_type_size_bytes(this->dst.type) < 32) check doesn't work in a number of cases. For example, LSC block loads have exec_size == 1 and force_writemask_all set, but may write multiple full registers of data. (Currently, we only see them with exec_size 1 after logical-send-lowering, so our SHADER_OPCODE_SEND special case was covering those.) We had also special cased UNDEF. Instead, we can simply check: 1. Predication 2. !inst->dst.contiguous() 3. inst->dst.offset % REG_SIZE != 0 4. inst->size_written % REG_SIZE != 0 We had the first three already, but #4 is new. If either #3 or #4 are true, then that implies there is a REG_SIZE chunk of the destination which is written, but not entirely written, so it's a partial write. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>	2024-09-12 20:54:36 +00:00

1 2 3 4 5 ...

12720 commits