fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 18:08:15 +02:00

Author	SHA1	Message	Date
Sagar Ghuge	af2d51eafa	anv: enable BTP+BTI RCC keying for some workloads Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details We can drop RT flush and PS Scoreboard stall if state cache perf fix disabled is set to 1. If bit is set RCC uses the sum of Binding Table Pointer and Binding Table Index as tag in state cache instead of just Binding Table Index. On DX12 this is a performance win on all workloads we've tested. On DX11 there are a bunch of performance of regression. We think this is due to the fact that to avoid trashing the RCC, we need to remove all but render targets from the binding table, meaning all shader resource accesses have to go through the bindless HW heap. This leads to additional register usage due to the need to push the base offset of descriptor sets. Improvement in the compiler would likely mitigate this. This change introduce a DRIRC key we only turn on for DX12. Also platforms prior to DG2/LSC have a really small bindless heap that leads to additional register usage, so this optimization is completely disable there. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10872 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10873 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14075 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39982>	2026-03-24 18:17:42 +00:00
Lionel Landwerlin	adf18761f8	anv: rework color_aux operation tracking The current tracking seems to have hidden issues related to MCS ambiguate that are currently hidden by the fact that we're inserting pb-stall+RT-flush on BTI changes which we're going to be remove in the next commits. The issues appear to be related to a missing pb-stall+RT-flush between MCS ambiguate and fast-clear causing failures on the following tests once BTP+BTI RCC caching is enabled : dEQP-VK.pipeline..multisample.misc.multi* dEQP-VK.pipeline..framebuffer_attachment.diff_attachments_2d_32x32_39x41_ms dEQP-VK.pipeline..framebuffer_attachment.diff_attachments_2d_32x32_48x48_ms Here we rework the tracking with a new enum to track 3 classes of operations. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39982>	2026-03-24 18:17:42 +00:00
Lionel Landwerlin	dc79d6b13a	anv: merge null surface state packing with previous attachments Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39982>	2026-03-24 18:17:42 +00:00
Lionel Landwerlin	d1eed2239d	anv: batch rendering initialization commands Instead of : foreach color attachment transition layout fast clear slow clear do this : foreach color attachment transition layout foreach color attachment fast clear foreach color attachment slow clear Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39982>	2026-03-24 18:17:42 +00:00
Lionel Landwerlin	268c7f2a44	anv: rename variables in CmdBeginRendering Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39982>	2026-03-24 18:17:42 +00:00
Lionel Landwerlin	bbcb7c7838	anv: move depth/stencil BeginRendering handling prior to color When rendering only has depth/stencil, we need to look at the depth/stencil view size to generate a dummy null color attachments. So do that first, so we don't have to iterate color attachments once more with the final size. This change also has the nice impact of removing a BTI change flush due to the sequence moving from : - before blorp BTI-flush - color fast-clear - after blorp BTI-flush - depth fast-clear - change RT due to shader outputs (BTI-flush) - draw call to : - depth fast-clear - before blorp BTI-flush - color fast-clear - combined after blorp BTI-flush (pending) - change RT due to shader outputs (BTI-flush, combined with above) - draw call Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39982>	2026-03-24 18:17:42 +00:00
Tapani Pälli	735ad7cefb	anv: add required barrier for Wa_14026570320 Ensure RT is not processing rays while requesting state cache invalidate by making sure compute is done first. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13830 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40388>	2026-03-24 09:34:29 +00:00
Tapani Pälli	1cce7c79f0	anv: remove barrier special handling for RT_BTI_CHANGE This has been dead code since commit `4b2b824112`. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40388>	2026-03-24 09:34:29 +00:00
José Roberto de Souza	c0f1689e11	anv: Fix invalid resource barrier signal stage Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Simulator is crashing when receiving GPGPU + Pixel as resource barrier signal stage, what according to spec is invalid. So here replacing the pixel stage by color, over synchronizing it a bit but keeping it functional. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14641 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40516>	2026-03-23 16:30:39 +00:00
José Roberto de Souza	347e82c718	anv: Always have a valid Resource barrier::Wait stage set Simulator hangs if a resource barrier has wait stage = None, HW seens to don't care but something bad could be happning internaly. So here making sure Wait stage is set to TOP when it is None. Simulator hangs if a resource barrier has wait stage = None. The HW seems to ignore it, but something bad could be happening internally. So here I'm making sure the wait stage is set to TOP when it is None. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40516>	2026-03-23 16:30:39 +00:00
Lionel Landwerlin	5d7cf5e762	anv: don't queue pipe control reasons wihout a trace When there is no trace pointer, there is usually a another tracepoint being emitted (see STATE_BASE_ADDRESS, 3DSTATE_BINDING_TABLE_POOL_ALLOC emission). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40503>	2026-03-19 18:13:46 +00:00
José Roberto de Souza	2b91888e54	anv: Remove asserts() added in resource_barrier_wait_stage() Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details In commit `10b5b279a4` ("anv: Fix CmdResetEvent2() with RESOURCE_BARRIER::Wait stage == none") I haved added assert to catch invalid cases but looks like we have several tests affected by that problem causing crashes in debug builds. So here I'm removing those asserts(), will then work on all the fixes and bring it back. Acked-by: Ivan Briano <ivan.briano@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40476>	2026-03-18 05:36:38 +00:00
Sagar Ghuge	37f26e346a	anv: Write IR header using shader instead of CS On integrated platforms, we have issue where L3 cache not being coherent with CS and it forces us to push data out L3. To avoid data cache flush, let's write the IR header with BLORP shader. There is a small shader launch latency but eventually that should not matter because writing data with CS (MI_STORE) commands is slower than shader execution when we consider large number of BVH tree getting built. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39971>	2026-03-18 03:49:17 +00:00
José Roberto de Souza	10b5b279a4	anv: Fix CmdResetEvent2() with RESOURCE_BARRIER::Wait stage == none CmdResetEvent2() was calling anv_add_pending_pipe_bits() with no dst_stages stages causing RESOURCE_BARRIER::Wait stage == none, what causes a GPU hang in NVL-P simulator. So here setting dst_stages to VK_PIPELINE_STAGE_2_TOP_OF_PIPE_BIT and adding an assert in resource_barrier_wait_stage() to catch hw_stage == 0. This fixes crucible func.event.cmd_buffer.q0 in simulator. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40445>	2026-03-17 16:42:55 +00:00
Tapani Pälli	a9ea5825b6	anv: update btp address after CmdExecuteCommands We need to update state.btp address with the last executed secondary command buffer btp address so that optimization will work correctly. Fixes: `8a5ac96a67` ("anv: predicate BTP emissions") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15041 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40361>	2026-03-12 11:17:45 +00:00
Lionel Landwerlin	e20f5a0a7a	anv: use companion RCS for hiz ops on compute queue Fixes new CTS tests. Similar to a previous change : `5bf3546cc6` ("anv: Use companion cmd buffer for CCS and MCS image barriers") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40332>	2026-03-11 21:34:42 +00:00
Nanley Chery	465c186fc5	anv: Prepare for format width changes in blorp_copy() blorp_copy() will soon gain the ability to increase the format bpb. Prepare anv by replicating the clear color pixel on gfx12. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39974>	2026-03-11 00:36:18 +00:00
Michael Cheng	6e92be2747	anv: Rename instruction_state_pool to shader_heap Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Shaders are allocated from anv_shader_heap, which is backed by the util_vma_heap. Rename the VA range field to shader_heap to match current usage and avoid confusion. Signed-off-by: Michael Cheng <michael.cheng@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40131>	2026-02-27 17:36:41 +00:00
Caio Oliveira	df4042371f	anv: Set PIPELINE_SELECT systolic mode based on shader usage For Gfx125 workloads that use systolic mode, this might mean an extra PIPELINE_SELECT when flipping between a compute shader that use the mode and another that doesn't use the mode (or vice-versa). Reviewed-by: Iván Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40014>	2026-02-26 19:05:56 +00:00
Lionel Landwerlin	095c470d25	anv: add missing handling for attachment locations in secondaries Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Fixes: dEQP-VK.renderpasses.dynamic_rendering.partial_secondary_cmd_buff.local_read.interaction_with_shader_object dEQP-VK.renderpasses.dynamic_rendering.partial_secondary_cmd_buff.local_read.remap_single_attachment_shader_object Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `d2f7b6d5` ("anv: implement VK_KHR_dynamic_rendering_local_read") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40036>	2026-02-26 20:26:58 +02:00
Lionel Landwerlin	1cd9a4e4a1	anv: avoid filling PC reason for timestamp u_trace captures Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39405>	2026-02-25 10:44:06 +00:00
Lionel Landwerlin	79a56ef448	anv: add a debug printout for dirty descriptors Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39405>	2026-02-25 10:44:04 +00:00
Lionel Landwerlin	413e169f45	anv: remove snprintf for aux op transition With perfetto that string is processed later leading to use-after-free. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39405>	2026-02-25 10:44:03 +00:00
Lionel Landwerlin	8a5ac96a67	anv: predicate BTP emissions The previous commit enable different command buffers to program the same 3DSTATE_BINDING_TABLE_POOL_ALLOC instruction even though they allocated different chunks of binding tables. Now we can just predicate this programming and skip the stalling, flushing & invalidation. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39527>	2026-02-25 00:17:03 +00:00
Lionel Landwerlin	725c2a39d5	anv: enable sharing binding table pool programming We currently allocate 64KiB chunks of binding table pools for each command buffers and program the 3DSTATE_BINDING_TABLE_POOL_ALLOC instruction accordingly. But 3DSTATE_BINDING_TABLE_POINTERS_* instructions can address 2^20 bytes. So it's possible to have 2 command buffers share the same programming if they just add some offsets to their 3DSTATE_BINDING_TABLE_POINTERS_* programming and round down 3DSTATE_BINDING_TABLE_POOL_ALLOC addresses to 2^20. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39527>	2026-02-25 00:17:02 +00:00
Kenneth Graunke	4bdef9824a	anv, brw: Consolidate ex_bso bits to a static devinfo inline If we have extended bindless surface offset (ExBSO) support, we want to use it. Consolidate the anv_physical_device and brw_compiler bits into a single static inline that take devinfo. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>	2026-02-16 21:33:47 +00:00
Kenneth Graunke	9531c6b89e	brw: Make indirect_ubos_use_sampler a static inline bool taking devinfo Having the named field allowed us to indicate that our code conditions are referring to the specific decision about how we handle indirect UBOs, rather than some other arbitrary hardware change. Still, there's no need to store this in a singleton struct - we can easily have a static inline bool that does the devinfo check for us. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39839>	2026-02-16 21:33:42 +00:00
Lionel Landwerlin	e94cb92cb0	anv: use internal surface state on Gfx12.5+ to access descriptor buffers As a result on Gfx12.5+ we're not holding any binding table entry to access descriptor buffers. This should reduce the amount of binding table allocations. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10711 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35160>	2026-02-12 16:45:26 +00:00
Lionel Landwerlin	812b62a315	anv: remove set index for descriptor buffers We can check the shader's layout_type. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35160>	2026-02-12 16:45:25 +00:00
Lionel Landwerlin	42b70cf05a	anv: add missing constant cache invalidation for descriptor buffers A descriptor buffer promoted to push constants requires a constant cache invalidation if it is modified on the device. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35160>	2026-02-12 16:45:21 +00:00
Lionel Landwerlin	888ac904a3	anv: flush render caches on first pipeline select Given a situation like this : - CB_A: begin, renderDepthA, end - CB_B: begin, computeA, barrier (depth), computeB, end The depth cache is not being flushed between renderDepthA & computeB because : - it's not flushed at the end of CB_A (it's not required) - when CB_B starts, we're still on GFX pipeline mode but do not flush render caches because pipeline mode is unknown - when barrier is CB_B is executed, we're already in compute pipeline mode and HW cannot flush depth. The fix is to flush RT/depth cached when switching from unknown pipeline mode any pipeline mode. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `e6dae6ef5f` ("vulkan: Optimize implicit end_subpass barrier") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14816 Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Tested-by: David Gow <david@davidgow.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39824>	2026-02-12 10:10:23 +02:00
Juston Li	f84ed620c2	anv: set missing protected bit for protected depth/stencil surfaces This bit is set in mocs for other protected attachment types by anv_image_fill_surface_state() however was ommited for depth/stencil attachments here. Without the protected bit set, it causes heavy black artifacting when attaching a protected depth attachment image to a framebuffer. Fixes: `794b0496e9` ("anv: enable protected memory") Signed-off-by: Juston Li <justonli@google.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39818>	2026-02-11 21:45:17 +00:00
Nanley Chery	e42b2a5d70	anv: Don't partial resolve LOD1+ for non-FCV CCS We don't allow fast-clears in this case. Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:54 +00:00
Nanley Chery	21d187b7f5	anv: Support fast clears on more layers On Xe2+, support multi-layer and non-zero-layer CCS fast-clears. To do this in a simple manner, drop the code which splits multi-layer clears into fast clears and slow clears. The performance CI reports no regressions nor improvements on BMG. For MCS on all platforms and for CCS on prior platforms, use a new heuristic. Instead of only allowing fast clears on the first slice/layer, do the following: For 3D images, only fast-clear if all slices are cleared. Enables fast-clearing every slice of 3D textures in: * Terminator Resistance - 480x270x128. * Ghostrunner 2 - 320x180x128. For 2D arrays, match the Xe2+ behavior and allow clearing to any layer. This is possible because we only allow fast-clearing if the clear color matches the default value. Enables fast-clearing every layer of 2D array textures in: * Assassin's Creed - 128x128, 6-layers. * Blackops 3 - 1024x1024, 6-layers. * Borderlands 3 - 128x128, 6-layers. * Cyberpunk - 1024x1024, 10-layers. * Unigine Superposition - 4K, 2-layers. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11893 Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:54 +00:00
Nanley Chery	b8f6ad9060	anv: Use variable default value for some images using CLEAR A future commit will enable clearing to more than the first layer of 2D array images. To ensure consistency for the clear color, require the ANV_FAST_CLEAR_DEFAULT_VALUE for such images if they make use of ISL_AUX_STATE_CLEAR. Also, use a non-zero default value for some image formats. I tested the majority of workloads in the performance CI. This will cause those which clear to 2D array layers to gain clears on more than just the first layer. At the moment, we still only support clearing the first layer, so there should be no change in performance. Affected games are documented in the code. Acked-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:53 +00:00
Nanley Chery	390c9e3fda	anv: Inline the CCS/MCS predicated resolve functions Now we can see the MI writes performed before and after the resolves in transition_color_buffer(). Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:52 +00:00
Nanley Chery	4d8c71ab1f	anv: Delete conversion of CCS_D partial resolve Now that hasvk is the driver for supporting HSW and BDW, we no longer need to convert CCS_D partial resolves to full resolves to avoid an assert-failure in BLORP. Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:51 +00:00
Nanley Chery	b1db1179c2	anv: Set compressed bit separately from fast-clear type This will make handling fast-clears on multiple layers simpler by saving us from having to pass more parameters into fast-clear state setting functions. It also allows us to set more complex fast-clear state for FCV_CCS_E without marking the image as compressed. Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:50 +00:00
Nanley Chery	c054d4fe2f	anv: Support partial resolves on any level/layer Enables more support for FCV_CCS_E partial resolves if we ever need it. Also enables support for multiple layers being fast cleared and needing resolves. Support for that will arrive in several commits. Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:50 +00:00
Nanley Chery	0a8ab13b9d	anv: Reset fast-clear type in transition_color_buffer() Moving the code here will simplify the task of supporting fast-clears on multiple array layers and depth slices. Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:49 +00:00
Nanley Chery	ce196c9de5	anv: Fix the fast clear type for FCV writes We started allowing non-default clear colors with FCV in commit `cd8e120b97`. When rendering to an image with FCV, set the fast-clear type to ANV_FAST_CLEAR_ANY if the image properties allow such fast-clears. Fixes: `cd8e120b97` ("anv: Allow more single subresource fast-clears with FCV") Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:49 +00:00
Francisco Jerez	349b09f8a2	anv/gfx12.5: Apply HIZ-CCS resolve TC flush on full resolves for all gfx12.5. This appears to be needed to guarantee that a resolved depth surface has no remaining fast-cleared blocks on DG2 as well as MTL. After this series this should no longer be hit in practice since we'll be doing partial resolves in most cases, but it seems sensible to keep and correct the workaround for our peace of mind to make sure that full resolves are truly resolving the main surface. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>	2026-01-27 08:52:17 +00:00
Francisco Jerez	8e1b4b62ce	anv/gfx12.5: Take advantage of partial resolves in depth layout transitions. Issue a partial resolve instead of a full resolve from transition_depth_buffer() when the final usage requires the CCS-compressed surface to provide a complete representation of the image. This significantly improves performance of applications that frequently interleave depth rendering and sampling on non-WT surfaces (e.g. MSAA surfaces). Nba2K23-trace-dx11-2160p-ultra improves performance by about 260% with this on MTL, DG2 shows a similar benefit. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>	2026-01-27 08:52:17 +00:00
Francisco Jerez	157a4cc6d0	anv/gfx12.5: Resolve depth during layout transitions from ISL_AUX_STATE_COMPRESSED_HIER_DEPTH. For transitions to a state that requires the image to be fully defined by the primary+CCS surface without necessarily requiring a valid primary we have to perform a resolve if the initial state was ISL_AUX_STATE_COMPRESSED_HIER_DEPTH, which isn't fully defined by its primary+CCS surface. This full resolve will be replaced with a more efficient partial resolve in a future commit, but we have to do this up front in order to avoid breaking bisectability. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>	2026-01-27 08:52:16 +00:00
Francisco Jerez	7f1ed1e411	anv/gfx12.5: Can't fast clear multisampled Z/S with HIZ CCS WT aux usage. We can end up in this situation in cases where the application uses a layout that allows both rendering and sampling from a depth surface, since in such cases we will attempt to render with HIZ CCS WT usage as a side effect of using ISL_AUX_USAGE_HIZ_CCS_WT for all layouts that allow the image to be sampled from. Disabling fast clears for that case isn't expected to cause performance regression since before this series for HiZ CCS non-WT images transitioning to such a layout we would have issued a full resolve and used ISL_AUX_USAGE_NONE, which also doesn't support fast clears. Multisample depth images should still get fast clears after this commit in cases where the rendering and sampling is split into separate render pasess with a layout transition between them that transitions the image from a W/O layout into a R/W one -- Such transitions will be handled with a relatively cheap partial resolve in a subsequent commit. v2: Add details of additional findings about these hardware issues in comment. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> v3: Pass aspect bit consistent with layout to anv_layout_to_aux_usage() instead of defaulting to VK_IMAGE_ASPECT_DEPTH_BIT. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>	2026-01-27 08:52:15 +00:00
Francisco Jerez	02030b4b8f	anv: Use actual layout in anv_fast_clear_depth_stencil() instead of ANV_IMAGE_LAYOUT_EXPLICIT_AUX. Currently anv_fast_clear_depth_stencil() doesn't know the correct layout of the depth and stencil images, instead it uses ANV_IMAGE_LAYOUT_EXPLICIT_AUX to force the base AUX usage of each plane, which can be inconsistent with the VkImageLayout currently in use. Plumb the correct depth and stencil layouts. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>	2026-01-27 08:52:15 +00:00
Tapani Pälli	f66ff97d58	drirc/anv: implement steps to disable RHWO for Wa_14024015672 Disable RHWO by default for singlesample draws and for MSAA draws if a drirc key is set (avoid perf hit if not needed). Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39404>	2026-01-23 11:10:07 +00:00
Tapani Pälli	fcbe987e10	anv: fix setting emitted_flush_bits Fixes a crash with: dEQP-VK.api.external.semaphore.opaque_fd.signal_export_import_wait_temporary when driver calls genX(CmdSetEvent2) -> emit_apply_pipe_flushes with having NULL in emitted_flush_bits. Fixes: `8834ef8bcd` ("anv: use flushing PIPE_CONTROL for event signaling") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39343>	2026-01-16 13:19:06 +00:00
Tapani Pälli	4b2b824112	anv: hand over ANV_PIPE_RT_BTI_CHANGE to pipe control Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details There are issues when using resource barrier for this. Fixes: `24e9afb0b7` ("anv: implement resource barrier emissions") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14533 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39132>	2026-01-04 13:35:24 +00:00
Lionel Landwerlin	d99a3d9b58	anv: remove CS-L3 coherency on Xe2 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details I'll try to write some crucible tests for this. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `be5f5f659f` ("anv: consider CS coherent with L3 on Xe2+") Fixes: `503355c7f8` ("anv: update pipeline barriers for Xe2+") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38966>	2025-12-16 21:35:27 +00:00

1 2 3 4 5 ...

1266 commits