fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-25 21:18:26 +02:00

Author	SHA1	Message	Date
Nanley Chery	b8f6ad9060	anv: Use variable default value for some images using CLEAR A future commit will enable clearing to more than the first layer of 2D array images. To ensure consistency for the clear color, require the ANV_FAST_CLEAR_DEFAULT_VALUE for such images if they make use of ISL_AUX_STATE_CLEAR. Also, use a non-zero default value for some image formats. I tested the majority of workloads in the performance CI. This will cause those which clear to 2D array layers to gain clears on more than just the first layer. At the moment, we still only support clearing the first layer, so there should be no change in performance. Affected games are documented in the code. Acked-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:53 +00:00
Nanley Chery	811c413f98	anv: Don't return the Xe2+ fast-clear type early Don't return early from anv_layout_to_fast_clear_type() for Xe2+. We'll need to make more use of the function for some MCS changes in later commits. Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:53 +00:00
Nanley Chery	7bb7b63b96	anv: Line wrap anv_CmdClearColorImage Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:52 +00:00
Nanley Chery	390c9e3fda	anv: Inline the CCS/MCS predicated resolve functions Now we can see the MI writes performed before and after the resolves in transition_color_buffer(). Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:52 +00:00
Nanley Chery	4d8c71ab1f	anv: Delete conversion of CCS_D partial resolve Now that hasvk is the driver for supporting HSW and BDW, we no longer need to convert CCS_D partial resolves to full resolves to avoid an assert-failure in BLORP. Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:51 +00:00
Nanley Chery	b1db1179c2	anv: Set compressed bit separately from fast-clear type This will make handling fast-clears on multiple layers simpler by saving us from having to pass more parameters into fast-clear state setting functions. It also allows us to set more complex fast-clear state for FCV_CCS_E without marking the image as compressed. Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:50 +00:00
Nanley Chery	c054d4fe2f	anv: Support partial resolves on any level/layer Enables more support for FCV_CCS_E partial resolves if we ever need it. Also enables support for multiple layers being fast cleared and needing resolves. Support for that will arrive in several commits. Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:50 +00:00
Nanley Chery	0a8ab13b9d	anv: Reset fast-clear type in transition_color_buffer() Moving the code here will simplify the task of supporting fast-clears on multiple array layers and depth slices. Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:49 +00:00
Nanley Chery	ce196c9de5	anv: Fix the fast clear type for FCV writes We started allowing non-default clear colors with FCV in commit `cd8e120b97`. When rendering to an image with FCV, set the fast-clear type to ANV_FAST_CLEAR_ANY if the image properties allow such fast-clears. Fixes: `cd8e120b97` ("anv: Allow more single subresource fast-clears with FCV") Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:49 +00:00
Nanley Chery	e7854d06a5	anv: Update predicated resolve documentation * Don't mention gfx7-8 due to the hasvk split. * Account for the array of clear colors. Fixes: `0e6b132a75` ("anv: Access more colors in fast_clear_memory_range") Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:48 +00:00
Nanley Chery	eb4a581e44	intel/isl: Fix QPitch of arrayed MCS From RENDER_SURFACE_STATE::AuxiliarySurfaceQPitch on BDW+, This field must be set to an integer multiple of the Surface Vertical Alignment Accomplish this by aligning the height of each MCS layer to main surface's vertical alignment. Prevents the following test group from failing on Xe2 when a future commit enables multi-layer fast-clears in anv: dEQP-VK.api.image_clearing.. clear_color_attachment.multiple_layers. _clamp_input_sample_count_* The main test I used to debug this: dEQP-VK.api.image_clearing.core. clear_color_attachment.multiple_layers. a8b8g8r8_unorm_pack32_64x11_clamp_input_sample_count_2 Backport-to: 25.3 Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>	2026-01-27 18:46:47 +00:00
Kenneth Graunke	41d7debcfe	brw: Use nir_imul_imm in per-vertex/per-primitive offset calculation Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This avoids generating some useless math that would need to be cleaned up later, without complicating things too much. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Kenneth Graunke	24c66d3871	brw: Vectorize URB intrinsics using nir_opt_load_store_vectorize This helps cut down URB messages on tessellation and mesh shaders significantly. fossil-db results on Battlemage: Instrs: 505172392 -> 505207187 (+0.01%); split: -0.00%, +0.01% Send messages: 23678197 -> 23656126 (-0.09%); split: -0.09%, +0.00% Cycle count: 63150470088 -> 63147482640 (-0.00%); split: -0.01%, +0.00% Spill count: 576554 -> 576616 (+0.01%) Fill count: 545304 -> 545413 (+0.02%) Max live registers: 141099192 -> 141150675 (+0.04%); split: -0.00%, +0.04% Max dispatch width: 39856192 -> 39856208 (+0.00%) Totals from 4231 (0.27% of 1583648) affected shaders: Instrs: 1620161 -> 1654956 (+2.15%); split: -0.25%, +2.40% Send messages: 128652 -> 106581 (-17.16%); split: -17.18%, +0.03% Cycle count: 24650700 -> 21663252 (-12.12%); split: -12.82%, +0.70% Spill count: 378 -> 440 (+16.40%) Fill count: 1308 -> 1417 (+8.33%) Max live registers: 364676 -> 416159 (+14.12%); split: -0.24%, +14.36% Max dispatch width: 67952 -> 67968 (+0.02%) There are several reasons we didn't go with nir_opt_vectorize_io: 1. nir_opt_vectorize_io appears to work on the slot location level. We want to be able to vectorize based on the URB offsets, especially for cases like point size, layer, and viewport which have different VARYING_SLOT_* values but live in the same vec4 in a URB entry. 2. We want vec8 stores, and nir_opt_vectorize_io only seems to vectorize within a single 32-bit vec4. It does handle 8 components, but that's only for packing 16-bit values into a 32-bit vec4. Improves performance of Sascha Willems' tessellation demo by around 4% on Meteorlake. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Kenneth Graunke	aafe8967fd	brw: Avoid using URB global offset with per-slot offsets on <= Icelake Both the URB Global Offset and Per-Slot Offsets are specified to be unsigned numbers. The URB Global Offset is only 11 bits, and so is limited to be between [0, 2047]. While the per-slot offsets are given as U32 values, it would appear that adding the two offsets does not handle 32-bit overflow/unsigned wrap correctly. This pops up in Piglit's TCS variable-indexing tests, which ends up performing loads from offset (x - 16) and a base of 18, and at an offset (x) with a base of 2. These should be equivalent, but when x <= 15, the per-slot offset calculated in the shader is negative (0xfffffff[0-f]) and adding the base of 18 is not wrapping around correctly to [2, 17]. To work around this, avoid using the global offset when the per-slot offset is present, and just add the two in the shader where unsigned wrap works correctly. Tigerlake and later don't seem to have this issue. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Kenneth Graunke	07ac0e3463	brw: Skip vec8 store_urb_vec4_intel noop writemasks as well We were checking for 0xf which is fine for vec4, but vec8 gets 0xff. Either way, nothing is writemasked, so we can skip sending the mask. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Kenneth Graunke	dbb24ff56b	brw: Assert that urb_vec4_intel stores only have 4/8 components vec1-3, 5-7, and 9+ are not supported. Only vec4 and vec8. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Kenneth Graunke	c2f03ba12f	nir: Add memory modes to URB load intrinsics This makes it easier for NIR passes to distinguish between inputs and outputs without having to reason about which URB handle source was passed to the intrinsic. It probably also makes it a bit easier for humans to read the NIR too. v2: Don't add memory mode to store intrinsics. It's always output. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>	2026-01-27 16:08:36 +00:00
Tapani Pälli	bb84773c81	blorp: fix asserts hit with msaa blorp blits on xe3 Tested on PTL, fixes various copy_and_blit tests that utilize compute after `ab9d3528dc` that exposed this to them. Fixes: `ab9d3528dc` ("anv: fix queue check in anv_blorp_execute_on_companion on xe3") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39548>	2026-01-27 15:28:55 +00:00
Caleb Callaway	a91a636faf	driconf: LTO disable Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39544>	2026-01-27 14:57:20 +00:00
Hans-Kristian Arntzen	22bd72aa58	anv: Enable VK_EXT_present_timing. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38770>	2026-01-27 11:09:51 +00:00
Hans-Kristian Arntzen	c18b14aea2	anv: Add PRESENT_STAGE_LOCAL_EXT path for calibration. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38770>	2026-01-27 11:09:50 +00:00
Francisco Jerez	c0cf14f0e2	intel/isl: Add unit tests for ISL_AUX_STATE_COMPRESSED_HIER_DEPTH. v2: Add additional AUX state transition test-cases for HIZ_CCS (Nanley). v3: Assume partial resolve is equivalent to full resolve on legacy HiZ surfaces during isl_aux_state_transition_aux_op() instead of asserting (Nanley). v4: Move some tests into different group, add more MCS tests (Nanley). Acked-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>	2026-01-27 08:52:18 +00:00
Francisco Jerez	349b09f8a2	anv/gfx12.5: Apply HIZ-CCS resolve TC flush on full resolves for all gfx12.5. This appears to be needed to guarantee that a resolved depth surface has no remaining fast-cleared blocks on DG2 as well as MTL. After this series this should no longer be hit in practice since we'll be doing partial resolves in most cases, but it seems sensible to keep and correct the workaround for our peace of mind to make sure that full resolves are truly resolving the main surface. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>	2026-01-27 08:52:17 +00:00
Francisco Jerez	8e1b4b62ce	anv/gfx12.5: Take advantage of partial resolves in depth layout transitions. Issue a partial resolve instead of a full resolve from transition_depth_buffer() when the final usage requires the CCS-compressed surface to provide a complete representation of the image. This significantly improves performance of applications that frequently interleave depth rendering and sampling on non-WT surfaces (e.g. MSAA surfaces). Nba2K23-trace-dx11-2160p-ultra improves performance by about 260% with this on MTL, DG2 shows a similar benefit. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>	2026-01-27 08:52:17 +00:00
Francisco Jerez	ef95d5243f	intel/isl: Teach ISL about HIZ CCS partial resolves. This updates the isl_aux_state transition helpers to consider partial resolves for HiZ-CCS surfaces, and as a side effect of the update to isl_aux_prepare_access() partial resolves should be implicitly enabled in iris now for platforms that support it. v2: HiZ partial resolves aren't enough to remove cleared blocks unlike color partial resolves (Nanley). v3: Treat ISL_AUX_STATE_CLEAR similar to ISL_AUX_STATE_COMPRESSED_HIER_DEPTH so we can continue using it after depth buffer fast clears. Drop flagging partial_resolve == true for HiZ usages so we don't do the wrong thing while preparing access of a surface in ISL_AUX_STATE_CLEAR state. v4: Assume partial resolve is equivalent to full resolve on legacy HiZ surfaces during isl_aux_state_transition_aux_op() instead of asserting (Nanley). Acked-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>	2026-01-27 08:52:17 +00:00
Francisco Jerez	cc66f5ff1d	intel/blorp: Add support for partial resolves of HiZ-CCS surfaces. v2: Define additional enum BLORP_OP_HIZ_PARTIAL_RESOLVE to track partial resolves (Nanley). v3: Add comment regarding fall back to full resolve on Gfx12.0 (Nanley). Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>	2026-01-27 08:52:17 +00:00
Francisco Jerez	79ab5db71b	intel/measure: Define snapshot type for HiZ partial resolves. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>	2026-01-27 08:52:16 +00:00
Francisco Jerez	f9ce1b9c40	anv/gfx12.5+: Keep HIZ_CCS aux usage while sampling from depth surfaces. As long as the surface is in a state with valid AUX state with identity contents of the HiZ surface (E.g. in ISL_AUX_STATE_COMPRESSED_CLEAR, ISL_AUX_STATE_COMPRESSED_NO_CLEAR, ISL_AUX_STATE_RESOLVED or ISL_AUX_STATE_PASS_THROUGH states) we can keep compression enabled, which works around hardware bugs on MTL and DG2, and will be helpful to switch to partial resolves in a future commit. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>	2026-01-27 08:52:16 +00:00
Francisco Jerez	baf39d4322	anv/gfx12.5: Infer ISL_AUX_STATE_COMPRESSED_HIER_DEPTH from anv_layout_to_aux_state(). Update anv_layout_to_aux_state() to return the ISL_AUX_STATE_COMPRESSED_HIER_DEPTH state in cases where we may be rendering into a HiZ surface in non-WT aux mode, instead of ISL_AUX_STATE_COMPRESSED_CLEAR. v2: No need to handle ISL_AUX_STATE_COMPRESSED_HIER_DEPTH in anv_layout_to_fast_clear_type() since it should never be reached (Nanley). Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>	2026-01-27 08:52:16 +00:00
Francisco Jerez	157a4cc6d0	anv/gfx12.5: Resolve depth during layout transitions from ISL_AUX_STATE_COMPRESSED_HIER_DEPTH. For transitions to a state that requires the image to be fully defined by the primary+CCS surface without necessarily requiring a valid primary we have to perform a resolve if the initial state was ISL_AUX_STATE_COMPRESSED_HIER_DEPTH, which isn't fully defined by its primary+CCS surface. This full resolve will be replaced with a more efficient partial resolve in a future commit, but we have to do this up front in order to avoid breaking bisectability. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>	2026-01-27 08:52:16 +00:00
Francisco Jerez	7f1ed1e411	anv/gfx12.5: Can't fast clear multisampled Z/S with HIZ CCS WT aux usage. We can end up in this situation in cases where the application uses a layout that allows both rendering and sampling from a depth surface, since in such cases we will attempt to render with HIZ CCS WT usage as a side effect of using ISL_AUX_USAGE_HIZ_CCS_WT for all layouts that allow the image to be sampled from. Disabling fast clears for that case isn't expected to cause performance regression since before this series for HiZ CCS non-WT images transitioning to such a layout we would have issued a full resolve and used ISL_AUX_USAGE_NONE, which also doesn't support fast clears. Multisample depth images should still get fast clears after this commit in cases where the rendering and sampling is split into separate render pasess with a layout transition between them that transitions the image from a W/O layout into a R/W one -- Such transitions will be handled with a relatively cheap partial resolve in a subsequent commit. v2: Add details of additional findings about these hardware issues in comment. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> v3: Pass aspect bit consistent with layout to anv_layout_to_aux_usage() instead of defaulting to VK_IMAGE_ASPECT_DEPTH_BIT. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>	2026-01-27 08:52:15 +00:00
Francisco Jerez	02030b4b8f	anv: Use actual layout in anv_fast_clear_depth_stencil() instead of ANV_IMAGE_LAYOUT_EXPLICIT_AUX. Currently anv_fast_clear_depth_stencil() doesn't know the correct layout of the depth and stencil images, instead it uses ANV_IMAGE_LAYOUT_EXPLICIT_AUX to force the base AUX usage of each plane, which can be inconsistent with the VkImageLayout currently in use. Plumb the correct depth and stencil layouts. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>	2026-01-27 08:52:15 +00:00
Francisco Jerez	d283e44634	anv/gfx12.5: Allocate indirect color state for depth surfaces. The clear color state has to be allocated since we will be sampling from non-WT HiZ CCS depth surfaces without disabling compression. v2: Use isl_aux_usage_has_ccs() instead of open coding (Nanley). v3: Use stricter condition on Gfx12.0 to avoid allocating buffer unnecessarily (Nanley). Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>	2026-01-27 08:52:14 +00:00
José Roberto de Souza	8fdec3d161	intel/isl/gfx12.5: Alow hierarchial depth buffer write through for multi sampled surfaces Documentation is kinda of ambiguos but at least gfx12.5 is allowed to do hierarchial depth buffer write through for multi sampled surfaces. BSpec: 46965 BSpec: 56419 Suggested-by: Nanley Chery <nanley.g.chery@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>	2026-01-27 08:52:12 +00:00
Francisco Jerez	5ed23c14da	intel/isl: Define ISL_AUX_STATE_COMPRESSED_HIER_DEPTH aux state. This state is helpful to track when resolves are needed for HiZ-CCS non-WT surfaces, since the ISL_AUX_STATE_COMPRESSED_* states that currently exist don't distinguish between the CCS and the HiZ surfaces being in a non-passthrough compression state, so we would have had to pre-emptively issue a resolve before sampling from any ISL_AUX_STATE_COMPRESSED_* HiZ-CCS surface just in case its HiZ surface has non-trivial contents, even if its HiZ surface is in pass-through state and the surface only has non-trivial CCS compression. This commit introduces a new ISL_AUX_STATE_COMPRESSED_HIER_DEPTH state that indicates that the hierarchical depth surface has non-trivial contents that have to be considered to get a complete representation of the image. While in this state the surface may also have fast-cleared blocks. The pre-existing ISL_AUX_STATE_COMPRESSED_* states now unambiguously indicate that the HiZ surface is in an identity state, so it's unnecessary to obtain a complete representation of the image e.g. while sampling from a HiZ-CCS depth surface. v2: Use more abstract aux state name instead of ISL_AUX_STATE_COMPRESSED_HIZ, don't transition legacy HIZ surfaces to new aux state on write by using COMPRESS write behavior instead of COMPRESS_HIZ (Nanley). v3: Comment clarifications (Nanley). v4: Re-apply change to transition legacy HIZ surfaces to new aux state on write by using COMPRESS_HIZ for consistent semantics of the aux state irrespective of the aux usage, this is particularly important because the HIZ aux usage coexists with HIZ_CCS in some platforms, so pretending write_behavior is just "COMPRESS" for HIZ as on v2 would cause the ISL_AUX_STATE_COMPRESSED_CLEAR state to have different meaning and require different handling depending on the aux usage that was used with the surface before. v5: Additional comment clarifications, express aux_state_possible() result and isl_aux_prepare_access() check in terms of aux_usage_info::write_behavior (Nanley). Move changes in behavior for ISL_AUX_STATE_CLEAR from future ISL partial resolve commit into this commit since the change is already required for correctness as part of the split of hierarchical depth states. Acked-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31139>	2026-01-27 08:52:12 +00:00
Alyssa Rosenzweig	3361ca86cf	brw: hoist fsat lower OOTL Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39539>	2026-01-26 23:24:49 +00:00
Alyssa Rosenzweig	f16ec90caa	brw: move fsign lower OOTL Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39539>	2026-01-26 23:24:49 +00:00
Nanley Chery	f208ac9f4b	intel: Enable CCS support for Yf and Ys Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Enable CCS with Ys on all systems, and with Yf on gfx9-11. Unfortunately, Yf + CCS isn't supported on gfx12. Tests fail and systems hang in the CI with this enabled. The simulator also complains about this combination on tests such as: dEQP-VK.api.image_clearing.core.clear_color_attachment.multiple_layers.r4g4b4a4_unorm_pack16 dEQP-VK.api.image_clearing.core.clear_color_attachment.single_layer.r4g4b4a4_unorm_pack16_200x180_sample_count_2 The simulator doesn't complain about this combination on depth/stencil surfaces, but actual hardware still has issues with this. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11057 Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>	2026-01-26 21:09:05 +00:00
Nanley Chery	c5f01414da	anv,iris: Don't fast-clear 3D + Ys on gfx12.0 BSpec 46969 (r45602) tells us that we get no fast-clears for 3D: 3D/Volumetric surfaces do not support Fast Clear operation. For Y-tiled surfaces, we work around this in BLORP with convert_rt_from_3d_to_2d(). However, that function doesn't support Ys-tiling. We could modify our surface redescription code paths to support clearing entire Ys tiles, but we choose to hold off on the added complexity until we have a use-case. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>	2026-01-26 21:09:05 +00:00
Nanley Chery	525077f160	anv: Query the plane in anv_can_fast_clear_color() Instead of assuming the first plane, use anv_image_aspect_to_plane(). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>	2026-01-26 21:09:05 +00:00
Nanley Chery	bbd45bb9d1	intel/isl: Prefer suggested tilings which use CCS Try to use a tiling which would not result in a loss of CCS. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>	2026-01-26 21:09:05 +00:00
Nanley Chery	07539af097	intel/isl: Drop HIZ/MCS checks in CCS support query We'll use isl_surf_supports_ccs() in a scenario in which we want to check for CCS support without creating a HIZ or MCS surface beforehand. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>	2026-01-26 21:09:05 +00:00
Nanley Chery	b7c5779ede	intel/isl: Prefer the smallest suggested tiling When choosing between the suggested tilings, create one of each allowed and pick the smallest one. One benefit of using the standard tilings is that miptails can avoid space waste in mipmapped compressed textures. From the ICL PRM, Volume 5: Memory Data Formats, "MIP Layout": If Tiling is enabled, then each MIP is layed out using one or more tiles. If TileYf or TileYs tiling is enabled (TR_MODE != NONE), then some of the MIPs may actually be stored in a MIPTail which fits in a single 64K or 4K tile. The layout above, then only applied to MIPs which are not packed in the MIP Tail. Note that, depending on surface height the Vertical Alignment that surface can actually have the last few mips layed out below LOD1. Using MIP Tail (if supported) eliminates this possibility. In the performance CI, this helps: * Hogwarts Legacy on DG2 by 0.64% * Satisfactory on BMG by 0.89% * Wukong on BMG by 0.77% Highlights on memory saved by using Tile64 from at most 10k frames in game traces on DG2: * Hogwarts. 32 instances of: Saved 128 4KB page(s). extent=4096x4096x1 dim=2d levels=13 fmt=BC7_UNORM * Assassin's Creed. 8 instances of: Saved 768 4KB page(s). extent=120x68x192 dim=3d levels=1 fmt=R16G16B16A16_FLOAT * Black Ops 3. 3 instances of: Saved 864 4KB page(s). extent=172x140x288 dim=3d levels=1 fmt=BC6H_UF16 * God of War. 1 instance of: Saved 1920 4KB page(s). extent=320x170x192 dim=3d levels=1 fmt=R16G16B16A16_FLOAT This patch may cause regressions on SKL-TGL because the smaller surface may not support compression. This will be fixed in a coming patch. v2. Don't factor in the image alignments when comparing their sizes. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14074 Reviewed-by: Rohan Garg <rohan.garg@intel.com> (v1) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>	2026-01-26 21:09:04 +00:00
Nanley Chery	13dabd941e	intel/isl: Refactor tiling selection in isl_surf_init_s Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>	2026-01-26 21:09:04 +00:00
Nanley Chery	ab07c4066a	intel: Add and use ISL_SURF_USAGE_PREFER_4K_ALIGNMENT Does nothing for now. This will be used in future patch where a 64K-aligned image may be selected over a 4K-aligned one. Follows the alignment request behavior specified in VkImageAlignmentControlCreateInfoMESA. Specifically, this preference does not override attempts by ISL to enable compression. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>	2026-01-26 21:09:04 +00:00
Nanley Chery	6fc0e5c0aa	blorp: Fix Tile64 clear redescription assertion Prevent assert failures in a future commit where Tile64 will be selected more often. Fixes: `42ef23ecd1` ("intel/blorp: Don't redescribe some Tile64 clears") Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>	2026-01-26 21:09:03 +00:00
Nanley Chery	103ec323e3	anv: Ensure host-transfer tilings are supported by ISL ISL's tiled-memcpy functions don't support Yf, Ys, and Tile64. Remove those tilings when creating an image which will be used with host-image copies. The identical memory layout flag is checked by tests such as: dEQP-VK.image.host_image_copy.identical_memory_layout.optimal.bc5_snorm_block dEQP-VK.image.host_image_copy.query.linear.r16_unorm Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>	2026-01-26 21:09:03 +00:00
Nanley Chery	0e1cc2216d	anv: Disable multisampled host transfer support We don't actually handle this case. The next patch will limit the amount of tilings used when an image is created with VK_IMAGE_USAGE_HOST_TRANSFER_BIT_EXT. This prevents zink failures on DG2 for various multisampled test cases. For example: arb_internalformat_query2-internalformat-size-checks -auto -fbo Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>	2026-01-26 21:09:03 +00:00
Nanley Chery	78e24605db	intel/isl: Reduce scope of Yf-disabling workaround The missing bits for correct operation with compressed textures and multisampled textures were added in previous commits. The issues with lossless compression and higher miptail slots seem to affect 128bpb formats as well. However, we're only failing tests which use compression (even if those tests never actually use the compression format, just blorp_copy() up and down). Limit the workaround only to compressed formats until we get more information/testing. Tests: dEQP-VK.api.copy_and_blit.core.image_to_buffer.3d_images.mip_copies_etc2_r8g8b8a8_unorm_block_16x8x24 dEQP-VK.pipeline.monolithic.sampler.view_type.3d.format.astc_10x6_unorm_block.mipmap.linear.lod.select_bias_3_1 dEQP-VK.api.copy_and_blit.core.image_to_buffer.2d_images.mip_copies_astc_12x12_unorm_block_64x192 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>	2026-01-26 21:09:02 +00:00
Nanley Chery	ec37a06d93	intel/isl: Rework miptail restrictions with CCS This will be used to clarify some undocumented restrictions with 64bpb and 128bpb formats. Changes include: * Drop a redundant tiling check * Restrict workarounds to the right ISL_SURF_DIM * Handle the Yf case for the 2D workaround * Implement a narrower workaround for the 3D workaround Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>	2026-01-26 21:09:02 +00:00

... 10 11 12 13 14 ...

15916 commits