Enable CCS with Ys on all systems, and with Yf on gfx9-11.
Unfortunately, Yf + CCS isn't supported on gfx12. Tests fail and systems
hang in the CI with this enabled. The simulator also complains about
this combination on tests such as:
dEQP-VK.api.image_clearing.core.clear_color_attachment.multiple_layers.r4g4b4a4_unorm_pack16
dEQP-VK.api.image_clearing.core.clear_color_attachment.single_layer.r4g4b4a4_unorm_pack16_200x180_sample_count_2
The simulator doesn't complain about this combination on depth/stencil
surfaces, but actual hardware still has issues with this.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11057
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
BSpec 46969 (r45602) tells us that we get no fast-clears for 3D:
3D/Volumetric surfaces do not support Fast Clear operation.
For Y-tiled surfaces, we work around this in BLORP with
convert_rt_from_3d_to_2d(). However, that function doesn't support Ys-tiling.
We could modify our surface redescription code paths to support clearing
entire Ys tiles, but we choose to hold off on the added complexity until
we have a use-case.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
We'll use isl_surf_supports_ccs() in a scenario in which we want to
check for CCS support without creating a HIZ or MCS surface beforehand.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
When choosing between the suggested tilings, create one of each allowed
and pick the smallest one. One benefit of using the standard tilings is
that miptails can avoid space waste in mipmapped compressed textures.
From the ICL PRM, Volume 5: Memory Data Formats, "MIP Layout":
If Tiling is enabled, then each MIP is layed out using one or more
tiles. If TileYf or TileYs tiling is enabled (TR_MODE != NONE), then
some of the MIPs may actually be stored in a MIPTail which fits in a
single 64K or 4K tile. The layout above, then only applied to MIPs
which are not packed in the MIP Tail. Note that, depending on surface
height the Vertical Alignment that surface can actually have the last
few mips layed out below LOD1. Using MIP Tail (if supported)
eliminates this possibility.
In the performance CI, this helps:
* Hogwarts Legacy on DG2 by 0.64%
* Satisfactory on BMG by 0.89%
* Wukong on BMG by 0.77%
Highlights on memory saved by using Tile64 from at most 10k frames in
game traces on DG2:
* Hogwarts. 32 instances of:
Saved 128 4KB page(s). extent=4096x4096x1 dim=2d levels=13 fmt=BC7_UNORM
* Assassin's Creed. 8 instances of:
Saved 768 4KB page(s). extent=120x68x192 dim=3d levels=1 fmt=R16G16B16A16_FLOAT
* Black Ops 3. 3 instances of:
Saved 864 4KB page(s). extent=172x140x288 dim=3d levels=1 fmt=BC6H_UF16
* God of War. 1 instance of:
Saved 1920 4KB page(s). extent=320x170x192 dim=3d levels=1 fmt=R16G16B16A16_FLOAT
This patch may cause regressions on SKL-TGL because the smaller surface
may not support compression. This will be fixed in a coming patch.
v2. Don't factor in the image alignments when comparing their sizes.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14074
Reviewed-by: Rohan Garg <rohan.garg@intel.com> (v1)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
Does nothing for now. This will be used in future patch where a
64K-aligned image may be selected over a 4K-aligned one.
Follows the alignment request behavior specified in
VkImageAlignmentControlCreateInfoMESA. Specifically, this preference
does not override attempts by ISL to enable compression.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
Prevent assert failures in a future commit where Tile64 will be selected
more often.
Fixes: 42ef23ecd1 ("intel/blorp: Don't redescribe some Tile64 clears")
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
ISL's tiled-memcpy functions don't support Yf, Ys, and Tile64. Remove
those tilings when creating an image which will be used with host-image
copies.
The identical memory layout flag is checked by tests such as:
dEQP-VK.image.host_image_copy.identical_memory_layout.optimal.bc5_snorm_block
dEQP-VK.image.host_image_copy.query.linear.r16_unorm
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
We don't actually handle this case. The next patch will limit the amount
of tilings used when an image is created with
VK_IMAGE_USAGE_HOST_TRANSFER_BIT_EXT. This prevents zink failures on DG2
for various multisampled test cases. For example:
arb_internalformat_query2-internalformat-size-checks -auto -fbo
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
The missing bits for correct operation with compressed textures and
multisampled textures were added in previous commits.
The issues with lossless compression and higher miptail slots seem to
affect 128bpb formats as well. However, we're only failing tests which
use compression (even if those tests never actually use the compression
format, just blorp_copy() up and down). Limit the workaround only to
compressed formats until we get more information/testing.
Tests:
dEQP-VK.api.copy_and_blit.core.image_to_buffer.3d_images.mip_copies_etc2_r8g8b8a8_unorm_block_16x8x24
dEQP-VK.pipeline.monolithic.sampler.view_type.3d.format.astc_10x6_unorm_block.mipmap.linear.lod.select_bias_3_1
dEQP-VK.api.copy_and_blit.core.image_to_buffer.2d_images.mip_copies_astc_12x12_unorm_block_64x192
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
This will be used to clarify some undocumented restrictions with 64bpb
and 128bpb formats. Changes include:
* Drop a redundant tiling check
* Restrict workarounds to the right ISL_SURF_DIM
* Handle the Yf case for the 2D workaround
* Implement a narrower workaround for the 3D workaround
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
When determining if an LOD can fit within a miptail, we must minify in
pixel space and then convert to elements.
Prevents the following test case from failing when Yf is force-enabled:
dEQP-VK.image.texel_view_compatible.graphic.extended.3d_image.texture_read.astc_8x5_srgb_block.r32g32b32a32_uint
Fixes: 46f45d62d1 ("intel/isl: Start using miptails")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
This bit seems to affect whether the SKL or ICL swizzles are used for
multisampled surfaces.
Prevents the following test case from failing when Yf is force-enabled:
dEQP-VK.pipeline.monolithic.multisample.misc.dynamic_rendering.multi_renderpass.r8g8b8a8_unorm_r16g16b16a16_sfloat_r16g16b16a16_sint_d32_sfloat_s8_uint.random_203
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
From the ICL PRMs Volume 5: Memory Data Formats, "Compressed
Multisampled Surfaces":
Tiling for CMS and UMS Surfaces
Multisampled CMS and UMS use a modified table from
non-mulitsampled 2D surfaces.
[...]
TileYS: In addition to u and v, the sample slice index “ss” is
included in the address swizzling according to the following
table.
[...]
TileYF: In addition to u and v, the sample slice index “ss” is
included in the address swizzling according to the following
table.
For depth/stencil surfaces with Yf/Ys tiling, don't use the MSAA
swizzles.
With the driver modified forced to prefer Ys/Yf for depth buffers, this
fixes 14 failing tests in the VK CTS group:
dEQP-VK.pipeline.monolithic.multisample.misc.clear*16x*
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
Adding this mmap mode makes explicit in code that PAT compressed
buffers should not be mmaped.
Although there is no CPU access Xe KMD uAPI still requires a
cpu_caching to be set, so setting WC.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34222>
That function is only called from i915 backend no needed to be
on common code.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34222>
XD is transient display, meaning that GT caches are flushed when
display IP needs access buffer.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34222>
This is not used and we don't have any future plans to use it, so removing it.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34222>
This is not used and don't make sense as the transient display is
on the GPU side.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34222>
Minor adjustments to formatting of the copyright line, but keep
dates and holders. "Authors" entries that could be
obtained via Git logs were also removed.
The license in brw_disasm.c and elk_disasm.c don't match directly
any SPDX pattern I could find, so kept as is.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39503>
The previous Gfx12+ implementation using bit masking is failing for FP8
types, so replacing with explicit lookup tables.
For float types, the encoding now aligns with brw_data_type_float, ensuring
correct behavior for DPAS and other 3-source instructions.
Fixes: d1d4e3d530 ("brw: Add EU assembler support for float8")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39448>
Code kept track of blocks both in a linked list and
in an array. Change the client code of the list to
just use the array so we just maintain one.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39246>
The code currently don't remove blocks, when a block is about to become
empty, the code will replace the last instruction with a NOP.
If we want to have actual block removals again, there are other
strategies than removing them as we iterate (e.g. allow empty blocks
and then collect them in a pass or right after iteration).
So remove those macros.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39246>
Change the few other cases to an inline function that
does the same job. This macro will change in ways that
are not compatible with the non-assembler usages.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39363>
This optimization doesn't work when the ray query index isn't uniform across
the subgroup, which is something the spec allows. While there are some smart
ways to fix this and still avoid unnecessary spilling, its not worth investing
the time until we find a realtime raytracing workload that actually needs to
use multiple live ray queries for something.
Fixes: 1f1de7eb ("anv,brw: Allow multiple ray queries without spilling to a shadow stack")
Acked-by: Sagar Ghuge <sagar.ghuge@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39445>
Disable RHWO by default for singlesample draws and for MSAA
draws if a drirc key is set (avoid perf hit if not needed).
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39404>