Does nothing for now. This will be used in future patch where a
64K-aligned image may be selected over a 4K-aligned one.
Follows the alignment request behavior specified in
VkImageAlignmentControlCreateInfoMESA. Specifically, this preference
does not override attempts by ISL to enable compression.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
Prevent assert failures in a future commit where Tile64 will be selected
more often.
Fixes: 42ef23ecd1 ("intel/blorp: Don't redescribe some Tile64 clears")
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
ISL's tiled-memcpy functions don't support Yf, Ys, and Tile64. Remove
those tilings when creating an image which will be used with host-image
copies.
The identical memory layout flag is checked by tests such as:
dEQP-VK.image.host_image_copy.identical_memory_layout.optimal.bc5_snorm_block
dEQP-VK.image.host_image_copy.query.linear.r16_unorm
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
We don't actually handle this case. The next patch will limit the amount
of tilings used when an image is created with
VK_IMAGE_USAGE_HOST_TRANSFER_BIT_EXT. This prevents zink failures on DG2
for various multisampled test cases. For example:
arb_internalformat_query2-internalformat-size-checks -auto -fbo
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
The missing bits for correct operation with compressed textures and
multisampled textures were added in previous commits.
The issues with lossless compression and higher miptail slots seem to
affect 128bpb formats as well. However, we're only failing tests which
use compression (even if those tests never actually use the compression
format, just blorp_copy() up and down). Limit the workaround only to
compressed formats until we get more information/testing.
Tests:
dEQP-VK.api.copy_and_blit.core.image_to_buffer.3d_images.mip_copies_etc2_r8g8b8a8_unorm_block_16x8x24
dEQP-VK.pipeline.monolithic.sampler.view_type.3d.format.astc_10x6_unorm_block.mipmap.linear.lod.select_bias_3_1
dEQP-VK.api.copy_and_blit.core.image_to_buffer.2d_images.mip_copies_astc_12x12_unorm_block_64x192
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
This will be used to clarify some undocumented restrictions with 64bpb
and 128bpb formats. Changes include:
* Drop a redundant tiling check
* Restrict workarounds to the right ISL_SURF_DIM
* Handle the Yf case for the 2D workaround
* Implement a narrower workaround for the 3D workaround
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
Allow them in all cases except for one which prevents
dEQP-GLES31.functional.image_load_store.3d.atomic.xor_r32i_return_value
from hitting the following assertion on TGL:
convert_rt_from_3d_to_2d:
Assertion `!isl_tiling_is_std_y(info->surf.tiling)' failed.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
GL only allows atomics on R32 formats. So, for a shader which does
atomic operations, only decompress the bound R32-formatted images
instead of every image.
Aside from the performance improvement, explicitly limiting the formats
here makes it clear which formats may be resolved on gfx12.0. This helps
us to limit the scope of the Ys + 3D-dim restriction that will be added
in the next patch.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
ISL prevents certain tilings from being used on 3D shader images prior
to gfx12 due to an undocumented dataport issue. We're going to allow
these tilings soon, so increase use of the shader flag to make use of
ISL's workaround.
Test case:
arb_shader_image_load_store-layer
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
The BO may contain a surface that is tiled with a 64K tiling. Without
this change, the following piglit test assert fails on ICL:
ext_external_objects-vk-stencil-display -auto -fbo
The assertion is:
isl_gfx11_emit_depth_stencil_hiz_s: Assertion
`info->depth_address % info->depth_surf->alignment_B == 0' failed.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
Prevents the following piglit test from failing on DG2 when Tile64 is
force-enabled:
fbo-clear-formats GL_ARB_texture_rg -auto -fbo
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
When determining if an LOD can fit within a miptail, we must minify in
pixel space and then convert to elements.
Prevents the following test case from failing when Yf is force-enabled:
dEQP-VK.image.texel_view_compatible.graphic.extended.3d_image.texture_read.astc_8x5_srgb_block.r32g32b32a32_uint
Fixes: 46f45d62d1 ("intel/isl: Start using miptails")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
This bit seems to affect whether the SKL or ICL swizzles are used for
multisampled surfaces.
Prevents the following test case from failing when Yf is force-enabled:
dEQP-VK.pipeline.monolithic.multisample.misc.dynamic_rendering.multi_renderpass.r8g8b8a8_unorm_r16g16b16a16_sfloat_r16g16b16a16_sint_d32_sfloat_s8_uint.random_203
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
From the ICL PRMs Volume 5: Memory Data Formats, "Compressed
Multisampled Surfaces":
Tiling for CMS and UMS Surfaces
Multisampled CMS and UMS use a modified table from
non-mulitsampled 2D surfaces.
[...]
TileYS: In addition to u and v, the sample slice index “ss” is
included in the address swizzling according to the following
table.
[...]
TileYF: In addition to u and v, the sample slice index “ss” is
included in the address swizzling according to the following
table.
For depth/stencil surfaces with Yf/Ys tiling, don't use the MSAA
swizzles.
With the driver modified forced to prefer Ys/Yf for depth buffers, this
fixes 14 failing tests in the VK CTS group:
dEQP-VK.pipeline.monolithic.multisample.misc.clear*16x*
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
We get a display fd passed in to us through wsi_display_init_wsi(), and
when that was the first open of the display device with no previous DRM
master, it got master privs and we saved that as the display fd to use for
KHR_display. However, that meant that no other client can get DRM master,
preventing things like vkAcquireDRMDisplayEXT() users from getting a
master fd to pass in to us.
Instead, we can drop master at device init time, and pick it back up when
a VK_KHR_display swapchain is created that uses that fd.
This allows dEQP-VK.wsi.acquire_drm and dEQP-VK.wsi.direct_drm CTS tests
to run, which was previously impossible (those tests try to create a
custom VK instance, while the CTS already has an instance that had been
created with KHR_display enabled, so they're not the first open of the
fd). It also means that you could successfully implement VT switching
between a KHR_display client and other userspace DRM clients. Also, we
can finally implement the text about vkAcquireDRMDisplayEXT's drmFd
needing to match the device's fd.
The risk of this change, though, is if you're implementing a compositor,
and your clients have a chance to open the DRM fd before you've created
your swapchain, they may inadvertently have master and DOS you. However,
this is no different than the previous situation, where someone with
permissions to open DRM could hold master and DOS you already.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38502>
Follow the semi-documented behavior of the blob driver and skip
rendering bins whose fragment density is 0 (i.e. fragment area is
infinite). Some Oculus VR apps using an earlier version of the Unity SDK
rely on this instead of VK_QCOM_multiview_per_view_render_areas.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35894>
When apps use VK_QCOM_multiview_per_view_render_areas, there may be some
bins which are only visible (i.e. overlapping the render area) in one
view. In the typical VR use-case, there is a strip of bins to the right
of the the left eye and to the left of the right eye that are not used
with that eye. By making sure that the right eye is never rendered to,
we can reuse that space to double the GMEM height and merge two bins
along the left edge, partially offsetting the cost of extra bins from
offsetting the left and right viewports and render areas.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35894>
In order to implement this we have to modify all of the cases where we
set a scissor and then loop over attachments to conditionally set the
scissor inside each layer of the attachment based on whether per-view
render areas are supported.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35894>
I noticed when adding support for render areas per view that this didn't
take the number of views into account at all. Based on the code, the
right thing to do seems be to multiply by the layer count.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35894>
We already had to implement per-view viewports for fragment density map.
When multiviewPerViewViewports is enabled, we just have to do what we
did before, except we also have to stop sharing the same original
viewport across all views when FDM is enabled. The app can specify a
different viewport for each view and on top of that we will also
transform it differently depending on the fragment area for that view,
instead of only the transform being different.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35894>
This is a backport of f134cc5a1e:
("Update <type category="funcpointer"> schema to simplify")
in vulkan-docs, essentially. It changed things about how vk.xml
is parsed.
Fixes: b30f780c ("vulkan: update spec to 1.4.340")
Reviewed-by: Aaron Ruby <aruby@qnx.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39502>
The rectangle to clear, which is the render area for subpass clears, is
specified in framebuffer coordinates, but the hardware uses GMEM
coordinates with FDM. I assumed this was ok for subpass clears, because
the end of the bin in GMEM coordinates is always less than or equal to
the end in framebuffer coordinates, so we would clear past the end of
the bin which is still safe because only the render area would be stored
to sysmem:
bin 0 bin 1 bin 2
|---| |---| |---| GMEM coordinates (what the HW "sees")
|-------|-------|-------| framebuffer coordinates (used e.g.
as STORE_OP_STORE destination)
|-----------------------| render area/clear rectangle (past end of bin
in GMEM coordinates!)
There was a hack for FDM offset, where framebuffer coordinates are
shifted to the left, but that was it. However this breaks down if the
render area doesn't start at (0,0), because it can miss pixels in GMEM
coordinates that should be cleared:
bin 0 bin 1 bin 2
|---| |---| |---| GMEM coordinates (what the HW "sees")
|-------|-------|-------| framebuffer coordinates (used e.g.
as STORE_OP_STORE destination)
|------------------| render area/clear rectangle (we don't clear
bin 0!)
Here we should clear the right half of bin 0 but instead we don't clear
it at all.
Instead of adding yet more hacks to expand the render area, just add a
patchpoint to transform the render area into GMEM coordinates. We
already do this for CmdClearAttachments where we didn't have a choice,
so just reuse that. As a bonus, we can also delete the hack for FDM
offset.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39495>
The previous method to calculate imageSize().z was
incorrect for a cubearray view.
This change was tested on palm and cayman. Here is the test fixed:
spec/arb_texture_view/rendering-layers-image/layers rendering of imagecubearray: fail pass
Fixes: 6c1432f0be ("r600/eg: fix cube map array buffer images.")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39063>
This extension seems to work.
This change was tested with the current piglit repository:
spec/ext_shader_realtime_clock/execution/clock2x32: skip pass
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37954>
Adding this mmap mode makes explicit in code that PAT compressed
buffers should not be mmaped.
Although there is no CPU access Xe KMD uAPI still requires a
cpu_caching to be set, so setting WC.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34222>
That function is only called from i915 backend no needed to be
on common code.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34222>
XD is transient display, meaning that GT caches are flushed when
display IP needs access buffer.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34222>
This is not used and we don't have any future plans to use it, so removing it.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34222>
This is not used and don't make sense as the transient display is
on the GPU side.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34222>
Similar to the low latency option for encode, this reduces latency
of decoding at the cost of increased power usage.
Can be enabled with AMD_DEBUG=lowlatencydec
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39450>
Some apps (old FFmpeg, contemporary CTS) send down pMi{Col,Row}Starts in
SB units, not MI units. Instead of dependening on those values which
could be unreliable, derive the tile sizes in SB using other parameters.
Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39492>
Added support for externally provided motion hints by reading the
MFSampleExtension_MoveRegions sample attribute.
The motion hint data is converted into pipe_enc_move_info and passed
down to the driver for use during encoding.
Reviewed-by: Yubo Xie <yuboxie@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39515>
This change fixes the clamp to max_texel_buffer_elements
issue related to rv770 and older gpus.
Here are the tests fixed on rv770:
spec/arb_texture_buffer_object/texture-buffer-size-clamp/r8ui_texture_buffer_size_via_sampler: fail pass
spec/arb_texture_buffer_object/texture-buffer-size-clamp/rg8ui_texture_buffer_size_via_sampler: fail pass
spec/arb_texture_buffer_object/texture-buffer-size-clamp/rgba8ui_texture_buffer_size_via_sampler: fail pass
Fixes: 1a441ad5cb ("r600: clamp to max_texel_buffer_elements")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39385>
This is a gl4.3 issue very similar to e8fa3b4950.
The mode r10g10b10a2_sscaled processed as vertex on palm at the
hardware level doesn't follow the current standard. Indeed, the .w
component (2-bits) is not calculated as expected. The table below
describes the situation.
This change fixes this issue by adding two gpu instructions at
the vertex fetch shader stage. An equivalent C representation and
a gpu asm dump of the generated sequence are available below.
.w(2-bits) expected palm cypress
0 0 0 0
1 1 1 1
2 -2 2 -2
3 -1 3 -1
w_out = w_in - (w_in > 1. ? 4. : 0.);
0002 00000024 A0040000 ALU 2 @72
0072 801F2C0A 600004C0 1 w: SETGT*4 __.w, R10.w, 1.0
0074 839FCC0A 61400010 2 w: ADD R10.w, R10.w, -PV.w
Note: cypress returns the expected value, and does not need
this correction.
This change was tested on palm, barts and cayman. Here are the tests fixed:
khr-gl4[3-6]/vertex_attrib_binding/basic-input-case6: fail pass
khr-gles31/core/vertex_attrib_binding/basic-input-case6: fail pass
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38849>