Replace load_mesh_global_arg_addr_intel with a more general intrinsic
load_mesh_inline_data_intel, since inline data now hold both
a pointer descriptor information and the first few push constants.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14788>
The format on this platform is slightly different from the one used on
TGL. Also it's part of the surface state instead of an aux-map.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14355>
The name of the bit field is CompressionFormat. The format subsections
of the field specify the alternate names of RenderCompressionFormat or
MediaCompressionFormat depending on the compression type.
We're going to start programming this field for media compression, so
we'd like to use either the bit field name or a new
MediaCompressionFormat field. Either option seems fine, so we go with
the first.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14355>
We haven't tested the single-sampled case, so this doesn't enable any
more uses of standalone CCS. This does however, allow multisampled
surfaces with MCS or HIZ to get upgraded to MCS_CCS and HIZ_CCS,
respectively.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14431>
On XeHP, CCS doesn't require VMA on XeHP. The HW provides anything
allocated in LMEM a mapping to a CCS memory range for free. So, we:
1) use the implicit CCS framework to avoid adding an image memory
binding for the CCS surface.
2) leave each BO sized as-is instead of adding on space for the CCS.
Thankfully the framework only adds on space if an aux-map is present.
XeHP has no aux-map, so this patch doesn't explicitly do anything for
this.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14431>
The fallback is incompatible with allocations that use CCS on XeHP. On
that platform, compression can't be used in SMEM.
Apps should be okay with this change. They're able to manage local and
system memory heaps directly (see VK_EXT_memory_budget).
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14431>
ISL reports no CCS support for these formats, except for
R10G10B10A2_UNORM_SRGB. Anv doesn't support that format at all however.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14431>
this is triggered by mesa's own wsi handling, so stop printing nonsense
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14743>
Commit e614789588 ("anv: Also disallow CCS_E for multi-LOD images")
accidentally disabled CCS_E on TGL+ because it checked for
image->vk.mip_levels > 0 instead of image->vk.mip_levels > 1.
Instead of reverting it, we remove the code which disables CCS_E for
mipmapped or arrayed images now that we've sufficiently handled the
clear color issue in other ways.
Fixes: e614789588 ("anv: Also disallow CCS_E for multi-LOD images")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14723>
On TGL, if a block of fragment shader outputs match the surface's clear
color, the HW may convert them to fast-clears (see HSD 14010672564).
This can lead to rendering corruptions if not handled properly. We
restrict the clear color to zero to avoid issues that can occur with:
- Texture view rendering (including blorp_copy calls)
- Images with multiple levels or array layers
Fixes: e614789588 ("anv: Also disallow CCS_E for multi-LOD images")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14723>
CCS_E is currently disabled on TGL+, but we'll enable it soon. We choose
to explicitly disable it for certain copy operations to avoid CTS
failures in the following groups:
- dEQP-VK.drm_format_modifiers.export_import.*
- dEQP-VK.synchronization*
Fixes: e614789588 ("anv: Also disallow CCS_E for multi-LOD images")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14723>
Includes a fake framebuffer allocation that's necessary for the code we
still use from the regular render passes.
v3: (Lionel)
- Reuse the attachment count from the faux render pass, remove now
unused function
- Add a cmd_buffer_end_rendering function to match begin_rendering,
making use of the split stuff from end_subpass
v4: (Lionel)
- Don't bother with mark_images_writen or resolves on suspend case
- Remove flush at the end of end_rendering, it's not needed
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13980>
v3: (Lionel)
- Handle VkPipelineRenderingCreateInfoKHR not being present
- Rename dynamic_pass and set it for regular render passes too
v4: C99 is good (Lionel)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13980>
There's two of them because they can be created from three points in the
code that provide different details and this is the least ugly way I
could think of for now.
v2: Avoid allocations (Lionel)
v3: Move definition closer to its usage (Lionel)
v4: (Lionel)
- Simplify anv_dynamic_pass_init_full
- Zero out pass/subpass to avoid stall pointers
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13980>
On platforms where we don't support 64 bit instructions we shouldn't
pass such instructions for the code generator to lower into supported
instructions, because this makes their execution pipeline
unpredictable to the scoreboard lowering pass on XeHP+ platforms.
We really should be reducing all these 64 bit instructions before code
generation, so here we add an assert to help us catch and fix these
cases more easily.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
[ Francisco Jerez: Also allow has_integer_dword_mul. ]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14273>
This fixes a bug in the CLUSTER_BROADCAST code generation that causes
the original IR region to be ignored, this will be a problem when we
start lowering 64-bit CLUSTER_BROADCAST instructions at the IR level,
since it will lead to instructions with non-trivial regioning.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14273>
One of the two SHUFFLE implementations wasn't taking into account the
destination stride at all, and the other (more commonly used) one was
taking it into account incorrectly since brw_reg::hstride represents
the stride logarithmically, so we need to use a left-shift operator
instead of product. Found by inspection.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14273>
This fixes a bug in the handcrafted SIMD lowering done by the SHUFFLE
code generation, which wasn't taking into account the source and
destination region strides while deciding whether it needs to split an
instruction.
v2: Use new element_sz() helper instead of left shift. (Lionel)
Fixes: 90c9f29518 ("i965/fs: Add support for nir_intrinsic_shuffle")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14273>