Pure refactor. No intended change in behavior.
This makes the code infinitely easier to understand. And it uncovers
a potential bug (marked with XXX comment).
v2: Fix narrowing conversions on 32-bit arch. s/size_t/uintmax_t/.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
Months ago, make_surface() added *all* surfaces required for the given
aspect. It was a monster monolithic function, and difficult to reason
about its correctness. In commit c652ff8c (2020-03-06), I split the code
for aux surfaces into its own function, add_aux_surface_if_supported().
This patch continues the splitting, therefore making bugs easier to
identify.
Code changes:
- Move the code that adds the shadow surface from make_surface() to
a new function add_shadow_surface(), called from
add_all_surfaces().
- Move the call to add_aux_surface_if_supported() from make_surface()
to add_all_surfaces().
- To preserve correctness of the assertions on image layout in
make_surface(), move them to the loop in add_all_surfaces() after
all the aspect's surfaces have been added.
- Rename make_surface() to add_primary_surface() because now that's
what it does.
Pure refactor, no intended change in behavior.
v2:
- Rebase onto "anv: Fix isl_surf_usage_flags for stencil images".
- Sanitize the image's extent earlier.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
This deduplicates the loops in anv_image_create() and
resolve_ahw_image().
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
In anv_get_image_format_properties(), the special-case code for
VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT is tiny. It is mostly a detached
'case' in the 'switch' block for VkImageType. So move the special-case
code to immediately follow the 'switch' block.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
In vkGetPhysicalDeviceImageFormatProperties.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The check in anv_get_image_format_properties() is already handled in
anv_get_image_format_features().
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
When filling VkImageFormatProperties, anv_get_image_format_properties()
checks the requested VkImageUsageFlags and VkImageCreateFlags against
the VkFormatFeatureFlags available to the queried VkFormat. However, we
neglected to consider if any formats given in
VkImageFormatListCreateInfo
further restricted the available VkFormatFeatureFlags.
The image view formats are more likely to introduce additional
restrictions when DRM format modifiers are present.
v2:
- Do not drop anv_formats_ccs_e_compatible().
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
If anv_get_image_format_features reports that the inputs are
unsupported, fail immediately.
Without the early fail, I have less confidence in the function's
correctness when a DRM format modifier is present.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The code in anv_get_image_format_properties() that set sampleCounts
appears correct, but weirdly inconsistent. Clean the code to
consistently set sampleCounts in the same location as
maxExtent/maxMipLevels/maxArraySize.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Rename it to get_drm_format_modifier_properties_list() because it is now
independent of WSI.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
In vkGetPhysicalDeviceImageFormatProperties2, we advertised support for
VK_IMAGE_TILING_LINEAR and VK_IMAGE_TILING_OPTIMAL for all memory
handles.
However, when importing or exporting an image, there must exist a method
that enables the app and driver to agree on the image's memory layout.
If no method exists, then we should reject image creation.
v2:
- Reduce copy-paste for Lionel.
v3:
- Treat tiling LINEAR and DRM_FORMAT_MODIFIER as identical when
determing compatible memory handles.
- Improve comments.
v4:
- Remove DMA_BUF from opaque_fd_only_props.
v5:
- Minor changes to code style for `if`. (for jekstrand)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v4)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v4)
The code asserted that we supported no more than 4 formats with
modifiers: /VK_FORMAT_B8G8R8(A8)?_(SRGB|UNORM)/.
Strangely, 2 of the 4 were non-power-of-two formats, which were rejected
elsewhere.
The assertion's comment suggested that we use a hard-coded list of
formats because the driver was not yet able to determine if a given
format was compatible with a given modifier. Therefore, the list only
contained formats that were compatible with *all* modifiers. That code
deficiency no longer exists: anv_get_image_format_features() can check
format/modifier compatibility.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Refactor in get_wsi_format_modifier_properties_list().
Instead of iterating over a function-local hard-coded list, iterate over
all modifiers in isl_drm.c.
This will improve agreement in behavior between
VkDrmFormatModifierPropertiesListEXT
VkPhysicalDeviceImageDrmFormatModifierInfoEXT.
The future disagreement this patch attempts to prevent is the
combination of:
a. VkDrmFormatModifierPropertiesListEXT neglects to return a valid
modifier because its hard-coded list of modifiers drifts
out-of-sync with hard-coded lists elsewhere in the code. (Already
today, the list in get_wsi_format_modifier_properties_list() does
not match the list in isl_drm.c; though, this has produced no bug
yet).
b. vkGetPhysicalDeviceImageFormatProperties2 accepts, via
VkPhysicalDeviceImageDrmFormatModifierInfoEXT, the modifier
overlooked in (a), because it does not use the same hard-coded
list in get_wsi_format_modifier_properties_list(). (Recall that
the spec requires vkGetPhysicalDeviceImageFormatProperties2 to
correctly accept/reject any int that the app provides, even when
the int is an invalid modifier).
c. The Bug. The driver told the app in (b) that it can legally
create an image with format+modifier, but the app cannot query
the VkFormatFeatureFlags of the format+modifier due to (a).
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fill VkDrmFormatModifierPropertiesEXT::drmFormatModifierTilingFeatures
with anv_get_image_format_features().
anv_formats.c:get_wsi_format_modifier_properties_list() incorrectly left
it uninitialized.
v2: Increment drmFormatModifierPlaneCount if modifier support aux.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
Because anv_get_image_format_features() now understands modifiers, also
relocate most of the modifier compatibility checks from
anv_get_format_plane() into anv_get_image_format_features() in order to
avoid duplication.
The new signature forces some code movement in
anv_get_image_format_properties().
v2:
- Reject VK_FORMAT_B4G4R4A4_UNORM_PACK16 with modifiers on HSW.
v3:
- Revert the v2 change.
- Query isl_format_layout instead of pipe_format. (for jekstrand)
- Drop misguided comments. (for jekstrand)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v3)
The code did not return error when VK_IMAGE_CREATE_DISJOINT_BIT was
incompatible with the other input params.
If the Vulkan spec forbids a set of input params for vkCreateImage,
but permits them for vkGetPhysicalDeviceImageFormatProperties2,
then vkGetPhysicalDeviceImageFormatProperties2 must reject those input
params with failure.
- v2: Clearer commit message.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
v2: Apply the workaround to all gen hardawre
Ref: GEN:BUG:1409725701
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7463>
We are limited to 64 threads per dispatched group, regardless of what
num_cs_threads claims, so advertise that limit correctly.
Fixes (on TGL and up):
dEQP-VK.subgroups.size_control.compute.required_subgroup_size_min
and other *.required_subgroup_size_min tests.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7453>
pool can't be NULL at this point, because it was already
dereferenced earlier.
Addresses "Dereference before null check" issue reported by Coverity.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7449>
NIR derefs currently have exactly one variable mode. This is about to
change so we can handle OpenCL generic pointers. In order to transition
safely, we need to audit every deref->mode check. This commit adds a
set of helpers that provide more nuanced mode checks and converts most
of NIR to use them.
For simple cases, we add nir_deref_mode_is and nir_deref_mode_is_one_of
helpers. These can be used in passes which don't have to bother with
generic pointers and just want to know what mode a thing is. If the
pass ever encounters generic pointers in a way that this check would be
unsafe, it will assert-fail to alert developers that they need to think
harder about things and fix the pass.
For more complex passes which require a more nuanced understanding of
modes, we add nir_deref_mode_may_be and nir_deref_mode_must_be helpers
which accurately describe the compiler's best knowledge about the given
deref. Unfortunately, we may not be able to exactly identify the mode
in a generic pointers scenario so we have to be very careful when we use
these. Conversion of these passes is left to later commits.
For the case of mass lowering of a particular mode (nir_lower_explicit_io
is one good example), we add nir_deref_mode_is_in_set. This is also
pretty assert-happy like nir_deref_mode_is but is for a set containment
comparison on deref modes where you expect the deref to either be all-in
or all-out.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
anv_bo_pool_alloc expects that the memory returned by and_gem_mmap
was annotated using VALGRIND_MALLOCLIKE_BLOCK, but anv_gem_mmap_offset
didn't do that. Move annotation from anv_gem_mmap_legacy to common
code.
Fixes: 4abf0837cd ("anv: Add support for new MMAP_OFFSET ioctl.")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7381>
In many cases those revision happened every before the first public
release of the spec and we just forgot to update our numbers.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7136>
On Gen12+, stencil buffer compression does not support fast clear so we
don't have to track clear address for it.
v2:
- Use isl_aux_usage_has_fast_clears (Nanley Chery)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2942>
Handle compressed stencil buffer transition from one layout to another
on gen12+.
When stencil compression is enabled, we have to initialize buffer via
stencil clear (HZ_OP) before any renderpass.
v2:
- Pass predicate bit false to anv_image_ccs_op (Nanley Chery)
v3:
- update aspect assertion (Nanley Chery)
v4:
- Make state decision based on anv_layout_to_aux_state instated of
anv_layout_to_aux_usage (Sagar Ghuge)
v5:
- No need to handle stencil CCS resolve case (Jason Ekstrand)
- Initialize buffer using HZ_OP (Nanley Chery)
v6: (Nanley Chery)
- Pass correct layer/level count.
- Remove local variable.
v7:
- Skip stencil initialization with HZ_OP packet if followed by fast
clear. (Nanley Chery)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2942>
Don't check the auxiliary surface's ISL surf in order to return the
surface levels/layers instead we can return the anv_image parameter.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2942>
When blitting from source depth range [0-3] into destination depth
range [0-2], we'll have to use a source layer that is in between 2
layers of the 3D source image.
Other than having an incorrect formula, we're also using integer which
prevent us from using the right source layer.
v2: Drop + 0.5 on application offsets
v3: Reuse num_layers (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3458
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6909>
Now that we have the HDC, using the data cache for UBO pulls seems to
help things quite a bit:
GTA V DXVK 104.0%
Talos Principle GL 102.8%
Rise of Tomb Raider VK 102.8%
Dark Souls 3 DXVK 101.4%
Witcher3 DXVK 101.3%
Bioshock Infinite GL 100.5%
Doom 2016 VK 97.7%
Doom is a bit of a loss but it helps enough other stuff, it's probably
worth the hit.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7230>
On Gen12+, we can enable additional caches in certain usage situations.
This routes that decision making to a central place in ISL, based on
surface usage flags, and updates both drivers to use it. (i965 doesn't
need to change because it doesn't support Gen12.)
We continue handling the "external" decision via an anv_mocs() wrapper
for now, since we store that flag in anv_bo, which isl doesn't know
about. (We could introduce an ISL_SURF_USAGE_EXTERNAL, but I'm not
actually sure that would be cleaner.)
This patch should not have any functional nor performance effects, as
we continue selecting the exact same MOCS values for now.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7104>
Most uses of this function deal with destination buffers, but for
copy_buffer_to_image, the buffer is the source, and isn't rendered
to. We should avoid setting ISL_SURF_USAGE_RENDER_TARGET_BIT.
Also, we should avoid setting ISL_SURF_USAGE_TEXTURE_BIT for the
destination, which isn't sampled from.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7104>
It's a bit hard to exactly map our implementation to the limits
described by Vulkan. The bindless surface handle in the extended
message descriptors is 20 bits and it's an index into the table of
RENDER_SURFACE_STATE structs that starts at bindless surface base
address. This means that we can have at must 1M surface states
allocated at any given time. Since most image views take two
descriptors, this means we have a limit of about 500K image views.
However, since we allocate surface states at vkCreateImageView time,
this means our limit is actually something on the order of 500K image
views allocated at any time. The actual limit describe by Vulkan, on
the other hand, is a limit of how many you can have in a descriptor set.
Assuming anyone using 1M descriptors will be using the same image view
twice a bunch of times (or a bunch of null descriptors), we can safely
advertise a larger limit. 1M is what's required by D3D12, so let's
advertise that.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3335
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7180>
According to the Vulkan specification, the
VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT flag will be ignored if
included in a VkCommandBufferBeginInfo for a primary command buffer.
This also implies pBeginInfo->pInheritanceInfo should not be read even
if the flag is present, and makes it legal to include the flag knowing
it will be ignored.
Signed-off-by: Ricardo Garcia <rgarcia@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7128>
v2: Re-wrap lines in meson.build. Suggested by Jason.
v3: Also update Makefile.sources and Android build files. Noticed by
Lionel.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v2]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6899>
cs_prog_data->slm_size is basically redundant with
prog_data->total_shared, which is the field that we actually use for
controlling the shared local memory size in all drivers. We were
still using it in one place for VK_EXT_pipeline_executable_properties,
but we should just fix that and delete the field.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7152>
This reverts commit bcfec61d1e.
The previous patch fixed the underlying issue that the above commit was
actually working around. It turns out that the previously observed
performance regression was due to invalid aux-map entries for
multi-layer HiZ+CCS buffers.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7046>