In our source languages, interpolateAtOffset() takes a floating point
offset in the range [-0.5, +0.5]. However, the hardware takes integer
valued offsets in the range [-8, 7], in units of 1/16th of a pixel.
So, we need to multiply and clamp the coordinates. We were doing this
in the FS backend, but with the advent of IBC, I'd like to avoid doing
it twice. This patch instead moves the lowering to NIR so we can reuse
it across both backends.
v2: Use nir_shader_instructions_pass (suggested by Eric Anholt).
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6193>
Simple refactor. No intended change in behavior.
Replace each derivation of aux address with anv_image_get_aux_addr().
The function will soon do more in support of
VK_EXT_image_drm_format_modifier, where the image bo and aux bo may be
disjoint.
v2:
- Replace param 'aspect' with 'plane'.
v3:
- Workaround for stencil ccs. If no aux surface, then return
ANV_NULL_ADDRESS.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v3)
Pre-patch, we checked the offsets once per aspect after adding all
surfaces for the aspect. The additional checks will make it easier to
diagnose layout bugs.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Pure refactor. No intended change in behavior.
This makes the code infinitely easier to understand. And it uncovers
a potential bug (marked with XXX comment).
v2: Fix narrowing conversions on 32-bit arch. s/size_t/uintmax_t/.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
Months ago, make_surface() added *all* surfaces required for the given
aspect. It was a monster monolithic function, and difficult to reason
about its correctness. In commit c652ff8c (2020-03-06), I split the code
for aux surfaces into its own function, add_aux_surface_if_supported().
This patch continues the splitting, therefore making bugs easier to
identify.
Code changes:
- Move the code that adds the shadow surface from make_surface() to
a new function add_shadow_surface(), called from
add_all_surfaces().
- Move the call to add_aux_surface_if_supported() from make_surface()
to add_all_surfaces().
- To preserve correctness of the assertions on image layout in
make_surface(), move them to the loop in add_all_surfaces() after
all the aspect's surfaces have been added.
- Rename make_surface() to add_primary_surface() because now that's
what it does.
Pure refactor, no intended change in behavior.
v2:
- Rebase onto "anv: Fix isl_surf_usage_flags for stencil images".
- Sanitize the image's extent earlier.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
This deduplicates the loops in anv_image_create() and
resolve_ahw_image().
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
In anv_get_image_format_properties(), the special-case code for
VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT is tiny. It is mostly a detached
'case' in the 'switch' block for VkImageType. So move the special-case
code to immediately follow the 'switch' block.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
In vkGetPhysicalDeviceImageFormatProperties.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The check in anv_get_image_format_properties() is already handled in
anv_get_image_format_features().
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
When filling VkImageFormatProperties, anv_get_image_format_properties()
checks the requested VkImageUsageFlags and VkImageCreateFlags against
the VkFormatFeatureFlags available to the queried VkFormat. However, we
neglected to consider if any formats given in
VkImageFormatListCreateInfo
further restricted the available VkFormatFeatureFlags.
The image view formats are more likely to introduce additional
restrictions when DRM format modifiers are present.
v2:
- Do not drop anv_formats_ccs_e_compatible().
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
If anv_get_image_format_features reports that the inputs are
unsupported, fail immediately.
Without the early fail, I have less confidence in the function's
correctness when a DRM format modifier is present.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The code in anv_get_image_format_properties() that set sampleCounts
appears correct, but weirdly inconsistent. Clean the code to
consistently set sampleCounts in the same location as
maxExtent/maxMipLevels/maxArraySize.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Rename it to get_drm_format_modifier_properties_list() because it is now
independent of WSI.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
In vkGetPhysicalDeviceImageFormatProperties2, we advertised support for
VK_IMAGE_TILING_LINEAR and VK_IMAGE_TILING_OPTIMAL for all memory
handles.
However, when importing or exporting an image, there must exist a method
that enables the app and driver to agree on the image's memory layout.
If no method exists, then we should reject image creation.
v2:
- Reduce copy-paste for Lionel.
v3:
- Treat tiling LINEAR and DRM_FORMAT_MODIFIER as identical when
determing compatible memory handles.
- Improve comments.
v4:
- Remove DMA_BUF from opaque_fd_only_props.
v5:
- Minor changes to code style for `if`. (for jekstrand)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v4)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v4)
The code asserted that we supported no more than 4 formats with
modifiers: /VK_FORMAT_B8G8R8(A8)?_(SRGB|UNORM)/.
Strangely, 2 of the 4 were non-power-of-two formats, which were rejected
elsewhere.
The assertion's comment suggested that we use a hard-coded list of
formats because the driver was not yet able to determine if a given
format was compatible with a given modifier. Therefore, the list only
contained formats that were compatible with *all* modifiers. That code
deficiency no longer exists: anv_get_image_format_features() can check
format/modifier compatibility.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Refactor in get_wsi_format_modifier_properties_list().
Instead of iterating over a function-local hard-coded list, iterate over
all modifiers in isl_drm.c.
This will improve agreement in behavior between
VkDrmFormatModifierPropertiesListEXT
VkPhysicalDeviceImageDrmFormatModifierInfoEXT.
The future disagreement this patch attempts to prevent is the
combination of:
a. VkDrmFormatModifierPropertiesListEXT neglects to return a valid
modifier because its hard-coded list of modifiers drifts
out-of-sync with hard-coded lists elsewhere in the code. (Already
today, the list in get_wsi_format_modifier_properties_list() does
not match the list in isl_drm.c; though, this has produced no bug
yet).
b. vkGetPhysicalDeviceImageFormatProperties2 accepts, via
VkPhysicalDeviceImageDrmFormatModifierInfoEXT, the modifier
overlooked in (a), because it does not use the same hard-coded
list in get_wsi_format_modifier_properties_list(). (Recall that
the spec requires vkGetPhysicalDeviceImageFormatProperties2 to
correctly accept/reject any int that the app provides, even when
the int is an invalid modifier).
c. The Bug. The driver told the app in (b) that it can legally
create an image with format+modifier, but the app cannot query
the VkFormatFeatureFlags of the format+modifier due to (a).
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This allows Vulkan and GL to iterate over the full list of modifiers
instead of hard-coding in various places the "same" list as isl.
(Anvil's list has already diverged from isl's list. It omits Gen12
modifiers).
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fill VkDrmFormatModifierPropertiesEXT::drmFormatModifierTilingFeatures
with anv_get_image_format_features().
anv_formats.c:get_wsi_format_modifier_properties_list() incorrectly left
it uninitialized.
v2: Increment drmFormatModifierPlaneCount if modifier support aux.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
Because anv_get_image_format_features() now understands modifiers, also
relocate most of the modifier compatibility checks from
anv_get_format_plane() into anv_get_image_format_features() in order to
avoid duplication.
The new signature forces some code movement in
anv_get_image_format_properties().
v2:
- Reject VK_FORMAT_B4G4R4A4_UNORM_PACK16 with modifiers on HSW.
v3:
- Revert the v2 change.
- Query isl_format_layout instead of pipe_format. (for jekstrand)
- Drop misguided comments. (for jekstrand)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v3)
If each format channel has the same base type (such unorm), then that
is the format's "uniform channel type".
Calculating the field at buildtime is probably better than looping over
all channels at runtime each time we wish to query it.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Return the modifier's score, which indicates the driver's preference for the
modifier relative to others. A higher score is better. Zero means
unsupported.
Intended to assist selection of a modifier from an externally provided list,
such as VkImageDrmFormatModifierListCreateInfoEXT.
v2:
- Rename anv_drm_format_mod_score to isl_drm_modifier_get_score.
- Squash all incremental changes to anv_drm_format_mod_score.
v3:
- Drop redundant 'unlikely'. (for nchery)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v3)
The code did not return error when VK_IMAGE_CREATE_DISJOINT_BIT was
incompatible with the other input params.
If the Vulkan spec forbids a set of input params for vkCreateImage,
but permits them for vkGetPhysicalDeviceImageFormatProperties2,
then vkGetPhysicalDeviceImageFormatProperties2 must reject those input
params with failure.
- v2: Clearer commit message.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Motivation is to detect earlier certain bugs that can occur when
missing a check for the stage before using the downcast.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7540>
In commit eda3e4e055, Eric added names
to various programs. In that patch, he also renamed our passthrough
TCS shader from "passthrough" to "passthrough TCS". The passthrough
TCS directly supplies the VUE headers rather than doing the whole
"patch parameters are in backwards order" reswizzling dance.
We failed to detect this and started trying to supply vec4s starting
at component 3, leading to a stack smash on an array of 7 sources,
not to mention the values were being put in the wrong place.
Easy fix: update the code for the new name.
Fixes: eda3e4e055 ("nir/builder: Add a name format arg to nir_builder_init_simple_shader().")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3777
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7564>
This cleans up a bunch of gross sprintfs and keeps the caller from needing
to remember to ralloc_strdup. I added a couple of '"%s", name ? name :
""' to radv where I didn't fully trace through whether a non-null name was
being passed in.
I also took the liberty of adding a basic name to a few shaders (pan_blit,
unit tests)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7323>
These two consumers were the only ones out of the ~65 calls to
init_simple_shader, so there's a pretty clear consensus on how to allocate
simple shaders. I suspect that actually these would be just fine with
b.shader being the mem_ctx, but that would take a bit more rework.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7323>
Our driver started using this method to mmap the BOs and we need to
hook it to track the dirtiness of the BO.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Tested-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7528>
v2: Apply the workaround to all gen hardawre
Ref: GEN:BUG:1409725701
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7463>
Intel hardware supports 8-bit arithmetic but it's tricky and annoying:
- Byte operations don't actually execute with a byte type. The
execution type for byte operations is actually word. (I don't know
if this has implications for the HW implementation. Probably?)
- Destinations are required to be strided out to at least the
execution type size. This means that B-type operations always have
a stride of at least 2. This means wreaks havoc on the back-end in
multiple ways.
- Thanks to the strided destination, we don't actually save register
space by storing things in bytes. We could, in theory, interleave
two byte values into a single 2B-strided register but that's both a
pain for RA and would lead to piles of false dependencies pre-Gen12
and on Gen12+, we'd need some significant improvements to the SWSB
pass.
- Also thanks to the strided destination, all byte writes are treated
as partial writes by the back-end and we don't know how to copy-prop
them.
- On Gen11, they added a new hardware restriction that byte types
aren't allowed in the 2nd and 3rd sources of instructions. This
means that we have to emit B->W conversions all over to resolve
things. If we emit said conversions in NIR, instead, there's a
chance NIR can get rid of some of them for us.
We can get rid of a lot of this pain by just asking NIR to get rid of
8-bit arithmetic for us. It may lead to a few more conversions in some
cases but having back-end copy-prop actually work is probably a bigger
bonus. There is still a bit we have to handle in the back-end. In
particular, basic MOVs and conversions because 8-bit load/store ops
still require 8-bit types.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7482>
We can't really support these directly on any platform. May as well let
NIR lower them. The NIR lowering is potentially one more instruction
for scan/reduce ops thanks to not being able to do the B->W conversion
as part of SEL_EXEC. For imax/imin exclusive scan, it's yet another
instruction thanks to the extra imax/imin NIR has to insert to deal with
the fact that the first live channel will contain the identity value
which, when signed, will cast wrong. However, it does let us drop some
complexity from our back-end so it's probably worth it.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7482>
We are limited to 64 threads per dispatched group, regardless of what
num_cs_threads claims, so advertise that limit correctly.
Fixes (on TGL and up):
dEQP-VK.subgroups.size_control.compute.required_subgroup_size_min
and other *.required_subgroup_size_min tests.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7453>
Refactor out the part of fail_if function that never returns into
NORETURN function and put the condition check outside.
Addresses many false positive warnings by Coverity.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7449>
pool can't be NULL at this point, because it was already
dereferenced earlier.
Addresses "Dereference before null check" issue reported by Coverity.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7449>
There's already code handling that case and help text also says
it's possible.
Found, because Coverity complained about optarg NULL check,
suggesting optarg can be NULL for other options, where it's not
possible. IOW, false positive lead me to finding an unrelated issue.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7449>