The code did not return error when VK_IMAGE_CREATE_DISJOINT_BIT was
incompatible with the other input params.
If the Vulkan spec forbids a set of input params for vkCreateImage,
but permits them for vkGetPhysicalDeviceImageFormatProperties2,
then vkGetPhysicalDeviceImageFormatProperties2 must reject those input
params with failure.
- v2: Clearer commit message.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
v2: Apply the workaround to all gen hardawre
Ref: GEN:BUG:1409725701
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7463>
We are limited to 64 threads per dispatched group, regardless of what
num_cs_threads claims, so advertise that limit correctly.
Fixes (on TGL and up):
dEQP-VK.subgroups.size_control.compute.required_subgroup_size_min
and other *.required_subgroup_size_min tests.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7453>
pool can't be NULL at this point, because it was already
dereferenced earlier.
Addresses "Dereference before null check" issue reported by Coverity.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7449>
NIR derefs currently have exactly one variable mode. This is about to
change so we can handle OpenCL generic pointers. In order to transition
safely, we need to audit every deref->mode check. This commit adds a
set of helpers that provide more nuanced mode checks and converts most
of NIR to use them.
For simple cases, we add nir_deref_mode_is and nir_deref_mode_is_one_of
helpers. These can be used in passes which don't have to bother with
generic pointers and just want to know what mode a thing is. If the
pass ever encounters generic pointers in a way that this check would be
unsafe, it will assert-fail to alert developers that they need to think
harder about things and fix the pass.
For more complex passes which require a more nuanced understanding of
modes, we add nir_deref_mode_may_be and nir_deref_mode_must_be helpers
which accurately describe the compiler's best knowledge about the given
deref. Unfortunately, we may not be able to exactly identify the mode
in a generic pointers scenario so we have to be very careful when we use
these. Conversion of these passes is left to later commits.
For the case of mass lowering of a particular mode (nir_lower_explicit_io
is one good example), we add nir_deref_mode_is_in_set. This is also
pretty assert-happy like nir_deref_mode_is but is for a set containment
comparison on deref modes where you expect the deref to either be all-in
or all-out.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
anv_bo_pool_alloc expects that the memory returned by and_gem_mmap
was annotated using VALGRIND_MALLOCLIKE_BLOCK, but anv_gem_mmap_offset
didn't do that. Move annotation from anv_gem_mmap_legacy to common
code.
Fixes: 4abf0837cd ("anv: Add support for new MMAP_OFFSET ioctl.")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7381>
In many cases those revision happened every before the first public
release of the spec and we just forgot to update our numbers.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7136>
On Gen12+, stencil buffer compression does not support fast clear so we
don't have to track clear address for it.
v2:
- Use isl_aux_usage_has_fast_clears (Nanley Chery)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2942>
Handle compressed stencil buffer transition from one layout to another
on gen12+.
When stencil compression is enabled, we have to initialize buffer via
stencil clear (HZ_OP) before any renderpass.
v2:
- Pass predicate bit false to anv_image_ccs_op (Nanley Chery)
v3:
- update aspect assertion (Nanley Chery)
v4:
- Make state decision based on anv_layout_to_aux_state instated of
anv_layout_to_aux_usage (Sagar Ghuge)
v5:
- No need to handle stencil CCS resolve case (Jason Ekstrand)
- Initialize buffer using HZ_OP (Nanley Chery)
v6: (Nanley Chery)
- Pass correct layer/level count.
- Remove local variable.
v7:
- Skip stencil initialization with HZ_OP packet if followed by fast
clear. (Nanley Chery)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2942>
Don't check the auxiliary surface's ISL surf in order to return the
surface levels/layers instead we can return the anv_image parameter.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2942>
When blitting from source depth range [0-3] into destination depth
range [0-2], we'll have to use a source layer that is in between 2
layers of the 3D source image.
Other than having an incorrect formula, we're also using integer which
prevent us from using the right source layer.
v2: Drop + 0.5 on application offsets
v3: Reuse num_layers (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3458
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6909>
Now that we have the HDC, using the data cache for UBO pulls seems to
help things quite a bit:
GTA V DXVK 104.0%
Talos Principle GL 102.8%
Rise of Tomb Raider VK 102.8%
Dark Souls 3 DXVK 101.4%
Witcher3 DXVK 101.3%
Bioshock Infinite GL 100.5%
Doom 2016 VK 97.7%
Doom is a bit of a loss but it helps enough other stuff, it's probably
worth the hit.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7230>
On Gen12+, we can enable additional caches in certain usage situations.
This routes that decision making to a central place in ISL, based on
surface usage flags, and updates both drivers to use it. (i965 doesn't
need to change because it doesn't support Gen12.)
We continue handling the "external" decision via an anv_mocs() wrapper
for now, since we store that flag in anv_bo, which isl doesn't know
about. (We could introduce an ISL_SURF_USAGE_EXTERNAL, but I'm not
actually sure that would be cleaner.)
This patch should not have any functional nor performance effects, as
we continue selecting the exact same MOCS values for now.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7104>
Most uses of this function deal with destination buffers, but for
copy_buffer_to_image, the buffer is the source, and isn't rendered
to. We should avoid setting ISL_SURF_USAGE_RENDER_TARGET_BIT.
Also, we should avoid setting ISL_SURF_USAGE_TEXTURE_BIT for the
destination, which isn't sampled from.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7104>
It's a bit hard to exactly map our implementation to the limits
described by Vulkan. The bindless surface handle in the extended
message descriptors is 20 bits and it's an index into the table of
RENDER_SURFACE_STATE structs that starts at bindless surface base
address. This means that we can have at must 1M surface states
allocated at any given time. Since most image views take two
descriptors, this means we have a limit of about 500K image views.
However, since we allocate surface states at vkCreateImageView time,
this means our limit is actually something on the order of 500K image
views allocated at any time. The actual limit describe by Vulkan, on
the other hand, is a limit of how many you can have in a descriptor set.
Assuming anyone using 1M descriptors will be using the same image view
twice a bunch of times (or a bunch of null descriptors), we can safely
advertise a larger limit. 1M is what's required by D3D12, so let's
advertise that.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3335
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7180>
According to the Vulkan specification, the
VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT flag will be ignored if
included in a VkCommandBufferBeginInfo for a primary command buffer.
This also implies pBeginInfo->pInheritanceInfo should not be read even
if the flag is present, and makes it legal to include the flag knowing
it will be ignored.
Signed-off-by: Ricardo Garcia <rgarcia@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7128>
v2: Re-wrap lines in meson.build. Suggested by Jason.
v3: Also update Makefile.sources and Android build files. Noticed by
Lionel.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v2]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6899>
cs_prog_data->slm_size is basically redundant with
prog_data->total_shared, which is the field that we actually use for
controlling the shared local memory size in all drivers. We were
still using it in one place for VK_EXT_pipeline_executable_properties,
but we should just fix that and delete the field.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7152>
This reverts commit bcfec61d1e.
The previous patch fixed the underlying issue that the above commit was
actually working around. It turns out that the previously observed
performance regression was due to invalid aux-map entries for
multi-layer HiZ+CCS buffers.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7046>
Fixes rendering corruption in the shadowmappingcascade Sascha Willems
Vulkan demo. To see the corruption, I adjusted the demo options as
follows:
1. Enable "Display depth map"
2. Set "Split lambda" to 0.100
3. Make "Cascade" non-zero.
Fixes: 80ffbe915f ("anv: Add support for HiZ+CCS")
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7046>
On Gen7, the data cache is pretty terrible so we'd rather avoid it
there. On Gen8+, it should be fine and is less likely to conflict with
texturing so we should get less cache thrashing there.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3932>
Use the same generators as used in anv driver so both Vulkan and OpenGL
drivers can share the same external memory objects.
v2: removed extra parameter from function gen_uuid_compute_device_id
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Signed-off-by: Eleni Maria Stea <estea@igalia.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7025>
We need Vulkan and GL to produce the same UUIDs. So move the generator
from ANV to a common code that can be shared by ANV and Iris driver.
v2: fix android build (Tapani)
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7025>
It's included in declaration of INTEL_DEBUG.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6732>
On Fallout4, enabling HIZ_CCS_WT compression for D16_UNORM format
regress the performance by 2%, in order to avoid that disable
compression via driconf option.
The experiment showed that, running Fallout4 with HIZ performs better
than HIZ_CCS and HIZ_CCS_WT. Reason behind that is the benchmark uses
the depth pass with D16_UNORM surfaces format which fills the L3 cache
and next pass doesn't make use of it where we end up clearing cache.
v2:
- Don't add conditional check in isl (Nanley, Jason)
- Move disable_d16unorm_compression flag to instance (Lionel)
- Use plane_format.isl_format (Nanley)
v3:
- Add more descriptive comment (Marcin Ślusarz)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6734>
Things work a little different on Gen7 than they do on Gen8+. In
particular, SOBufferEnable lives in 3DSTATE_STREAMOUT but BufferPitch
lives in 3DSTATE_SO_BUFFER. This leaves us having to marshal data
around a bit more than we did on Gen8. Still, it's not too bad.
Normally, I don't spend much time on Gen7 but XFB just became a hard
requirement for DXVK so it stopped working for all our Haswell users.
Let's get them happily playing their games again. 😸
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3532
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6997>
Now that we're not trying to evade preprocessor macro expansion in
preprocessor string concatenation, we can use plain old bools in option
setup.
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6916>
We can generate the XML if anybody actually queries it, but this reduces
the amount of work in driver setup and means that we'll be able to support
driconf option queries on Android without libexpat.
This updates the driconf interface struct version for i965, i915, and
radeon to use the new getXml entrypoint to call the on-demand xml
generation. Note that our loaders (egl, glx) implement the v2 function
interface and don't use .xml when that's set, and the X server doesn't use
this interface at all.
XML generation tested on iris and i965 using adriconf
Acked-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6916>
Workaround # 22011374674
Applied to i965, iris and anv drivers
No performance impact is observed with WA.
Cc: mesa-stable
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>