The current code is not handling the potential NULL pointer in
VkDebugUtilsObjectNameInfoEXT::pObjectName
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e1b9a6e4f3 ("anv: initial RMV support")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30516>
Like AHB, we don't know the layout for an image backed by gralloc
swapchain memory until bind when gralloc information is passed by the
platform.
Signed-off-by: Juston Li <justonli@google.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29850>
Refactor out shared code for the u_gralloc tiling query so it can also
be used by ahw and later anb resolves.
Signed-off-by: Juston Li <justonli@google.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29850>
Dota 2 and Counter-Strike 2 really want to be able to allocate memory
for both VkImages and VkBuffers from the same memory type. Xe2's
special compression-only memory type does not support buffers, which
makes these games crash. Disable CCS on these games as a workaround.
This is a temporary workaround as we're still working towards a
long-term solution (either by fixing the engine or finding a way
better expose our memory types).
Backport-to: 24.2
Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11520
Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11521
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30481>
These memory types are useless when CCS is disabled, don't leave them
there so they don't confuse applications.
Backport-to: 24.2
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30481>
Xe KMD will now report the different topology mask types based on the
type of the EU of running platform.
With this we don't need to divide the EU count by 2 in intel_perf.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30127>
Allows a driver to declare indirect arguments for its tracepoints and
pass an address. u_trace will request a copy of the data which should
be implemented on the command processor.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Co-Authored-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29944>
We want to reduce the buffer allocations for other type of data than
timestamps.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29944>
We're about to add indirect arguments, having a better way to describe
arguments (as capture/storage) will be useful.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29944>
Our validation code doesn't need to know which bytes are accessed. It
only needs to know which grfs were accessed by an element. This also
helps to easily handle the Xe2 register size change.
Backport-to: 24.2
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28479>
Previously this check would create a mask of the bytes used in the
grf, and then shift the mask. This worked well when there was 32 bytes
in the register because a 64-bit uint64_t could easily detect that
bytes were used in the next regiter. (The next register was the high
32-bits of the `access_mask` variable.)
With Xe2, the register size becomes 64 bytes, meaning this strategy
doesn't work. Instead of a mask, we can just check to see if more than
1 grfs are used during each loop iteration. (Suggested by Ken.) This
will make it easier to extend for Xe2 in a follow on commit.
Verified this with
dEQP-VK.subgroups.arithmetic.compute.subgroupexclusivemul_u64vec4_requiredsubgroupsize
on Xe2, which otherwise would cause the program to fail to validate
because it assumed a grf was 32 bytes.
Backport-to: 24.2
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28479>
Also add a stub for VkGetPhysicalDeviceVideoEncodeQualityLevelPropertiesKHR
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27810>
Anv supports VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BUFFER_OFFSET_BIT_KHR
and VK_VIDEO_ENCODE_FEEDBACK_BITSTREAM_BYTES_WRITTEN_BIT_KHR.
Also add to handle the VK_QUERY_RESULT_WITH_STATUS_BIT_KHR flag.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27810>
We don't actually need to extend g0's live range to the EOT message
generally - most messages that end a shader are headerless. The main
implicit use of g0 is for constructing scratch headers. With the last
two patches, we now consider scratch access that may exist in the IR
and already extend the liveness appropriately.
There is one remaining problem: spilling. The register allocator will
create new scratch messages when spilling a register, which need to
create scratch headers, which need g0. So, every new spill or fill
might extend the live range of g0, which would create new interference,
altering the graph. This can be problematic.
However, when compiling SIMD16 or SIMD32 fragment shaders, we don't
allow spilling anyway. So, why not use allow g0? Also, when trying
various scheduling modes, we first try allocation without spilling.
If it works, great, if not, we try a (hopefully) less aggressive
schedule, and only allow spilling on the lowest-pressure schedule.
So, even for regular SIMD8 shaders, we can potentially gain the use
of g0 on the first few tries at scheduling+allocation.
Once we try to allocate with spilling, we go back to reserving g0
for the entire program, so that we can construct scratch headers at
any point. We could possibly do better here, but this is simple and
reliable with some benefit.
Thanks to Ian Romanick for suggesting I try this approach.
fossil-db on Alchemist shows some more spill/fill improvements:
Totals:
Instrs: 149062395 -> 149053010 (-0.01%); split: -0.01%, +0.00%
Cycles: 12609496913 -> 12611652181 (+0.02%); split: -0.45%, +0.47%
Spill count: 52891 -> 52471 (-0.79%)
Fill count: 101599 -> 100818 (-0.77%)
Scratch Memory Size: 3292160 -> 3197952 (-2.86%)
Totals from 416541 (66.59% of 625484) affected shaders:
Instrs: 124058587 -> 124049202 (-0.01%); split: -0.01%, +0.01%
Cycles: 3567164271 -> 3569319539 (+0.06%); split: -1.61%, +1.67%
Spill count: 420 -> 0 (-inf%)
Fill count: 781 -> 0 (-inf%)
Scratch Memory Size: 94208 -> 0 (-inf%)
Witcher 3 shows a 33% reduction in scratch memory size, for example.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30319>
brw_send_indirect_split_message() implicitly reads g0 to construct the
extended message descriptor for certain send messages when this is set.
Record that liveness explicitly.
Thanks to Francisco Jerez for reminding me about this use of g0.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30319>
The generator code for emitting legacy scratch headers was implicitly
using g0 as a source. But the IR wasn't indicating any usage of g0,
which means the liveness isn't properly tracked at the IR level.
It works because we reserve g0 as permanently live for the whole
program. In order to stop doing that, we need to record it properly.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30319>
Fall through to common vk_ahb_format_to_image_format() to handle
R8G8B8X8 as R8G8B8A8.
Fixes issues with querying for format feature support when its handled
as R8G8B8.
Signed-off-by: Juston Li <justonli@google.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30080>
API requires samplers to return 32-bit even though hardware can handle
16-bit floating point, so we detect that case and make more efficient
use of memory BW. This is helping improve performance of encode and
decode tokens during LLM by at least 5% across multiple platforms.
Thank you Kenneth Graunke for suggesting and guiding me throughout
this implementation.
Signed-off-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30447>
While extending our backend to handle 16-bit sampler return payloads, we
found that in piglit's arb_texture_view-rendering-formats, the SIMD8 FS
was missing the sampling operation altogether. This was because we were
first emitting the texturing instruction, and then calling
get_nir_def(), which adds an UNDEF instruction when the destination is
smaller than the 32-bit. So the texturing was dead code elimated. Fix
this by calling get_nir_def() earlier.
Thank you to Kenneth Graunke for suggesting and guiding me throughout
this implementation.
Signed-off-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30447>
This option is used for Gfx < 12, elk already set it to true,
so set it in brw and change the drivers to not set it anymore.
Because the dual-compiler support in Iris, the helper function
there had to change to consult the right compiler value instead.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30393>
Documentation says we should leave this field to the default value
(Normal). Instead we set 3DSTATE_PS_EXTRA::PixelShaderHasUAV when we
see that a fragment shader has side effects.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30408>
ANB is only used by Android WSI which uses explicit sync so these
flags can be dropped.
Signed-off-by: Juston Li <justonli@google.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29883>
The condition of flat ccs and vram_only checker causes different
aux usage at binding stage. The current design is reusing CCS_E
on Xe2, so we want both Xe2 integrated and discreted GPUs behave
the same way.
Xe2 shouldn't need any special setup of CCS in the loop.
Backport-to: 24.2
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30111>
On pre-Xe2 platforms, the compression on these modifiers that
don't support compression are enabled. The compressed will be
resolved when needed. On Xe2+ we haven't support explicit
resolve, so all the paths to resolves are prohibited now. But
the code is still doing it, causing an assertion failure:
Fixes: vkcube
src/intel/vulkan/anv_private.h:5467:
anv_image_get_fast_clear_type_addr: Assertion
`device->info->ver < 20' failed.
Backport-to: 24.2
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30111>