This enables us to create more logical engines than HW engines are
available. This also brings the uAPI usage closer to what is happening
on Xe.
Rework: (Sagar)
- Correct exec_flag at the time of submission
- Handle device status check
- Set queue parameters
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
This helper takes the main command buffer as input and then create a
companion RCS command buffer.
v2:
- Rename anv_get_render_queue_index helper to
anv_get_first_render_queue_index (Jose)
- Rename RCS command buffer to companion RCS command buffer (Lionel)
- Add early return in anv_get_first_render_queue_index (Lionel)
- Add lock around the function (Jose)
- Move companion rcs command pool creation in device create (Sagar)
- Reset companion RCS cmd buffer (Sagar)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23661>
A single Vulkan state can map to multiple fields in different GPU
instructions. This change introduces the bottom half of a simplified
emission mechanism where we do the following :
Vulkan runtime state
|
V
Intermediate driver state
|
V
Instruction programming
This way we can detect that the intermediate state didn't change and
avoid HW instruction emission.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
Add a driconf to force the swapchain size to match
`VkSurfaceCapabilities2KHR::currentExtent` as a workaround for
misbehaved games
Fixes: 6139493ae3 ("vulkan/wsi: return VK_SUBOPTIMAL_KHR for sw/x11 on window resize")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24818>
This change allow us to insert the MI_SEMAPHORE_WAIT before/after
specific draw call. With GTX tool, we can always update the memory
address to unblock spinning wait.
v2:
- Make sure draw_call_count is thread-safe (Lionel)
- Add static inline helper (Lionel)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24308>
Some DX12 games require sparse resource support but don't
actually use sparse resources. Add a way to make these games work
while we still don't have sparse resources fully working on every
KMD backend.
When fake_sparse=true, anv advertises sparse resource support
despite lacking full support.
Based-on-patch-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24046>
Xe KMD don't need relocs, so calling a nop function and avoiding the
CPU cycles and memory waste with reloc.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24411>
Calling the ra_allocate function after each register spill can take
several minutes. This option speeds up shader compilation by spilling
more registers after the ra_allocate failure.Required for
Cyberpunk 2077, which uses a watchdog thread to terminate the process
in case the render thread hasn't responded within 2 minutes.
Execution time of my Cyberpunk2077 shader compilation test:
https://gitlab.freedesktop.org/illia.a.polishchuk/cyberpunk-vulkan-compute-hang-test-anv
Before the patch:
real 1m28,738s
user 1m28,329s
sys 0m0,400s
After the patch
real 0m33,245s
user 32m,835s
sys 0m0,404s
I think it's acceptable patch because Cyberpunk benchmarks has
the same FPS with and without patch. (I started
it without patch with a patched binary with disabled watchdog thread)
Signed-off-by: Illia Polishchuk <illia.a.polishchuk@globallogic.com>
Requires: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24228
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9241
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24299>
HEVC support on Gfx9 is only available on VCS0. So limit the number of video queues
to the first VCS engine instance.
We should be able to query HEVC support from the kernel using the engine query uAPI,
but this appears to be broken : https://gitlab.freedesktop.org/drm/intel/-/issues/8832
When this bug is fixed we should be able to check HEVC support to determine the
correct number of queues.
Closes: mesa/mesa#9172, mesa/mesa#9314
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24065>
On DG2 I ran into a case where the surface state was not being decoded
with INTEL_DEBUG=bat. This is because the surface states are not part
of a state pool there anymore. Instead BO are allocate manually and
placed in vma heap.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 96c33fb027 ("anv: enable direct descriptors on platforms with extended bindless offset")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23891>
The Vulkan CTS started generating the list of valid versions the driver
can report as conformant against based on the active branches, and the
1.3.0 branch we were reporting up to now is no longer valid.
Fixes dEQP-VK.api.driver_properties.conformance_version
Reviewed-by: Mark Janes <markjanes@swizzler.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23784>
As Matt Turner pointed out, the commit here fixed breaks in Iris and
ANV in kernel versions without support for DRM_I915_QUERY_ENGINE_INFO.
As compute engines are only present in gfx12 and newer, and support
for DRM_I915_QUERY_ENGINE_INFO was added before any gfx12 platform,
we can check for gfx version before trying to get engine info.
For ANV, this is done by checking if engine_info is not NULL, like in
other places in the ANV source code.
Fixes: a364f23a6c ("intel: Make gen12 URB space reservation dependent on compute engine presence")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9099
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23257>
A recent update to Cyberpunk 2077 enables XeSS code for Intel GPUs
which is causing the game to crash in the XeSS libraries. As a
temporary work around, stop identifying as Intel for Cyberpunk so
XeSS falls back to the cross-vendor path.
References: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8860
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23271>
When in direct descriptor mode, the descriptor pool buffers will hold
surface states directly. We won't allocate surface states in image &
buffer views.
Instead views will hold a packed RENDER_SURFACE_STATE ready to copied
into the descriptor buffers.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>
We'll use the fact that the pool is aligned to 4Gb to limit the amount
of address computations to build the address in the shaders.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>
This is the default for now. It needs to be part the pipeline hashing
as we will allow this to be tweaked per application.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>
We bump the max surfaces to ~16 million instead of ~1 million on
Gfx9-12. We could do more but that'll come later.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>
At the beginning of the buffer is located the driver identifier for
error states.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>
global load/store (or A64 messages) need the NIR bound checking which
is enabled by "robust" behavior even when robust behavior is disabled.
Many thanks to Christopher Snowhill for pointing out the pushed
constant related issue with the initial version of this patch.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>
We make the compiler assume the worst possible case (it's not great
because we have to burn 32 GRFs of potential input data) and then we
push the actual value through push constants.
This enables VK_EXT_gpl usage on zink, which causes two traces to change
their results. Raven is an imperceptible change, blender has missing
original pngs but looks plausible.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22378>
In i915 if the device has local memory it can only mmap bo with
I915_MMAP_OFFSET_FIXED, so all this set of ANV_BO_ALLOC_WRITE_COMBINE
were useless.
In Xe KMD there is no way to change mmap mode for all GPUs types.
So we can nuke bo->map_wc, ANV_BO_ALLOC_WRITE_COMBINE and related
dead code.
No changes in behavior expected here.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22483>
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT is also set in all memory types of
integrated GPUs.
This flag means that memory will be allocated in the most efficient
place for the GPU to access, which is true in integrated GPUs.
However, this was causing ANV_BO_ALLOC_WRITE_COMBINE to be set in
integrated GPUs in the block right below when allocating in the non-cached memory type.
But the comment only talks about lmem, so to still keep the write
combine behavior for iGPUs it was used VkMemoryPropertyFlags in mmap_calc_flags().
Additionally, this was causing anv_bo.has_implicit_ccs to always be
set, which could change the expected behavior of
anv_BindImageMemory2() in MTL.
Fixes: fbd32a04da ("anv: add a third memory type for LLC configuration") added a new heap
Fixes: 582bf4d9f7 ("anv: flag BO for write combine when CPU visible and potentially in lmem")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22483>