This is, unfortunately, a large flag-day mega-commit. However, any
other approach would likely be fragile and involve a lot more churn as
we try to plumb the new vk_fence and vk_semaphore primitives into ANV's
submit code before we delete it all. Instead, we do it all in one go
and accept the consequences.
While this should be mostly functionally equivalent to the previous
code, there is one potential perf-affecting change. The command buffer
chaining optimization no longer works across VkSubmitInfo structs.
Within a single VkSubmitInfo, we will attempt to chain all the command
buffers together but we no longer try to chain across a VkSubmitInfo
boundary. Hopefully, this isn't a significant perf problem. If it ever
is, we'll have to teach the core runtime code how to combine two or more
VkSubmitInfos into a single vk_queue_submit.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13427>
If we ever want to stop depending on the EXEC_OBJECT_PINNED to detect
when something is pinned (like for VM_BIND), having a helper will reduce
the code churn. This also gives us the opportunity to make it compile
away to true/false when we can figure it out just based on compile-time
GFX_VERx10.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13610>
Soft-pin is but one possible mechanism for pinning buffers. We're
working on another called VM_BIND. Most of the time, the real question
we're asking isn't "are we using soft-pin?" but rather "are we using
relocations?" because it's relocations, and not soft-pin, that cause us
all the extra pain we have to write code to handle. This commit flips
the majority of those checks around. The new helper is currently just
the exact inverse of the old use_softpin helper.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13610>
INTEL_DEBUG is defined (since 4015e1876a) as:
#define INTEL_DEBUG __builtin_expect(intel_debug, 0)
which unfortunately chops off upper 32 bits from intel_debug
on platforms where sizeof(long) != sizeof(uint64_t) because
__builtin_expect is defined only for the long type.
Fix this by changing the definition of INTEL_DEBUG to be function-like
macro with "flags" argument. New definition returns 0 or 1 when
any of the flags match.
Most of the changes in this commit were generated using:
for c in `git grep INTEL_DEBUG | grep "&" | grep -v i915 | awk -F: '{print $1}' | sort | uniq`; do
perl -pi -e "s/INTEL_DEBUG & ([A-Z0-9a-z_]+)/INTEL_DBG(\1)/" $c
perl -pi -e "s/INTEL_DEBUG & (\([A-Z0-9_ |]+\))/INTEL_DBG\1/" $c
done
but it didn't handle all cases and required minor cleanups (like removal
of round brackets which were not needed anymore).
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13334>
Make u_vector_init a wrapper to u_vector_init_pot. Let both take
(element_count, element_size) as parameters.
Motivated by eed0fc4caf ("vulkan/wsi/wayland: fix an invalid
u_vector_init call")
v2: rename u_vector_init_pot to u_vector_init_pow2
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13201>
This commit does several things:
* Unify code common to several drivers by evaluating INTEL_NO_HW within
intel_get_device_info_from_fd (suggested by Jordan).
* For drivers that keep a copy of the intel_device_info struct, a
separate copy of the no_hw field is now unnecessary. Remove them.
* Minimize kernel queries when INTEL_NO_HW is true. This is done for
code simplification, but we may find reason to undo this later on.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12007>
This is the first time we see an application running out of mmap().
We essentially allocate too many batches (+65k) and end up not being
able to mmap them, at which point we can't mmap anything anymore and
things go sideways.
This change allocates bigger batch BOs as we grow an existing command
buffer. This drastically reduces the number of BOs we need to allocate
(the benchmark that reported the issue now reaches a max of ~630 BOs,
instead of reaching 65k and failing previously).
v2: Track the total batch size of command buffers (Jason)
Just give 0 for batch_len to i915 (Jason)
v3: Fix indentation (Jason)
v4: Drop uncessary reshuffling of error labels (Jason)
v5: Remove empty lines (Marcin)
v6: Limit BO growing to chunks of 16Mb (Jason)
v7: Add assert on initial size (Jason)
v8: Add define for max size (Jason)
v9: Fixup v7 assert for non softpin platforms (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4956
Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11482>
This should drop the CPU overhead of processing buffers on SKL+ by
dropping some of the logic contained in anv_reloc_list_add() whenever we
have enough compile-time information to know we have softpin.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11236>
The relocation list currently serves two purposes. One is for
relocations on older non-softpin platforms. The second is to keep track
of driver-managed BOs which are used by the given command buffer. We
going to need a mechanism to add BOs to the command buffer without doing
a relocation into the batch.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11236>
This patch renames functions, structures, enums etc. with "gen_"
prefix defined in common code.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9413>
v2: Fixup crash spotted by Mark about missing alloc vfuncs
v3: Fixup double iteration over device->memory_objects (that ought to
be expensive...) (Ken)
v4: Add more asserts for non-softpin cases (Ken)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2371>
We would like to chain multiple primary command buffer to be submitted
together to i915. For prepare this, add end the command buffers with a
MI_BATCH_BUFFER_START and at submit time, replace it with
MI_BATHC_BUFFER_END if needed.
v2: Don't even consider non softpin platforms
v3: Fix inverted condition
v4: Limit is_chainable() to checking device->use_softpin (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2371>
When a secondary command buffer is encountered, insert an event that
links to the new batch.
This commit leaves intel_measure timestamp buffer objects mmapped,
which is more efficient than mapping/unmapping several times. With
the BOs mapped at all times, timestamp buffers can be managed directly
by intel_measure, where it will iterate over timestamps of linked
secondary buffers.
With timestamp buffers managed by intel_measure, a more efficient and
accurate check for render completion can be moved into intel_measure
from anv/iris.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7354>
This may vary based on the newer kernel engines based contexts.
v2 (Jason Ekstrand):
- Initialize anv_queue::exec_flags in anv_queue_init
- Don't conflate this with refactors to get_reset_stats
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8667>
It's included in declaration of INTEL_DEBUG.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6732>
this catches some undefined behavior like e.g., using a stale descriptorset
that references deleted bos, which I would absolutely never do
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6747>
Fixes crashes in:
- Rise of the Tomb Rider (on benchmark start)
- Total War: Three Kingdoms (on game start)
- Total War: Warhammer II (on game start)
Fixes: 34a0ce58c7 ("anv: add a new execution mode for secondary command buffers")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6546>
This implements timeline semaphores using a new type of dma-fence
stored into drm-syncobjs. We use a thread to implement delayed
submissions.
v2: Drop cloning of temporary semaphores and just transfer their ownership (Jason)
Drain queue when dealing with binary semaphore
Ensure we don't submit to the thread as long as we don't need to
v3: Use __u64 not uintptr_t for kernel pointers
Fix commented code for INTEL_DEBUG=bat
Set DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES in timeline fence execbuf extension
Add new anv_queue_set_lost()
Drop multi queue stuff meant for the fake multi queue patch
Rework temporary syncobj handling
Don't use syncobj when not available (DeviceWaitIdle/CreateDevice)
Use ANV_MULTIALLOC
And a few more tweaks...
v4: Drop drained condition helper (Lionel)
Fix missing EXEC_OBJECT_WRITE on BOs we want to wait on (Jason)
v5: Add missing device->lost_reported in _anv_device_report_lost (Lionel)
Fix missing free on submit->simple_bo (Lionel)
Don't drop setting the device in lost state on QueueSubmit error (Jason)
Store submit->fence_bos as an array of uintptr_t (Jason)
v6: condition device->has_thread_submit to i915 & core DRM support (Jason)
v7: Fix submit->in_fence leakage on error (Jason)
Keep dummy semaphore with no thread submission (Jason)
v8: Move ownership of submit->out_fence to submit (Jason)
v9: Don't forget to read the VkFence's syncobj binary payload (Lionel)
v10: Take the mutex lock on anv_gem_close() (Jason/Lionel)
v11: Fix void* -> u64 cast on 32bit (Lionel)
v12: Rebase after BO backed timeline semaphore (Lionel)
v13: Fix missing snippets lost after rebase (Lionel)
v14: Drop update_binary usage (Lionel)
v15: Use ANV_MULTIALLOC (Lionel)
v16: Fix some realloc issues (Ivan)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v8)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2901>
This is a left over from the earlier version of
VK_KHR_performance_query where we used kernel relocs to implement
multi passe queries.
We're using self modifying batches now so we shouldn't need any
relocation.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2001a80d4a ("anv: Implement VK_KHR_performance_query")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6291>
Those are currently hurting Felix' ability to look at the batches.
We can probably detect this in the aubinator but that's a bit more
work than falling back to the previous behavior.
v2: Condition VK_KHR_performance_query to not using this variable (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5391>
A buffer added to all execbufs so that we can attribute a batch that
caused a hang to a particular driver.
v2: Reuse workaround BO
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3203>
We initially used this debug option to mean "don't bother registering
the OA configuration into the kernel".
This change makes this option suppress any interaction with the
i915/perf interface. This is useful when debugging self modifying
batches with performance queries while running on the intel_mi_runner.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2775>
This has the same kernel requirements are VK_INTEL_performance_query
v2: Fix empty queue submit (Lionel)
v3: Fix autotool build issue (Piotr Byszewski)
v4: Fix Reset & Begin/End in same command buffer, using soft-pin &
relocation on the same buffer won't work currently. This version
uses a somewhat dirty trick in anv_execbuf_add_bo (Piotr Byszewski)
v5: Fix enumeration with null pointers for either pCounters or
pCounterDescriptions (Piotr)
Fix return condition on enumeration (Lionel)
Set counter uuid using sha1 hashes (Lionel)
v6: Fix counters scope, should be COMMAND_KHR not COMMAND_BUFFER_KHR (Lionel)
v7: Rebase (Lionel)
v8: Rework checking for loaded queries (Lionel)
v9: Use new i915-perf interface
v10: Use anv_multialloc (Jason)
v11: Implement perf query passes using self modifying batches (Lionel)
Limit support to softpin/gen8
v12: Remove spurious changes (Jason)
v13: Drop relocs (Jason)
v14: Avoid overwritting .sType in
VkPerformanceCounterKHR/VkPerformanceCounterDescriptionKHR (Lionel)
v15: Don't copy the entire
VkPerformanceCounterKHR/VkPerformanceCounterDescriptionKHR (Jason)
Reuse anv_batch rather than custom packing (Jason)
v16: Fix missing MI_BB_END in reconfiguration batch
Only report the extension with kernel support (perf_version >= 3)
v17: Some cleanup of unused stuff
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2775>
This change adds a call/return execution mode for secondary command
buffer rather than the existing copy into the primary batch mode.
v2: Rework convention to avoid burning an ALU register (Jason)
v3: Use anv_address_add() (Jason)
v4: Move command emissions to anv_batch_chain.c (Jason)
v5: Also move last MI_BBS emission in secondary command buffer to
anv_batch_chain.c (Jason)
v6: Fix end secondary command buffer end (Jason)
v7: Refactor anv_batch_address() to remove additional emit functions
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2775>