This code assumed that batch->exec_bos[i] matched validation_list[i],
which won't be true once we start suballocating BOs. This patch changes
it to print the full exec_bos[i] list instead of the validation list,
as that has the logical list of objects, names, addresses, placement,
whether they are suballocated, and so on.
It may be useful to look at the actual validation list as well; I'm not
sure how common that is. We may want to add additional debug prints in
the future.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12623>
We don't want to suballocate some buffers, such as ones that we know
we're intending to export to other clients, or ones with special
semantics (such as the workaround BO not having proper synchronization).
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12623>
The vast majority of cases need no handling at all, as they simply
read bo->address or similar fields, which works in either case.
Some other cases simply need to unwrap to look at the underlying BO.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12623>
With suballocation, our batch BO list may have multiple BOs that are
suballocated from the same GEM object. We still need to track each of
those buffers for cross-batch write tracking, cache tracking, and busy
tracking. However, we only want to include underlying GEM objects in
the actual validation list building. The validation list entry should
have EXEC_OBJECT_WRITE if any of the BOs are marked as writable.
We use a temporary array to map GEM handles to validation list entries
so we can quickly see if we've already emitted one and update the
EXEC_OBJECT_WRITE flag as needed.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12623>
We would like to start performing slab allocation of resources, where
multiple resources can be backed by a single GEM object.
Originally, I had thought to move busy tracking, cache domain tracking,
and so on into resources themselves, instead of having them at the BO
level. Multiple resources would point at the same BO with an offset.
Unfortunately, this meant adjusting the batch BO pinning code to take
resources rather than BOs. That cascades into needing iris_address
for genxml packing to store resources, not BOs. Which means that places
which have use raw BOs would need to start creating resources instead.
Except some places, like aux BO handling, really don't make sense as
pipe resources and really would rather use raw BOs. So iris_address
would need to store both, which convolutes the genxml field. And,
having a BO and resource means that every place in the code needs to
handle that offset correctly. It sounds simple, but is a giant mess.
Instead, we take a different route: adjust iris_bo itself, so that BOs
are either be backed by a GEM object (as is the case today), or backed
by another underlying BO. "Real" BOs have bo->gem_handle != 0. "Slab
allocated" or "fake" or "wrapper" BOs have bo->gem_handle == 0. We move
fields into a union based on these cases. amdgpu takes this approach.
This sounds complex at first glance---in theory, every place that
interacts with BOs might need to handle the wrapper BO special case.
But in practice, they don't. For suballocated BOs, we can set the
wrapper's address field to the underlying BO's address plus any offset,
at which point it looks like any other BO. Most other properties are
easily queried; the main code that needs updating is execbuf handling
and bufmgr internals.
For now, we simply move the fields. Any code that accesses either
bo->real.* or bo->gem_handle will need updating in future patches to
actually handle the slab-allocated case.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12623>
When I wrote this code originally, I decided to try and construct the
validation list up front, rather than at submission time. That worked
okay, but it's not really necessary. It's a fair amount of data to
store (struct drm_i915_gem_exec_object2 is 56 bytes per object), when
we can easily construct it on the fly.
More importantly, with suballocation, batch->exec_bos[i] may have
multiple entries corresponding to a single validation list entry.
Rather than tracking two lists with an awkward mapping between them,
we choose to just store the BO list and generate the other on the fly.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12848>
When we start suballocating BOs, multiple logical BOs may map to the
same GEM object, and thus share a validation list entry. However, we
want to track whether logical BOs are written, to avoid unnecessary
cross-batch data dependencies.
Just track that in a bitfield instead, where bit i corresponds to
batch->exec_bos[i].
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12848>
When a batch fails we either recreate our context (in case we got -EIO
or -ENOMEM) or we abort() (every other error). If we don't abort and a
later batch has a dependency on the batch that failed, then this newer
batch will fail with -EINVAL since it requires a syncobj that was
never submitted, which means we'll abort().
To avoid this problem, in this patch we simply signal syncobjs of
failed batches. This means we may be breaking our dependency tracking,
but IMHO it's better than simply letting it abort() later.
In other words, this moves the situation for some apps from "app
causes a GPU hang and then aborts" to "app causes a GPU hang but keeps
running".
Note: on some older Kernels (like today's Debian 5.10 Kernel) I see X
simply freezing after the GPU hang when the app doesn't decide to
abort(). Switching to a more recent Kernel fixes this issue for me, so
in case it happens to you make sure you have the most recent stable
trees.
v2:
- Fix coding style (Ken).
- Use the big comment block provided by Ken (Ken).
- Adjust the commit message so avoid saying we retry (Ken).
- Rebase after the syncobj ownership changes.
- Drive-by add a missing white space in the header.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12657>
We're moving away from relying on the Kernel's implicit busy tracking
into our own tracking, except for shared buffers.
Not only this shouldn't hurt now (it doesn't, according to my
measurements), when we switch to vm_bind we will be able to cut some
significant overhead by simply omitting all the async buffers from the
execbuf ioctl.
v2:
- Change iris_bo_busy() to bool (Ken).
- Fix coding style issues (Ken).
- Rebase on not having the refcount _inc and _dec helpers anymore
(Ken).
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4748
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12363>
The next patches will justify the new ownership. We want the BOs to
have references on the batches' syncobjs so we can implement implicit
tracking. In other words: BOs will be able to wait on syncobjs owned
by different screens. Since our syncobjs are actually just a Kernel
handle with a refcount, they can be used globally and it makes more
sense to map them to the bufmgr, just like the BOs.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12363>
This commit does several things:
* Unify code common to several drivers by evaluating INTEL_NO_HW within
intel_get_device_info_from_fd (suggested by Jordan).
* For drivers that keep a copy of the intel_device_info struct, a
separate copy of the no_hw field is now unnecessary. Remove them.
* Minimize kernel queries when INTEL_NO_HW is true. This is done for
code simplification, but we may find reason to undo this later on.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12007>
Again, we don't need all the dependency checking, seqno incrementing
and duplicate tracking for batch->bo. Just use the unchecked version.
This commit is not particularly significant since it really just saves
us a check in the iris_use_pinned_bo() hot path, but since we already
have the helper function, why not?
v2:
- (turns out the answer to "why not?" is because the patch had a bug)
- Call ensure_exec_obj_space() since batch batch chaining can happen
and doesn't guarantee pre-reserved space (Ken).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12194>
Don't use iris_use_pinned_bo(), go directly with add_bo_to_batch(),
skipping every check. This allows us to early return from
iris_use_pinned_bo when the workaround bo is used, saving us the call
to find_validation_entry() which ends up doing nothing except
iterating over every bo in the batch. Also don't bother with
ensure_exec_obj_space() since we just reset the batch and this is the
second BO we're adding to it.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12194>
We want to add a new caller, so extract this first.
v2: kflags can never contain EXEC_OBJECT_WRITE (Ken).
v3: Rebase after s/gtt_offset/address/.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12194>
I don't see these BOs being searched for in the benchmarks I tested so
I don't think this should improve anything. On the other hand, it
shouldn't hurt either since it's just an extra assignment.
I want to unify both places where we have this code into a single
function and the lack of the bo->index assignment was the only
difference between the two places. So first we make both functions the
same and in the next commit we'll unify things. This should make
bisecting easier in case I'm wrong.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12194>
The last_seqnos list is used by iris_emit_buffer_barrier_for() and as
far as I can understand we don't emit barriers for the workaround bo,
so don't even bother doing the atomic operations required to bump the
workaround_bo seqno list.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12194>
This is the virtual memory address of the buffer object. Calling it the
BO's address is a lot more obvious than calling it an offset in one of
the now many graphics translation tables.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12206>
When the system doesn't have enough memory, GNOME Shell may be crashed
by iris:
gnome-shell[1161]: iris: Failed to submit batchbuffer: Cannot allocate memory
gnome-shell[1161]: GNOME Shell crashed with signal 6
So don't abort() when kernel can't allocate memory to avoid crashing the
entire desktop.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11178>
This is rarely useful, but after the next patch removes tiling tracking,
this would literally be the only difference between iris_bo_alloc and
iris_bo_alloc_tiled, so we may as well add it.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11169>
Based on a patch by Rafael Antognolli.
We already had a flags parameter, but omitted it from the simple alloc
interface because most callers were passing 0. However, we'll want to
use it for selecting between device local memory and system memory, and
possibly mmap cacheability modes, in the future. At that point, many
more callers will want to specify, so I think we should include flags
in iris_bo_alloc() as well.
A few places used the iris_bo_alloc_tiled() function simply to pass
flags, so this patch converts them to use iris_bo_alloc() instead now
it does everything they want.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11169>
As requested by Ken since we're now (after 20e2c7308f)
re-emitting constants at the beginning of every batch which may lead
to some redundant constant restore overhead. No statistically
significant performance changes observed with either change.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9903>
It tries to be helpful by returning BO metadata matching exactly
the requested address, but it "forgets" to fix the remaining size.
The only caller of this function (ctx_get_bo) already deals with
raw BO metadata, so return it as such instead of fixing size too.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9707>
This patch renames all macros with "GEN_" prefix defined in
common code.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9413>
This patch renames functions, structures, enums etc. with "gen_"
prefix defined in common code.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9413>
Changes in this patch include:
- Rename all files in src/intel/common path
- Update the filenames used in source and build files
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9413>
EXT_external_objects require we call glSignalSemaphoreEXT followed by a
glFlush. If the rendering workload is small when Signal and Flush
take place the relevant batch buffers with the actual rendering might
have been submitted already. In that case the following condition is met:
(iris_batch_bytes_used(batch) == 0). This causes:
glFlush() --> iris_fence_flush() -> iris_batch_flush() ->
_iris_batch_flush() to no-op, and so the fence doesn't get submitted to the
kernel. Then when anv tries to submit an execuf2 that must wait on the
shared VkSempahore / drm_syncobj fence, there isn't one and the kernel
rejects the batchbuffer causing an -EINVAL return of the execbuf2 ioctl
and a VK_DEVICE_LOST error. Empty batch buffers do have typically one
fence attached, but the ones carrying the extra fence from a
glSignalSempahore() call do have at least 2.
See also: the discussion in MR!4337.
v2: Changed the batch struct to have a contains_fence_signal variable
that is set to true when i915_EXEC_FENCE_SIGNAL fence is added to the
batch and off when batch is reset (Tapani Pälli)
Authored-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reported-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Tested-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Eleni Maria Stea <elene.mst@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8861>
These hooks were written in the initial IRIS_MEASURE implementation.
Minor changes by Mark Janes <markjanes@swizzler.org> to adapt to the
INTEL_MEASURE reimplementation.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7354>
This eliminates the need to use container_of in error handling code.
INTEL_MEASURE will need to access the iris context from each batch.
suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7354>
It's included in declaration of INTEL_DEBUG.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6732>
In FreeBSD x86 and aarch64 __u64 is typedef to unsigned long and
is the same size as unsigned long long.
Since we are explicitly specifying the format, cast the value
to the proper type.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Emmanuel <manu@FreeBSD.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3559>
The render cache hash table is now *mostly* redundant with the more
general seqno matrix-based cache tracking mechanism. Most hash table
operations are now gone except for the format mismatch checks done in
iris_cache_flush_for_render(). Redundant code removed as a separate
patch for bisectability.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>
The depth cache set is now redundant with the more general seqno
matrix-based cache tracking mechanism. Removed as a separate patch
for bisectability.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>
This introduces a representation of the cache coherency status of the
GPU at any point in the batch. This is done by defining a matrix C of
synchronization sequence numbers such that at any point of batch
construction, a memory operation from domain i introduced into the
batch is guaranteed to be ordered after any memory operation from
domain j in a previous batch section with seqno n if the following
condition holds:
C_i_j >= n
This allows us to efficiently determine whether additional flushing
and/or invalidation is required in order to access a buffer object
from some arbitrary domain.
Except for batch buffer reset which requires clearing the whole
matrix, all operations on the matrix are either O(n) or O(1) on the
number of caching domains (which is basically constant).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>
Probably the most annoying patch to review from the whole series --
Mark every buffer object use as accessed through some caching domain
with the sequence number of the current synchronization section of the
batch. The additional argument of iris_use_pinned_bo() makes sure I'd
have gotten a compile error if I had missed any buffer added to the
batch validation list.
There are only a few exceptions where a buffer is left untracked while
adding it to the validation list, justified below:
- Batch buffers: These are strictly read-only for the moment.
- BLORP buffer objects: Their seqnos are bumped manually at the end
of iris_blorp_exec() instead, in order to avoid plumbing domain
information through BLORP address combining.
- Scratch buffers: The contents of these are strictly thread-local.
- Shader images and SSBOs: Accesses of these buffers are explicitly
synchronized at the API level.
v2: Opt out of tracking more aggressively (Ken): In addition to the
above, surface states, binding tables, instructions and most
dynamic states are now left untracked, which means a *lot* more BO
uses marked IRIS_DOMAIN_NONE which need to be reviewed extremely
carefully, since the cache tracker won't be able to provide any
coherency guarantees for them.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>
This introduces some minimalistic infrastructure which will be used in
order to partition the batch into a series of sections, each one with
a unique, monotonically-increasing sequence number. Section
boundaries will typically lie at points in the batch where the
execution and memory coherency status of some previous commands are
known, e.g. at batch buffer boundaries or PIPE_CONTROL commands.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>
This makes iris_batch_prepare_noop() return a boolean instead of
passing through the relevant set of dirty flags. It will make it
easier to change the representation of dirty flags.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5279>
Rename iris_seqno to iris_fine_fence, borrowed from si_fine_fence, to
avoid introducing any confusion with any other seqno used for tracking
pipelines.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5233>
A buffer added to all execbufs so that we can attribute a batch that
caused a hang to a particular driver.
v2: Reuse workaround BO
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3203>
We can use seqno as a basic for fast userspace fences: where we can
check a value directly to test for fence completion without having to
query using the kernel. To do so we need to write a breadcrumb from the
batch and track those writes as the basis for our lightweight fences.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3802>
We're going to need it to create a uploader in the batch soon. We still
avoid storing it, to maintain the charade of separation, and make people
think twice about fetching random fields from there and intertwining
things even worse.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3802>
This is just a refcounted wrapper around a drm_syncobj. There is
enough terminology going on in the area of synchronization (sync
objects, sync files, ...) that I'd rather not invent our own.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3802>
Iris allocates gem buffers using buckets of allocation sizes that are
page aligned. We always ask for batch buffers of size BATCH_SZ +
BATCH_RESERVED, which is not page aligned: we ask for 65552 bytes,
which ends up in the bucket of size 81920, resulting in 20% unused
space. Adjust things so there is no waste of space: BATCH_SZ +
BATCH_RESERVED is now 65536.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4561>