Commit graph

76 commits

Author SHA1 Message Date
Lionel Landwerlin
4a61be24fe anv: only resort to sync fds internally with no syncobj support
We can rely on only one kind of synchronization object (drm-syncobj)
when it is available. This reduces the number of file descriptors we
use in our implementation.

This will be required later for timeline semaphores implementation, at
this point we won't ever want to use anything else but syncobjs.

v2: Only use has_syncobj for semaphores (Jason)

v3: Only has_syncobj in assert on semaphores in QueueSubmit (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-06-20 14:59:51 +00:00
Jason Ekstrand
83b943cc2f anv: Make all VkDeviceMemory BOs resident permanently
We spend a lot of time in the driver adding things to hash sets to track
residency.  The reality is that a properly built Vulkan app uses large
memory objects and sub-allocates from them.  In a typical frame, most of
if not all of those allocations are going to be resident for the entire
frame so we're really not saving ourselves much by tracking fine-grained
residency.  Just throwing everything in the validation list does make it
a little bit more expensive inside the kernel to walk the list and
ensure that all our VA is in order.  However, without relocations, the
overhead of that is pretty small.

If we ever do run into a memory pressure situation where the fine-
grained residency could even potentially help, we would likely be
swapping one page out to make room for another within the draw call and
performance is totally lost at that point.  We're better off swapping
out other apps and just letting ours run a whole frame.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-19 19:56:42 +00:00
Lionel Landwerlin
acb50d6b1f intel/decoders: handle decoding MI_BBS from ring
An MI_BATCH_BUFFER_START in the ring buffer acts as a second level
batchbuffer (aka jump back to ring buffer when running into a
MI_BATCH_BUFFER_END).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-03-07 15:08:31 +00:00
Lionel Landwerlin
32ffd90002 anv: add support for INTEL_DEBUG=bat
As requested by Ken ;)

v2: Also decode simple batches (Caio)
    Fix u_vector usage issues (Lionel)

v3: Make binding/instruction/state/surface available (Lionel)

v4: Going through device pools for simple batches (Lionel)
    Centralize search BO callbacks into anv_device.c (Lionel)

v5: Clear decoded batch buffer var after use (Caio)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-03-02 12:53:21 +00:00
Lionel Landwerlin
239b0d8570 Revert "anv: add support for INTEL_DEBUG=bat"
This reverts commit e4d88396d2.

Apologies, I pushed the wrong commit.
2019-02-24 01:06:39 +00:00
Lionel Landwerlin
e4d88396d2 anv: add support for INTEL_DEBUG=bat
As requested by Ken ;)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-23 23:29:04 +00:00
Jason Ekstrand
48ed2a7bb0 anv: Implement VK_EXT_buffer_device_address
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-01 17:09:42 -06:00
Rafael Antognolli
f39dad7e4e anv: Validate the list of BOs from the block pool.
We now have multiple BOs in the block pool, but sometimes we still
reference only the first one in some instructions, and use relative
offsets in others. So we must be sure to add all the BOs from the block
pool to the validation list when submitting commands.

v2:
   - Don't add block pool BOs to the dependency list right before
   execbuf (Jason)
   - Call anv_execbuf_add_bo() to each BO in the block pools (Jason)
   - Use anv_execbuf_add_bo_set() to add surface state dependencies to
   execbuf.

v3:
   - Add comment to the non-softpin case (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:10 -08:00
Rafael Antognolli
11a5d4620b anv: Split code to add BO dependencies to execbuf.
This part of the anv_execbuf_add_bo() code is totally independent of the
BO being added. Let's split it out, so we can reuse it later.

v3: rename to anv_execbuf_add_bo_set (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:08 -08:00
Rafael Antognolli
e3dc56d731 anv: Update usage of block_pool->bo.
Change block_pool->bo to be a pointer, and update its usage everywhere.
This makes it simpler to switch it later to a list of BOs.

v3:
 - Use a static "bos" field in the struct, instead of malloc'ing it.
 This will be later changed to a fixed length array of BOs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:02 -08:00
Rafael Antognolli
e8b6e0a5ba anv/allocator: Add getter for anv_block_pool.
We will need the anv_block_pool_map to find the map relative to some BO
that is not at the start of the block pool.

v2: just return a pointer instead of a struct (Jason)
v4: Update comment (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:07:43 -08:00
Caio Marcelo de Oliveira Filho
09c3ff01df src/intel: use new hash table and set creation helpers
Replace calls to create hash tables and sets that use
_mesa_hash_pointer/_mesa_key_pointer_equal with the helpers
_mesa_pointer_hash_table_create() and _mesa_pointer_set_create().

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Eric Engestrom <eric@engestrom.ch>
2019-01-14 10:49:33 -08:00
Eric Engestrom
e27902a261 util: use C99 declaration in the for-loop set_foreach() macro
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-25 12:43:18 +01:00
Jason Ekstrand
f5bab06428 anv/batch_chain: Don't start a new BO just for BATCH_BUFFER_START
Previously, we just went ahead and emitted MI_BATCH_BUFFER_START as
normal.  If we are near enough to the end, this can cause us to start a
new BO just for the MI_BATCH_BUFFER_START which messes up chaining.  We
always reserve enough space at the end for an MI_BATCH_BUFFER_START so
we can just increment cmd_buffer->batch.end prior to emitting the
command.

Fixes: a0b133286a "anv/batch_chain: Simplify secondary batch return..."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107926
Tested-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-03 09:03:12 -05:00
Jason Ekstrand
7a89a0d9ed anv: Use separate MOCS settings for external BOs
On Broadwell and above, we have to use different MOCS settings to allow
the kernel to take over and disable caching when needed for external
buffers.  On Broadwell, this is especially important because the kernel
can't disable eLLC so we have to do it in userspace.  We very badly
don't want to do that on everything so we need separate MOCS for
external and internal BOs.

In order to do this, we add an anv-specific BO flag for "external" and
use that to distinguish between buffers which may be shared with other
processes and/or display and those which are entirely internal.  That,
together with an anv_mocs_for_bo helper lets us choose the right MOCS
settings for each BO use.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99507
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-03 09:03:03 -05:00
Lionel Landwerlin
f430a37fa7 intel: decoder: unify MI_BB_START field naming
The batch decoder looks for a field with a particular name to decide
whether an MI_BB_START leads into a second batch buffer level. Because
the names are different between Gen7.5/8 and the newer generation we
fail that test and keep on reading (invalid) instructions.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107544
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-24 23:10:08 +01:00
Jason Ekstrand
64e619674e anv: Don't even bother processing relocs if we have softpin
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 16:34:26 -07:00
Jason Ekstrand
c7be17c8d3 anv: Refactor reloc handling in execbuf_add_bo
This just separates the reloc list vs. BO set cases and lets us avoid an
allocation if relocs->deps->entries == 0.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 16:34:25 -07:00
Scott D Phillips
f3dbe0419d anv: Soft-pin batch buffers
Co-authored-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 14:27:12 -07:00
Jason Ekstrand
a0b133286a anv/batch_chain: Simplify secondary batch return chaining
Previously, we did this weird thing where we left space and an empty
relocation for use in a hypothetical MI_BATCH_BUFFER_START that would be
added to the secondary later.  Then, when it came time to chain it into
the primary, we would back that out and emit an MI_BATCH_BUFFER_START.
This worked well but it was always a bit hacky, fragile and ugly.  This
commit instead adds a helper for rewriting the MI_BATCH_BUFFER_START at
the end of an anv_batch_bo and we use that helper for both batch bo list
cloning and handling returns from secondaries.  The new helper doesn't
actually modify the batch in any way but instead just adjusts the
relocation as needed.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 14:27:12 -07:00
Jason Ekstrand
4f20c665b4 anv/batch_chain: Call batch_bo_finish at the end of end_batch_buffer
The only reason we were calling it in the middle was that one of the
cases for figuring out the secondary command buffer execution type
wanted batch_bo->length which gets set by batch_bo_finish.  It's easy
enough to recalculate and now batch_bo_finish is called in a sensible
location.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 14:27:11 -07:00
Scott D Phillips
27cc68d9e9 anv: For pinned BOs, skip relocations, but track bo usage
References to pinned BOs won't need to be relocated at a later
point, so just write the final value of the reference into the bo
directly.

Add a `set` to the relocation lists for tracking dependencies that
were previously tracked by relocations. When a batch is executed, we
add the referenced pinned BOs to the exec list.

v2: - visit bos from the dependency set in a deterministic order (Jason)
v3: - compar => compare, drat (Jason)
    - Reworded commit message, provided by (Jordan)

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-01 14:27:10 -07:00
Scott D Phillips
c7db0ed4e9 anv: Use a separate pool for binding tables when soft pinning
Soft pinning lets us satisfy the binding table address
requirements without using both sides of a growing state_pool.

If you do use both sides of a state pool, then you need to read
the state pool's center_bo_offset (with the device mutex held) to
know the final offset of relocations that target the state pool
bo.

By having a separate pool for binding tables that only grows in
the forward direction, the center_bo_offset is always 0 and
relocations don't need an update pass to adjust relocations with
the mutex held.

v2: - don't introduce a separate state flag for separate binding tables (Jason)
    - replace bo and map accessors with a single binding_table_pool accessor (Jason)
v3: - assert bt_block->offset >= 0 for the separate binding table (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-06-01 14:27:10 -07:00
Scott D Phillips
29a139b308 anv/blorp: Write relocated values into surface states
v2 (Jason Ekstrand):
 - Split the blorp bit into it's own patch and re-order a bit
 - Use anv_address helpers

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-31 16:51:47 -07:00
Scott D Phillips
4714784dae anv: move canonical_address calculation into a separate function
A later patch will make use of this in other places. Also, remove
dependency on undefined behavior of left-shifting a signed value.

v2: - move function into a separate header (Chris)
v3: (by Ken) Add new header to the various build systems.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-05-27 19:24:33 -07:00
Jason Ekstrand
bd1279bd9f Get rid of a bunch of KHR suffixes
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Kenneth Graunke
bd87bd178c anv: Drop I915_EXEC_CONSTANTS_REL_GENERAL from execbuf.
The kernel used to have execbuf parameters to program the INSTPM bit
for whether 3DSTATE_CONSTANT_* should be relative to dynamic state
base address or an absolute address.  However, they never worked in
the presence of hardware contexts, so I deleted them a while back.

It doesn't make sense to set this flag, as it doesn't exist anymore.
It also never did anything anyway - the flag is zero, so |'ing it in
did nothing.  The default is relative anyway.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-12 07:00:41 -08:00
Matt Turner
6cfc49287d anv: Remove 'inline' keywords
Unless you have data, the compiler knows better than you whether a
function should be inlined.

No difference in the resulting binary with gcc-6.3.0 or clang-4.0.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Jason Ekstrand
49c59c88eb anv: Implement VK_KHR_external_fence
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-28 19:33:43 -07:00
Jason Ekstrand
5f372d93a9 anv: Use DRM sync objects to back fences whenever possible
In order to implement VK_KHR_external_fence, we need to back our fences
with something that's shareable.  Since the kernel wait interface for
sync objects already supports waiting for multiple fences in one go, it
makes anv_WaitForFences much simpler if we only have one type of fence.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-28 19:33:43 -07:00
Jason Ekstrand
caa71343c6 anv: Rename anv_fence_state to anv_bo_fence_state
It only applies to legacy BO fences.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-28 18:35:30 -07:00
Jason Ekstrand
92286dc08a anv: Pull the guts of anv_fence into anv_fence_impl
This is just a refactor, similar to what we did for semaphores, in
preparation for handling VK_KHR_external_fence.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-28 18:35:27 -07:00
Jason Ekstrand
f992bb205c anv: Rework fences to work more like BO semaphores
This commit changes fences to work a bit more like BO semaphores.
Instead of the fence being a batch, it's simply a BO that gets added
to the validation list for the last execbuf call in the QueueSubmit
operation.  It's a bit annoying finding the last submit in the execbuf
but this allows us to avoid the dummy execbuf.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-28 18:35:22 -07:00
Francisco Jerez
e29ccaac29 anv: Check that in_fence fd is valid before closing it.
Probably harmless, but will overwrite errno with a failure status
code.  Reported by coverity.

CID 1416600: Argument cannot be negative (NEGATIVE_RETURNS)
Fixes: 5c4e4932e0 (anv: Implement support for exporting semaphores as FENCE_FD)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-22 11:56:38 -07:00
Francisco Jerez
7ca124a6a3 anv: Add error handling to setup_empty_execbuf().
The anv_execbuf_add_bo() call can actually fail in practice, which
should cause the QueueSubmit operation to fail.  Reported by Coverity.

CID: 1416606: Unchecked return value (CHECKED_RETURN)
Fixes: 017cdb10cf (anv: Submit a dummy batch when only semaphores are provided.)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-22 11:54:16 -07:00
Jason Ekstrand
55bce22d8d anv: Use DRM sync objects for external semaphores when available
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-15 19:08:26 -07:00
Jason Ekstrand
5c4e4932e0 anv: Implement support for exporting semaphores as FENCE_FD
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-15 19:08:26 -07:00
Jason Ekstrand
017cdb10cf anv: Submit a dummy batch when only semaphores are provided.
Vulkan allows you to do a submit whose only job is to wait on and
trigger semaphores.  The easiest way for us to support that right
now is to insert a dummy execbuf.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-15 19:08:26 -07:00
Jason Ekstrand
031f57eba3 anv: Add a basic implementation of VK_KHX_external_semaphore
This patch adds an implementation based on DRM BOs.  We don't actually
advertise the extension yet because we want to add a couple more paths
first.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-15 19:08:26 -07:00
Jason Ekstrand
8e3d9c5d09 anv: Round u_vector element sizes to a power of two
This fixes 32-bit builds of the driver.  Commit 08413a81b9
changed things so that we now put struct anv_states in the u_vector for
binding tables.  On 64-bit builds, sizeof(struct anv_state) is a power
of two but it isn't on 32-bit builds.

Fixes: 08413a81b9
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2017-07-12 10:34:13 -07:00
Kenneth Graunke
3e50607a40 intel: Move clflush helpers from anv to common/gen_clflush.h.
I want to use these in the OpenGL driver as well.

v2: Add to COMMON_FILES in Makefile.sources (caught by Emil)

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-10 15:55:19 -07:00
Jason Ekstrand
781263486f anv: Stop setting domains to RENDER on EXEC_OBJECT_WRITE
The reason we were doing this was to ensure that the kernel did the
appropriate cross-ring synchronization and flushing.  However, the
kernel only looks at EXEC_OBJECT_WRITE to determine whether or not to
insert a fence.  It only cares about the domain for determining whether
or not it needs to clflush the BO before using it for scanout but the
domain automatically gets set to RENDER internally by the kernel if
EXEC_OBJECT_WRITE is set.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-07-10 08:55:47 -07:00
Lionel Landwerlin
e3a5ab2d66 anv: check return value of anv_execbuf_add_bo
CID: 1405919 (Error handling issues)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-08 14:38:27 +01:00
Jason Ekstrand
d3ed72e2c2 anv/allocator: Embed the block_pool in the state_pool
Now that the state stream is allocating off of the state pool, there's
no reason why we need the block pool to be separate.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
08413a81b9 anv: Allocate binding table blocks through the state pool
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
49ecaf88d1 anv/allocator: Drop the block_size field from block_pool
Since the state_stream is now pulling from a state_pool, the only thing
pulling directly off the block pool is the state pool so we can just
move the block_size there.  The one exception is when we allocate
binding tables but we can just reference the state pool there as well.

The only functional change here is that we no longer grow the block pool
immediately upon creation so no BO gets allocated until our first state
allocation.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
4201cc2dd3 anv: Implement VK_KHX_external_semaphore_fd
This implementation allocates a 4k BO for each semaphore that can be
exported using OPAQUE_FD and uses the kernel's already-existing
synchronization mechanism on BOs.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-03 15:09:46 -07:00
Jason Ekstrand
ef2e427d78 anv: Pull the guts of cmd_buffer_execbuf into a helper
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-03 15:09:46 -07:00
Jason Ekstrand
bd3a9813b9 anv/cmd_buffer: Use the device allocator for QueueSubmit
The command is really operating on a Queue not a command buffer and the
nearest object to that with an allocator is VkDevice.

Reviewed-by: Chad Versace <chadversary@chromium.org>
Cc: "17.0 17.1" <mesa-dev@lists.freedesktop.org>
2017-04-27 20:08:46 -07:00
Jason Ekstrand
439da38d18 anv: Replace anv_bo::is_winsys_bo with a uint32_t flags
Reviewed-by: Kristian H. Kristensen <krh@bitplanet.net>
2017-04-04 18:33:52 -07:00