Commit graph

109138 commits

Author SHA1 Message Date
Daniel Schürmann
0d42e4d7a0 aco: Initial GFX7 Support
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
3177346bfc aco: refactor visit_store_fs_output() to use the Builder
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Jason Ekstrand
0f60aa4037 anv: Re-emit all compute state on pipeline switch
It's a very odd case to hit in the real world.  However, there are some
CTS tests which switch back and forth between dispatch and clear without
changing the pipeline.

Fixes: bc612536eb "anv: Emit a dummy MEDIA_VFE_STATE before switching..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-07 04:03:35 +00:00
Jason Ekstrand
bce1c3c668 anv: Re-capture all batch and state buffers
When we moved from allocating BOs directly to using the BO cache, we
lost the EXEC_OBJECT_CAPTURE flag on all our state buffers.

Fixes: 3119b96bdf "anv: Allocate block pool BOs from the cache"
Fixes: ee77938733 "anv: Allocate batch and fence buffers from..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-07 04:03:35 +00:00
Jason Ekstrand
865ffe4e02 anv: Return VK_ERROR_OUT_OF_DEVICE_MEMORY for too-large buffers
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-06 22:32:05 +00:00
Eric Anholt
e3b249f166 freedreno: Enable texture upload memory throttling.
Fixes oom-killer during streaming-texture-upload, which I found while
trying to enable piglit in CI.

Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-12-06 14:03:50 -08:00
Fritz Koenig
c496d44284 freedreno: reorder format check
With the addition of the planar formats helper, the
planar formats no longer have a valid block.bits field.
Calling util_format_get_blocksize therefore asserts.

Reorder the check to see if the format is supported
before doing the query to get the blocksize.

Fixes: 20f132e5ef ("gallium/util: add planar format layouts and helpers")

Signed-off-by: Fritz Koenig <frkoenig@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-12-06 21:27:10 +00:00
Nanley Chery
21376cffb3 iris: Fix import of multi-planar surfaces with modifiers
Multi-planar surfaces are allowed to have modifiers. Don't require
DRM_FORMAT_MOD_INVALID in order to create a surface for each plane
defined by the format.

Fixes: 246eebba4a ("iris: Export and import surfaces with modifiers that have aux data")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-06 20:31:48 +00:00
Nanley Chery
51ee8fff9b gallium: Store the image format in winsys_handle
This format will be used to properly handle planar images with modifiers
in iris.

Fixes: 246eebba4a ("iris: Export and import surfaces with modifiers that have aux data")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-06 20:31:48 +00:00
Nanley Chery
d5c857837a gallium/dri2: Fix creation of multi-planar modifier images
The commit noted below assumed and enforced that DRM_MOD_INVALID was the
only valid modifier for multi-planar imported images. Due to that, it
required that modifier on multi-planar images to:

   1. Allow multiple planes.
   2. Perform YUV format lowering and extent adjustments.
   3. Use buffer_index to correctly map the given planes.

Fix these issues by removing or updating the code built on that
assumption.

Fixes: 2066966c10 ("gallium/dri2: Support creating multi-planar modifier images")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-06 20:31:48 +00:00
Jason Ekstrand
f9a3d9738b anv: Use BO fences/semaphores for AcquireNextImage
Instead of doing a dummy submit on the command buffer for the fence or a
dummy semaphore and trusting in implicit sync, this commit moves us to
take advantage of implicit sync and just use the WSI image BO as the
fence.  Both semaphores and fences require a tiny bit of extra plumbing
to do this but the result is that we can get rid of a bunch of the extra
synchronization we're doing today.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-06 19:58:07 +00:00
Jason Ekstrand
ecc119a96e anv: Add a fence_reset_reset_temporary helper
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-06 19:58:07 +00:00
Jason Ekstrand
ccb7d606f1 anv: Use submit-time implicit sync instead of allocate-time
In 83b943cc2f, we started making all VkDeviceMemory BOs resident all
the time.  One unfortunate side-effect of this is that every
vkQueueSubmit sets EXEC_OBJECT_WRITE on every WSI memory object which
means that X server or Wayland compositor, instead of waiting on the
last vkQueueSubmit to actually write the buffer, now waits on the last
vkQueueSubmit to from that driver instance relative to whenever the
compositor's GL driver instance calls execbuf.  This potentially leads
to a lot of extra synchronization that we didn't intend to have.

Instead, this commit makes it so that we leave WSI memory objects with
EXEC_OBJECT_ASYNC most of the time and only unset EXEC_OBJECT_ASYNC and
set EXEC_OBJECT_WRITE in the dummy execbuf that we do as part of
vkQueuePresent.  This should hopefully result in tighter integration
with the compositor, lower latency, and better performance.

Testing with DOOM 2016, this seems to reduce latency by at least a frame
if not two and makes the game much more responsive.  Testing was,
however, subjective, so we don't have any hard data on that.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-06 19:58:07 +00:00
Jason Ekstrand
6ebf677cfd anv: Always add in EXEC_OBJECT_WRITE when specified in extra_flags
Otherwise, we're trusting in the execbuf_add_bo which sets
EXEC_OBJECT_WRITE to to always be the first one that gets called.  This
is likely true for fences but it seems somewhat fragile.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-06 19:58:07 +00:00
Jason Ekstrand
778b51f491 vulkan/wsi: Add a hooks for signaling semaphores and fences
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-06 19:58:07 +00:00
Jason Ekstrand
48e23a6406 vulkan/wsi: Provide the implicitly synchronized BO to vkQueueSubmit
This lets us treat the implicit synchronization that we need for X11 and
Wayland like a semaphore.  Instead of trusting the driver to somehow
figure out when that memory object needs to be signaled, we provide an
explicit point where the driver can set EXEC_OBJECT_WRITE and signal the
dma_fence on the BO.  Without this, we have to somehow track inside the
driver when WSI buffers are actually used to avoid extra synchronization
dependencies.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-06 19:58:06 +00:00
Urja Rannikko
d07ed0c9c9 panfrost: free spill cost table in mir_spill_register
Signed-off-by: Urja Rannikko <urjaman@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-06 15:26:13 +00:00
Urja Rannikko
12e393bacf panfrost: add lcra_free() to free lcra state
Signed-off-by: Urja Rannikko <urjaman@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-06 15:26:13 +00:00
Urja Rannikko
5b6108834b panfrost: free allocations in schedule_block
Signed-off-by: Urja Rannikko <urjaman@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-06 15:26:13 +00:00
Urja Rannikko
e2dbea683c panfrost: free last_read/write tables in mir_create_dependency_graph
Signed-off-by: Urja Rannikko <urjaman@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-06 15:26:13 +00:00
Alyssa Rosenzweig
adf716dc7f panfrost: Rename SET_VALUE to WRITE_VALUE
See
https://lists.freedesktop.org/archives/dri-devel/2019-December/247601.html

Write value emphasises that it's just a generic write primitive.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-06 14:37:17 +00:00
Alyssa Rosenzweig
9eae950342 panfrost: Update SET_VALUE with information from igt
It's not a tiler specific initialization; it's a generic GPU-side write
primitive that may be used for tiler reset on midgard.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-06 14:37:17 +00:00
Jonathan Marek
0796e7e70d turnip: implement border color
Fixes the deqp fails in:
dEQP-VK.pipeline.sampler.*border*
(minus 1d array/d24 cases which fail for other reasons)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-05 22:12:30 -05:00
Jonathan Marek
095d35eff8 turnip: improve emit_textures
Two things:
* Texture/sampler pointers aligned to the size of texture/sampler state
* Returning errors instead of crashing on OOM

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-05 22:12:30 -05:00
Jonathan Marek
3ab4f99461 turnip: add function to allocate aligned memory in a substream cs
To use with texture states that need alignment (texconst, sampler, border)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-12-05 22:12:29 -05:00
Timothy Arceri
1abca2b3c8 glsl/nir: iterate the system values list when adding varyings
Iterate the system values list when adding varyings to the program
resource list in the NIR linker. This is needed to avoid CTS
regressions when using the NIR to build the GLSL resource list in
an upcoming series. Presumably it also fixes a bug with the current
ARB_gl_spirv support.

Fixes: ffdb44d3a0 ("nir/linker: Add inputs/outputs to the program resource list")

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-05 22:04:31 +00:00
Dave Airlie
201ed4b4e7 llvmpipe: enable support for primitives generated outside streamout
This enables the draw support when the queries are enabled.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-06 06:48:30 +10:00
Dave Airlie
5f8af9731e draw: add support for collecting primitives generated outside streamout
GL/gallium require gathering primitives generated outside streamout
stats. This introduces the draw interfaces to enabling collecting this.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-06 06:48:30 +10:00
Dave Airlie
f137672197 llvmpipe: disable occlusion queries when requested by state tracker
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-06 06:48:30 +10:00
Dave Airlie
3b8e1b3ee4 llvmpipe: add queries disabled flag
This flag is set when the state tracker request queries
be disabled for meta operations.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-12-06 06:48:30 +10:00
Kenneth Graunke
ef893db468 main: Change u_mmAllocMem align2 from bytes (old API) to bits (new API)
The main and Gallium implementations were recently merged, and the
align2 parameter in the Gallium one is in bits.  execmem.c expected
bytes still.  This led to every call here asserting.

Fixes: b6fd679a9e("mesa/main/util: moving gallium u_mm to util, remove main/mm")

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
2019-12-05 21:07:09 +01:00
Jason Ekstrand
752196a493 util/atomic: Add p_atomic_add_return for the unlocked path
Fixes: 385d13f26d "util/atomic: Add a _return variant of p_atomic_add"
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-12-05 11:55:21 -06:00
Jason Ekstrand
1b6991ba1d anv: Implement VK_KHR_buffer_device_address
The primary difference between the KHR and EXT versions of the extension
is that the KHR provides the address at AllocateMemory time for replay
so we can replay it safely without moving to a sparse address model.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
4428cd9127 anv: Use a pNext loop in AllocateMemory
This function has a lot of possible extensions and some of them we can
easily handle on-the-fly so it's easier to just have a loop than to find
each structure manually.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
a8e59b3708 anv: Add allocator support for client-visible addresses
When a BO is flagged as having a client visible address, we put it in
its own heap.  We also support the client explicitly specifying an
address in said heap.  If an address collision happens, we return false
from anv_vma_alloc which turns into a VK_ERROR_OUT_OF_DEVICE_MEMORY.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
96e3328ac2 util/vma: Add a function to allocate a particular address range
This new function lets you request to remove a specific address range
from the allocator.  It returns true on success and leaves the allocator
unmodified and returns false on failure.  It doesn't need to return an
offset because, if it succeeds, the offset passed in is the allocated
offset.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
782fb5407d util/vma: Factor out the hole splitting part of util_vma_heap_alloc
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
03450e9cfc anv: Add an explicit_address parameter to anv_device_alloc_bo
We already have a mechanism for specifying that we want a fixed address
provided by the driver internals.  We're about to let the client start
specifying addresses in some very special scenarios as well so we want
to pass this through to the allocation function.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
597fdb9e21 anv: Stop advertising two heaps just for the VF cache WA
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
b47bc0202a anv: Set up VMA heaps independently from memory heaps
Our VMA allocations are really independent from the memory heaps we
expose via the API.  The only thing that really matters is the GTT size
so we can make the high heap the right size.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
1037b52cf4 anv: Stop tracking VMA allocations
util_vma_heap_alloc will already return 0 if it doesn't have enough
space.  The only thing the vma_*_available tracking was doing was
preventing us from allocating too much on any given heap.  Now that
we're tracking that in the heap itself, we can drop these.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
a4e3d8f0db anv: Disallow allocating above heap sizes
We're already tracking the amount of memory used in each heap.  This
commit just makes us start rejecting memory allocations if the heap
would grow too large.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
385d13f26d util/atomic: Add a _return variant of p_atomic_add
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
0a36fafa95 anv: Don't leak when set_tiling fails
Fixes: a44744e01d "anv: Require a dedicated allocation for..."
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
46af0ecc1d anv: Use PIPE_CONTROL flushes to implement the gen8 VF cache WA
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
1b5cb92b62 anv: Apply cache flushes after setting index/draw VBs
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
7ce39a55c1 anv: Always invalidate the VF cache in BeginCommandBuffer
I think the reason why we only do this for primaries is that we didn't
expect to have blorp calls in secondaries.  However, you are allowed to
have a full render pass in a secondary command buffer so resolves and
clears can end up in there.  We should just always invalidate.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
a500a6b7f1 blorp: Pass the VB size to the VF cache workaround
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
c142a40a92 anv: Add a has_softpin boolean
This separates "has" from "use" which will make the next commit a bit
cleaner.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
0bba88081b anv: Drop bo_flags from anv_bo_pool
In ee77938733, we started using the BO cache for anv_bo_pool and
stopped using the bo_flags parameter.  However, we never dropped it from
the struct or the init function.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:58:14 -06:00