Commit graph

4436 commits

Author SHA1 Message Date
Jason Ekstrand
7e3e2ce702 anv: Bump the patch version to 131 2020-01-15 08:34:57 -06:00
Lionel Landwerlin
a014105498 anv: fix pipeline switch back for non pipelined states
Setting state base address can happen even before pipeline is
selected. Also we must ensure it is set to 3D for Gen12, we can't
switch back to an invalid pipeline value (UINT32_MAX).

v2: Reuse helpers (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b34422db5e ("anv: Implement Gen12 workaround for non pipelined state")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3396>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3396>
2020-01-15 11:14:43 +02:00
Jason Ekstrand
7c16a1ae4e vulkan/wsi: Add a driconf option to force WSI to advertise BGRA8_UNORM first
The Aztec Ruins benchmark just grabs the first format in the list and
SRGB causes it to render washed out.  With this workaround, it renders
the same as OpenGL.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3350>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3350>
2020-01-14 19:27:13 +00:00
Lionel Landwerlin
a19cdf989b anv: only use VkSamplerCreateInfo::compareOp if enabled
The spec says nothing about the validity of the compareOp field when
compareEnable is false.

v2: use vulkan enum to pick default value (Caio)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2350
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3387>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3387>
2020-01-14 16:40:16 +00:00
Lionel Landwerlin
b34422db5e anv: Implement Gen12 workaround for non pipelined state
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3365>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3365>
2020-01-14 11:52:36 +02:00
Jason Ekstrand
7978f2401b anv: Memset array properties
This is probably better than possibly leaving those bytes uninitialized
even if the app will theoretically not use them.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3369>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3369>
2020-01-13 22:33:55 +00:00
Jason Ekstrand
d36eed3e69 anv: Don't over-advertise descriptor indexing features
We should only advertise sub-features if we advertise the extension.

Fixes: 6e230d7607 "anv: Implement VK_EXT_descriptor_indexing"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3369>
2020-01-13 22:33:55 +00:00
Lionel Landwerlin
2cc14bd7b8 anv: set stencil layout for input attachments
If an input attachment has a stencil format, we need to set this.

v2: Fish out VkAttachmentReferenceStencilLayoutKHR from
    VkAttachmentReference2KHR::pNext (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: c1c346f166 ("anv: implement VK_KHR_separate_depth_stencil_layouts")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2891>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2891>
2020-01-13 21:57:33 +02:00
Jason Ekstrand
21bc16a723 anv: Drop an unused variable 2020-01-13 12:20:48 -06:00
Jason Ekstrand
9b71171442 anv: Re-use flush_descriptor_sets in flush_compute_state
There's no reason to hand-roll all of the memory re-allocation fall-back
code for compute shaders.  It's  just duplicated complexity.  This also
makes it more clear in flush_compute_state where the
MEDIA_INTERFACE_DESCRIPTOR_LOAD command gets emitted relative to other
packets in the command stream.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2020-01-09 19:45:00 -06:00
Jason Ekstrand
ae72d1238c anv: Flag descriptors dirty when gl_NumWorkgroups is used
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2020-01-09 19:45:00 -06:00
Jason Ekstrand
ca6b3b11af anv: Don't add dynamic state base address to push constants on Gen7
Because Gen7 push constants are already relative to dynamic state base
address, they aren't really an address.  It's deceptive to return an
address from the helper function.  Instead, let's leave it as a
special-case in the gen7-11 helper; we don't need the helper for code
de-duplication for Gen7 anyway.

Fixes: 67d2cb3e93 "anv: Add get_push_range_address() helper"
Closes: #2323
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2020-01-09 19:44:06 -06:00
Lionel Landwerlin
60e0db3bfb anv: fix intel perf queries availability writes
The availability is not written at the location changed in
ee6fbb95a74d...

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ee6fbb95a7 ("anv: Properly handle host query reset of performance queries")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2020-01-09 20:42:36 +02:00
Lionel Landwerlin
4578d4ae52 anv: don't close invalid syncfd semaphore
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2020-01-08 18:20:50 +02:00
Kenneth Graunke
defb3a9465 anv: Only enable EWA LOD algorithm when doing anisotropic filtering.
Updated documentation renames "Anisotropic Algorithm" to "LOD Algorithm"
and adds a note for Gen9+ saying "The EWA Algorithm should only be
enabled for Anisotropic Filtering modes." and indicating that the extra
accuracy shouldn't be necessary for other modes, and comes at a cost.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2020-01-04 14:27:22 -08:00
Jason Ekstrand
52ad1712ed anv: Allow HiZ in TRANSFER_SRC_OPTIMAL on Gen8-9
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-04 12:25:54 -08:00
Caio Marcelo de Oliveira Filho
75a19186b2 anv: Ignore some CreateInfo structs when rasterization is disabled
According to the description of VkGraphicsPipelineCreateInfo(),
pViewportState, pMultisampleState, pDepthStencilState and
pColorBlendState must be ignored when rasterization is not enabled.

This avoids potentially invalid pointers being dereferenced when
rasterization is disabled.  Tested with `demos_x64 VK_Parameter_Zoo`
from Renderdoc repository.

v2: Don't store the `raster_enabled` as part of anv_pipeline, just
    query it from the create info.  This avoids storing a state that's
    only used during pipeline creation. (Jason)

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2258
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric@engestrom.ch> [v1]
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> [v1]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2020-01-03 13:57:31 -08:00
Caio Marcelo de Oliveira Filho
6755b6315b anv: Drop unused function parameter
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2020-01-03 13:29:49 -08:00
Jason Ekstrand
9bd8000c6c anv: Drop unneeded struct keywords
All VkFoo structs are typedef'd to not need the struct keyword.  Leaving
it in there is just extra characters and breaks Vulkan's aliasing when
stuff gets promoted to core versions.  It's better to just never use
struct for VkFoo.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2020-01-03 11:32:34 -06:00
Kenneth Graunke
7a9c0fc0d7 intel: Drop Gen11 WaBTPPrefetchDisable workaround
This isn't needed on production Icelake hardware.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3250>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3250>
2020-01-03 00:20:17 +00:00
Jason Ekstrand
ac70442ce1 anv: Properly advertise sampledImageIntegerSampleCounts
We support the same set of samples for integer color formats as for
non-integer.  We've been advertising it wrong since before the initial
Vulkan 1.0 release. :-(

Fixes: d689745303 "vk/0.210.0: Rework device features and limits"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-24 08:31:44 -06:00
Caio Marcelo de Oliveira Filho
c61ad77cd2 anv/gen12: Temporarily disable VK_KHR_buffer_device_address (and EXT)
For the sake of our testing infrastructure, disable this extension
for TGL until we can sort out a hang in Vulkan CTS.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-17 11:07:41 -08:00
Iván Briano
a649bbffee anv: Export VK_KHR_buffer_device_address only when really supported
Fixes: 1b6991ba1d ("anv: Implement VK_KHR_buffer_device_address")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3071>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3071>
2019-12-16 19:24:46 +00:00
Iván Briano
0fd93b9589 anv: Export filter_minmax support only when it's really supported
Fixes: bea4d4c78c ("anv: add VK_EXT_sampler_filter_minmax support")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3071>
2019-12-16 19:24:46 +00:00
Lionel Landwerlin
c056193288 anv: drop unused parameter from apply layout pass
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-12-16 14:35:25 +02:00
Lionel Landwerlin
7c223cf316 anv: constify pipeline layout in nir passes
Was hoping to find potential issues but nothing. Still probably a good
idea.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-12-16 14:35:22 +02:00
Eric Engestrom
c327245257 anv: drop unused #include
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-13 20:42:40 +00:00
Lionel Landwerlin
2c5eb1df68 anv: fix assumptions about temporary fence payload
Since f9a3d9738b temporary BO_WSI are definitely a thing so we have
an assert wrong.

Take that opportunity to expand a bit on an existing comment.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: f9a3d9738b ("anv: Use BO fences/semaphores for AcquireNextImage")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
2019-12-12 10:10:48 +00:00
Lionel Landwerlin
52bc235f2a anv: fix fence underlying primitive checks
We appear to have got lucky that the only type of temporary fence
payload we could have was a syncobj and that would only happen when
the type of the permanent payload was also a syncobj.

This code was broken if that assumption changed and it did in commit
f9a3d9738b.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
2019-12-12 10:10:48 +00:00
Jason Ekstrand
776cfde699 anv: Bump the advertised patch version to 129
We've been keeping up with the spec updates.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-11 18:52:08 +00:00
Jason Ekstrand
5f5f5019bd anv: Unconditionally advertise Vulkan 1.1
Vulkan 1.1 requires VK_KHR_external_fence which requires syncobj support
to be actually usable.  However, it doesn't strictly require that we
support any external handle types.  We should be able to advertise 1.1
even on old kernels that don't have syncobj support.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-11 18:52:08 +00:00
Jason Ekstrand
98a83d0fce anv: Flush the queue on DeviceWaitIdle
When we have syncobj_wait, we can trust in WAIT_FOR_SUBMIT but when we
don't, we only have BO waits and those aren't quite as nice.  This
commit adds a flag to _anv_queue_submit to wait for the queue to drain
before returning.  This gives us the behavior we need to implement
DeviceWaitIdle.

Fixes: 246261f0ad "anv: prepare the driver for delayed submissions"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-11 18:52:08 +00:00
Kenneth Graunke
0f2f561a10 anv: Enable Gen11 Color/Z write merging optimization
TCCNTLREG contains additional L3 cache write merging optimizations.

The default value on my system appears to be:
- URB Partial Write Merging (bit 0)
- L3 Data Partial Write Merging (bit 2)
- TC Disable (bit 3)

Windows drivers appear to set bit 1 as well to enable "Color/Z Partial
Write Merging".  This should solve an issue we were seeing where MRT
benchmarks were using substantially more bandwidth than they ought.
However, we have not observed it to cause measurable FPS gains.

It is unclear whether we should be setting bit 0 or bit 3, so for now
we leave those at the hardware default value.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-12-10 16:19:46 -08:00
Jason Ekstrand
41691ac016 ANV: Stop advertising smoothLines support on gen10+
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
2019-12-10 20:13:56 +00:00
Lionel Landwerlin
5fdea9f401 anv: fix incorrect VMA alignment for CCS main surfaces
Maybe finer way of dealing with this requirement would be to increase
the number of pdevice->memory.types[] to add a category for special
alignment cases.

Meanwhile this fixes the problem of CCS surface alignment and it's
probably not going to cause issues given the size of our address
space.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 6af8a4acc4 ("anv: Add aux-map translation for gen12+")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-12-10 16:06:54 +00:00
Lionel Landwerlin
dcfe1903c3 anv: fix missing gen12 handling
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 181be14d43 ("anv: Build for gen12")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-12-10 16:06:54 +00:00
Jason Ekstrand
0f60aa4037 anv: Re-emit all compute state on pipeline switch
It's a very odd case to hit in the real world.  However, there are some
CTS tests which switch back and forth between dispatch and clear without
changing the pipeline.

Fixes: bc612536eb "anv: Emit a dummy MEDIA_VFE_STATE before switching..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-07 04:03:35 +00:00
Jason Ekstrand
bce1c3c668 anv: Re-capture all batch and state buffers
When we moved from allocating BOs directly to using the BO cache, we
lost the EXEC_OBJECT_CAPTURE flag on all our state buffers.

Fixes: 3119b96bdf "anv: Allocate block pool BOs from the cache"
Fixes: ee77938733 "anv: Allocate batch and fence buffers from..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-07 04:03:35 +00:00
Jason Ekstrand
865ffe4e02 anv: Return VK_ERROR_OUT_OF_DEVICE_MEMORY for too-large buffers
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-06 22:32:05 +00:00
Jason Ekstrand
f9a3d9738b anv: Use BO fences/semaphores for AcquireNextImage
Instead of doing a dummy submit on the command buffer for the fence or a
dummy semaphore and trusting in implicit sync, this commit moves us to
take advantage of implicit sync and just use the WSI image BO as the
fence.  Both semaphores and fences require a tiny bit of extra plumbing
to do this but the result is that we can get rid of a bunch of the extra
synchronization we're doing today.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-06 19:58:07 +00:00
Jason Ekstrand
ecc119a96e anv: Add a fence_reset_reset_temporary helper
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-06 19:58:07 +00:00
Jason Ekstrand
ccb7d606f1 anv: Use submit-time implicit sync instead of allocate-time
In 83b943cc2f, we started making all VkDeviceMemory BOs resident all
the time.  One unfortunate side-effect of this is that every
vkQueueSubmit sets EXEC_OBJECT_WRITE on every WSI memory object which
means that X server or Wayland compositor, instead of waiting on the
last vkQueueSubmit to actually write the buffer, now waits on the last
vkQueueSubmit to from that driver instance relative to whenever the
compositor's GL driver instance calls execbuf.  This potentially leads
to a lot of extra synchronization that we didn't intend to have.

Instead, this commit makes it so that we leave WSI memory objects with
EXEC_OBJECT_ASYNC most of the time and only unset EXEC_OBJECT_ASYNC and
set EXEC_OBJECT_WRITE in the dummy execbuf that we do as part of
vkQueuePresent.  This should hopefully result in tighter integration
with the compositor, lower latency, and better performance.

Testing with DOOM 2016, this seems to reduce latency by at least a frame
if not two and makes the game much more responsive.  Testing was,
however, subjective, so we don't have any hard data on that.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-06 19:58:07 +00:00
Jason Ekstrand
6ebf677cfd anv: Always add in EXEC_OBJECT_WRITE when specified in extra_flags
Otherwise, we're trusting in the execbuf_add_bo which sets
EXEC_OBJECT_WRITE to to always be the first one that gets called.  This
is likely true for fences but it seems somewhat fragile.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-06 19:58:07 +00:00
Jason Ekstrand
1b6991ba1d anv: Implement VK_KHR_buffer_device_address
The primary difference between the KHR and EXT versions of the extension
is that the KHR provides the address at AllocateMemory time for replay
so we can replay it safely without moving to a sparse address model.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
4428cd9127 anv: Use a pNext loop in AllocateMemory
This function has a lot of possible extensions and some of them we can
easily handle on-the-fly so it's easier to just have a loop than to find
each structure manually.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
a8e59b3708 anv: Add allocator support for client-visible addresses
When a BO is flagged as having a client visible address, we put it in
its own heap.  We also support the client explicitly specifying an
address in said heap.  If an address collision happens, we return false
from anv_vma_alloc which turns into a VK_ERROR_OUT_OF_DEVICE_MEMORY.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
03450e9cfc anv: Add an explicit_address parameter to anv_device_alloc_bo
We already have a mechanism for specifying that we want a fixed address
provided by the driver internals.  We're about to let the client start
specifying addresses in some very special scenarios as well so we want
to pass this through to the allocation function.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
597fdb9e21 anv: Stop advertising two heaps just for the VF cache WA
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
b47bc0202a anv: Set up VMA heaps independently from memory heaps
Our VMA allocations are really independent from the memory heaps we
expose via the API.  The only thing that really matters is the GTT size
so we can make the high heap the right size.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00
Jason Ekstrand
1037b52cf4 anv: Stop tracking VMA allocations
util_vma_heap_alloc will already return 0 if it doesn't have enough
space.  The only thing the vma_*_available tracking was doing was
preventing us from allocating too much on any given heap.  Now that
we're tracking that in the heap itself, we can drop these.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-05 10:59:10 -06:00