on GFX8 onwards, we have only single bit to determine correct return
format.
v2:
- Define macro and use it instead of hardcoded value. (Lionel)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11766>
At high frequency sampling, this generates a lot of messages.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13831>
Otherwise we need to include intel headers in generic code.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13831>
On newer HW it will require more work than just reading a dword. It
could also vary depending on the report format.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13831>
For this each driver must :
- report its clock_id (if no particular clock just default to cpu
boottime one)
- be able to sample its clock (gpu_timestamp())
The PPSDataSource will then emit timestamp correlation events in the
trace ensuring perfetto is able to display GPU & CPU events
appropriately on its timeline.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13831>
Now that we removed the intel intrinsic and just use the generic one,
we can skip it in the intel call lowering pass and just deal with it
in the intel rt intrinsic lowering.
v2: rewrite with nir_shader_instructions_pass() (Jason)
v3: handle everything in switch (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 423c47de99 ("nir: drop the btd_resume_intel intrinsic")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12113>
Instead of having a bunch of const vk_sync_type for each permutation of
vk_drm_syncobj capabilities, have a vk_drm_syncobj_get_type helper which
auto-detects features. If a driver can't support a feature for some
reason (i915 got timeline support very late, for instance), they can
always mask off feature bits they don't want.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13427>
BO waits aren't going away any time soon so using a syncobj here doesn't
really gain us anything. It just makes it more complicated.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13427>
These are entirely unused except for the syncobj in submit_simple_batch.
We can use the libdrm wrappers for that as they're basically equivalent
and the core Vulkan sync code already depends on them.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13427>
This is, unfortunately, a large flag-day mega-commit. However, any
other approach would likely be fragile and involve a lot more churn as
we try to plumb the new vk_fence and vk_semaphore primitives into ANV's
submit code before we delete it all. Instead, we do it all in one go
and accept the consequences.
While this should be mostly functionally equivalent to the previous
code, there is one potential perf-affecting change. The command buffer
chaining optimization no longer works across VkSubmitInfo structs.
Within a single VkSubmitInfo, we will attempt to chain all the command
buffers together but we no longer try to chain across a VkSubmitInfo
boundary. Hopefully, this isn't a significant perf problem. If it ever
is, we'll have to teach the core runtime code how to combine two or more
VkSubmitInfos into a single vk_queue_submit.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13427>
These are about to be the only use of anv_gettime_ns() so lets switch
them over so the next patch can delete the helper outright.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13427>
This gets rid of the bespoke vfunc interface in the WSI code.
Eventually, I'd love to get rid of wsi_display_fence entirely but
this at least makes it a lot more palatable.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13427>
This effectively partially reverts 13fe43714c ("anv: Add helpers in
anv_allocator for mapping BOs") where we both added helpers and reworked
memory mapping to stash the maps on the BO. The problem comes with
external memory. Due to GEM rules, if a memory object is exported and
then imported or imported twice, we have to deduplicate the anv_bo
struct but, according to Vulkan rules, they are separate VkDeviceMemory
objects. This means we either need to always map whole objects and
reference-count the map or we need to handle maps separately for
separate VkDeviceMemory objects. For now, take the later path.
Fixes: 13fe43714c ("anv: Add helpers in anv_allocator for mapping BOs")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5612
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13795>
We prefix names with an underscore to make them "safe" C identifiers
when necessary. For example, a value of "32x32" would become "_32x32".
However, when specifying something like
<field ... prefix="BLOCK_SIZE">
<value name="32x32" value="0"/>
</field>
we already have a prefix that makes the field name safe. We'd rather
generate a name with a single underscore, i.e.
#define BLOCK_SIZE_32x32 0
rather than
#define BLOCK_SIZE__32x32 0
This also fixes up affected defines in crocus.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13809>
When a <field> tag has multiple <value> children, listing symbolic names
for possible field values, we generate #defines for each value, with an
optional prefix. I don't know why, but this code was checking whether
self.default is None. We want to generate the same list of #defines,
with a prefix, regardless of whether the field has a default value
specified or not.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13809>
When you enable video genxml, lots of warnings about redefined things
appear, just clean those up before things get started.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13788>
We need to guarantee that when vkQueueSubmit() returns the application
can actually wait on a signaled semaphore/syncobj.
When using a thread to do the submission to i915, this gets a bit
tricky in the following case :
A syncobj is used both as a wait & signal semaphore and has been
signaled once already. It contains a fence before entering
vkQueueSubmit().
This means we need to reset the syncobj to ensure when we return
from vkQueueSubmit(), the syncobj contains no stale fence.
Currently in the Anv, the submission thread is in charge of putting
the new fence in the syncobj and also picks up the wait fence
directly from the syncobj. This means we can't reset the syncobj
from vkQueueSubmit().
The solution to this has been pointed by Bas & Jason :
In vkQueueSubmit(), clone the wait syncobj fence into a new
temporary syncobj that will be destroy after submission and use
this temporary syncobj as a wait fence for i915. This allows us to
reset the original syncobj in vkQueueSubmit().
For this to work with wait_before_signal behavior, we also need to
do a wait-on-materialize on binary semaphores from vkQueueSubmit().
Otherwise the application thread calling vkQueueSubmit() could race
the submission thread and pick up the wrong fence when cloing.
v2: Use copy semantic for clone_syncobj_dma_fence() (Jason)
Do the cloning prior to adding the syncobj to anv_queue_submit so
that if the cloning fails don't have an invalid syncobj in
anv_queue_submit (Jason)
v3: Fix another syncobj leak (Jason)
v4: Fix invalid argument order (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4945
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11474>
From VUID-VkAttachmentReference2-attachment-04755:
"If attachment is not VK_ATTACHMENT_UNUSED, and the format of the
referenced attachment is a depth/stencil format which includes both
depth and stencil aspects, and layout is
VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_OPTIMAL or
VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL, the pNext chain must include
a VkAttachmentReferenceStencilLayout structure."
We did not check that there even could be a stencil layout
before fetching it.
Fixes a few tests from:
dEQP-VK.image.depth_stencil_descriptor.depth_read_only_optimal.*
Fixes: 979ea394e5 "vulkan/util: Move
helper functions for depth/stencil images to vk_iamge"
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13740>
We reference the scratch BO using a bindless index in the command
streamer instructions, but we forgot to add them to the BO list.
v2: Make use of pipeline reloc list (Jason)
v3: Don't add NULL BOs to the reloc list (Lionel)
v4: Don't add BOs twice to reloc list when dealing with addresses
(Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: eeeea5cb87 ("anv: Add support for scratch on XeHP")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13544>
If we ever want to stop depending on the EXEC_OBJECT_PINNED to detect
when something is pinned (like for VM_BIND), having a helper will reduce
the code churn. This also gives us the opportunity to make it compile
away to true/false when we can figure it out just based on compile-time
GFX_VERx10.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13610>
Starting with 3b363d5b55 ("anv: Assume syncobj support"), we assume
syncobj support and no longer use the execbuf sync_file API directly so
there's no point in checking for it. For the one physical device check
this deletes, we can assume has_exec_fence is always true because every
kernel with syncobj support also has sync_file.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13610>
Soft-pin is but one possible mechanism for pinning buffers. We're
working on another called VM_BIND. Most of the time, the real question
we're asking isn't "are we using soft-pin?" but rather "are we using
relocations?" because it's relocations, and not soft-pin, that cause us
all the extra pain we have to write code to handle. This commit flips
the majority of those checks around. The new helper is currently just
the exact inverse of the old use_softpin helper.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13610>