This patch renames functions, structures, enums etc. with "gen_"
prefix defined in common code.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9413>
While we're here, add __gen_get_batch_address declarations to more files
because we're about to start requiring it on all GFX 12.5+.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9445>
mi_ is already a unique prefix in Mesa so the gen_ isn't really gaining
us anything except extra characters. It's possible that MI_ may
conflict a tiny bit with GenXML but it doesn't seem to be a problem
today and we can deal with that in the future if it's ever an issue.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9393>
Apart from the single additional marker field, these queries will now
use the same layout as all other drivers.
This should allow us to modify a single component to add an additional
register for new metrics.
v2: Capture the query beging registers in reverse order to ensure
timestamp is as close as possible from measured draw call.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6518>
This unifies performance data gathering between the GL & Vulkan
drivers.
v2: Also move all NOOPs to before the query, leaving none inside
v3: Capture the query beging registers in reverse order to ensure
timestamp is as close as possible from measured draw call.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6518>
For all generations supported we had a layout describing what register
to store to implement a MI_RPC replacement.
This is because, on Gen12 we need to snapshot OAG registers to get
correct values for the perf equations. There, the MI_RPC instruction
captures OAR register which do not have all the information we need.
v2: Fix commented code for debug (Marcin)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6518>
Those are not part of the OA reports and need some additional
scaffolding. Those counters are only available when doing queries as
we need to emit MI_SRMs to record them.
Equations making use of those counters are not there yet, they will
come in a follow up commit updating a bunch of oa-*.xml files.
v2: Fix typo
v3: Use PERF_CNT_VALUE_MASK (Marcin)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6518>
We want to unify things a bit between GL & Vulkan. So store those
values in the results rather than just in the GL query code.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8525>
This has the same kernel requirements are VK_INTEL_performance_query
v2: Fix empty queue submit (Lionel)
v3: Fix autotool build issue (Piotr Byszewski)
v4: Fix Reset & Begin/End in same command buffer, using soft-pin &
relocation on the same buffer won't work currently. This version
uses a somewhat dirty trick in anv_execbuf_add_bo (Piotr Byszewski)
v5: Fix enumeration with null pointers for either pCounters or
pCounterDescriptions (Piotr)
Fix return condition on enumeration (Lionel)
Set counter uuid using sha1 hashes (Lionel)
v6: Fix counters scope, should be COMMAND_KHR not COMMAND_BUFFER_KHR (Lionel)
v7: Rebase (Lionel)
v8: Rework checking for loaded queries (Lionel)
v9: Use new i915-perf interface
v10: Use anv_multialloc (Jason)
v11: Implement perf query passes using self modifying batches (Lionel)
Limit support to softpin/gen8
v12: Remove spurious changes (Jason)
v13: Drop relocs (Jason)
v14: Avoid overwritting .sType in
VkPerformanceCounterKHR/VkPerformanceCounterDescriptionKHR (Lionel)
v15: Don't copy the entire
VkPerformanceCounterKHR/VkPerformanceCounterDescriptionKHR (Jason)
Reuse anv_batch rather than custom packing (Jason)
v16: Fix missing MI_BB_END in reconfiguration batch
Only report the extension with kernel support (perf_version >= 3)
v17: Some cleanup of unused stuff
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2775>
We're about to use the offset fields from the query object. We can't
just use a made up object.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2775>
By using the same mi_builder throughout the draw call, we can just
allocate a register from the mi_builder and unref it when we're done.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2775>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4690>
We should keep this very minimal; I don't know that we need to go all
struct gl_context on it. However, this gives us at least a tiny base on
which we can start building some common functionality.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4690>
We were not capturing the register already so don't bother writing the
delta in the results (we were previously doing a delta between two 0
values).
v2: Fix unused function warning
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4586>
We've been missing this workaround for a while and since it's required
for Gen12, let's implement it for Gen9 first.
v2: Update comment for Gen9.
v3: Fix clearing of bits... (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3405>
If VK_QUERY_RESULT_WAIT_BIT is not set, there is currently no
special handling of unavailable queries in vkCmdCopyQueryPoolResults,
and anv will write an invalid value for the query result.
This commit updates vkCmdCopyQueryPoolResults for unavailable
queries to return 0 if the VK_QUERY_RESULT_PARTIAL_BIT flag is set
and if not, skip writing altogether.
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3586>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3586>
Currently, fetching the partial results (VK_QUERY_RESULT_PARTIAL_BIT)
of an unavailable occlusion query via vkGetQueryPoolResults can
return invalid values. anv returns slot.end - slot.begin, but in the
case of unavailable queries, slot.end is still at the initial value
of 0. If slot.begin is non-zero, the occlusion count underflows to
a value that is likely outside the acceptable range of the partial
result.
This commit fixes vkGetQueryPoolResults by always returning 0 if the
query is unavailable and the VK_QUERY_RESULT_PARTIAL_BIT is set.
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3586>
Having to always pull the physical device from the instance has been
annoying for almost as long as the driver has existed. It also won't
work in a world where we ever have more than one physical device. This
commit adds a new field called "physical" to anv_device and switches
every location where we use device->instance->physicalDevice to use the
new field instead.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>
The availability is not written at the location changed in
ee6fbb95a74d...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ee6fbb95a7 ("anv: Properly handle host query reset of performance queries")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
We already have a mechanism for specifying that we want a fixed address
provided by the driver internals. We're about to let the client start
specifying addresses in some very special scenarios as well so we want
to pass this through to the allocation function.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Timeline semaphore introduce support for wait before signal behavior,
which means that it is now allowed to call vkQueueSubmit() with wait
semaphores not yet submitted for execution. Our kernel driver requires
all of the wait primitives to be created before calling the execbuf
ioctl. As a result, we must delay submissions in the userspace driver.
This change store the necessary information to be able to delay a
VkSubmitInfo submission to the kernel driver.
v2: Fold count++ into array access (Jason)
Move queue list to another patch (Jason)
v3: Document cleanup of temporary semaphores (Jason)
v4: Track semaphores of SYNC_FD type that needs updating after delayed
submission
v5: Don't forget to update sync_fd in signaled semaphores after
submission (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The host query reset entry point didn't use the availability offset
for performance queries.
To fix this, reorder the availability of performance queries to match
other queries.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2b5f30b1d9 ("anv: implement VK_INTEL_performance_query")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
v2: Introduce the appropriate pipe controls
Properly deal with changes in metric sets (using execbuf parameter)
Record marker at query end
v3: Fill out PerfCntr1&2
v4: Introduce vkUninitializePerformanceApiINTEL
v5: Use new execbuf extension mechanism
v6: Fix comments in genX_query.c (Rafael)
Use PIPE_CONTROL workarounds (Rafael)
Refactor on the last kernel series update (Lionel)
v7: Only I915_PERF_IOCTL_CONFIG when perf stream is already opened (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
We use a mix of MI & PIPE_CONTROL commands to write our queries' data
(results & availability). Those commands' memory write order is not
guaranteed with regard to their order in the command stream, unless CS
stalls are inserted between them. This is problematic for 2 reasons :
1. We copy results from the device using MI commands even though
the values are generated from PIPE_CONTROL, meaning we could
copy unlanded values into the results and then copy the
availability that is inconsistent with the values.
2. We allow the user to poll on the availability values of the
query pool from the CPU. If the availability lands in memory
before the values then we could return invalid values.
This change does 2 things to address this problem :
- We use either PIPE_CONTROL or MI commands to write both
queries values and availability, so that the ordering of the
memory writes guarantees that if availability is visible,
results are also visible.
- For the occlusion & timestamp queries we apply a CS stall
before copying the results on the device, to ensure copying
with MI commands see the correct values of previous
PIPE_CONTROL writes of availability (required by the Vulkan
spec).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Doesn't fix anything but it's not the right function prototype.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 673f33c77d ("anv: Implement CmdBegin/EndQueryIndexed")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
In commit 9a7b319903 ("anv/query: flush render target before
copying results") we tracked all the render target writes to apply a
flushes in the vkCopyQueryResults(). But we can narrow this down to
only when we write a buffer (which is the only input of
vkCopyQueryResults).
v2: Drop newer render target write flags introduce by 1952fd8d2c
("anv: Implement VK_EXT_conditional_rendering for gen 7.5+")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
This change tracks render target writes in the pipeline and applies a
render target flush before copying the query results to make sure the
preceding operations have landed in memory before the command streamer
initiates the copy.
v2: Simplify logic in CopyQueryResults (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108909
Fixes: 37f9788e9a ("anv: flush pipeline before query result copies")
Cc: mesa-stable@lists.freedesktop.org
Pipeline state pending bits should be taken into account when copying
results.
In the particular bug below, the results of the
vkCmdCopyQueryPoolResults() command was being overwritten by the
preceding vkCmdCopyBuffer() with a same destination buffer. This is
because we copy the buffers using the 3D pipeline whereas we copy the
query results using the command streamer. Those pieces of HW work in
parallel and the results are somewhat undefined.
v2: Unconditionally flush the pipeline before copying the results
(Jason)
v3: Wrap & expressions (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108894
Cc: mesa-stable@lists.freedesktop.org
This lets us get rid of a bunch of duplicated error messages.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Instead of passing around BOs and offsets, use addresses which are anv's
GPU equivalent of pointers.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Each query slot is a uint64_t and we were only zeroing half of it.
Fixes: 7ec6e4e689 "anv/query: implement multiview interactions"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Instead of computing an index at the end which we hope maps to the
number of things written, just count the number of things as we go.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
and _mesa_bitcount_64 with util_bitcount_64. This fixes a build problem
in nir for platforms that don't have popcount or popcountll, such as
32bit msvc.
v2: - Fix additional uses of _mesa_bitcount added after this was
originally written
Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
v2 (Jason Ekstrand):
- Break up Scott's mega-patch
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>