Commit graph

73 commits

Author SHA1 Message Date
Lionel Landwerlin
3bb8a4bfec intel/perf: expose timestamp begin for mdapi
This was forgotten in the initial implementation.

v2: ensure the value is written for both GL & Vulkan queries

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3112>
2020-01-16 15:29:28 +02:00
Lionel Landwerlin
afdc0121b5 i965/iris/perf: factor out frequency register capture
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3113>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3113>
2019-12-18 14:23:17 +02:00
Lionel Landwerlin
bd888bc1d6 i965/iris: perf-queries: don't invalidate/flush 3d pipeline
Our current implementation of performance queries is fairly harsh
because it completely flushes and invalidates the 3d pipeline caches
at the beginning and end of each query. An argument can be made that
this is how performance should be measured but it probably doesn't
reflect what the application is actually doing and the actual cost of
draw calls.

A more appropriate approach is to just stall the pipeline at
scoreboard, so that we measure the effect of a draw call without
having the pipeline in a completely pristine state for every draw
call.

v2: Use end of pipe PIPE_CONTROL instruction for Iris (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-13 11:27:22 +02:00
Lionel Landwerlin
a575b3cd5c intel/perf: drop batchbuffer flushing at query begin
This was initially intended to fix issues with the query timings going
occassionally high.

It turns out there was a bug in the attribution of OA reports to our
context when parsing the OA data. This led to reports flagged with
other context IDs to be included in our queries results.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-13 11:27:17 +02:00
Lionel Landwerlin
ddacd3d43b intel/perf: fix improper pointer access
This expression was unused by the macro, probably why it didn't
register in the compilation.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-04 09:21:15 +00:00
Lionel Landwerlin
8c0b058263 intel/perf: simplify the processing of OA reports
This is a more accurate description of what happens in processing the
OA reports.

Previously we only had a somewhat difficult to parse state machine
tracking the context ID.

What we really only need to do to decide if the delta between 2
reports (r0 & r1) should be accumulated in the query result is :

   * whether the r0 is tagged with the context ID relevant to us

   * if r0 is not tagged with our context ID and r1 is: does r0 have a
     invalid context id? If not then we're in a case where i915 has
     resubmitted the same context for execution through the execlist
     submission port

v2: Update comment (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-04 09:21:15 +00:00
Lionel Landwerlin
b364e920bf intel/perf: take into account that reports read can be fairly old
If we read the OA reports late enough after the query happens, we can
get a timestamp in the report that is significantly in the past
compared to the start timestamp of the query. The current code must
deal with the wraparound of the timestamp value (every ~6 minute). So
consider that if the difference is greater than half that wraparound
period, we're probably dealing with an old report and make the caller
aware it should read more reports when they're available.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-04 09:21:15 +00:00
Lionel Landwerlin
9d0a5c817c intel/perf: set read buffer len to 0 to identify empty buffer
We always add an empty buffer in the list when creating the query.
Let's set the len appropriately so that we can recognize it when we
read OA reports up to the end of a query.

We were using an 0 timestamp value associated with the empty buffer
and incorrectly assuming this was a valid value. In turn that led to
not reading enough reports and resulted in deltas added to our counter
values which should have been discarded because those would be flagged
for a different context.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-04 09:21:15 +00:00
Lionel Landwerlin
acea59dbf8 intel/perf: fix invalid hw_id in query results
Accumulation happens between 2 reports, it can be between a start/end
report from another context. So only consider updating the hw_id of
the results when it's not already valid and that we have a valid value
to put in there.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 41b54b5faf ("i965: move OA accumulation code to intel/perf")
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-04 09:21:15 +00:00
Lionel Landwerlin
c061185e17 intel/perf: add EHL performance query support
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-11-15 13:14:30 +00:00
Lionel Landwerlin
15b7b56eb2 intel/perf: add TGL support
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-31 09:13:20 +00:00
Lionel Landwerlin
1a2246a5e0 intel/perf: update ICL configurations
A few equations/programming changes for ICL.

v2: Fix a couple of issues in naming and floating/integer operations (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-29 13:00:26 +02:00
Lionel Landwerlin
5ba6d9941b intel/perf: add mdapi writes for register perf counters
Those are not part of the OA reports.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-23 05:41:15 +00:00
Lionel Landwerlin
11c4bf9417 intel/perf: add support for querying kernel loaded configurations
We use this as a communication mechanism between MDAPI & Anv.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-23 05:41:14 +00:00
Lionel Landwerlin
db7a6847dd intel/perf: move registers to their own header
Will conflict with the genxml RPSTAT register.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-23 05:41:14 +00:00
Lionel Landwerlin
e1d5d75257 intel/perf: extract register configuration
We want to query the content of register configurations from the
kernel. Let's pull this out of the query.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-23 05:41:14 +00:00
Lionel Landwerlin
a338b7d739 intel/perf: expose some utility functions
The Vulkan performance query extension is a bit lower level than the
GL one. Expose some of the functions to do the result accumulation
directly in the Anv driver.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-23 05:41:14 +00:00
Lionel Landwerlin
a0e0e75db1 intel/perf: add mdapi maker helper
A simple utility to put the marker at the right location.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-23 05:41:14 +00:00
Greg V
ac1561088d intel/perf: use MAJOR_IN_SYSMACROS/MAJOR_IN_MKDEV
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: 134e750e16 ("i965: extract performance query metrics")
2019-08-08 21:44:33 +01:00
Mark Janes
61c54a8878 intel/perf: fix debug typo
Misspelling was seen with INTEL_DEBUG=perfmon.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
2df1ab4d48 intel/perf: make gen_perf_query_object private
Encapsulate the details of this structure within the perf implemenation.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
deea3798b6 intel/perf: make perf context private
Encapsulate the details of this data structure.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
1f4f421ce0 intel/perf: print debug information
INTEL_DEBUG=perfmon will iterate over the perf queries, printing
information about the state of each query.  Some of this information
will be private to intel/perf, and needs to a dump routine that can be
called from i965.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
a663c8c26e intel/perf: make internal methods private
Now that all references from i965 have been moved to perf, we can make
internal methods private again.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
be8b466cff intel/perf: make oa_sample_buffers private
All references to this data structure have been moved inside the perf
subsystem.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
f2a049b4e3 intel/perf: expose method to create query
By encapsulating this implementation within perf, we can eventually
make struct gen_perf_ctx private.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
9f5c160d82 intel/perf: move initialization of pipeline statistics metrics to gen_perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
9f84efb452 intel/perf: move get_query_data into gen_perf
This refactor moves several helper functions for get_query_data as
well:

 - accumulate_oa_reports
 - read_gt_frequency
 - get_pipeline_stats_data
 - get_oa_counter_data

Functions which are no longer referenced in brw_performance_query.c
have been removed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
73eccdc4a5 intel/perf: move delete_query to gen_perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
8c9eac1234 intel/perf: move is_query_ready to gen_perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
a9be292722 intel/perf: move wait_query to perf
The following methods have duplicate implementation of read_oa_samples_until in
brw_performance_query.c:

 - read_oa_samples_for_query
 - read_oa_samples_until

They ar still referenced by other methods in the file and will be
removed on the subsequent commit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
3c8ed58486 intel/perf: create a vtable entry for bo_busy
Iris and i965 variants of this method need to be called by perf
routines.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
6fed756388 intel/perf: create a vtable entry for bo_wait_rendering
Iris and i965 variants of this method need to be called by perf
routines.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
511bb15d4b intel/perf: create a vtable entry for batch_references
Iris and i965 variants of this method need to be called by perf
routines.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
3ecb23092e intel/perf: refactor gen_perf_end_query into gen_perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:56 -07:00
Mark Janes
018f9b81e5 intel/perf: refactor gen_perf_begin_query into gen_perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
52d3db9ab6 intel/perf: move perf-related state into gen_perf_context
To move more operations into intel/perf, several state items are
needed.  Save references to that state in the perf_ctxt, rather than
passing them in for every operation.

This commit includes an initializer for gen_perf_context, to set those
references and also encapsulate the initialization of the sample
buffer state.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
df18acee78 intel/perf: create a vtable entries for buffer object map/unmap
These operations are needed to refactor subsequent methods into perf

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
a330d759c5 intel/perf: move client reference counts into perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
4d0d4aa1b5 intel/perf: move open_perf into perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
79ded7cc8f intel/perf: move close_perf into perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
f57c8a6dc1 intel/perf: create a vtable entry for emit_mi_flush
This method is needed to move subsequent methods into perf.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
a157f5acb1 intel/perf: move snapshot_statistics_registers into perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
8ae6667992 intel/perf: move query_object into perf
Query objects can now be encapsulated within the perf subsystem.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
7e890ed476 intel/perf: create a vtable entry for store_register_mem64
This method is needed to move subsequent methods into perf.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
4b2c885207 intel/perf: move free_sample_bufs into perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
2f712d21b9 intel/perf: move reap_old_sample_buffers into perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
31758bd36c intel/perf: move get_free_sample_buf into perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
e08a69b7f4 intel/perf: move the perf context into perf
The "context" that is necessary to submit and process perf commands to
the hardware was previously present in the brw_context.perfquery
struct.  This commit moves it into perf and provides a more
understandable name.

The intention is for this struct to be private, when all methods that
access it are migrated into perf.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00
Mark Janes
fb622054f7 intel/perf: move get_metric_id to perf
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-07 21:33:55 -07:00