2021-03-22 16:37:27 +01:00
Perfetto Tracing
================
Mesa has experimental support for `Perfetto <https://perfetto.dev> `__ for
GPU performance monitoring. Perfetto supports multiple
`producers <https://perfetto.dev/docs/concepts/service-model> `__ each with
one or more data-sources. Perfetto already provides various producers and
data-sources for things like:
- CPU scheduling events (`` linux.ftrace `` )
- CPU frequency scaling (`` linux.ftrace `` )
- System calls (`` linux.ftrace `` )
- Process memory utilization (`` linux.process_stats `` )
As well as various domain specific producers.
2022-10-19 12:00:58 +02:00
The mesa Perfetto support adds additional producers, to allow for visualizing
2021-03-22 16:37:27 +01:00
GPU performance (frequency, utilization, performance counters, etc) on the
same timeline, to better understand and tune/debug system level performance:
- pps-producer: A systemwide daemon that can collect global performance
counters.
- mesa: Per-process producer within mesa to capture render-stage traces
2022-08-26 11:14:47 -07:00
on the GPU timeline, track events on the CPU timeline, etc.
2021-03-22 16:37:27 +01:00
2021-05-05 14:58:32 -07:00
The exact supported features vary per driver:
.. list-table :: Supported data-sources
:header-rows: 1
* - Driver
- PPS Counters
- Render Stages
2021-07-13 14:13:26 +02:00
* - Freedreno
- `` gpu.counters.msm ``
- `` gpu.renderstages.msm ``
* - Turnip
- `` gpu.counters.msm ``
2022-08-26 11:14:47 -07:00
- `` gpu.renderstages.msm ``
2021-07-13 14:13:26 +02:00
* - Intel
- `` gpu.counters.i915 ``
2023-01-10 10:36:46 -08:00
- `` gpu.renderstages.intel ``
2021-05-06 13:20:01 +02:00
* - Panfrost
2021-07-13 14:13:26 +02:00
- `` gpu.counters.panfrost ``
2021-05-06 13:27:42 +02:00
-
2025-10-10 13:27:04 -07:00
* - PanVK
- `` gpu.counters.panfrost ``
- `` gpu.renderstages.panfrost ``
2024-12-10 15:26:22 +01:00
* - V3D
- `` gpu.counters.v3d ``
-
2025-01-29 13:38:08 +01:00
* - V3DV
- `` gpu.counters.v3d ``
-
2021-05-05 14:58:32 -07:00
2021-03-22 16:37:27 +01:00
Run
---
2022-10-19 12:00:58 +02:00
To capture a trace with Perfetto you need to take the following steps:
2021-03-22 16:37:27 +01:00
2025-10-10 14:54:44 -07:00
1. Build Mesa with perfetto enabled.
2021-03-22 16:37:27 +01:00
2025-10-10 14:54:44 -07:00
.. code-block :: sh
2021-03-22 16:37:27 +01:00
2025-10-10 14:54:44 -07:00
# Configure Mesa with perfetto
mesa $ meson . build -Dperfetto=true -Dvulkan-drivers=intel,broadcom -Dgallium-drivers=
# Build mesa
mesa $ meson compile -C build
2021-03-22 16:37:27 +01:00
2025-10-10 14:54:44 -07:00
2. Build Perfetto from sources available at `` subprojects/perfetto `` .
2021-03-22 16:37:27 +01:00
2025-10-10 14:54:44 -07:00
.. code-block :: sh
2021-03-22 16:37:27 +01:00
2025-10-10 14:54:44 -07:00
# Within the Mesa repo, build perfetto
mesa $ cd subprojects/perfetto
perfetto $ ./tools/install-build-deps
perfetto $ ./tools/gn gen --args='is_debug=false' out/linux
perfetto $ ./tools/ninja -C out/linux
2021-03-22 16:37:27 +01:00
2025-10-10 14:54:44 -07:00
# Example arm64 cross compile instead
perfetto $ ./tools/install-build-deps --linux-arm
perfetto $ ./tools/gn gen --args='is_debug=false target_cpu="arm64"' out/linux-arm64
2021-03-22 16:37:27 +01:00
2025-10-10 14:54:44 -07:00
More build options can be found in `this guide <https://perfetto.dev/docs/quickstart/linux-tracing> `__ .
2021-03-22 16:37:27 +01:00
2025-10-10 14:54:44 -07:00
3. Select a `trace config <https://perfetto.dev/docs/concepts/config> `__ , likely
`` src/tool/pps/cfg/system.cfg `` which does whole-system including GPU
profiling for any supported GPUs). Other configs are available in that
directory for CPU-only or GPU-only tracing, and more examples of config files
can be found in `` subprojects/perfetto/test/configs `` .
2021-03-22 16:37:27 +01:00
2025-10-10 14:54:44 -07:00
4. Start the PPS producer to capture GPU performance counters.
2021-03-22 16:37:27 +01:00
2025-10-10 14:54:44 -07:00
.. code-block :: sh
mesa $ sudo meson devenv -C build pps-producer
5. Start your application (and any other GPU-using system components) you want
to trace using the perfetto-enabled Mesa build.
2021-10-28 00:15:33 +03:00
2024-01-11 11:16:00 +00:00
.. code-block :: sh
2021-10-28 00:15:33 +03:00
2025-10-10 14:54:44 -07:00
mesa $ meson devenv -C build vkcube
2021-10-28 00:15:33 +03:00
2025-10-10 14:54:44 -07:00
6. Capture a perfetto trace using `` tracebox `` .
.. code-block :: sh
2021-10-28 00:15:33 +03:00
2025-10-10 14:54:44 -07:00
mesa $ sudo ./subprojects/perfetto/out/linux/tracebox --system-sockets --txt -c src/tool/pps/cfg/system.cfg -o vkcube.trace
2021-10-28 00:15:33 +03:00
2025-10-10 14:54:44 -07:00
7. Go to `ui.perfetto.dev <https://ui.perfetto.dev> `__ and upload
`` vkcube.trace `` by clicking on **Open trace file** .
2021-10-28 00:15:33 +03:00
2025-10-10 14:54:44 -07:00
8. Alternatively you can open the trace in `AGI <https://gpuinspector.dev/> `__
(which despite the name can be used to view non-android traces).
2021-10-28 00:15:33 +03:00
2022-08-26 11:14:47 -07:00
CPU Tracing
~~~~~~~~~~~
Mesa's CPU tracepoints (`` MESA_TRACE_* `` ) use Perfetto track events when
Perfetto is enabled. They use `` mesa.default `` and `` mesa.slow `` categories.
2025-03-20 13:31:58 +01:00
Currently, only EGL and the following drivers have CPU tracepoints.
2024-10-09 11:02:45 +02:00
- Freedreno
2025-03-20 13:31:58 +01:00
- Panfrost
2025-01-29 13:39:32 +01:00
- Turnip
2024-10-09 11:02:45 +02:00
- V3D
- VC4
2025-01-29 13:39:32 +01:00
- V3DV
2022-08-26 11:14:47 -07:00
2021-11-22 12:56:20 +02:00
Vulkan data sources
~~~~~~~~~~~~~~~~~~~
The Vulkan API gives the application control over recording of command
buffers as well as when they are submitted to the hardware. As a
consequence, we need to ensure command buffers are properly
2022-10-19 12:00:58 +02:00
instrumented for the Perfetto driver data sources prior to Perfetto
2021-11-22 12:56:20 +02:00
actually collecting traces.
2022-11-24 08:55:20 +00:00
This can be achieved by setting the :envvar: `MESA_GPU_TRACES`
2021-11-22 12:56:20 +02:00
environment variable before starting a Vulkan application :
2024-01-11 11:16:00 +00:00
.. code-block :: sh
2021-11-22 12:56:20 +02:00
2022-11-24 10:30:50 +02:00
MESA_GPU_TRACES=perfetto ./build/my_vulkan_app
2021-11-22 12:56:20 +02:00
2021-05-05 14:58:32 -07:00
Driver Specifics
~~~~~~~~~~~~~~~~
2021-05-06 13:20:01 +02:00
Below is driver specific information/instructions for the PPS producer.
2021-05-05 14:58:32 -07:00
2021-07-13 14:13:26 +02:00
Freedreno / Turnip
^^^^^^^^^^^^^^^^^^
The Freedreno PPS driver needs root access to read system-wide
performance counters, so you can simply run it with sudo:
2024-01-11 11:16:00 +00:00
.. code-block :: sh
2021-07-13 14:13:26 +02:00
sudo ./build/src/tool/pps/pps-producer
Intel
^^^^^
The Intel PPS driver needs root access to read system-wide
2023-10-06 11:43:28 +02:00
`RenderBasic <https://www.intel.com/content/www/us/en/docs/vtune-profiler/user-guide/2023-0/gpu-metrics-reference.html> `__
2021-07-13 14:13:26 +02:00
performance counters, so you can simply run it with sudo:
2024-01-11 11:16:00 +00:00
.. code-block :: sh
2021-07-13 14:13:26 +02:00
sudo ./build/src/tool/pps/pps-producer
Another option to enable access wide data without root permissions would be running the following:
2024-01-11 11:16:00 +00:00
.. code-block :: sh
2021-07-13 14:13:26 +02:00
sudo sysctl dev.i915.perf_stream_paranoid=0
Alternatively using the `` CAP_PERFMON `` permission on the binary should work too.
2021-11-22 16:24:43 +02:00
A particular metric set can also be selected to capture a different
set of HW counters :
2024-01-11 11:16:00 +00:00
.. code-block :: sh
2021-11-22 16:24:43 +02:00
INTEL_PERFETTO_METRIC_SET=RasterizerAndPixelBackend ./build/src/tool/pps/pps-producer
2021-11-21 18:23:57 +02:00
Vulkan applications can also be instrumented to be Perfetto producers.
To enable this for given application, set the environment variable as
follow :
2024-01-11 11:16:00 +00:00
.. code-block :: sh
2021-11-21 18:23:57 +02:00
PERFETTO_TRACE=1 my_vulkan_app
2021-05-06 13:20:01 +02:00
Panfrost
^^^^^^^^
2021-05-05 14:58:32 -07:00
2021-05-06 13:20:01 +02:00
The Panfrost PPS driver uses unstable ioctls that behave correctly on
kernel version `5.4.23+ <https://lwn.net/Articles/813601/> `__ and
`5.5.7+ <https://lwn.net/Articles/813600/> `__ .
2021-05-06 13:27:42 +02:00
2021-05-06 13:20:01 +02:00
To run the producer, follow these two simple steps:
2021-05-06 13:27:42 +02:00
2021-05-06 13:20:01 +02:00
1. Enable Panfrost unstable ioctls via kernel parameter:
2021-05-06 13:27:42 +02:00
2024-01-11 11:16:00 +00:00
.. code-block :: sh
2021-05-06 13:27:42 +02:00
2021-05-06 13:20:01 +02:00
modprobe panfrost unstable_ioctls=1
2021-05-06 13:27:42 +02:00
2021-05-06 13:20:01 +02:00
Alternatively you could add `` panfrost.unstable_ioctls=1 `` to your kernel command line, or `` echo 1 > /sys/module/panfrost/parameters/unstable_ioctls `` .
2021-05-06 13:27:42 +02:00
2021-05-06 13:20:01 +02:00
2. Run the producer:
2021-05-06 13:27:42 +02:00
2024-01-11 11:16:00 +00:00
.. code-block :: sh
2021-05-06 13:27:42 +02:00
2021-05-06 13:20:01 +02:00
./build/pps-producer
2021-05-05 14:58:32 -07:00
2025-01-29 13:46:32 +01:00
V3D / V3DV
----------
As we can only have one performance monitor active at a given time, we can only monitor
32 performance counters. There is a need to define the performance counters of interest
for pps_producer using the environment variable `` V3D_DS_COUNTER `` .
.. code-block :: sh
V3D_DS_COUNTER=cycle-count,CLE-bin-thread-active-cycles,CLE-render-thread-active-cycles,QPU-total-uniform-cache-hit ./src/tool/pps/pps-producer
2021-03-22 16:37:27 +01:00
Troubleshooting
---------------
Missing counter names
~~~~~~~~~~~~~~~~~~~~~
If the trace viewer shows a list of counters with a description like
`` gpu_counter(#) `` instead of their proper names, maybe you had a data loss due
to the trace buffer being full and wrapped.
In order to prevent this loss of data you can tweak the trace config file in
two different ways:
- Increase the size of the buffer in use:
.. code-block :: javascript
buffers {
size_kb: 2048,
fill_policy: RING_BUFFER,
}
- Periodically flush the trace buffer into the output file:
.. code-block :: javascript
write_into_file: true
file_write_period_ms: 250
- Discard new traces when the buffer fills:
.. code-block :: javascript
buffers {
size_kb: 2048,
fill_policy: DISCARD,
}