The cairo-perf-diff-files tool would ignore perf reports with
just one test for no apparent reason. The traces take so long
that it's useful to be able to compare runs with just one trace.
Since this takes days to run now and should not find any bugs that are
not covered by the test-suite it seems like a pointless exercise.
Especially as I am trying to make a release!
The OIL routines don't work as expected on MacOS X. The operating
system gives access to the timestamp counter through the function
mach_absolute_time.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Provide a hook for the test to be able to compute the number of ops per
second. For instance, the glyphs test uses it to report the number of
kiloglyph per second Cairo is able to render.
Real applications that control their Drawable externally to Cairo are
'disadvantaged' by cairo-perf-trace when it creates a similar surface
for each new instance of the same Drawable. The difficulty in
maintaining one perf surface for every application surface is that the
traces do not track lifetimes for the application surfaces, so we would
just accumulate stale surfaces. The surface cache takes a different
approach and returns the same surface for each active Drawable, and
maintains a hold-over of the MRU 16 surfaces. This achieves 60-80% hit
rate with firefox, which is probably as good as can be expected.
Obviously for double-buffered applications we only every draw to freshly
created surfaces (and Gtk+ bypasses cairo to do the final copy -- the
ideal application would just use a push-group for double buffering, in
which case we would capture and replay the entire expose event).
To enable use of the surface cache whilst replaying use -c:
./cairo-perf-trace -c firefox-talos-gfx
A new -f option to cairo-perf reverts to a fast run
mode for quick performance overviews. The number of
milliseconds each iteration of a test is run for can
be overriden using the new CAIRO_PERF_ITERATION_MS
environment variable. The default remains 2000 ms/iter.
Use "mild outliers" method to remove exceptional speed-ups and slow-downs
from the graph, so that the majority of information is not lost by the
scaling. Add the timing labels to the bars so that the true factor is
always presented.
Originally written by Vladimir Vukicevic to investigate using Skia for
Mozilla, it provides a nice integration with a rather interesting code
base. By hooking Skia underneath Cairo it allows us to directly compare
code paths... which is interesting.
[updated by Chris Wilson]
In order to handle 'cairo-perf-trace benchmark', we need to perform the
can_run? test on the directory name as opposed to the individual trace
names. Make it so.
I'd disabled this to look at cairo-qt performance, then forgot about it.
Be clean, cleanup globals -- this should fix the huge performance loss
when running in series multiple backends that need separate font caches.
These traces run for much longer than the original synthetic benchmarks
and seek to replicate 'real-world' applications, so the warning that the
xserver and cairo-perf are not bound to any cpu is false.
cairo-perf-chart takes multiple runs (currently it is limited to
prefiltered data sets) and pretty-prints a chart showing performace
improvements/regressions (in either ASCII or HTML) along with a
cairo-perf-chart.png
Use cairo_stroke() to perform the equivalent of
spiral-rect-(pix|non)align-evenodd-fill. A useful comparison of stroking
versus filling, as we can assume the composition costs are similar.
Oops we were accumulating paths during each spiral iteration and so the
tests were getting slower and slower and slower...
[And fix a couple of other instances of path accumulation.]
Enabling 'FAST CLIP' appears to trigger an infinite loop so disable.
Enabling 'FAST FILL' has limited effect on performance, so disable whilst
the basic QT surface is improved.
By ensuring that tests take longer than a couple of seconds we eliminate
systematic errors in our measurements. However, we also effectively
eliminate the synchronisation overhead. To compensate, we attempt to
estimate the overhead by reporting the difference between a single
instance and the minimum averaged instance.
As the change and ranking is based on the min_ticks, and as this can
sometimes deviate wildly from median_ticks, include min_ticks in the
output.
In particular it helps to explain cases like:
xlib-rgba rectangles_similar-rgba-mag_source-512 10.13 88.41% -> 5.77 0.19%: 1.50x slowdown
which becomes
xlib-rgba rectangles_similar-rgba-mag_source-512 3.83 (10.13 88.41%) -> 5.75 (5.77 0.19%): 1.50x slowdown
(Considering the poor standard deviation on the initial measurement, this
is more likely a sampling error than a true regression.)
More the large slowdowns to the end. This has two pleasing effects:
1. There is symmetry between large speedups at the top, and large
slowdowns at the bottom, with long bars -> short bars -> long bars.
2. After a cairo-perf-diff run the largest slowdowns are immediately
visible on the console. What better way to flag performance
regressions?
Add a CAIRO_PERF_OUTPUT environment variable to cause cairo-perf to first
generate an output image in order to manually check that the test is
functioning correctly. This needs to be automated, so that we have
absolute confidence that the performance tests are not broken - but isn't
that the role of the test suite? If we were ever to publish cairo-perf
results, I would want some means of verification that the test-suite had
first been passed.
The warning is repeated in the error message if we fail to find any
traces, and now that we search a path it is likely that some elements do
not exist. Thus we annoy the user with irrelevant, non-fatal warnings.
Still looking for suggestions for the most appropriate home for the system
wide cairo-traces dir...
The original stroke only contains a single subpath. Self-intersection
removal particularly affects strokes with multiple curved segments, so add
a path that encompasses both straight edges and rounded corners.