Behdad wants to include the feature with 1.10, so we enable it as early as
possible in 1.9 dev cycle to generate as much feedback as possible.
The first change is to use "<cairo>" as being a name unlikely to clash
with any real font names.
This reverts commits:
a824d284be,
2922336855,
e0046aaf41,
f534bd549e.
This performance test relied on the recently-removed ability
to select the internal twin-based font family with a name of
"cairo".
Presumably, we'll want to bring this performance case back when
some other means of requesting that font face is added.
Generate a cairo-perf-diff graph for a series of commits in order to be
able to identify significant commits. Still very crude, but minimally
functional.
Add a new test case to Cairo for checking the performance of Cairo's
equivalent to GDK's gdk_pixbuf_composite_color() operation. That is an
operation that happens to be extremely useful when viewing or editing
transparent images so I think it is important that it is as fast as
possible.
Add the performance test case to compare the speed of filling a rounded
rectangle (one with camphered corners) as opposed to an ordinary
rectangle. Since the majority of the pixels are identical, ideally the two
cases would take similar times (modulo the additional overhead in the more
complex path).
In order to run under memfault, the framework is first extended to handle
running concurrent tests - i.e. multi-threading. (Not that this is a
requirement for memfault, instead it shares a common goal of storing
per-test data). To that end all the global data is moved into a per-test
context and the targets are adjusted to avoid overlap on shared, global
resources (such as output files and frame buffers). In order to preserve
the simplicity of the standard draw routines, the context is not passed
explicitly as a parameter to the routines, but is instead attached to the
cairo_t via the user_data.
For the masochist, to enable the tests to be run across multiple threads
simply set the environment variable CAIRO_TEST_NUM_THREADS to the desired
number.
In the long run, we can hope the need for memfault (runtime testing of
error paths) will be mitigated by static analysis. A promising candidate
for this task would appear to be http://hal.cs.berkeley.edu/cil/.
Convert the boilerplate specific flattened content value to the ordinary
CAIRO_CONTENT_COLOR_ALPHA for use with cairo_push_group_with_content() -
otherwise cairo rightfully flags an error and the test harness decides
that the similar surface is not available.
Immediately repeat the performance test against a similar surface to
ensure that they introduce no regressions. Primarily introduced to
sanity check the change to use XShmPixmaps instead of XPixmaps in the
xlib backend, but it should be generally useful.
Similar to cairo-test, we free any global memory used by cairo for its
caches through the debug interfaces. We do this so that valgrind does
not unnecessary warn about memory leaks for the cached data and any true
leak is then not lost in the noise.
cairo-perf and the X server should be bound to CPUs (either the same
or separate) on SMP systems. Not doing so causes random results when
the X server is moved to or from cairo-perf's CPU during the
benchmarks.
This test shows that drawing a 100x100 single-pixel wide box outline is
currently 5 to 16 times slower when using the natural cairo_stroke() as
compared to a rather awkward cairo_fill() of two rectangles.
[ # ] backend-content test-size min(ticks) min(ms) median(ms) stddev. iterations
[ 0] image-rgba box-outline-stroke-100 301321 0.218 0.219 0.39% 5
[ 1] image-rgba box-outline-fill-100 18178 0.013 0.013 0.43% 5
[ 0] xlib-rgba box-outline-stroke-100 379177 0.275 0.276 1.39% 6
[ 1] xlib-rgba box-outline-fill-100 83355 0.060 0.060 0.17% 5
The map for this test case was originally demonstrated as a
performance problem in this mozilla bug report:
A very slow SVG file with <path>s
https://bugzilla.mozilla.org/show_bug.cgi?id=332413
I obtained permission from the creator of the original file to
include the data here, (see comments in world-map.h for details).
We don't need this at this deep level since callers can now
implement this limiting manually since stats.iterations is
now returned. Also, this was interfering with the -i option
to cairo-perf anyway.
This new test case is the 0th polygon polygon from Zack Rusin's
recent cairorender program as made avaialable here:
http://ktown.kde.org/~zrusin/examples/cairorender.tar.bz2
This polygon contains about 1000 coordinates and looks like a
hand-drawn version of the word another.
The perf tree's sha1 is now in the cache file name, so that
if the performance suite itself ever changes then new data
will be generated rather than using stale stuff from the cache.
Also, we now use the src tree's sha1 rather than the commit's
so that commits that don't change the src directory are also
treated as identical, (which they really should be as far as
performance of the library itself is concerned).
Instead of just discarding the worst 15% of the results, we now
do IQR-based detection and elimination of outliers. Also, instead
of reporting mean times we now report minimum and median times.
We still do compute the mean and standard deviation for the
detection of when results seem stable for early bailout. And we
do still report the standard deviation.
A statistician might complain that it looks funny to report the
median and the standard deviation together, (instead of median
and average absolute deviation from the median, say), but I liked
the standard deviation since it is always larger, so it might
ensure better separatation if we use it to determine when two
sets of results are sufficiently different to be interesting.
This test is really just for hammering the double to fixed-point conversion
(in _cairo_fixed_from_double) that happens as doubles from API calls gets
translated into internal cairo fixed-point numbers.
This makes the entire performance test suite about 10 times faster
on my system. And I don't think that the results are significantly
worse, (many tests are stable after only 5 iterations while some
still run to 100 iterations without reaching our stability criteria).