A benchmark to test the speed of hash tables when inserting and
removing a huge number of elements.
Although originally hash tables were assumed not to get many
deletions, in practice they are now being used as caches in multiple
places. This means that they often have a fixed number of live
elements and an element is evicted whenever a new element is inserted
(this happens explicitly for cairo_cache_t objects, but also, for
example, in scaled_font_map + holdovers). This access pattern is very
inefficient with the current implementation.
Reset the cairo_t to the initial state so that subsequent tests are not
affected by earlier tests.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This patch has been generated by the following Coccinelle semantic patch:
// Remove useless checks for NULL before freeing
//
// free (NULL) is a no-op, so there is no need to avoid it
@@
expression E;
@@
+ free (E);
+ E = NULL;
- if (unlikely (E != NULL)) {
- free(E);
(
- E = NULL;
|
- E = 0;
)
...
- }
@@
expression E;
@@
+ free (E);
- if (unlikely (E != NULL)) {
- free (E);
- }
A benchmark to test how close we get to reducing paint+clip to an ordinary
fill, and to check correctness.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The performance tools build system for Win32 hasn't been maintained
for some time. The makefiles are now structured as in other
directories (Makefile.sources used by both Makefile.am and
Makefile.win32) and some additional code hides os-specific parts.
cairo-perf-trace uses cairo-hash.c, which calls _cairo_error.
Instead of redefining it in cairo-perf-trace.c it can be abstracted in
a separate source which is directly included in the build of
cairo-perf-trace.
This avoids visibility issues when compiling cairo-perf-trace with a
statically linked cairo library on architectures which do not support
hidden visibility (example: win32).
Benjamin just demonstrated this funky trick for generating pixel
outlines, and as no good deed should go unpunished, I've added his code
to the perf suite.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Currently we print the backend description before every time, which is
overly verbose. As the information doesn't^Wshouldn't change, simply
print it before running the first test of each target.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Make the loops count depend on the actual calibration_loops/calibration_time
instead of calibration_loops/calibration_max_time.
This avoids having some tests take much less/more than the wanted time per iteration
(I was having some tests taking about 1 second, other taking about 7 seconds when
the ms_per_iteration was 2000)
Spend 0.5-1 times the time wanted for each iteration in calibration, increase the
accuracy of loops count. Just making the loops count be the correct ratio doesn't
guarantee that the iteration time is accurate. By actually measuring iteration
times until it gets greater than 1/4 of the wanted time, the total sum is bound
to be <= the wanted iteration time and last calibration time is between 1/4 and
1/2 of the wanted time, so it should give a very accurate loop count.
The cairo-perf-diff-files tool would ignore perf reports with
just one test for no apparent reason. The traces take so long
that it's useful to be able to compare runs with just one trace.
Since this takes days to run now and should not find any bugs that are
not covered by the test-suite it seems like a pointless exercise.
Especially as I am trying to make a release!
The OIL routines don't work as expected on MacOS X. The operating
system gives access to the timestamp counter through the function
mach_absolute_time.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Provide a hook for the test to be able to compute the number of ops per
second. For instance, the glyphs test uses it to report the number of
kiloglyph per second Cairo is able to render.
Real applications that control their Drawable externally to Cairo are
'disadvantaged' by cairo-perf-trace when it creates a similar surface
for each new instance of the same Drawable. The difficulty in
maintaining one perf surface for every application surface is that the
traces do not track lifetimes for the application surfaces, so we would
just accumulate stale surfaces. The surface cache takes a different
approach and returns the same surface for each active Drawable, and
maintains a hold-over of the MRU 16 surfaces. This achieves 60-80% hit
rate with firefox, which is probably as good as can be expected.
Obviously for double-buffered applications we only every draw to freshly
created surfaces (and Gtk+ bypasses cairo to do the final copy -- the
ideal application would just use a push-group for double buffering, in
which case we would capture and replay the entire expose event).
To enable use of the surface cache whilst replaying use -c:
./cairo-perf-trace -c firefox-talos-gfx
A new -f option to cairo-perf reverts to a fast run
mode for quick performance overviews. The number of
milliseconds each iteration of a test is run for can
be overriden using the new CAIRO_PERF_ITERATION_MS
environment variable. The default remains 2000 ms/iter.
Use "mild outliers" method to remove exceptional speed-ups and slow-downs
from the graph, so that the majority of information is not lost by the
scaling. Add the timing labels to the bars so that the true factor is
always presented.
Originally written by Vladimir Vukicevic to investigate using Skia for
Mozilla, it provides a nice integration with a rather interesting code
base. By hooking Skia underneath Cairo it allows us to directly compare
code paths... which is interesting.
[updated by Chris Wilson]
In order to handle 'cairo-perf-trace benchmark', we need to perform the
can_run? test on the directory name as opposed to the individual trace
names. Make it so.
I'd disabled this to look at cairo-qt performance, then forgot about it.
Be clean, cleanup globals -- this should fix the huge performance loss
when running in series multiple backends that need separate font caches.
These traces run for much longer than the original synthetic benchmarks
and seek to replicate 'real-world' applications, so the warning that the
xserver and cairo-perf are not bound to any cpu is false.
cairo-perf-chart takes multiple runs (currently it is limited to
prefiltered data sets) and pretty-prints a chart showing performace
improvements/regressions (in either ASCII or HTML) along with a
cairo-perf-chart.png