Having spent the last dev cycle looking at how we could specialize the
compositors for various backends, we once again look for the
commonalities in order to reduce the duplication. In part this is
motivated by the idea that spans is a good interface for both the
existent GL backend and pixman, and so they deserve a dedicated
compositor. xcb/xlib target an identical rendering system and so they
should be using the same compositor, and it should be possible to run
that same compositor locally against pixman to generate reference tests.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
P.S. This brings massive upheaval (read breakage) I've tried delaying in
order to fix as many things as possible but now this one patch does far,
far, far too much. Apologies in advance for breaking your favourite
backend, but trust me in that the end result will be much better. :)
Microsoft C Compiler complains about:
hash-table.c(44) : error C2466: cannot allocate an array of constant
size 0
Adding an unused element makes it happy.
The cairo-missing library provides the functions which are needed in
order to correctly compile cairo (or its utilities) and which were not
found during configuration.
Fixes the build on MacOS X Lion, which failed because of collisons
between the cairo internal getline and strndup and those in libc:
cairo-analyse-trace.c:282: error: static declaration of ‘getline’ follows non-static declaration
/usr/include/stdio.h:449: error: previous declaration of ‘getline’ was here
cairo-analyse-trace.c:307: error: static declaration of ‘strndup’ follows non-static declaration
...
Add the cairo_time_t type (currently based on cairo_uint64_t) and use
it in cairo-observer and in the perf suite.
Fixes the build on MacOS X (for the src/ subdir) and Win32, whch
failed because they don't provide clock_gettime:
cairo-surface-observer.c:629: error: implicit declaration of function 'clock_gettime'
cairo-surface-observer.c:629: warning: nested extern declaration of 'clock_gettime'
cairo-surface-observer.c:629: error: 'CLOCK_MONOTONIC' undeclared (first use in this function)
...
In order for this to be effective on small system we also need to
disable the recording of the long traces which exhaust all memory...
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We can use the elapsed time of the indiividual operations to profile the
synchronous throughput of a trace and eliminate all replay overhead. At
the cost of running the trace synchronously of course.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The immediate use of this is to print out the slowest operation of each
type in a replayable manner. A continuing demonstration of how we may
analyse traces...
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As the trace may leak surfaces over its lifetime, we are forced to
teardown the connection between runs.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Another logging passthrough surface that records the style of operations
performed trying to categorise what is slow/fast/important.
In combination with perf/cairo-analyse-trace it is very useful for
understanding what a trace does. The next steps for this tool would be
to identify the slow operations that the trace does. Baby steps.
This should be generally useful in similar situations outside of perf/
and should be extensible to become an online performance probe.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
I forgot to proof-read the patch before pushing and forgot I had left in
some damage from trying to get skia to link using libtool.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Still hopelessly broken. Requires compiling cairo to use static linking
and then still requires manual linkage to workaround libtool. Lots of
functionality is still absent - we need to either find analogues to some
Cairo operations or implement fallbacks - but it is sufficient to
investigate how Skia functions in direct comparison with Cairo for
tessellation/rasterisation.
Caveat emptor.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
A variant of many-strokes tries to answer the question of how much
overhead is there in stroking, i.e. if we fill an equivalent path to a
set of strokes, do we see an equivalence in performance?
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
An intersection variant to exercise the stroker with many, many lines. A
silly benchmark, but a popular one in the wild.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This fixes weird and occasional build failures when updating the source, e.g.:
cairo-perf-micro.o:(.rodata+0xb0): undefined reference to `hash_table'
Signed-off-by: Uli Schlachter <psychon@znc.in>
A benchmark to test the speed of hash tables when inserting and
removing a huge number of elements.
Although originally hash tables were assumed not to get many
deletions, in practice they are now being used as caches in multiple
places. This means that they often have a fixed number of live
elements and an element is evicted whenever a new element is inserted
(this happens explicitly for cairo_cache_t objects, but also, for
example, in scaled_font_map + holdovers). This access pattern is very
inefficient with the current implementation.
Reset the cairo_t to the initial state so that subsequent tests are not
affected by earlier tests.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This patch has been generated by the following Coccinelle semantic patch:
// Remove useless checks for NULL before freeing
//
// free (NULL) is a no-op, so there is no need to avoid it
@@
expression E;
@@
+ free (E);
+ E = NULL;
- if (unlikely (E != NULL)) {
- free(E);
(
- E = NULL;
|
- E = 0;
)
...
- }
@@
expression E;
@@
+ free (E);
- if (unlikely (E != NULL)) {
- free (E);
- }
A benchmark to test how close we get to reducing paint+clip to an ordinary
fill, and to check correctness.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The performance tools build system for Win32 hasn't been maintained
for some time. The makefiles are now structured as in other
directories (Makefile.sources used by both Makefile.am and
Makefile.win32) and some additional code hides os-specific parts.
cairo-perf-trace uses cairo-hash.c, which calls _cairo_error.
Instead of redefining it in cairo-perf-trace.c it can be abstracted in
a separate source which is directly included in the build of
cairo-perf-trace.
This avoids visibility issues when compiling cairo-perf-trace with a
statically linked cairo library on architectures which do not support
hidden visibility (example: win32).
Benjamin just demonstrated this funky trick for generating pixel
outlines, and as no good deed should go unpunished, I've added his code
to the perf suite.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Currently we print the backend description before every time, which is
overly verbose. As the information doesn't^Wshouldn't change, simply
print it before running the first test of each target.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Make the loops count depend on the actual calibration_loops/calibration_time
instead of calibration_loops/calibration_max_time.
This avoids having some tests take much less/more than the wanted time per iteration
(I was having some tests taking about 1 second, other taking about 7 seconds when
the ms_per_iteration was 2000)
Spend 0.5-1 times the time wanted for each iteration in calibration, increase the
accuracy of loops count. Just making the loops count be the correct ratio doesn't
guarantee that the iteration time is accurate. By actually measuring iteration
times until it gets greater than 1/4 of the wanted time, the total sum is bound
to be <= the wanted iteration time and last calibration time is between 1/4 and
1/2 of the wanted time, so it should give a very accurate loop count.