Avoid calling libtool to link every single test case, by building just one
binary from all the sources.
This binary is then given the task of choosing tests to run (based on user
selection and individual test requirement), forking each test into its own
process and accumulating the results.
This target was added to the boilerplate during 1.8.1. It currently
shows many failures in the test suite. These failures likely fall
into three different classes:
* Tests needing new svg12-specific reference images
* Tests exercising bugs in librsvg
* Tests exercising existing cairo bugs
We haven't gone through the effort to separate these, but even for
the tests that are exercising actual cairo bugs, these are likely
bugs that existed in the cairo 1.8.0 release and not regressions.
Because of that, in this commit I'm conditionally disabling the
testing of the svg12 target. As soon as we increment the cairo
version to 1.9.0 or higher, this target will get re-enabled
automatically and we can begin the work to separate the tests as
described above and also fix the bugs.
As we use cairo to convert SVG files back to an image, that process is
dependent upon changes within our library and so we cannot skip the
conversion if the SVG file happens to match a previous run. Fortunately,
librsvg is quick enough that this is not a major issue.
For this we extend the boilerplate get_image() routines to extract a
single page out of a paginated document and then proceed to manually
check each page of the fallback-resolution test.
(Well that's the theory, in practice SVG doesn't support multiple pages
and so we just generate a new surface for each resolution. But the
infrastructure is in place so that we can automate other tests,
e.g. test/multi-pages.)
One possibility for a read failure whilst converting the image is if the
external utility crashed. This information is important for the test suite
as knowing input that causes the converter to crash is just as vital as
identifying a crash within the library.
Compare the current output against a previous run to determine if there
has been any change since last time, and only run through imagediff if
there has been. For the vector surfaces, we can check the vector output
first and potentially skip the rasterisation. On my machine this reduces
the time for a second run from 6 minutes to 2m30s. As most of the time,
most test output will remain unchanged, so this seems to be a big win. On
unix systems, hard linking is used to reduce the amount of storage space
required - others will see about a three-fold increase in the amount of
disk used. The directory continues to be a stress test for file selectors.
In order to reduce the changes between runs, the current time is no longer
written to the PNG files (justified by that it only exists as a debugging
aid) and the boilerplate tweaks the PS surface so that the creation date
is fixed. To fully realise the benefits here, we need to strip the
creation time from all the reference images...
The biggest problem with using the caches is that different runs of the
test suite can go through different code paths, introducing potential
Heisenbergs. If you suspect that caching is interfering with the test
results, use 'make -C test clean-caches check'.
In order to achieve substantial speed improvements the external conversion
utilities are rewritten as a daemon that communicates with the test suite
over a local socket. This is faster as it avoids the libtool and dynamic
linker overhead for each invocation, the caches persist between tests and
we no longer require a round trip through libpng.
The daemon is started automatically by the test suite and if communication
cannot be established then it falls back to using a pipe to a normal
conversion utility. The daemon will then persist for 60 seconds waiting
for further connections.
Of course any memory leak (stares at poppler) is exacerbated.
As Behdad suggested, we can dramatically speed up the test suite by
short-circuiting the write to a png file, only to then immediately read it
back in. So for the raster based surfaces, we avoid the round-trip through
libpng by implementing a new boilerplate method to directly extract the image
buffer from the test result. A secondary speedup is achieved by caching the
most recent reference image.
In order to run under memfault, the framework is first extended to handle
running concurrent tests - i.e. multi-threading. (Not that this is a
requirement for memfault, instead it shares a common goal of storing
per-test data). To that end all the global data is moved into a per-test
context and the targets are adjusted to avoid overlap on shared, global
resources (such as output files and frame buffers). In order to preserve
the simplicity of the standard draw routines, the context is not passed
explicitly as a parameter to the routines, but is instead attached to the
cairo_t via the user_data.
For the masochist, to enable the tests to be run across multiple threads
simply set the environment variable CAIRO_TEST_NUM_THREADS to the desired
number.
In the long run, we can hope the need for memfault (runtime testing of
error paths) will be mitigated by static analysis. A promising candidate
for this task would appear to be http://hal.cs.berkeley.edu/cil/.
Convert the boilerplate specific flattened content value to the ordinary
CAIRO_CONTENT_COLOR_ALPHA for use with cairo_push_group_with_content() -
otherwise cairo rightfully flags an error and the test harness decides
that the similar surface is not available.
Testing win32-printing requires setting the default printer to
a PostScript level 3 color printer. The PostScript output is
saved to a file and converted to png using ghostscript.
As opposed to the CAIRO_TEST_TARGET env var which lists the exact
targets to test, CAIRO_TEST_TARGET_EXCLUDE instead supplies a list of
targets to filter from the testing set. This is useful under
circumstances where the build environment prevents testing of a target,
for example no DirectFB support or the glitz library is broken, but where
you still want to perform the minimal check that the code compiles.