Avoid calling libtool to link every single test case, by building just one
binary from all the sources.
This binary is then given the task of choosing tests to run (based on user
selection and individual test requirement), forking each test into its own
process and accumulating the results.
Add the core support to cairo-test for running the test-suite under a
malloc fault injector. This commit contains the adjustments to
cairo_test_run() to repeat the test if it detects a failure due to fault
injection and complains if it detects unreported faults or memory leaks.
From bug 18010 (https://bugs.freedesktop.org/show_bug.cgi?id=18010),
in order to make flockfile() available we need to set _POSIX_C_SOURCE and
according to the man page, the appropriate feature check is
_POSIX_THREAD_SAFE_FUNCTIONS.
For this we extend the boilerplate get_image() routines to extract a
single page out of a paginated document and then proceed to manually
check each page of the fallback-resolution test.
(Well that's the theory, in practice SVG doesn't support multiple pages
and so we just generate a new surface for each resolution. But the
infrastructure is in place so that we can automate other tests,
e.g. test/multi-pages.)
If the vector surface matches the output from last time, then the
rasterisation is skipped - but we need to write the expected OUTPUT
filename to the log so that the image is referenced from index.html.
The reasoning behind the -25 testing is that we want to ensure
that cairo provides translation invariance. However, for
many vector backends we use external rasterizers that don't
necessarily provide that translation invariance.
So this testing makes a bunch of failures appear that we don't
really care about, (and we don't even have a mechanism to turn
them off with custom reference images). For the release, I'm
just turning this off.
After the release, I plan to turn this back on, and then we could
fix this by ensuring that the vector output itself is unaffected
by a device offset, or by moving away from external rasterizers,
(see Chris's micro-gs work to test PostScript with cairo-based
rasterization).
Delete the results of previous runs if the reference images are more
recent.
There's still potential error if the conversion utility or its required
libraries are modified...
Be explicit about handling cached FAIL images, instead of relying on the
sequences of failed matches as the files are an external resource and we
can not guarantee their individual accessibility.
Note this also changes the filename, so you may want to run:
$ find -name '*-last.*' -print | xargs rm
after this checkout.
Compare the current output against a previous run to determine if there
has been any change since last time, and only run through imagediff if
there has been. For the vector surfaces, we can check the vector output
first and potentially skip the rasterisation. On my machine this reduces
the time for a second run from 6 minutes to 2m30s. As most of the time,
most test output will remain unchanged, so this seems to be a big win. On
unix systems, hard linking is used to reduce the amount of storage space
required - others will see about a three-fold increase in the amount of
disk used. The directory continues to be a stress test for file selectors.
In order to reduce the changes between runs, the current time is no longer
written to the PNG files (justified by that it only exists as a debugging
aid) and the boilerplate tweaks the PS surface so that the creation date
is fixed. To fully realise the benefits here, we need to strip the
creation time from all the reference images...
The biggest problem with using the caches is that different runs of the
test suite can go through different code paths, introducing potential
Heisenbergs. If you suspect that caching is interfering with the test
results, use 'make -C test clean-caches check'.
Having included some extra details in the test output PNG filename, we
need to pass the extra information to
cairo_ref_name_for_test_target_format() in order to find the match.
As Behdad suggested, we can dramatically speed up the test suite by
short-circuiting the write to a png file, only to then immediately read it
back in. So for the raster based surfaces, we avoid the round-trip through
libpng by implementing a new boilerplate method to directly extract the image
buffer from the test result. A secondary speedup is achieved by caching the
most recent reference image.
Construct the test name to pass to the boilerplate creation routines such
that it uniquely identifies the test in terms of test, target, content and
pass (similar, offset, thread). This allows the vector targets to create
output different output files for each test, whereas before, later tests
would overwrite existing files making debugging more difficult.
In order to run under memfault, the framework is first extended to handle
running concurrent tests - i.e. multi-threading. (Not that this is a
requirement for memfault, instead it shares a common goal of storing
per-test data). To that end all the global data is moved into a per-test
context and the targets are adjusted to avoid overlap on shared, global
resources (such as output files and frame buffers). In order to preserve
the simplicity of the standard draw routines, the context is not passed
explicitly as a parameter to the routines, but is instead attached to the
cairo_t via the user_data.
For the masochist, to enable the tests to be run across multiple threads
simply set the environment variable CAIRO_TEST_NUM_THREADS to the desired
number.
In the long run, we can hope the need for memfault (runtime testing of
error paths) will be mitigated by static analysis. A promising candidate
for this task would appear to be http://hal.cs.berkeley.edu/cil/.
Some platforms and applications enable floating point exceptions, so it
seems sensible to run the test-suite with the most serious exceptions
enabled - divide by zero, invalid result and overflow.
Convert the boilerplate specific flattened content value to the ordinary
CAIRO_CONTENT_COLOR_ALPHA for use with cairo_push_group_with_content() -
otherwise cairo rightfully flags an error and the test harness decides
that the similar surface is not available.
This reverts commit 07122e64fa.
The extra testing did find a pdf bug, and that should be fixed,
but the extra maintenance burden of running another iteration
of all tests does not seem justfied at all---particularly since
it looks like dozens of new reference images would be needed
for the svg backend.
Also, the new "failures" of the image backend with this new
testing look like a misunderstanding of exactly what the new
testing is actually drawing.
Test surfaces using similar surfaces with both CONTENT_COLOR and
CONTENT_COLOR_ALPHA, if applicable. This seems justified by the apparent
bugs in the pdf backend when going from an ARGB32 similar surface to
a destination RGB24 surface as well as isolated bugs in the image
backend.
The original goal was to try and trick the test suite into producing
a xlib surface with mismatching Visual/XRenderPictFormat. This succeeds,
although with a little bit of brute force in the xlib backend, but the
search to reproduce a BadMatch error fruitless.
Add various test cases to exercise
_cairo_pattern_acquire_surface_for_surface(), most notably using similar
source surfaces to provide coverage of the non-image surface branch.
cairo_get_target() returns the original surface passed to
cairo_create(), and not the current destination as required when
testing drawing to the same surface using multiple contexts.
For completeness we also use the group target when creating similar
surfaces within the tests (to check that similar surfaces of similar
surfaces also work).
To correctly copy a surface onto the destination irrespective of its
content requires the SOURCE operator. Forgetting to do so here causes
uninitialized pixels to be mixed into the result and the failure of
many tests for the similar surface. Oops...
Allow using a previous test output directory as a source of
reference images. To make use of this, set the environment
variable 'CAIRO_REF_DIR' to point at an old test directory,
relative to the current test directory.
This is useful for testing backends when reference images haven't
been created yet, or which the current reference image structure
can't accomodate, like multiple font backends.
The test logs currently do not record the paths of
image output, the reference images tested against, and
the diffs created. This means that make-html.pl has to
duplicate the policy in cairo-test.c. Fix this by teaching
cairo-test.c to log the paths.
Having noticed strange discrepancies creeping into similar surfaces
whilst working on the xlib backend, I thought it wise to also run
the test harness against similar targets. For consistency, only
targets whose similar surface use the same backend are included.
This can be disabled by exporting CAIRO_TEST_IGNORE_SIMILAR=1.
This is necessary to ensure that limiting backends using
CAIRO_TEST_TARGET does not increase the number of tests failing,
which is a desirable invariant.