Handling clip as part of the surface state, as opposed to being part of
the operation state, is cumbersome and a hindrance to providing true proxy
surface support. For example, the clip must be copied from the surface
onto the fallback image, but this was forgotten causing undue hassle in
each backend. Another example is the contortion the meta surface
endures to ensure the clip is correctly recorded. By contrast passing the
clip along with the operation is quite simple and enables us to write
generic handlers for providing surface wrappers. (And in the future, we
should be able to write more esoteric wrappers, e.g. automatic 2x FSAA,
trivially.)
In brief, instead of the surface automatically applying the clip before
calling the backend, the backend can call into a generic helper to apply
clipping. For raster surfaces, clip regions are handled automatically as
part of the composite interface. For vector surfaces, a clip helper is
introduced to replay and callback into an intersect_clip_path() function
as necessary.
Whilst this is not primarily a performance related change (the change
should just move the computation of the clip from the moment it is applied
by the user to the moment it is required by the backend), it is important
to track any potential regression:
ppc:
Speedups
========
image-rgba evolution-20090607-0 1026085.22 0.18% -> 672972.07 0.77%: 1.52x speedup
▌
image-rgba evolution-20090618-0 680579.98 0.12% -> 573237.66 0.16%: 1.19x speedup
▎
image-rgba swfdec-fill-rate-4xaa-0 460296.92 0.36% -> 407464.63 0.42%: 1.13x speedup
▏
image-rgba swfdec-fill-rate-2xaa-0 128431.95 0.47% -> 115051.86 0.42%: 1.12x speedup
▏
Slowdowns
=========
image-rgba firefox-periodic-table-0 56837.61 0.78% -> 66055.17 3.20%: 1.09x slowdown
▏
Reading through the previous commit spotted that the arguments to
edge_compare_for_y_against_x were transposed, but the test-suite had
failed to catch detect it. This is due that in order to actually
solve the equation we need to have a diagonal edge passing near an
off-centre point of interest, which was not among the test cases. So add
some off-centre tests to fully exercise the code.
Avoid calling libtool to link every single test case, by building just one
binary from all the sources.
This binary is then given the task of choosing tests to run (based on user
selection and individual test requirement), forking each test into its own
process and accumulating the results.
In order to run under memfault, the framework is first extended to handle
running concurrent tests - i.e. multi-threading. (Not that this is a
requirement for memfault, instead it shares a common goal of storing
per-test data). To that end all the global data is moved into a per-test
context and the targets are adjusted to avoid overlap on shared, global
resources (such as output files and frame buffers). In order to preserve
the simplicity of the standard draw routines, the context is not passed
explicitly as a parameter to the routines, but is instead attached to the
cairo_t via the user_data.
For the masochist, to enable the tests to be run across multiple threads
simply set the environment variable CAIRO_TEST_NUM_THREADS to the desired
number.
In the long run, we can hope the need for memfault (runtime testing of
error paths) will be mitigated by static analysis. A promising candidate
for this task would appear to be http://hal.cs.berkeley.edu/cil/.