This is an optimization the PS surface has been using to improve
printing speed and prevent printers from choking on large
images. Applying this optimzation to PDF prevents the same problem
occuring when the PDF is converted to PS.
Use "mild outliers" method to remove exceptional speed-ups and slow-downs
from the graph, so that the majority of information is not lost by the
scaling. Add the timing labels to the bars so that the true factor is
always presented.
We remember the location of the last insert as the next edge is likely to
be nearby. However, we need to be careful when the pointer rests upon the
HEAD and ensure that we begin the search from the appropriate end.
This branch brings self-intersection removal with virtually no
performance regression. (Compare with the initial implementation that
incurred a 5-10x slowdown due to having to tessellate whole strokes at a
time.) The importance of self-intersection removal is the improved visual
quality it brings - gone are those annoying sparkles on the outside of
rounded-rectangles for instance. Most of the performance overhead
associated with the self-intersection removal is avoided by switching from
trapezoids to spans for strokes. Obviously we are not able to do so for
the xlib backend as we do not yet have a polygon image type, and so the
tessellators are overhauled instead, along with more special casing for
frequent operations to avoid the increased complexity along the general
paths.
Speedups
========
xlib-rgba swfdec-youtube-0 11371.13 (11406.01 0.28%) -> 10450.00 (10461.84 0.66%): 1.09x speedup
▏
image-rgba firefox-talos-svg-0 73696.53 (73828.28 3.42%) -> 68324.30 (70269.79 1.36%): 1.08x speedup
▏
image-rgba swfdec-youtube-0 7843.08 (7873.89 2.57%) -> 7393.96 (7399.68 0.18%): 1.06x speedup
xvfb-rgba swfdec-youtube-0 9627.25 (9634.43 0.16%) -> 9020.55 (9040.97 0.27%): 1.07x speedup
▏
Slowdowns
=========
xvfb-rgba gnome-terminal-vim-0 7695.12 (7697.87 0.44%) -> 8569.45 (8588.29 0.19%): 1.11x slowdown
▏
xvfb-rgba swfdec-giant-steps-0 3811.77 (3815.06 0.23%) -> 4246.67 (4569.17 3.52%): 1.11x slowdown
▏
image-rgba gvim-0 7150.90 (7181.96 29.36%) -> 14641.04 (14651.36 0.11%): 2.05x slowdown
█
One method for overcoming these regressions is to reduce the complexity of
the polygons being fed into the tessellator (both in the number of edges
and intersections). This should be feasible by coupling into Jeff Muizelaar's
stroke-to-path work, which early indications suggest will bring a
significant performance improvement. On top of this, our span
implementation for the image backend is not as efficient as we would hope
for - and Joonas promises a much faster implementation soon.
is_rectangle() is far stricter than is_box(), and is only required for a
very limited set of operations (essentially were the rectangle must
conform to the motion as described by cairo_rectangle). For the general
case where we just want to know whether we have a single rectangular path
that covers a certain area, is_box() is sufficient.
Fix up the geometric clipper to handle intersecting a rectilinear path
with an arbitrary path and inspecting the result to see if it becomes a
a region.
When combining a clip-mask with a subsurface, as when used to combine with
the composite mask, we need to pass the destination surface offset to the
clip so that the paths can be corrected for the new surface.
This patch revises xlib so that it doesn't depend on having recent
Xrender headers to build. In particular, some definitions were added
to the private xrender header file, and an ifdef render version check
CAIRO_SURFACE_RENDER_SUPPORTS_OPERATOR was changed to a run-time
check using CAIRO_SURFACE_RENDER_HAS_PDF_OPERATORS.
A typo, I missed converting the user over to the freshly sorted list,
leaving it iterating over original but checking the sorted for termination
conditions.
Soeren was the first to report a clipping regression in the xlib backend
with strokes, and provided a test case to exercise the bug. This is an
extension of his test to provide coverage of different clipping and
stroking methods.
To correctly handle retessellating trapezods constructed from alternately
wound boxes, then we need to pass that information from the path to the
tessellator. We do this by switching the direction of the box if the first
edge is horizontal as opposed to vertical.
Currently the tracing code for glyphs constructs an temporary path in
order to replay and append to the output. This temporary allocation is
extremely wasteful as we can just directly append the glyph path to
the output path.
If the stroke is degenerate, i.e. the path consists only of a single
move-to and no edges, then the stroke may be visible due to end-capping
(as opposed to fills which are empty). So we also need to pad out the
extents around the current point for the degenerate case.
_cairo_path_fixed_is_box() is only called for filled paths and so must
handle the implicit close (which was already being correctly handled by
_cairo_path_fixed_iter_is_box).
Fixes test/implicit-close
By forgetting the implicit-close when checking for rectilinear paths, we
tried to feed the triangle (and other diagclose) into the specialised
rectilinear tesselators which completely mishandled that final edge.
This is a simple test that broke with the determination of rectilinearity
during path construction. I forgot the implicit close on fill and so the
ignored the final diagonal edge and failed to draw the triangle.
For the purposes of benchmarking it is useful to run cairo-perf against a
different library from the one it was compiled against. In order to do so,
we need to check that the runtime library contains the required entry
points for our targets - which we can check by using dlsym.
The assert() is only correct for the normal paths, but failed on the error
path. It has been run for long enough for me to be confident that the code
is self-consistent, so I think I can now safely remove it.
Innocuous warnings about the use of mismatching explicit casts (I'm really
not convinced by the merits of this particular compiler warning, but it
does cleanse the code slightly.)
Previously the pattern_acquire_surface routine only had to worry about
handling extend modes NONE or REPEAT and so the test for ! REPEAT
sufficed when what was actually intended was a test for NONE.
Always use spans, even for unaligned boxes. In the future (given a new
interface) we may want to emit the common unaligned box code more
efficient than a per-scanline computation -- but for now simply avoid the
requirements to write a temporary CPU buffer.
As we never return an error status during the path construction, we can
use the return value for the QPainterPath instead, greatly simplifying the
callers.
The higher level code ensures that the region of interest is trimmed to
our declared surface extents, so performing the intersection again is
redundant. Furthermore with the change in the clipping code, the
fallback region is no longer clipped, especially as the clip that is
currently set upon the DC is likely to be stale and incorrect for the
fallback.
Hopefully this resolves the assertion failure reported by Damian Frank,
http://lists.cairographics.org/archives/cairo/2009-August/018015.html
CC: Damian Frank <damian.frank@gmail.com>
Originally written by Vladimir Vukicevic to investigate using Skia for
Mozilla, it provides a nice integration with a rather interesting code
base. By hooking Skia underneath Cairo it allows us to directly compare
code paths... which is interesting.
[updated by Chris Wilson]