Having spent the last dev cycle looking at how we could specialize the
compositors for various backends, we once again look for the
commonalities in order to reduce the duplication. In part this is
motivated by the idea that spans is a good interface for both the
existent GL backend and pixman, and so they deserve a dedicated
compositor. xcb/xlib target an identical rendering system and so they
should be using the same compositor, and it should be possible to run
that same compositor locally against pixman to generate reference tests.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
P.S. This brings massive upheaval (read breakage) I've tried delaying in
order to fix as many things as possible but now this one patch does far,
far, far too much. Apologies in advance for breaking your favourite
backend, but trust me in that the end result will be much better. :)
In step 1 of speeding up stroking, we introduce contours as a means for
tracking the connected edges around the stroke. By keeping track of
these chains, we can analyse the edges as we proceed and eliminate
redundant vertices speeding up rasterisation.
Coincidentally fixes line-width-tolerance (looks like a combination of
using spline tangent vectors and tolerance).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As handling joins/caps between line segments shorter than
half_line_width is tricky.
Rather than also fixing the bug in traps, remove that code. The plan is
to avoiding hitting the traps code, short-circuiting several steps along
the fast rectangular paths.
Fixes line-width-overlap.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Step 1, fix the failings sighted recently by tracking clip-boxes as an
explicit property of the clipping and of composition.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Allow a backend to completely reimplement the Cairo API as it wants. The
goal is to pass operations to the native backends such as Quartz,
Direct2D, Qt, Skia, OpenVG with no overhead. And to permit complete
logging contexts, and whatever else the imagination holds. Perhaps to
experiment with double-paths?
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Silence the warning:
cairo-path-stroke.c: In function '_cairo_stroker_add_caps':
cairo-path-stroke.c:861:29: warning: comparison between
'cairo_line_cap_t' and 'enum _cairo_line_join' [-Wenum-compare]
CAIRO_LINE_JOIN_ROUND and CAIRO_LINE_CAP_ROUND have the same value,
hence this defect went unnoticed so far.
_cairo_polygon_limit() had to be called immediately after
_cairo_polygon_init() (or never at all).
Merging the two calls is a simple way to enforce this rule.
Path are always interpreted in forward direction, so the ability of
interpreting in the opposite direction (which is very unlikely to be
useful at all) can be removed.
Use inline accessors to hide the flags in the code.
This ensures that flags that need additional computations (example:
is_rectilinear for the fill case) are always used correctly.
I updated the Free Software Foundation address using the following script.
for i in $(git grep Temple | cut -d: -f1 )
do
sed -e 's/59 Temple Place[, -]* Suite 330, Boston, MA *02111-1307[, ]* USA/51 Franklin Street, Suite 500, Boston, MA 02110-1335, USA/' -i "$i"
done
Fixes http://bugs.freedesktop.org/show_bug.cgi?id=21356
This reverts commit 26e9f14906 on
cairo-path-stroke.
The changes in cairo-path-stroke are not needed anymore since dash
pattern approximation is now done in gstate before passing the dash
pattern to the backend.
As a simple step to ensure that we do not inadvertently modify (or at least
generate compiler warns if we try) user data, mark the incoming style
and matrices as constant.
The number of points in a triangle fan was miscomputed because
it was computing the number of line segments rather than points
in the fan. Now we include the final point of the fan correctly
in the count.
This fixes https://bugs.webkit.org/show_bug.cgi?id=33071 as
reported by Benjamin Otte. A derived test case was not added
to the cairo test suite since the bug is difficult to trigger in
a reliable way which causes visible results (as opposed to
silent heap corruption.)
The easiest way of triggering the bug is to stroke a line
using a large line width and round caps or joins.
Add some auxiliary functions to cairo-stroke-style to compute
properties of the dashed patterns (period, "on" coverage) and to
generate approximations when the dash pattern is sub-tolerance.
These functions are used in cairo-path-stroke to simplify dash
patterns stroked by cairo.
Fixes dash-infinite-loop
See http://bugs.freedesktop.org/show_bug.cgi?id=24702
Add an even simpler sweep-line tessellator for rectangular trapezoids (as
produced by the rectilinear stoker and box filler).
This is so simple it even outperforms pixman's region validation code for the
purposes of path-to-region conversion.
We were hitting an assertion attempting to eliminate intersections inside
the rectilinear tessellator for empty strokes. We can avoid this
assertion, by only marking the traps as having potential intersections iff
it is non-empty.
For the simple cases where the clip is an unaligned box (or boxes), apply
the clip directly to the geometry and avoid having to use an intermediate
clip-mask.
For the frequent cases where we know in advance that we are dealing with a
rectilinear path, but can not use the simple region code, implement a
variant of the Bentley-Ottmann tessellator. The advantages here are that
edge comparison is very simple (we only have vertical edges) and there are
no intersection, though possible overlaps. The idea is the same, maintain
a y-x sorted queue of start/stop events that demarcate traps and sweep
through the active edges at each event, looking for completed traps.
The motivation for this was noticing a performance regression in
box-fill-outline with the self-intersection work:
1.9.2 to HEAD^: 3.66x slowdown
HEAD^ to HEAD: 5.38x speedup
1.9.2 to HEAD: 1.57x speedup
The cause of which was choosing to use spans instead of the region handling
code, as the complex polygon was no longer being tessellated.
We refactor the surface fallbacks to convert full strokes and fills to the
intermediate polygon representation (as opposed to before where we
returned the trapezoidal representation). This allow greater flexibility
to choose how then to rasterize the polygon. Where possible we use the
local spans rasteriser for its increased performance, but still have the
option to use the tessellator instead (for example, with the current
Render protocol which does not yet have a polygon image).
In order to accommodate this, the spans interface is tweaked to accept
whole polygons instead of a path and the tessellator is tweaked for speed.
Performance Impact
==================
...
Still measuring, expecting some severe regressions.
...
Handling clip as part of the surface state, as opposed to being part of
the operation state, is cumbersome and a hindrance to providing true proxy
surface support. For example, the clip must be copied from the surface
onto the fallback image, but this was forgotten causing undue hassle in
each backend. Another example is the contortion the meta surface
endures to ensure the clip is correctly recorded. By contrast passing the
clip along with the operation is quite simple and enables us to write
generic handlers for providing surface wrappers. (And in the future, we
should be able to write more esoteric wrappers, e.g. automatic 2x FSAA,
trivially.)
In brief, instead of the surface automatically applying the clip before
calling the backend, the backend can call into a generic helper to apply
clipping. For raster surfaces, clip regions are handled automatically as
part of the composite interface. For vector surfaces, a clip helper is
introduced to replay and callback into an intersect_clip_path() function
as necessary.
Whilst this is not primarily a performance related change (the change
should just move the computation of the clip from the moment it is applied
by the user to the moment it is required by the backend), it is important
to track any potential regression:
ppc:
Speedups
========
image-rgba evolution-20090607-0 1026085.22 0.18% -> 672972.07 0.77%: 1.52x speedup
▌
image-rgba evolution-20090618-0 680579.98 0.12% -> 573237.66 0.16%: 1.19x speedup
▎
image-rgba swfdec-fill-rate-4xaa-0 460296.92 0.36% -> 407464.63 0.42%: 1.13x speedup
▏
image-rgba swfdec-fill-rate-2xaa-0 128431.95 0.47% -> 115051.86 0.42%: 1.12x speedup
▏
Slowdowns
=========
image-rgba firefox-periodic-table-0 56837.61 0.78% -> 66055.17 3.20%: 1.09x slowdown
▏
Ensure that no assumptions are made that a small allocation will succeed
by manually injecting faults when we may be simply allocating from an
embedded memory pool.
The main advantage in manual fault injection is improved code coverage -
from within the test suite most allocations are handled by the embedded
memory pools.
The spline decomposition code allocates and stores points in a temporary
buffer which is immediately consumed by the caller. If the caller supplies
a callback that handles each point computed along the spline, then we can
use the point immediately and avoid the allocation.
Avoid the overhead of sorting the rectangle in
_cairo_traps_tessellate_convex_quad() by constructing the trap directly
from the line segment. This also has secondary effects in only passing
the non-degenerate trap to _cairo_traps_add_trap().
For rectilinear Hilbert curves this makes the rectilinear stoker over 4x
faster.
The rectilinear stroke finalized the cairo_traps_t passed to it - which
was then subsequently used without re-initialized. So instead of
finalizing the structure, just remove any traps that we may have added
(leaving the limits and memory intact).
These two functions were hiding away some important details
about strictness of inequalities. Also, the callers differ
on the strictness they need. Everything is cleaner and more
flexible by making the callers just call _cairo_slope_compare
directly.
The optimization to avoid sqrt() for horizontal/vertical lines in
_compute_normalized_device_slope was causing us to return a negative
magnitude with a positive slope for left-to-right and bottom-to-top
lines, instead of always returning a positive magnitude and a slope
with an appropriate sign.