The old implementation was a very naive one that used to generate one XRender
glyph element per glyph. That is, position glyphs individually. This was
raised here:
http://lists.freedesktop.org/archives/cairo/2006-December/008835.html
The new implmentation is a free rewriting of the Xft logic, that is,
compressing glyphs with "natural" advance into elements, but with various
optimizations and improvements.
In short, it works like this: glyphs are looped over, skipping those that are
not desired, and computing offset from "current position". Whenever a glyph
has non-zero offsets from the current position, a new element should be
started. All these are used to compute the request size in the render
protocol. Whenever the request size may exceed the max request size, or at
the end, glyphs are flushed. For this to work, we now set non-zero glyph
advances when sending glyphs to the server.
Notable optimizations and improvements include:
- Reusing the input glyph array (with double glyph positions) as a working
array to compute glyph offsets.
- Reusing the input glyph array as the output glyph-index array to be passed
to XRender.
- Marking glyphs to be skipped as so, avoiding a copy of the glyph array,
which is what the old code was doing.
- Skip glyphs with positions "out-of-range". That is, those with positions
that would cause an overflow in Xrender's glyph offset calculations.
On my Fedora desktop on Pentium 4, and on a Nokia 770, it shows a 6% speedup on
the timetext test.
This is done in cairo_scaled_glyph_t->x/y_advance. The value is mostly useful
for raster backends, for example to set as default advance of a glyph, and
later on optimize glyph positionings that use the default advance.
We duplicate the incoming glyph array for two reasons: 1) applying
transformations, and 2) to let the lower level functions have a glyph array
they can modify. By using a 2kb array on the stack we can avoid malloc() for
requests of less than 100 glyphs. The size of the array can be tuned by
setting CAIRO_STACK_BUFFER_SIZE.
This is the suggested size in bytes of buffers allocated on the stack per
function, mostly used for glyph rendering. We typically use a local buffer on
the stack to avoid mallocing for small requests. Requests that do not fit are
malloc()ed automatically. The default value should be enough for about a
100-glyph cairo_show_glyphs() operation.
The rule is: cairo_glyph_t* is always passed as const for measurement
purposes. This was not reflected in our public api previously. Fixed
Showing glyphs used to have cairo_glyph_t* always as const. With this
changed, it is only const on cairo_t and cairo_gstate_t operations.
cairo_surface_t, cairo_scaled_font_t, and individual backends receive
cairo_glyph_t* as non-const. The desired semantics is that they may modify
the contents of the array as long as they do not return
CAIRO_STATUS_UNSUPPORTED. This makes it possible to avoid copying the glyph
array again and again, and edit it in-place. Backends are in fact free to use
the array as a generic buffer as they see fit.
A nice side effect of this new approach is that the valid input range
was expanded back to (INT_MIN, INT_MAX]. No performance regressions observed.
Also included is documentation about the internal mysteries of _cairo_lround,
as previously promised.
Pass cairo_ft_options_t around by pointer, not by value. That's what we do
with cairo_font_options_t anyway, and there is no reason to not do the same
here. (makes -Waggregate-return warnings go away btw).
This patch removes the guard bits from the tessellator internal
coordinates and reworks the input validation to make sure that the
tessellator code should never die on an assert. When the extent of a
polygon exceeds a width or height of 2^31-1, then the rightmost
(resp. bottommost) points are clamped to within 2^31-1 of the leftmost
(resp. topmost) point of the polygon. The clamping produces bad
rendering for really large polygons, and needs to be fixed in a saner
manner.
Cleaned up as per
http://lists.freedesktop.org/archives/cairo/2006-December/008806.html
This patch improves the translation invariance of the tessellator
by offsetting all input coordinates to be nonnegative and paves
the way for future optimisations using the coordinate range.
Also changes the assertions to make sure that it is safe to add
the guard bits. This needs to be changed to do something sensible
about input coordinates that are too large instead of croaking.
The plan is to steal the guard bits from the least significant
instead of the most significant user bits, and having all coordinates
nonnegative will make the rounding involved there easier.
The cairo_in_fill() function sometimes gives false positives
when it samples a point on the edge of an empty trapezoid.
This patch alleviates the bug (but doesn't fix it completely),
for the common(?) case where the left and right edges of the
empty trapezoid have equal top and bottom points.
Fixes the regression exhibited by the test fill-missed-stop,
where the tessellator would sometimes extend a trapezoid
too far below the end of the right edge.
Fixes the regression fill-degenerate-sort-order, where
confusion arises in the event order for collinear edges.
Also fixes (or at least hides) the issues with zrusin-another
sometimes generating different trapezoids depending on the
state of the random number generator in cairo-skiplist.c.
This patch removes a redundant call to skip_list_find()
that was being used to detect duplicate intersection events.
Instead, skip_list_insert() now takes an additional parameter
letting it know what to do with duplicates.
This fixes the failures of the new tessellator with the 3 tests:
bitmap-font, rectangle-rounding-error, and close-path
The problem was that identical edges from separate polygons
were not being added to the event queue, (because of a check
that was actually only intended to prevent an intersection
event from being scheduled multiple times).
This is the implementation as it cooked in the new-tessellator branch
available from:
git://people.freedesktop.org/~cworth/cairo
The file here comes from commit eee4faf79900be2c5fda1fddd49737681a9e37d6 in
that branch. It's sitting here not hooked up to anything in cairo yet,
and still with a main function with test cases, etc.
The files here are copied directly from the standalone skiplist module
available from:
git clone git://cworth.org/~cworth/skiplist
In particular the files come from the double branch and the following
commit on that branch:
8b5a439c68e220cf1514d9b3141a1dbdce8af585
Also of interest is the original skiplist module hosted by Keith Packard
that is the original implementation on which these files were based.
Since the cworth/skiplist branched off of keithp's, Keith has also
now implemented a doubly-linked variant which might be interesting for
further simplification of the code. See:
git clone git://keithp.com/git/skiplist
and the double-link branch there.
After changing _cairo_gstate_show_glyphs and _cairo_gstate_glyph_path to use
this function, we see a significant speedup due to the elimination of redundant
FP calculations.
An innocient-looking loop like this:
for (j = 0; j <= last; j++)
something();
cannot be optimized, because it may loop forever!
Imagine the case that last is MAXINT, the loop will never end. The correct
way to write it is:
for (j = 0; j < last+1; j++)
something();
In this case, if last is MAXINT, the loop will never run. Not correct, but
better than looping forever.
Still better would be to correctly handle the MAXINT case (even though it
doesn't make any sense to show MAXINT number of glyphs in one operation!) To
do that, we can use the fact that the input num_glyphs is a signed. If
there is one good thing about using signed int as input length, it's that you
can use an unsigned looping variable to avoid looping forever. That is
exactly what this patch does.