The design is for the user to create a cairo_font_options_t object with
cairo_font_options_create() and then is free to use it with any Cairo
operation. This requires us to check when we may be about to overwrite
the read-only nil object.
This allows for the surface acquired from the pattern to have the
same content. In particular, in a case such as cairo_paint_with_alpha
we can now acquire an A8 mask surface instead of an ARGB32 mask
surface which can be rendered much more efficiently. This results
in a 4x speedup when using the OVER operator with the recently
added paint-with-alpha test:
Speedups
========
image-rgb paint-with-alpha_image_rgb_over-256 2.25 -> 0.60: 4.45x speedup
███▌
It does slowdown the same test when using the SOURCE operator, but
I don't think we care. Performing SOURCE with a mask is already a very
slow operation, (hitting compositeGeneral), so the slowdown here is
likely from having to convert from A8 back to ARGB32 before the
generalized compositing. So if someone cares about this slowdown,
(though SOURCE with cairo_paint_with_alpha doesn't seem extremely
useful), they will probably be motivated enough to contribute a
customized compositing function to replace compositeGeneral in which
case this slowdown should go away:
image-rgba paint-with-alpha_image_rgb_source-256 3.84 -> 8.86%: 1.94x slowdown
█
This is just multiplication after all, so there's nothing that can fail.
And we can get rid of a lot of useless error-checking code this way.
The corrected functions are:
_cairo_gstate_user_to_device
_cairo_gstate_user_to_device_distance
_cairo_gstate_device_to_user
_cairo_gstate_device_to_user_distance
Now that we have the warn_unused_result attribute enabled, (thanks
Chris!), it's actually harmful to have a function return an
uncoditional value of CAIRO_STATUS_SUCCESS. The harm is that
it would force lots of unnecessary error-checking paths that
just add clutter.
It is much better to simply give a function that cannot fail
a return type of void.
The idiom for cairo.c is to do
cr->status = _cairo_op ();
if (cr->status) _cairo_set_error (cr, cr->status);
Unfortunately a trivial mistake for a _cairo_op () is to call a cairo_op ()
and forget to check cr->status but return CAIRO_STATUS_SUCCESS which will
mask the earlier error.
Obviously this is a bug in the lower level but the impact can be reduced
by chaning cairo.c to use a local status variable for its return:
cairo_status_t status = _cairo_op ();
if (status) _cairo_set_error (cr, cr->status);
Previously, the convention was that static ones started with cairo_, but
renamed to start with _cairo_ when they were needed from other files and
became cairo_private instead of static...
This is error prune indeed, and two symbols were already violating. Now
all nil objects start with _cairo_.
Frequently cairo_set_source_rgb[a]() is used to replace the current
solid-pattern source with a new one of a different colour. The current
pattern is very likely to be unshared and unmodified and so it is likely
just to be immediately freed [or rather simply moved to recently freed
cache]. However as the last active pattern it is likely to cache-warm and
suitable to satisfy the forthcoming allocation. So by setting the current
pattern to 'none' we can move the pattern to the freed list before we
create the new pattern and hopefully immediately reuse it.
This is more natural since cr->path can be used as if it was a pointer.
This means, for example, if we move on to making it a pointer, most of
the code using it does not need any change. So we get some level of
encapsulation of implementation details, if you prefer the terminology :).
This means, we have to malloc only one buffer, not two. Worst case
is that one always draws curves, which fills the arg (point) buffer
six times faster than op buffer. But that's not a big deal since
each op takes 1 byte, while each point takes 8 bytes. So op space
is cheap to spare, so to speak (about 10% memory waste at worst).
We do this by including an initial op and arg buf in cairo_path_fixed_t,
so for small paths we don't have to alloc those buffers.
The way this is done is a bit unusual. Specifically, using an array of
length one instead of a normal member:
- cairo_path_op_buf_t *op_buf_head;
+ cairo_path_op_buf_t op_buf_head[1];
Has the advantage that read-only use of the buffers does not need any
change as arrays act like pointers syntactically. All manipulation code
however needs to be updates, which the patch supposed does. Still, there
seems to be bugs remaining as cairo-perf quits with a Bad X Request error
with this patch.
Previously, an INVALID_RESTORE error would leave cr->gstate as NULL,
(which is generally impossible/invalid). This seems safe enough as
most cairo functions check cr->status first and bail if anything
looks fishy.
However, the many cairo_get functions happily march along in spite
of any current error. We could instrument all of those functions to
check for the error status and return some dummy value in that case.
But it's much easier to get the same basic effect by simply creating
a non-NULL cr->gstate which will hold all those dummy values, and
we can eliminate the crashes without having to touch up every
cairo_get function.
This fixes the bug reported here:
evolution crash to _cairo_gstate_backend_to_user()
https://bugs.freedesktop.org/show_bug.cgi?id=9906
It also eliminates the crash that was added to the nil-surface test
with the previous commit.
There was some leftover cut-and-paste description of get_font_face
in the documentation for get_scaled_font. That turned out to be a
good thing as it alerted me to the fact that the get_font_face
documentation was stale as well.
Add description of the 'nil' object return values, rather than NULL.
Previous commit broke cairo_surface_finish, since it was checking for
ref_count == CAIRO_REF_COUNT_INVALID and bailing. But, that condition
was reached from destroy, so finish was bailing out early.
user_data setters/getters were added to public refcounted objects
that were missing them (cairo_t, pattern, scaled_font). Also,
a refcount getter (cairo_*_get_reference_count) was added to all
public refcounted objects.
Make these functions consistent with other cairo_get functions
by making cairo_get_dash_count return the count directly, and
removing the cairo_status_t return value from cairo_get_dash.
This custom stroking code allows backends to use optimized region-based
drawing operations for rectilinear strokes. This results in a 5-25x
performance improvement when drawing rectilinear shapes:
image-rgb box-outline-stroke-100 0.18 -> 0.01: 25.58x speedup
████████████████████████▋
image-rgba box-outline-stroke-100 0.18 -> 0.01: 25.57x speedup
████████████████████████▋
xlib-rgb box-outline-stroke-100 0.49 -> 0.06: 8.67x speedup
███████▋
xlib-rgba box-outline-stroke-100 0.22 -> 0.04: 5.39x speedup
████▍
In other words, using cairo_stroke instead of cairo_fill to draw the
same shape was 5-15x slower before, but is 1.2-2x faster now.
The rule is: cairo_glyph_t* is always passed as const for measurement
purposes. This was not reflected in our public api previously. Fixed
Showing glyphs used to have cairo_glyph_t* always as const. With this
changed, it is only const on cairo_t and cairo_gstate_t operations.
cairo_surface_t, cairo_scaled_font_t, and individual backends receive
cairo_glyph_t* as non-const. The desired semantics is that they may modify
the contents of the array as long as they do not return
CAIRO_STATUS_UNSUPPORTED. This makes it possible to avoid copying the glyph
array again and again, and edit it in-place. Backends are in fact free to use
the array as a generic buffer as they see fit.
A nice side effect of this new approach is that the valid input range
was expanded back to (INT_MIN, INT_MAX]. No performance regressions observed.
Also included is documentation about the internal mysteries of _cairo_lround,
as previously promised.
The following documented symbols were missing this tag:
cairo_clip_extents
cairo_copy_clip_rectangles
CAIRO_STATUS_INVALID_INDEX
cairo_rectangle_t
cairo_rectangle_list_t
One of these functions was already documented to be doing this, and
the other one should have been. Now the documentation and behavior
for both are consistent, (and the path-data test case verifies this).