Allocate subsequent path bufs twice as large as the previous buf,
whilst still embedding a small initial buf into cairo_path_fixed_t
that handles the most frequent usage.
These were recently added, (as part of sparse integration?), but they
break boilerplate which reaches into at least cairo-types-private.h
and cairo-scaled-font-private.h. But boilerplate cannot see cairoint.h
or else it gets the internal sybol renaming, (with the INT_ prefix),
and then all the test suite tests refuse to link.
If this change reverts some recently-added functionality, (or
cleanliness), then we'll just need to find some other way to add that
back again without the breakage.
This means, we have to malloc only one buffer, not two. Worst case
is that one always draws curves, which fills the arg (point) buffer
six times faster than op buffer. But that's not a big deal since
each op takes 1 byte, while each point takes 8 bytes. So op space
is cheap to spare, so to speak (about 10% memory waste at worst).
We do this by including an initial op and arg buf in cairo_path_fixed_t,
so for small paths we don't have to alloc those buffers.
The way this is done is a bit unusual. Specifically, using an array of
length one instead of a normal member:
- cairo_path_op_buf_t *op_buf_head;
+ cairo_path_op_buf_t op_buf_head[1];
Has the advantage that read-only use of the buffers does not need any
change as arrays act like pointers syntactically. All manipulation code
however needs to be updates, which the patch supposed does. Still, there
seems to be bugs remaining as cairo-perf quits with a Bad X Request error
with this patch.
This custom stroking code allows backends to use optimized region-based
drawing operations for rectilinear strokes. This results in a 5-25x
performance improvement when drawing rectilinear shapes:
image-rgb box-outline-stroke-100 0.18 -> 0.01: 25.58x speedup
████████████████████████▋
image-rgba box-outline-stroke-100 0.18 -> 0.01: 25.57x speedup
████████████████████████▋
xlib-rgb box-outline-stroke-100 0.49 -> 0.06: 8.67x speedup
███████▋
xlib-rgba box-outline-stroke-100 0.22 -> 0.04: 5.39x speedup
████▍
In other words, using cairo_stroke instead of cairo_fill to draw the
same shape was 5-15x slower before, but is 1.2-2x faster now.