An innocient-looking loop like this:
for (j = 0; j <= last; j++)
something();
cannot be optimized, because it may loop forever!
Imagine the case that last is MAXINT, the loop will never end. The correct
way to write it is:
for (j = 0; j < last+1; j++)
something();
In this case, if last is MAXINT, the loop will never run. Not correct, but
better than looping forever.
Still better would be to correctly handle the MAXINT case (even though it
doesn't make any sense to show MAXINT number of glyphs in one operation!) To
do that, we can use the fact that the input num_glyphs is a signed. If
there is one good thing about using signed int as input length, it's that you
can use an unsigned looping variable to avoid looping forever. That is
exactly what this patch does.
Optimizes EXTEND_REPEAT, especially when DDBs are in use through the
use of PatBlt or manually expanding out the repeated blits (up to a
limit). Will still fall back to fallback code as necessary.
Make sure that all operations are correct (the operations chosen
are listed in cairo-win32-surface.c); in particular, deal with the extra
byte present in FORMAT_RGB24 surfaces correctly.
Also adds support for calling StretchDIBits to draw RGB24
cairo_image_surfaces directly.
It turns out that all of the callers want a box anyway, so this
simplfies the code in addition to being more honest to the name.
(For those new to the convention, a "box" is an (x1,y2),(x2,y2)
pair while a "rectangle" is an (x,y),(width,height) pair.)
Optimize show glyphs by looking for strings of glyphs from the same subset
and use the xyshow operator to display. As a further optimization the xshow
and yshow operators are used for displaying horizontal and vertical text.
The bug was exposed by the recent addition of the paint-repeat test.
The ps output was crashing various interpreters by using infinite
extents for repeating patterns. Fixing that was easy enough, but
the offset of the repeating pattern was still being lost. The fix
for both involved imitating the style of emit_surface_pattern as
it exists in cairo-pdf-surface.c, (though the details are quite
different due to differences in the models of PS and PDF).
This broke with the clone_similar optimization in
8d7a02ed58 The optimization added an
interest rectangle to clone_similar, but with a repeating source
pattern, the interest rectangle might not intersect the extents of the
surface at all.
The test suite caught this with the trap-clip case.
The fix here is to clone the entire surface if the pattern has an
extend mode of REPEAT.
This broke with the clone_similar optimization in
8d7a02ed58
The optimization added an interest rectangle to clone_similar,
but the acquire_surface path was neglecting to transform its
rectangle by the pattern matrix.
The test suite did catch this, but apparently we were too
distracted by the performance improvements to notice. Only
backends other than image that implemented clone_similar
would be affected by the bug, (which meant I only saw xlib
failures in my testing).
This fixes bug #8711
This corrects mosts of changes in clone similar commit. But it's
still a problem in _cairo_glitz_surface_set_image, it'll crash if
source region is outside image extents.
The previous changes in _cairo_glitz_surface_get_image causes test
clip-fill-rule-pixel-aligned and clip-fill-rule fail with a pretty
crash, this fix that.
This fixes a huge performance bug (entire image was being pushed to X
server in order to copy a tiny piece of it). I see up to 50x improvement
from subimage_copy (which was designed to expose this problem) but also
a 5x improvement in some text performance cases.
xlib-rgba subimage_copy-512 3.93 2.46% -> 0.07 2.71%: 52.91x faster
███████████████████████████████████████████████████▉
xlib-rgb subimage_copy-512 4.03 1.97% -> 0.09 2.61%: 44.74x faster
███████████████████████████████████████████▊
xlib-rgba subimage_copy-256 1.02 2.25% -> 0.07 0.56%: 14.42x faster
█████████████▍
xlib-rgba text_image_rgb_over-256 63.21 1.53% -> 11.87 2.17%: 5.33x faster
████▍
xlib-rgba text_image_rgba_over-256 62.31 0.72% -> 11.87 2.82%: 5.25x faster
████▎
xlib-rgba text_image_rgba_source-256 67.97 0.85% -> 16.48 2.23%: 4.13x faster
███▏
xlib-rgba text_image_rgb_source-256 68.82 0.55% -> 16.93 2.10%: 4.07x faster
███▏
xlib-rgba subimage_copy-128 0.19 1.72% -> 0.06 0.85%: 3.10x faster
██▏
The trick for this was to carefully ensure that the pen always has
at least 4 vertices. There was a previous attempt at this in the
code already but the test case had a combination of matrix and radius
that resulted in a value that was just able to sneak past the previous
check.