Toolkits like GTK+ almost always set a simple rectangular clip mask before
any cairo operation, so avoid the allocation for this simple case by
embedding a small number of XRectangles into the surface structure.
Due to the interaction between multiple threads showing glyphs and
asynchronous CloseDisplays, it is possible for a font to maintain a
cairo_xlib_screen_info_t beyond the CloseDisplay. The simple solution
is to add a reference count in order to track the lifetime of the
cairo_xlib_screen_info_t correctly.
These were found during a cairo_static pass on an alternative branch...
A critical one in particular was setting the have added glyph flag to
TRUE even if _cairo_xlib_surface_add_glyph() fails. This can cause an
application crash due to a RenderBadGlyph error later when the scaled
font is cleaned and we attempt to remove the glyph.
_cairo_pattern_release_surface() asserts that it is passed a pattern
surface. This itself is bad as breaks the symmetry with
_cairo_pattern_acquire_surface under() error conditions, however reorder
the cleanup to avoid this assertion.
Detect when a substitute image surface is returned for a solid pattern,
and avoid mixed image/xlib composite operations. This can happen for example
if there is a resource allocation failure during creating a similar surface.
The original test for 'if (surface->visual)' dates back to a very old
assumption that if the xlib surface was created with an XRenderFormat
that the surface->visual field would be set to NULL. This assumption
was broken years ago with the following commit:
0c05b23b31
This fixes the crash reported here:
BadMatch when running gnome-terminal with the murrine-0.51 gtk engine
https://bugs.freedesktop.org/show_bug.cgi?id=10250
Use the new hook functions to register a callback for xlib to clear
the private glyph data when the display is closed. In order to do this
we need to reset the glyph cache inside the generic scaled font as well.
This reverts the following commits:
2715f2098167e3b3c53b
See this thread for an analysis of the problems it caused:
http://lists.freedesktop.org/archives/cairo/2007-February/009825.html
In short, a single cache for all backends doesn't work, as one thread
using any backend can cause an unused xlib pattern to be evicted from
the cache, and trigger an xlib call while the display is being used
from another thread. Xlib is not prepared for this.
Two drawables can be used in an X and Render operation only if they share
the same screen. Previously we were only checking for the same display
in is_compatible. Check for the same screen now.
We use a small cache of size 16 for surfaces created for solid patterns.
This mainly helps with the X backends where we don't have to create a
pattern for every operation, so we save a lot on X traffic. Xft uses a
similar cache, so cairo's text rendering traffic with the xlib backend
now completely matches that of Xft.
The cache uses an static index variable, which itself acts like a cache of
size 1, remembering the most recently used solid pattern. So repeated
lookups for the same pattern hit immediately. If that fails, the cache is
searched linearly, and if that fails too, a new surface is created and a
random member of the cache is evicted.
Only surfaces that are "compatible" are used. The definition of compatible
is backend specific. For the xlib backend, it means that the two surfaces
are allocated on the same display. Implementations for compatibility are
provided for all backends that it makes sense.
The old implementation was a very naive one that used to generate one XRender
glyph element per glyph. That is, position glyphs individually. This was
raised here:
http://lists.freedesktop.org/archives/cairo/2006-December/008835.html
The new implmentation is a free rewriting of the Xft logic, that is,
compressing glyphs with "natural" advance into elements, but with various
optimizations and improvements.
In short, it works like this: glyphs are looped over, skipping those that are
not desired, and computing offset from "current position". Whenever a glyph
has non-zero offsets from the current position, a new element should be
started. All these are used to compute the request size in the render
protocol. Whenever the request size may exceed the max request size, or at
the end, glyphs are flushed. For this to work, we now set non-zero glyph
advances when sending glyphs to the server.
Notable optimizations and improvements include:
- Reusing the input glyph array (with double glyph positions) as a working
array to compute glyph offsets.
- Reusing the input glyph array as the output glyph-index array to be passed
to XRender.
- Marking glyphs to be skipped as so, avoiding a copy of the glyph array,
which is what the old code was doing.
- Skip glyphs with positions "out-of-range". That is, those with positions
that would cause an overflow in Xrender's glyph offset calculations.
On my Fedora desktop on Pentium 4, and on a Nokia 770, it shows a 6% speedup on
the timetext test.
The rule is: cairo_glyph_t* is always passed as const for measurement
purposes. This was not reflected in our public api previously. Fixed
Showing glyphs used to have cairo_glyph_t* always as const. With this
changed, it is only const on cairo_t and cairo_gstate_t operations.
cairo_surface_t, cairo_scaled_font_t, and individual backends receive
cairo_glyph_t* as non-const. The desired semantics is that they may modify
the contents of the array as long as they do not return
CAIRO_STATUS_UNSUPPORTED. This makes it possible to avoid copying the glyph
array again and again, and edit it in-place. Backends are in fact free to use
the array as a generic buffer as they see fit.
This fixes a huge performance bug (entire image was being pushed to X
server in order to copy a tiny piece of it). I see up to 50x improvement
from subimage_copy (which was designed to expose this problem) but also
a 5x improvement in some text performance cases.
xlib-rgba subimage_copy-512 3.93 2.46% -> 0.07 2.71%: 52.91x faster
███████████████████████████████████████████████████▉
xlib-rgb subimage_copy-512 4.03 1.97% -> 0.09 2.61%: 44.74x faster
███████████████████████████████████████████▊
xlib-rgba subimage_copy-256 1.02 2.25% -> 0.07 0.56%: 14.42x faster
█████████████▍
xlib-rgba text_image_rgb_over-256 63.21 1.53% -> 11.87 2.17%: 5.33x faster
████▍
xlib-rgba text_image_rgba_over-256 62.31 0.72% -> 11.87 2.82%: 5.25x faster
████▎
xlib-rgba text_image_rgba_source-256 67.97 0.85% -> 16.48 2.23%: 4.13x faster
███▏
xlib-rgba text_image_rgb_source-256 68.82 0.55% -> 16.93 2.10%: 4.07x faster
███▏
xlib-rgba subimage_copy-128 0.19 1.72% -> 0.06 0.85%: 3.10x faster
██▏
Basically, it's evil to write a loop like:
while ((c -= 4) > 0) {
...
}
for one reason that doesn't work if c is unsigned. And when c is signed, if
for some reason c is about -MAXINT, then it will overflow and not work as
expected.
It's much safer (and more gcc warning friendly) to rewrite it as:
unsigned int c;
while (c >= 4) {
...
c -= 4;
}
Behdad chased this bug down when looking into bug #7593. This
bug is what finally motivated us to figure out how to get -Wextra
(for the "always true" comparisons of unsigned variables against
negative values).
This approach to fixing the bug is valid since there is code in pixman
for rendering to BGR images, (which is why cairo 1.0 worked with BGR X
servers for example). But, since we don't want to advertise additional
image formats we implement this through a new cairo_internal_format_t.
This is rather fragile since we don't want to leak any internal formats
nor do we ever want an internal format to be used somewhere a real
format is expected, (and trigger a CAIRO_FORMAT_VALID assertion failure).
More comments than code are added here to help compensate for the
fragility and to give some guidance in fixing this mess in a better way
in the future.
There appears to be a bug in some X servers which is triggered by
rendering 1-bit glyphs with zero size via the functions
XRenderAddGlyphs and XRenderCompositeText8 (and likely its variants).
We avoid this bug by making a copy of the glyphs array which does not
include any of the size-zero glyphs so that the X server never sees them.
This is an attempt to fix the following bug:
http://bugzilla.gnome.org/show_bug.cgi?id=332266
With the recent rewrite of the device-offset code, which pushed things
from the gstate to the surface layer, the 16-bit limitations on coordinates
which previously applied to device space only, have lately been applying to
user space. This commit moves the device_transform back up above the conversion
from floating-point to fixed-point values so that once again the limitation
only applies to device space.
When accessing the underlying drawable etc of an xlib surface, it is
also helpful to be able to get the width and height without a server
round trip. This patch provides those functions.
When the glyph format does not match the font format, the glyph will
be rendered incorrectly. Setting the {x, y}_offset correctly when
converting glyph format fix that.
This is a step toward allowing device scaling in addition to device offsets.
So far, the scale values are still always 1.0 so only the translation is
actually being used. But most of the code is in place for doing scaling as
well and it just needs to be hooked up.
There are some fragile parts in this code, all of which involve using the
translation without the scale, (so grep for device_transform.x0 or
device_transform->x0). Some of these are likely bugs that will hopefully
be obvious once we start using the scale. Others are OK if only because
we 'know' that we aren't ever setting device scaling on a surface that
has a device offset (we only set device scaling on surfaces we create
internally and we don't export device scaling to the user).
All of these fragile parts in the code have been marked with comments of
the form: XXX: FRAGILE.