In http://bugs.freedesktop.org/show_bug.cgi?id=13699, Pavel Vozenilek
reports a duplicate define for computing the appropriate length for an
on-stack array. The macro in question, and a few other places, was
performing CAIRO_STACK_BUF_SIZE/sizeof(stack[0]) so we can simplify
them slightly by using a common macro.
We maintain three Xrender glyphsets per scaled font, one for each of A1, A8,
and ARGB32. This is required to correctly support fonts with bitmaps for
some glyphs but not all.
Do not rely on the assumption that if the destination has render support
then the source has it as well - breaks when the boilerplate disables
render support for a surface.
Similarly do not set the XRender attributes on the source surface
unless it actually has a xrender_format.
The reasoning is that right now, applications render glyphs to images,
upload it to the X server, and keep a local copy in the cache. The X
server works hard to reuse glyph renderings, by hashing glyph images and
reusing them. So we are wasting memory in cairo apps that don't use the
glyph surface after uploading to the server, which is the case if you
don't use the glyph in an image surface. The patch does not release the
glyph surface if it already existed in the cache, so, worst case
scenario is that we render the glyph twice, if you first use it with
xlib, then with image surface. That effect should be negligible.
_xrender_format_to_content() was using the channel offset to determine
whether the format supported a content type.
For example, the XRenderPictFormat for the A8 format looks like:
direct.alpha = 0; direct.alphaMask = 0xff;
direct.red = 0; direct.redMask = 0x00;
direct.green = 0; direct.greenMask = 0x00;
direct.blue = 0; direct.blueMask = 0x00;
which _xrender_format_to_content() matched as CAIRO_CONTENT_COLOR.
Switch to using the channel masks for deducing content type.
The VendorString parsing (to detect broken Xserver versions) was being
performed for each surface creation, but as it is a display invariant
we can save a small amount of work by storing the result on the
cairo_xlib_display_t.
Instead of simply ignoring the error that may occur when we upload the
destination image to the xlib surface (via XPutImage) record the error
on the xlib surface.
The xlib surface creation routines will eventually attempt to lock the
global _cairo_xlib_display_mutex. Under the default environment this is
a non-issue as the CAIRO_MUTEX_INITIALIZE/FINALIZE become no-ops under
pthreads. However, for the sake of correctness (i.e. to silence the
lockdep debugger!) insert a call to initialize the global mutexes at the
start of the public entry points.
Every time we assign or return a hard-coded error status wrap that value
with a call to _cairo_error(). So the idiom becomes:
status = _cairo_error (CAIRO_STATUS_NO_MEMORY);
or
return _cairo_error (CAIRO_STATUS_INVALID_DASH);
This ensures that a breakpoint placed on _cairo_error() will trigger
immediately cairo detects the error.
This reverts commit 919bea6dbb.
Sadly as Behdad points out some backends do modify the glyph array and,
for example cairo-xlib-surface, hide this from the compiler with some
evil casts.
Skip the memory duplication of the incoming glyphs if we do not need
to transform them into the backend coordinate system.
As a consequence we need to constify the glyphs passed to the backend
functions.
Previously, the code was just using cairo_format_t which is much more limited
than the formats supported by pixman, (so many "odd" X server visuals would
just fall over).
Seems like all over the code, we have been using negated device_offset
values for glyph surfaces. Here is all the math(!):
A device_transform converts from device space (a conceptual space) to
surface space. For simple cases of translation only, it's called a
device_offset and is public API (cairo_surface_[gs]et_device_offset).
A possibly better name for those functions could have been
cairo_surface_[gs]et_origing. So, that's what they do: they set where
the device-space origin (0,0) is in the surface. If the origin is inside
the surface, device_offset values are positive. It may look like this:
Device space:
(-x,-y) <-- negative numbers
+----------------+
| . |
| . |
|......(0,0) <---|-- device-space origin
| |
| |
+----------------+
(width-x,height-y)
Surface space:
(0,0) <-- surface-space origin
+---------------+
| . |
| . |
|......(x,y) <--|-- device_offset
| |
| |
+---------------+
(width,height)
In other words: device_offset is the coordinates of the device-space
origin relative to the top-left of the surface.
We use device offsets in a couple of places:
- Public API: To let toolkits like Gtk+ give user a surface that
only represents part of the final destination (say, the expose
area), but has the same device space as the destination. In these
cases device_offset is typically negative. Example:
application window
+---------------+
| . |
| (x,y). |
|......+---+ |
| | | <--|-- expose area
| +---+ |
+---------------+
In this case, the user of cairo API can set the device_space on
the expose area to (-x,-y) to move the device space origin to that
of the application window, such that drawing in the expose area
surface and painting it in the application window has the same
effect as drawing in the application window directly. Gtk+ has
been using this feature.
- Glyph surfaces: In most font rendering systems, glyph surfaces
have an origin at (0,0) and a bounding box that is typically
represented as (x_bearing,y_bearing,width,height). Depending on
which way y progresses in the system, y_bearing may typically be
negative (for systems similar to cairo, with origin at top left),
or be positive (in systems like PDF with origin at bottom left).
No matter which is the case, it is important to note that
(x_bearing,y_bearing) is the coordinates of top-left of the glyph
relative to the glyph origin. That is, for example:
Scaled-glyph space:
(x_bearing,y_bearing) <-- negative numbers
+----------------+
| . |
| . |
|......(0,0) <---|-- glyph origin
| |
| |
+----------------+
(width+x_bearing,height+y_bearing)
Note the similarity of the origin to the device space. That is
exactly how we use the device_offset to represent scaled glyphs:
to use the device-space origin as the glyph origin.
Now compare the scaled-glyph space to device-space and surface-space
and convince yourself that:
(x_bearing,y_bearing) = (-x,-y) = - device_offset
That's right. If you are not convinced yet, contrast the definition
of the two:
"(x_bearing,y_bearing) is the coordinates of top-left of the
glyph relative to the glyph origin."
"In other words: device_offset is the coordinates of the
device-space origin relative to the top-left of the surface."
and note that glyph origin = device-space origin.
So, that was the bug. Fixing it removed lots of wonders and magic
negation signs.
The way I discovered the bug was that in the user-font API, to make
rendering the glyph from meta-surface to an image-surface work I had
to do:
cairo_surface_set_device_offset (surface, -x_bearing, -y_bearing);
_cairo_meta_surface_replay (meta_surface, surface);
cairo_surface_set_device_offset (surface, x_bearing, y_bearing);
This suggested that the use of device_offset for glyph origin is
different from its use for rendering with meta-surface. This reminded
me of the large comment in the xlib backend blaming XRender for having
weird glyph space, and of a similar problem I had in the PS backend
for bitmap glyph positioning (see d47388ad75)
...those are all fixed now.
This patch introduces three macros: _cairo_malloc_ab,
_cairo_malloc_abc, _cairo_malloc_ab_plus_c and replaces various calls
to malloc(a*b), malloc(a*b*c), and malloc(a*b+c) with them. The macros
return NULL if int overflow would occur during the allocation. See
CODING_STYLE for more information.
This reverts commit 7016614dd9.
We want to avoid any negative performance impacts due to extra calls
to XSync. The fact that X errors can be missed with this appraoch is
undesirable of course---a proper fix will likely involve moving to
XCB which will hopefully allow us to do the error-checking the way
we want without any performance penalty.
This eliminates X errors propagated from cairo due to cleaning up
Render Pictures after the application had already destroyed the
Drawable they reference. (It would be nice if the X server wouldn't
complain that some cleanup work is already done, but there you
have it.)
This fix has been verified to fix the bug causing OpenOffice.org to
crash as described here:
XError on right click menus in OOo.
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=243811
And unlike other proposed fixes for this bug, this fix does not
introduce any new calls to XSync, (and thereby avoids performance
concerns from those).
Call XSync before ignoring errors from XGetImage to avoid hiding
unassociated errors. Similarly, call XSync before reinstalling the old
error handler to ensure no errors creep out of the ignored section.
When an xlib surface is used as the source of a draw operation this
will cause the contents of child windows to be included in the source
data. The semantics of drawing to xlib surfaces are unchanged (ie:
draws are still clipped by child windows overlapping the destination
window).
After introducing a work queue for deferred destruction of X resource
my firefox crashes over and over again because XRenderFreeGlyphs is trying
to free a non-exist glyph (already freed). The problematic call sequence is
something like below:
XRenderAddGlyphs (20990204, 20069)
XRenderAddGlyphs (20990204, 20069)
XRenderFreeGlyphs (20990204, 20069)
XRenderFreeGlyphs (20990204, 20069)
You can see the two add/free glyphs is interlaced. And obviously, we'll crash
at the last one. To fix this bug, we must be ensure here's no pending work
to free the glyph that we want to sent.
The status return from _cairo_xlib_screen_put_gc() indicates the failure
to queue a job to free an old GC - the current GC is always transferred
away from the caller, so always nullify it in the surface.
Sun never released a version of Solaris with Xorg with the buggy repeat
problem. This patch was only needed for development versions of Solaris
Nevada (roughly builds 25-30). The latest S10U release and Nevada releases
have a fixed Xorg. So no users should ever encounter this bug.
We need to remove this test because Xsun has the same VendorString and a lower
VendorRelease number so it falsely triggers buggy_repeat to be turned on.
They just added Xrender support recently to Xsun, so this wasn't an issue
before recently.
Original work by Jorn Baayen <jorn@openedhand.com>,
2715f20981
We use a small cache of size 16 for surfaces created for solid patterns.
This mainly helps with the X backends where we don't have to create a
pattern for every operation, so we save a lot on X traffic. Xft uses a
similar cache, so cairo's text rendering traffic with the xlib backend
now completely matches that of Xft.
The cache uses an static index variable, which itself acts like a cache of
size 1, remembering the most recently used solid pattern. So repeated
lookups for the same pattern hit immediately. If that fails, the cache is
searched linearly, and if that fails too, a new surface is created and a
random member of the cache is evicted.
A cached surface can only be reused if it is similar to the destination.
In order to check for similar surfaces a new test is introduced for the
backends to determine that the cached surface is as would be returned by
a _create_similar() call for the destination and content.
As surfaces are in general complex encapsulation of graphics state we
only return unshared cached surfaces and reset them (to clear any error
conditions and graphics state). In practice this makes little difference
to the efficacy of the cache during various benchmarks. However, in order
to transparently share solid surfaces it would be possible to implement a
COW scheme.
Cache hit rates: (hit same index + hit in cache) / lookups
cairo-perf: (42346 + 28480) / 159600 = 44.38%
gtk-theme-torturer: (3023 + 3502) / 6528 = 99.95%
gtk-perf: (8270 + 3190) / 21504 = 53.29%
This translates into a reduction of about 25% of the XRENDER traffic during
cairo-perf.
By deferring the issuing of the X requests to set the clip mask we can
theoretically avoid some redundant requests, but primarily we remove
another path where X requests are emitted.
Due to caching, destruction of X11 resources may occur outside of a
usable X11 context. To avoid this, we defer the destruction onto a work
queue which will be run the next time we try to use the X11 connection
on behalf of the user (at which point we must have a usable X11 context!)
or we are closing the Display.
Due to the nature of the reference counting, an X resource may be
destroyed later than anticipated and possibly from a different thread
than the original context. This becomes an issue for applications that
carefully manage their single X connection from a single thread and do
not use locking and are then suprised when cairo triggers X traffic when
performing work for a different part of the application in another thread.
Previously, we stored the per-display attributes inside a special
screen=NULL _cairo_xlib_screen_info_t. Now we keep track of known X
displays and store the screen information beneath the display structure
alongside the per-display hooks.
_cairo_pattern_acquire_surfaces() may substitute an image surface for
either the source or the mask should the backend not support creation
of similar scratch surfaces or an error occurs during creation. For
composition we require xlib surfaces and so we must trigger the
fallback path if this happens.
If we create the Pixmap whilst constructing a similar xlib surface, then
it our responsibility to free the Pixmap should we fail to allocate the
surface.
This allows for the surface acquired from the pattern to have the
same content. In particular, in a case such as cairo_paint_with_alpha
we can now acquire an A8 mask surface instead of an ARGB32 mask
surface which can be rendered much more efficiently. This results
in a 4x speedup when using the OVER operator with the recently
added paint-with-alpha test:
Speedups
========
image-rgb paint-with-alpha_image_rgb_over-256 2.25 -> 0.60: 4.45x speedup
███▌
It does slowdown the same test when using the SOURCE operator, but
I don't think we care. Performing SOURCE with a mask is already a very
slow operation, (hitting compositeGeneral), so the slowdown here is
likely from having to convert from A8 back to ARGB32 before the
generalized compositing. So if someone cares about this slowdown,
(though SOURCE with cairo_paint_with_alpha doesn't seem extremely
useful), they will probably be motivated enough to contribute a
customized compositing function to replace compositeGeneral in which
case this slowdown should go away:
image-rgba paint-with-alpha_image_rgb_source-256 3.84 -> 8.86%: 1.94x slowdown
█
Toolkits like GTK+ almost always set a simple rectangular clip mask before
any cairo operation, so avoid the allocation for this simple case by
embedding a small number of XRectangles into the surface structure.
Due to the interaction between multiple threads showing glyphs and
asynchronous CloseDisplays, it is possible for a font to maintain a
cairo_xlib_screen_info_t beyond the CloseDisplay. The simple solution
is to add a reference count in order to track the lifetime of the
cairo_xlib_screen_info_t correctly.