In a discussion on IRC, attention was drawn to a dubious comment in
_cairo_xlib_show_glyphs() - the precise details of which have passed
out of the collective memory.
Handling clip as part of the surface state, as opposed to being part of
the operation state, is cumbersome and a hindrance to providing true proxy
surface support. For example, the clip must be copied from the surface
onto the fallback image, but this was forgotten causing undue hassle in
each backend. Another example is the contortion the meta surface
endures to ensure the clip is correctly recorded. By contrast passing the
clip along with the operation is quite simple and enables us to write
generic handlers for providing surface wrappers. (And in the future, we
should be able to write more esoteric wrappers, e.g. automatic 2x FSAA,
trivially.)
In brief, instead of the surface automatically applying the clip before
calling the backend, the backend can call into a generic helper to apply
clipping. For raster surfaces, clip regions are handled automatically as
part of the composite interface. For vector surfaces, a clip helper is
introduced to replay and callback into an intersect_clip_path() function
as necessary.
Whilst this is not primarily a performance related change (the change
should just move the computation of the clip from the moment it is applied
by the user to the moment it is required by the backend), it is important
to track any potential regression:
ppc:
Speedups
========
image-rgba evolution-20090607-0 1026085.22 0.18% -> 672972.07 0.77%: 1.52x speedup
▌
image-rgba evolution-20090618-0 680579.98 0.12% -> 573237.66 0.16%: 1.19x speedup
▎
image-rgba swfdec-fill-rate-4xaa-0 460296.92 0.36% -> 407464.63 0.42%: 1.13x speedup
▏
image-rgba swfdec-fill-rate-2xaa-0 128431.95 0.47% -> 115051.86 0.42%: 1.12x speedup
▏
Slowdowns
=========
image-rgba firefox-periodic-table-0 56837.61 0.78% -> 66055.17 3.20%: 1.09x slowdown
▏
Before attempting to even set the attributes on the source Picture, we
ensure that it exists. So remove the redundant safe-guards to do nothing
if it doesn't exist.
We always query an xrender_format for a Visual upon surface creation, so
checking again in create_similar() is redundant. (It also interferes with
disabling XRender...)
Shrink the overall size of the per-screen GC cache, but allow multiple GCs
per depth, as it quite common to need up to two temporary GCs along some
drawing paths. Decrease the number of GCs we obtain in total by returning
clean (i.e. a GC without a clip set) back to the screen pool after use.
Compensate for the increased number of put/get by performing the query
using atomic operations where available. So overall we see a dramatic
reduction on the numbers of XCreateGC and XFreeGC, of even greater benefit
for RENDER-less servers.
Written by Vladimir Vukicevic to enable integration with Qt embedded
devices, this backend allows cairo code to target QPainter, and use
it as a source for other cairo backends.
This imports the sources from mozilla-central:
http://mxr.mozilla.org/mozilla-central/find?text=&kind=text&string=cairo-qpainter
renames them from cairo-qpainter to cairo-qt, and integrates the patch
by Oleg Romashin:
https://bugs.freedesktop.org/attachment.cgi?id=18953
And then attempts to restore 'make check' to full functionality.
However:
- C++ does not play well with the PLT symbol hiding, and leaks into the
global namespace. 'make check' fails at check-plt.sh
- Qt embeds a GUI into QApplication which it requires to construct any
QPainter drawable, i.e. used by the boilerplate to create a cairo-qt
surface, and this leaks fonts (cairo-ft-fonts no less) causing assertion
failures that all cairo objects are accounted for upon destruction.
[Updated by Chris Wilson]
Acked-by: Jeff Muizelaar <jeff@infidigm.net>
Acked-by: Carl Worth <cworth@cworth.org>
Most drivers and the X server used to have incorrect RepeatPad/RepeatReflect
implementations, forcing cairo to fall back to client-side software rendering,
which is painfully slow due to pixmaps being transfered over the wire. These
issues are mostly fixed in the drivers (with the exception of radeonhd, whose
developers didn't respond) and the RepeatPad software fallback is implemented
correctly as of pixman-0.15.0, so this patch will hand off composite operations
with EXTEND_PAD/EXTEND_REFLECT source patterns to XRender.
There is no way to detect whether the X server or the drivers use a
broken Render implementation, we make a guess based on the server
version: It's probably safe to assume that 1.7 X servers will use
fixed drivers and a recent enough version of pixman.
This patch adds more implementation of the snapshot method. For
surface types where acquire_source_image is already making a copy
of the bits, doing another one as is the case for the fallback
implementation is a waste.
Allow the caller to choose whether or not various conversions take place.
The first flag is used to disable the expansion of reflected patterns into a
repeating surface.
Damian Frank noted
[http://lists.cairographics.org/archives/cairo/2009-May/017095.html]
a performance problem with an older XServer with an
unaccelerated composite - similar problems will be seen with non-XRender
servers which will trigger extraneous fallbacks. The problem he found was
that painting an ARGB32 image onto an RGB24 destination window (using
SOURCE) was going via the RENDER protocol and not core. He was able to
demonstrate that this could be worked around by declaring the pixel data as
an RGB24 image. The issue is that the image is uploaded into a temporary
pixmap of matching depth (i.e. 32 bit for ARGB32 and 24 bit for RGB23
data), however the core protocol can only blit between Drawables of
matching depth - so without the work-around the Drawables are mismatched
and we either need to use RENDER or fallback.
This patch adds a content mask to _cairo_surface_clone_similar() to
provide the extra bit of information to the backends for when it is
possible for them to drop channels from the clone. This is used by the
xlib backend to only create a 24 bit source when blitting to a Window.
Simply request a surface with a similar content to the source image when
uploading pixel data. Failing to do so prevents using a 16-bit (or
otherwise non-standard pixman image format) window as a source - in fact
it will trigger an infinite recursion.
Eliminate the extremely short-lived and oft unnecessary heap allocation
of the region by first checking to see whether the clip exceeds the
surface bounds and only then intersect the clip with a local
stack-allocated region.
Specifically,
cairo_region_union_rect -> cairo_region_union_rectangle
cairo_region_create_rect -> cairo_region_create_rectangle
Also delete cairo_region_clear() which is not that useful.
Usually, rectangles are more useful than boxes, so regions should only
expose rectangles in their public API.
Specifically,
_cairo_region_num_boxes becomes _cairo_region_num_rectangles
_cairo_region_get_box becomes _cairo_region_get_rectangle
Remove the cairo_box_int_t type
The _cairo_region_get_boxes() interface was difficult to use and often
caused unnecessary memory allocation. With _cairo_region_get_box() it
is possible to access the boxes of a region without allocating a big
temporary array.
Adds an error code replacing CAIRO_STATUS_NO_MEMORY in one case where it
is not really appropriate. CAIRO_STATUS_INVALID_SIZE is used by several
backends that do not support image sizes beyond 2^15 pixels on each side.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If we are dithering on the Xlib backend we can not simply repaint the
surface used for a solid pattern and must recreate it from scratch.
However, for ordinary XRender usage we do not want to have to pay that
price - so query the backend to see if we can reuse the surface.
A surface will have the chance to use span rendering at cairo_fill()
time by creating a renderer for a specific combination of
pattern/dst/op before the path is scan converted. The protocol is to
first call check_span_renderer() to see if the surface wants to render
with spans and then later call create_span_renderer() to create the
renderer for real once the extents of the path are known.
No backends have an implementation yet.
Allow the user to pass in a pre-allocated array and use it if the number
of boxes permits. This eliminates the frequent allocations during clipping
by toolkits.
Ginn Chen reported a regression with Firefox where "the whole area of web
page is transparent until it redraws", and bisected it to the change to
lazily clear the clip.
The bug would appears to be when we have an inconsistent GC clip - i.e.
the clip on the surface has been cleared, but we have not yet used and
thus cleared the GC, so that we did not mark the GC as having a clip set
when we freed it.
Add a "cairo_rectangle_int_t *extents" argument to to the following
backend functions:
paint
mask,
stroke
fill
show_glyphs
show_text_glyphs
This will be used to pass the extents of each operation computed by
the analysis surface to the backend. This is required for implementing
EXTEND_PAD.
libXrender amalgamates sequences of XRenderFillRectangle() into a single
XRenderFillRectangles request (when permissible). Since it is common for a
cairo application to draw rectangles individually in order to exploit fast
paths within cairo [rectilinear fills], it is a reasonably common pattern.
Constructing the font options cause the initialisation of Xlc and invoke
several round-trips to the X server, significantly delaying the creation
of the first surface. By deferring that operation until the first use of
fonts then we avoid that overhead for very simple applications (like the
test suite) and should improve start-up latency for larger application.
Only copy the pattern if we need to modify it, e.g. preserve a copy in a
snapshot or a soft-mask, or to modify the matrix. Otherwise we can
continue to use the original pattern and mark it as const in order to
generate compiler warnings if we do attempt to write to it.
Adrian Johnson discovered cases where we mistakenly compared the result
of unsigned arithmetic where we need signed quantities. Look for similar
cases in the users of cairo_rectangle_int_t.
If the image was opaque with no alpha channel, we filled the output alpha
with 0. Typically, the destination surface for dithering is an RGB window,
so this bug went unnoticed. However, test/xlib-expose-event is an example
where we generate an intermediate alpha-only pixmap for use as a stencil
and this was failing as the mask was left completely transparent. The
simple solution is to ensure that for opaque images, the output alpha is
set to the maximum permissible value.
After discussing the scaled font locking with Behdad, it transpired that it
is not sufficient for a font to be locked for the lifetime of a scaled glyph,
but that the scaled font's glyph cache must be frozen for the glyph'
lifetime. If the cache is not frozen, then there is a possibility that the
glyph may be evicted before the reference goes out of scope i.e. the glyph
becomes invalid whilst we are trying to use it.
Since the freezing of the cache is the stronger barrier, we remove the
locking/unlocking of the mutex from the backends and instead move the
mutex acquisition into the freeze/thaw routines. Then update the rule on
acquiring glyphs to enforce that the cache is frozen and review the usage
of freeze/thaw by all the backends to ensure that the cache is frozen for
the lifetime of the glyph.
Bug 9102 cairo doesn't support 24 bits per pixel mode on X11
(https://bugs.freedesktop.org/show_bug.cgi?id=9102)
is a reminder that that we need to support many obscure XImage formats.
With Carl's and Behdad's work to support psuedocolor we have a mechanism
in place to handle any format that is not natively handled by pixman. The
only piece we were missing was extending the swapper to handle all-known
formats and putting in defensive checks that pixels were correctly aligned
in accordance with pixman's requirements.