When using fonts circular references are established between the holdover
font caches and the interpreter which need manual intervention via
cairo_script_interpreter_finish() to break.
Most drivers and the X server used to have incorrect RepeatPad/RepeatReflect
implementations, forcing cairo to fall back to client-side software rendering,
which is painfully slow due to pixmaps being transfered over the wire. These
issues are mostly fixed in the drivers (with the exception of radeonhd, whose
developers didn't respond) and the RepeatPad software fallback is implemented
correctly as of pixman-0.15.0, so this patch will hand off composite operations
with EXTEND_PAD/EXTEND_REFLECT source patterns to XRender.
There is no way to detect whether the X server or the drivers use a
broken Render implementation, we make a guess based on the server
version: It's probably safe to assume that 1.7 X servers will use
fixed drivers and a recent enough version of pixman.
Whilst waiting for the fontmap lock on destruction another thread may not
only have resurrected the font but also destroyed it acquired the lock
first and inserted into the holdovers before the first thread resumes. So
check that the font is not already in the holdovers array before
inserting.
Waiting for a long running benchmark can be very annoying, especially if
you just want a rough-and-ready result. So hook into SIGINT and stop the
current benchmark (after the end of the iteration) on the first ^C. A
second ^C within the same iteration will kill the program as before.
_font_map_release_face_lock_held() was being called unconditionally during
_cairo_ft_font_reset_static_data(). This presents two problems. The first
is that we call FT_Done_Face() on an object not owned by cairo, and the
second is that the bookkeeping is then incorrect which will trigger an
assert later.
To save typing when creating macro-benchmarks --profile disables
mark-dirty and caller-info and compresses the trace using LZMA. Not for
computers short on memory!
Rewrite a few error strings so that they more closer match the
documentation. Where they differ, I believe I have chosen the more
informative combination of the two texts.
An issue occured when using subpixel antialiasing with user-fonts and
XRender - the glyphs were transparent, as demonstrated by the font-view
example.
The problem lies in that enabling subpixel antialiasing triggers use of an
ARGB32 image surface for rendering the glyph, but the default colour is
black (so the only information is in the alpha-channel). Given an ARGB32
glyph XRender treats it as a per-channel mask, but since the R,G,B
channels were uniformly zero, the glyph is rendered as transparent.
Fix this by setting the initial colour to white before rendering the image
surface for a user-font glyph, which generates the appropiate gray-level
mask by default.
The PDF surface now keeps track of all the patterns it is embedding in
a hash table keyed by the unique_id returned by the
_cairo_surface_get_unique_id().
This patch adds more implementation of the snapshot method. For
surface types where acquire_source_image is already making a copy
of the bits, doing another one as is the case for the fallback
implementation is a waste.
Provide a mechanism for backends to attach and remove snapshots. This can
be used by backends to provide a cache for _cairo_surface_clone_similar(),
or by the meta-surfaces to only emit a single pattern for each unique
snapshot.
In order to prevent stale data being returned upon a snapshot operation,
if the surface is modified (via the 5 high level operations, and on
notification of external modification) we break the association with any
current snapshot of the surface and thus preserve the current data for
their use.
Use 'cairo-perf -v -r' to have both the summary output along with the raw
values. This gives a progress report whilst benchmarking, very reassuring
with long running tests.
There are synchronisation issues with similar surfaces (as only the
original target surface is synced) which interferes with making
performance comparisons. (There still maybe some value should you be aware
of the limitations...)
Use the new API Behdad exposed in 1.8 to precompute a glyph string using
Cairo and then benchmark cairo_show_glyphs(). This is then equivalent to
the text benchmark but without the extra step of converting to glyphs on
every call to cairo_show_text() i.e. it shows the underlying glyph
rendering performance.
The glyph advance cache was only enabled for glyph indices < 256,
causing a large number of misses for non-ASCII text. Improve this by
simply applying the modulus of the index to select the cache slot - which
may cause some glyph advances to be overwritten and re-queried, but
improves the hit rate.
Allow the caller to choose whether or not various conversions take place.
The first flag is used to disable the expansion of reflected patterns into a
repeating surface.
We can defer taking the cairo_scaled_font_map_lock until we drop the
last reference to the scaled font so long as we double check the reference
count after waiting for the lock and not making assumptions about
unreferenced fonts during construction. This is significant as even
acquiring the uncontended cairo_scaled_font_map_lock during
cairo_scaled_font_destroy() was showing up as a couple of percent on text
heavy profiles (e.g. gnome-terminal).
When observing applications two patterns emerge. The first is due to
Pango, which wraps each glyph run within a context save/restore. This
causes the scaled font to be evicted after every run and reloaded on the
next. This is caught by the MRU slot on the cairo_scaled_font_map and
prevents a relatively costly traversal of the hash table and holdovers.
The second pattern is by applications that directly manage the rendering
of their own glyphs. The prime example of this is gnome-terminal/vte. Here
the application frequently alternates between a few scaled fonts - which
requires a hash table retrieval every time.
By introducing a MRU slot on the gstate we are able to directly recover
the scaled font around 90% of the time.
Of 110,000 set-scaled-fonts:
4,000 were setting the current font
96,000 were setting to the previous font
2,500 were recovered from the MRU on the cairo_scaled_font_map
7,500 needed a hash retrieval
which compares to ~106,000 hash lookups without the additional MRU slot on
the gstate.
This translates to an elapsed time saving of ~5% when replaying a
gnome-terminal trace using the drm backend.
The structure is already exposed, so just expose the
constructors/destructors in order to enable caches to be embedded and
remove a superfluous malloc.