_cairo_clip_get_surface() now returns a borrowed reference to the cached
surface on the clip, so we must not destroy it - as Carlos pointed out
when he hit the assert:
12:55 < KaL> ickle: cairo-surface.c:595: cairo_surface_destroy:
Assertion `((*&(&surface->ref_count)->ref_count) > 0)' failed.
12:56 < KaL> ickle: trying to render any pdf file with poppler glib demo
after installing cairo from git master
13:00 < KaL> ickle: well, it seems it has nothing ot do with poppler,
since it crashes in clearlooks src/clearlooks_draw.c:347
_cairo_pattern_is_opaque() now takes the extents over which the
operation is defined so that we make exclude the clear pixels that
surround EXTEND_NONE surfaces when determining opacity. In order to take
full advantage of this we need to start performing an extents query on
the operation and pass that down to the analysis...
This patch however is just the quick compile fix to pass a NULL and
restore the old behaviour.
Fixes (with previous commit):
Bug 26197 - Cairo doesn't build on windows
http://bugs.freedesktop.org/show_bug.cgi?id=26197
Force the subsurface extents to be inside the target extents and
compose subsubsurfaces offsets correctly.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
The PDF file referenced by bug 26186 contains a Type 3 font with non
identity font matrix and a "1/2" glyph created by drawing the "1" and
"2" from a Type 1 font. This combination exposed a bug in the font
scale and glyph position in _cairo_type3_glyph_surface_show_glyphs
when printing user font glyphs.
Since the spans rework, we were emitting half a primitive at a time,
and if we flushed our VBO full of quads out halfway through, we could
end up dropping the primitive and then out of phase.
When HAS_ATOMIC_OPS is not defined, cairo-image-surface.c does not
compile because _pixman_white_image calls _pixman_image_for_solid
which gets defined only later in the code.
The low-level surface composite interface will disappear in the near
future and results in much more ugly code than calling the high level
interface - so use it when flattening images into the page background.
On my Core2, the library version of lround() is faster than our
hand-rolled non-floating point implementation. So only enable our code
if we are trying to minimise the number of floating point operations --
even then, it would worth investigating the library performance first.
[Just a reminder that optimisation choices will change over time as our
hardware and software evolves.]
Still an experimental backend, it's now a little too late to stabilise
for 1.10, but this should represent a major step forward in its feature
set and an attempt to catch up with all the bug fixes that have been
performed on xlib. Notably not tested yet (and expected to be broken)
are mixed-endian connections and low bitdepth servers (the dithering
support has not been copied over for instance). However, it seems robust
enough for daily use...
Of particular note in this update is that the xcb surface is now capable
of subverting the xlib surface through the ./configure --enable-xlib-xcb
option. This replaces the xlib surface with a proxy that forwards all
operations to an equivalent xcb surface whilst preserving the cairo-xlib
API that is required for compatibility with the existing applications,
for instance GTK+ and Mozilla. Also you can experiment with enabling a
DRM bypass, though you need to be extremely foolhardy to do so.
As proof-of-principle add the nearly working demonstrations of using DRM
to render directly with the GPU bypassing both RENDER and GL for
performance whilst preserving high quality rendering.
The basis behind developing these chip specific backends is that this is
the idealised interface that we desire for this chips, and so a target
for cairo-gl as we continue to develop both it and our GL stack.
Note that this backends do not yet fully pass the test suite, so only
use if you are brave and willing to help develop them further.
Write a dedicated compositor for pixman so that we avoid the
middle-layer syndrome of surface-fallback. The major upshot of this
rewrite is that the image surface is now several times quicker for glyph
compositing, which dramatically improves performance for text rendering
by firefox and friends. It also uses a couple of the new scan
convertors, such as the rectangular scan converter for rectilinear
paths.
Speedups
========
image-rgba firefox-talos-gfx-0 342050.17 (342155.88 0.02%) -> 69412.44 (69702.90 0.21%): 4.93x speedup
███▉
image-rgba vim-0 97518.13 (97696.23 1.21%) -> 30712.63 (31238.65 0.85%): 3.18x speedup
██▏
image-rgba evolution-0 69927.77 (110261.08 19.84%) -> 24430.05 (25368.85 1.89%): 2.86x speedup
█▉
image-rgba poppler-0 41452.61 (41547.03 2.51%) -> 21195.52 (21656.85 1.08%): 1.96x speedup
█
image-rgba firefox-planet-gnome-0 217512.61 (217636.80 0.06%) -> 123341.02 (123641.94 0.12%): 1.76x speedup
▊
image-rgba swfdec-youtube-0 41302.71 (41373.60 0.11%) -> 31343.93 (31488.87 0.23%): 1.32x speedup
▍
image-rgba swfdec-giant-steps-0 20699.54 (20739.52 0.10%) -> 17360.19 (17375.51 0.04%): 1.19x speedup
▎
image-rgba gvim-0 167837.47 (168027.68 0.51%) -> 151105.94 (151635.85 0.18%): 1.11x speedup
▏
image-rgba firefox-talos-svg-0 375273.43 (388250.94 1.60%) -> 356846.09 (370370.08 1.86%): 1.05x speedup
Discard a redundant clear as the image surface is guaranteed to return
a cleared surface that meets pixman/xlib requirements for alignment, and
more importantly add the ComponentAlpha flag on the pixman image
generated as appropriate.
Revamp clipping in preparation for the removal of the low-level interface
and promote backend to use the higher levels. The principle here is that
the higher level interface gives the backend more scope for choosing
better performing primitives.
Frequently we only need the coarse path bounds, so avoid walking over
the list of points once more as we can cheaply track the extents during
construction.
By preallocating in our data segment a couple of solid patterns for the
stock colours, it becomes more convenient when using those in surface
operations, such as when clearing.
An issue that we currently have is that we have a pessimistic
false-positive rate when determining whether glyphs within a string
overlap. By using the tight bounds, the overlap detection is arguably
less accurate presuming pixel-aligned opacity masks but we make the
trade-off for performance.
Having added a specialised scan converter on the premise that it should
be better at handling rounded rectangles, ensure that they are indeed
rendered correctly.
Enable origin tracking by default for make check-valgrind. This is
slower and consumes more memory than regular valgrind, but the
additional information provided about the source of the uninitialised
data is often invaluable.
This is a highly specialised scan converter for the relatively common
case of where the input geometry is known to be a series of rectangles.
Generally not device aligned (or else we would most likely have chosen
an even higher performance path that does not require a coverage mask),
this optimised converter can simply compute the analytical coverage by
utilising a special case Bentley-Ottmann intersection finder.
This variant uses the Bentley-Ottmann algorithm to only maintain the
active edge list upon edge events and so can efficiently skip areas
where no change occurs. This means that it can be much quicker than the
Tor algorithm (which is still used to compute the coverages from the
active edges) for geometries consisting of long straight lines with few
intersections. However due to the computational overhead of the
Bentley-Ottmann event processing, for dense curvy paths, simply updating
the active edge list in sync with computing the coverages is a win. Due
to advantageous adaptive step size, the scan converter can be run at a
much higher subsampling with little extra overhead compared with Tor,
currently it uses a 256x256 subsampling grid to avoid any impedance
mismatch with path precision.
Given the current status of implementations, this scan converter [botor]
is likely to be advantage where detecting large regions of unchanged
span data will result in improved performance, for instance the drm
backends which convert the scan data into rectangles.