If the pattern is for example a repeating 0x0 image, then treat it as
having zero extents.
This should workaround the bug presented here:
https://bugs.freedesktop.org/show_bug.cgi?id=24693
Attached PDF crashes evince with a Floating point exception
Within our code base we carried a few hacks to utilize the component
alpha capabilities of pixman, whilst not supporting the concept for our
own masks. Thus we were setting it upon the pixman_image_t that we
passed around through code that was blissfully unaware and indeed the
component-alpha property was forgotten (e.g. upgrading glyph masks).
The real issue is that without explicit support that a pattern carries
subpixel masking information, that information is lost when using that
pattern with composite. Again we can look at the example of compositing
a sub-pixel glyph mask onto a remote xlib surface for further failure.
Gah, that was a horrible mistake. It was a flawed hack to create Pixmaps
of the correct depth when cloning patterns for blitting to the xlib
backend. However, it had the nasty side-effect of discarding alpha when
targeting Window surfaces. The correct solution is to simply correct the
Pixmap of the desired depth and render a matching pattern onto the
surface - i.e. a reversal the current acquire -> clone. See the
forthcoming revised xcb backend on how I should have done it originally.
As noted in the comments we could also compute the pattern extents for
gradients with CAIRO_EXTEND_NONE under certain circumstances, i.e.
radial gradients and device axis aligned linear gradients.
Previously the pattern_acquire_surface routine only had to worry about
handling extend modes NONE or REPEAT and so the test for ! REPEAT
sufficed when what was actually intended was a test for NONE.
Handling clip as part of the surface state, as opposed to being part of
the operation state, is cumbersome and a hindrance to providing true proxy
surface support. For example, the clip must be copied from the surface
onto the fallback image, but this was forgotten causing undue hassle in
each backend. Another example is the contortion the meta surface
endures to ensure the clip is correctly recorded. By contrast passing the
clip along with the operation is quite simple and enables us to write
generic handlers for providing surface wrappers. (And in the future, we
should be able to write more esoteric wrappers, e.g. automatic 2x FSAA,
trivially.)
In brief, instead of the surface automatically applying the clip before
calling the backend, the backend can call into a generic helper to apply
clipping. For raster surfaces, clip regions are handled automatically as
part of the composite interface. For vector surfaces, a clip helper is
introduced to replay and callback into an intersect_clip_path() function
as necessary.
Whilst this is not primarily a performance related change (the change
should just move the computation of the clip from the moment it is applied
by the user to the moment it is required by the backend), it is important
to track any potential regression:
ppc:
Speedups
========
image-rgba evolution-20090607-0 1026085.22 0.18% -> 672972.07 0.77%: 1.52x speedup
▌
image-rgba evolution-20090618-0 680579.98 0.12% -> 573237.66 0.16%: 1.19x speedup
▎
image-rgba swfdec-fill-rate-4xaa-0 460296.92 0.36% -> 407464.63 0.42%: 1.13x speedup
▏
image-rgba swfdec-fill-rate-2xaa-0 128431.95 0.47% -> 115051.86 0.42%: 1.12x speedup
▏
Slowdowns
=========
image-rgba firefox-periodic-table-0 56837.61 0.78% -> 66055.17 3.20%: 1.09x slowdown
▏
Observe that patterns are not altered during an operation and so we are
safe to use the data from the original pattern without copying. (This is
enforced through the declaration that the backends operate on constant
patterns which are not allowed to be referenced or destroyed.)
Allow the caller to choose whether or not various conversions take place.
The first flag is used to disable the expansion of reflected patterns into a
repeating surface.
Some backends are quite constrained with surface sizes and so trigger
fallbacks when asked to clone large images. To avoid this we attempt
to trim ROIs (as these are often limited to the destination image, and
so can be accommodated by the hardware). This patch allows trimming
REPEAT sources both horizontally and vertically independently.
As an aide to tracking down the source of uninitialised reads, run
VALGRIND_CHECK_MEM_IS_DEFINED() over the contents of image surfaces at the
boundary between backends, e.g. upon setting a glyph image or acquiring
a source image.
Damian Frank noted
[http://lists.cairographics.org/archives/cairo/2009-May/017095.html]
a performance problem with an older XServer with an
unaccelerated composite - similar problems will be seen with non-XRender
servers which will trigger extraneous fallbacks. The problem he found was
that painting an ARGB32 image onto an RGB24 destination window (using
SOURCE) was going via the RENDER protocol and not core. He was able to
demonstrate that this could be worked around by declaring the pixel data as
an RGB24 image. The issue is that the image is uploaded into a temporary
pixmap of matching depth (i.e. 32 bit for ARGB32 and 24 bit for RGB23
data), however the core protocol can only blit between Drawables of
matching depth - so without the work-around the Drawables are mismatched
and we either need to use RENDER or fallback.
This patch adds a content mask to _cairo_surface_clone_similar() to
provide the extra bit of information to the backends for when it is
possible for them to drop channels from the clone. This is used by the
xlib backend to only create a 24 bit source when blitting to a Window.
Ensure that no assumptions are made that a small allocation will succeed
by manually injecting faults when we may be simply allocating from an
embedded memory pool.
The main advantage in manual fault injection is improved code coverage -
from within the test suite most allocations are handled by the embedded
memory pools.
Assert that the pattern is one of the four known types, and return an
error so that the compiler knows that the local variable can not be used
uninitialised.
A new meta-surface backend for serialising drawing operations to a
CairoScript file. The principal use (as currently envisaged) is to provide
a round-trip testing mechanism for CairoScript - i.e. we can generate
script files for every test in the suite and check that we can replay them
with perfect fidelity. (Obviously this does not provide complete coverage
of CairoScript's syntax, but should give reasonable coverage over the
operators.)
pixman limits the src] co-ordinates (and thus [xy]_offset] to 16bits,
so we need to be careful how much of the translation vector to push into
[xy]_offset. Since the range is the same for both, split the integer
component between the matrix and the offset.
test/scale-offset* now at least shows the source image, even if it is
misplaced.
Exploit the auxiliary offset vector in the attributes to reduce
likelihood of range overflow in the translation components when converting
the pattern matrix to fixed-point pixman_matrix_t.
An example of this is bug 9148
Bug 9148 - invalid rendering when painting large scaled-down surfaces
(https://bugs.freedesktop.org/show_bug.cgi?id=9148)
but the issue is perhaps even more likely with high resolution fallback
images.
In order to workaound a directfb bug, tweak the reflect->repeat pattern so
that it covers the destination rectangle. Although the number of paint()
increases, the number of read/written pixels remain the same so that
performance should not deteriorate, but instead be improved by using a
cloned source. The early return of the REFLECT surface is discarded so
that the latter optimisations for surface sources can be applied. One side
effect of this is that acquire_source_image() is removed due to its lax
reference counting which thereby exposes the ROI optimisations for image
destinations as well.
Avoid allocating a default source pattern by using the static black pattern
object. The one complication is that we need to ensure that the static
pattern does leak to the application, so we replace it with an allocated
solid pattern within _cairo_gstate_get_source().
Only copy the pattern if we need to modify it, e.g. preserve a copy in a
snapshot or a soft-mask, or to modify the matrix. Otherwise we can
continue to use the original pattern and mark it as const in order to
generate compiler warnings if we do attempt to write to it.
Adrian Johnson discovered cases where we mistakenly compared the result
of unsigned arithmetic where we need signed quantities. Look for similar
cases in the users of cairo_rectangle_int_t.
As is so often the case, reading the commit log gives you fresh insight in
the problem - often called confessional debugging...
We can simplify the problem by ignoring attr->[xy]_offset, for the time
being, and focus on computing the correct matrix. This is comparatively
simple as all we need to do is perform the appropriate rounding on the
translation vector.
A complication I realised after pushing 3eb4bc3 was handling larger
sampled areas. Extending the test case revealed that the optimization
was broken for anything but the identity transform (after removing the
translation). Correctness first, leaving the "pixel-exact" solution for
interested reader...
As identified in bug 15479,
Unpredictable performance of cairo-xlib with non-integer translations of a
source surface pattern
(https://bugs.freedesktop.org/show_bug.cgi?id=15479),
source surfaces with a fractional translation hit slow paths for some
drivers, causing seemingly random performance variations. As a work-around
Owen Taylor proposed that cairo could convert non-integer translations on
NEAREST sources patterns to their integer equivalents.
The messy detail involved here is replicating the rounding mode used by
pixman for the sample offset, but otherwise the conversion is fairly
trivial.
As proof-of-principle, compute a scale factor to avoid overflow when
converting a linear pattern to pixman_fixed_t. Fixes test/huge-pattern,
but the principle should be extended to handle more cases of overflow.