Remember to check for a supported render version before making a
FillRectangle request, and fallback to the core protocol where possible
instead.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Before rendering into the mask, we should first check whether the
subsequent call to composite the mask will trigger a fallback. In that
case, we should fallback earlier and do the operation in place.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
When using a texture surface the depth/stencil buffer is private to
cairo so we can rely on the fact that any previously painted clip is
still valid.
We also only scissor when there's a previously painted clip on the
stencil buffer, otherwise we disable the scissor test. This fixes a few
test cases.
After 5e9083f882 there's no need to set a
clip on the cairo_gl_composite_t when masking. Clips are converted to
traps and rendered directly when masking now.
Otherwise, the join think it starts and end in exactly the same
direction and elimiates the round capping.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Stop trying to workaround the destroy-callback requiring the font mutex
as we already hold the mutex whilst cleaning up the font caches.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
After acquiring a GL device and the same GL context, surface, and
display combination is already active outside of Cairo, do not ask EGL
or GLX to change the current context as that may cause a flush on some
drivers. Also do not unset the context when releasing the device for the
same reason.
Allow the inpline span compositor to operate on wider images than its
temporary buffer by allocating a scanline mask.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Sadly we cannot check ahead of acquiring the lock whether we hold the
lock. Just have to rely on lockdep.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Both to make sure we do not leak the memory, but to also prevent
_cairo_xlib_surface_put_shm() from operating upon the finished shm
surface after the display is closed.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58253
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If we need to reap the global cache, this will call back into the scaled
font to free the glyph page. We therefore need to be careful not to run
concurrently with a user adding to the glyph page, ergo we need locking.
To complicate matters we need to be wary of a lock-inversion as we hold
the scaled_font lock whilst thawing the global cache. We prevent the
deadlock by careful ordering of the thaw-unlock and by inspecting the
current frozen state of the scaled-font before releasing the glyph
page.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
In the case of a recording surface we may recurse into the global glyph
cache so we need to be careful and stage the ordering of how we free the
glyphs. So first we finish any information and surfaces from the scaled
font glyph cache (and so triggering recursion into other scaled fonts)
and then take the global cache and remove our pages.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54950
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The freeze/thaw routines have a side-effect of managing the global glyph
cache in addition to taking the mutex on the font. If we don't call
them, we may end up indefinitely keeping the global glyph cache frozen
(effectively leaking glyphs to the maximum of all open fonts) and
triggering asserts.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
But note we can only do the exchange if they do indeed match and
there are no other references (the objects are only on the stack).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Speedups
========
firefox-paintball 59462.09 -> 40928.76: 1.45x speedup
firefox-fishtank 43687.33 -> 34627.78: 1.26x speedup
firefox-tron 52526.00 -> 45754.73: 1.15x speedup
However in order to avoid a regression with firefox-talos-svg we need to
prevent splitting up the scanline when using a gradient source.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We were open-coding the functionality of map-to-image inside the source
creation routines. so refactor to actually use map-to-image instead.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Be more strict with when we mark the pixmap as active so that we only
wait for the actual XCopyArea involving the pixmap to complete.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Try using the lighter-weight LZO decompressor in an effort to speed up
replays (at the cost of making the bound traces slightly larger).
Presuming that with the slight increase in file size (from -1% to +10%),
the file data remains in the readahead buffer cache, replays see a
performance improvement of between 5-10%.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
When clearing a GL surface, set is_clear to true, and when mapping to an
image, handle is_clear like surfaces without modification. Additionally,
explicitly clear surfaces created via cairo_surface_create_similar.
Writing to the stencil buffer can be expensive, so when using the
stencil buffer for clipping only clear the clip extent. When using the
stencil buffer to prevent overlapping rendering during stroking, only
clear the approximate stroke extents.
s/CAIRO_GOBJECT_TYPE_HNT_METRICS/CAIRO_GOBJECT_TYPE_HINT_METRICS/
However, as we have already released the broken headers, we need to
preserve that mistake in case applications are already using. Since it
is just a #define, there is little associated cost with carrying both
the incorrect spelling and the corrected define.
Whilst it cannot handle self-intersecting strokes (which includes the
antialias region of neighbouring lines and joints), it is about 3x
faster to use than the more robust algorithm. As some backends delegate
the rendering, the quality may still be preserved and so they should be
responsible for choosing the appropriate method for generation of the
stroke geometry.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
In theory this should just save a single copy, however PutImage will
break up requests into a series of scanlines requests which is less
efficient than the single-shot transfer provided by ShmPutImage.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
In order to overload the emitters in future to provide specialised
routines for the common types of operands, begin by switching the
current users over to a vfunc interface.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As our lazy event mechanism is sufficient for tracking when to reuse shm
memory, and the events are not necessary for ShmPut/ShmGetImage paths.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>