pixman_format_t is a simple structure used in short-term allocations and
suitable for on-stack allocation.
Killing the pixman_format_create()/pixman_format_destroy() pairs avoid
around 6% of the allocations during cairo-perf (e.g. 426,158 allocs out
of a total of 7,063,469).
The attribute was introduced with gcc-3.4, but the ability to suppress
warnings from misapplied attributes (-Wno-attributes) was only introduced
later. Without the supression, gcc will emit tens of warnings for each
compilation completely drowning the real errors that the programmer
must see.
By unexporting these function we have exact control over their call sites
and so can convert the initial guards into asserts which transforms the
two functions to return unconditional success and hence conversion to
void.
This adds a compiler check that the function result is used by the caller
and enables it by default for all cairo_private functions and for public
API that returns a cairo_status_t.
It has been discussed that to extend the warnings to all functions, a
new function type could been introduced to cover static functions:
cairo_static. This has not been done at the present time in order to
minimise the churn and focus on the more common errors.
In order to reduce the warning spew generated by gcc for invalid use of
this attribute, -Wno-attributes is added to CFLAGS. This has the
unfortunate side-effect of masking future warnings for all attributes -
be warned!
Most of the time pixman_region_init is called without any extents, and
followed by a pixman_region_union_rect, used to used to initialize
rectangular regions. pixman_region_union_rect is not that cheap, but
the sequence is called quite often. So it should be worth introducing
a specialized and fast function for this sequence.
This introduces pixman_region_init_rect. This new function makes
_cairo_region_init_from_rectangle obsolete.
Also removes the extent argument from pixman_region_init as it was
called with NULL most of the time. A pixman_region_init_with_extents
is added for the general case.
Thanks to Thomas Klausner for passing the report along.
This fixes the following bug report:
hidden attribute does not work with Solaris ld
https://bugs.freedesktop.org/show_bug.cgi?id=10227
And as Behdad points out, an even better fix would be to
move checks for supported visibility attribute to configure.
Previously the gradient walker was doing excessive resets, (such
as on every pixel in constant-colored regions or outside the
gradient with CAIRO_EXTEND_NONE). Don't do that.
The previous implementation fell apart quite badly when neither radius
value was equal to 0.0. I derived the math from scratch, (much thanks to
Vincent Torri <vtorri@univ-evry.fr> for guiding me to a simpler derivation
than I was doing originally), and it's working much better now without
being any slower, (in fact, cairo-perf shows speedup of 1.05x to 1.58x on
my laptop here).
This work also provides groundwork for defining the behavior of radial
gradients where neither circle is wholly contained within the other, (though
we haven't done that definition yet---it will require a new test case and a
very little bit of work on the implementation).
This is a fix for the following bug report:
Radial Gradients with nonzero inner radius misplace stops
https://bugs.freedesktop.org/show_bug.cgi?id=7685
Since the last time these makefiles were last updated some new source
files have been added and one renamed. In addition, a "clean" rule
needed to be added to the pixman makefile. And the "clean" rule in the
main cairo makefile wasn't working properly for me.
The warning happens all the place when the code converts from ullong to __m64.
The way the conversion is done is a C idiom: 1) get a pointer to the value, 2)
convert it to the suitable pointer type for the target, 3) dereference it.
That is "*(__m64*)(&m)" in this case. This is necessarily (as opposed to just
casting to target type) because the two types may not be "compatible" from the
compiler's point of view. Example of types that are not compatbile is structs
vs anything.
The "dereferencing type-punned pointer will break strict-aliasing rules" from
gcc exactly means: "some code may be assuming that pointers with different
types do not compare equal (or otherwise share the same target object). If
you case a pointer to a different type and dereference it, it may happen
here." However, in our usecase, it's clear that the compiler cannot make any
false assumptions. So we just go ahead and hide it by using a middle cast to
"void *". Since the compiler does not many any aliasing assumptions about
generic pointers, it will not warn either. (Though the problems if any, will
still occure. So this is not an ideal solution to this problem and should be
used very carefully, to no hide useful warnings for when things go loose on
some weird architecture.)
Another solution would have been to use gcc's "may_alias" function attribute,
but trying to define a may_alias version of __m64 hit a bug in gcc. That is,
try replacing "__m64" with "m64" and define:
typedef __m64 m64 __attribute__((may_alias));
and see it fail to compile. This seems to be because of the special vector
type that __m64 has.
The implementation of _FbOnes in iccolor.c would not work on 64-bit
longs correctly. Fortunately, it's only used on integers, so make it
explicit in the declaration.
Use an inline function for the gcc builtin implementation to make sure
that it's never used with arguments of incorrect size.
There is no __INT_MIN__ in gcc 4.1.1, but it's not an issue now because
the argument is 32-bit.
Signed-off-by: Pavel Roskin <proski@gnu.org>
The patch implements a few more operations with special cases MMX
code. On my laptop, applying the patch to cairo speeds up the
benchmark (rendering page 14 of a PDF file[*]) from 20.9 seconds
to 14.9 seconds, which is an improvement of 28.6%.
[*] http://people.redhat.com/jakub/prelink.pdf
This also benefits the recently added unaligned_clip perf case:
image-rgb unaligned_clip-100 0.11 -> 0.06: 1.65x speedup
▋
image-rgba unaligned_clip-100 0.11 -> 0.06: 1.64x speedup
▋
We update the test suite reference images where needed, (pdiff
avoided a few, but most still needed updating). We take advantage
of the need for new reference images to shrink some of the giant
tests to speed them up a bit.
This optimization provides a 2x improvement in linear gradient
generation performance (numbers from an x86 laptop):
image-rgb paint_linear_rgba_source-512 26.13 -> 11.13: 2.35x speedup
█▍
image-rgb paint_linear_rgba_source-256 6.47 -> 2.76: 2.34x speedup
█▍
image-rgba paint_linear_rgb_over-256 6.51 -> 2.86: 2.28x speedup
█▎
image-rgb paint_linear_rgba_over-512 28.62 -> 13.70: 2.09x speedup
█▏
image-rgba fill_linear_rgb_over-256 3.24 -> 1.94: 1.66x speedup
▋
image-rgb stroke_linear_rgba_over-256 5.68 -> 4.10: 1.39x speedup
▍