Since the objects can be shared and may be in use simultaneously across
multiple threads, setting the status needs to be atomic. We use a
locked compare and exchange in order to avoid overwriting an existing
error - that is we use an atomic operation that only sets the new status
value if the current value is CAIRO_STATUS_SUCCESS.
Test for the availability of the Intel __sync_* atomic primitives and
use them to define a few operations useful for reference counting -
providing a generic interface that may be targeted at more architectures
in the future. If no atomic primitives are available, use a mutex based
variant. If the contention on that mutex is too high, we can consider
using an array of mutexes using the address of the atomic variable as
the hash.