By ensuring that tests take longer than a couple of seconds we eliminate
systematic errors in our measurements. However, we also effectively
eliminate the synchronisation overhead. To compensate, we attempt to
estimate the overhead by reporting the difference between a single
instance and the minimum averaged instance.
Generate a cairo-perf-diff graph for a series of commits in order to be
able to identify significant commits. Still very crude, but minimally
functional.