New article on the multiple breakpoint problem in homogenization

Sunday, 24 March 2013

New article on the multiple breakpoint problem in homogenization

An interesting paper by Ralf Lindau and me on the multiple breakpoint problem has just appeared in a Special issue on homogenization of the open access Quarterly Journal of the Hungarian Meteorological Service "Időjárás".

Multiple break point problem

Long instrumental time series contain non-climatological changes, called inhomogeneities. For example because of relocations or due to changes in the instrumentation. To study real changes in the climate more accurately these inhomogeneities need to be detected and removed in a data processing step called homogenization; also called segmentation in statistics.

Statisticians have worked a lot on the detection of a single break point in data. However, unfortunately, long climate time series typically contain more than just one break point. There are two ad hoc methods to deal with this.

The most used method is the hierarchical one: to first detect the largest break and then to redo the detection on the two subsections, and so on until no more breaks are found or the segments become too short. A variant is the semi-hierachical method in which old detected breaks are retested and removed if no longer significant. For example, SNHT uses a semi-hierachical scheme and thus also the pairwise homogenization algorithm of NOAA, which uses SNHT for detection.

The second ad hoc method is to detect the breaks on a moving window. This window should be long enough for sensitivity, but should not be too long because that increases the chance of two breaks in the window. In the Special issue there is an article by José A. Guijarro on this method, which is used for his homogenization method CLIMATOL.

While these two ad hoc methods work reasonably, detecting all breaks simultaneously is more powerful. This can be performed as an exhaustive search of all possible combinations (used by the homogenization method MASH). With on average one break per 15 to 20 years, the number of breaks and thus combinations can get very large. Modern homogenization methods consequently use an optimization method called dynamic programming (used by the homogenization methods PRODIGE, ACMANT and HOMER).

All the mentioned homogenization methods have been compared with each other on a realistic benchmark dataset by the COST Action HOME. In the corresponding article (Venema et al., 2012) you can find references to all the mentioned methods. The results of this benchmarking showed that multiple breakpoint methods were clearly the best. However, this is not only because of the elegant solution to the multiple breakpoint problem, these methods also had other advantages.

Sketch illustrating the internal and external variance.

Optimal number of breaks

To determine the optimal number of breaks, the variance explained by the breaks is utilized. More precisely, for every segment you compute the mean and the variance of this step function is used as measure for the size of the breaks. In the paper this is called the external variance and the variance around the mean of the homogeneous subperiods is called the internal variance. The sum of the external and the internal variance is constant, which means that dynamic programming can be applied.

The more breaks are inserted, the larger the external variance is. Thus you need a criterion to determine the statistical significance of an additional break. To derive such a criterion, the new paper studies derives an analytical function for the average external variance of multiple breaks points in white noise. Statistically formulated: normal white noise is the null-hypothesis; for a combination of breaks to be statistically significant the external variance needs to be larger as one would expect to obtain in normal white noise.

The main fundamental finding of our paper is that the external variance has a beta distribution. Of more practical nature is the finding that the penalty function used by PRODIGE, ACMANT and HOMER to determine the number of breaks is not optimal. For a typical case with 100 years of data, this penalty function works well, but for very short or long time series it needs to be improved.

Our paper does not specify a new stopping criterion yet. The analytical relationship is for the average amount of external variance, that is it gives a value that is exceeded 50% of the time. In statistics it is customary to call the value that is exceeded 5% of the time statistically significant, for which we can currently only give numerical results.

Furthermore, it is not clear whether the traditional 5% level is optimal for homogenization. This 5% level is more or less arbitrary. It was chosen to avoid very second experiment refuting the theory of gravity, which would result in too many faulty press releases. With the 5% level, one needs to do 20 experiments for one faulty press release. For a given amount of data, you have to make a compromise between the ability to detect a true break and the number of false breaks. If one is very strict about not detecting false breaks, the downside is that one will detect less true breaks. Both detecting little false breaks and detecting many true breaks is desirable. It would be very coincidental if the traditional 5% level is the optimum compromise.

Special issue

This Special issue contains many more interesting papers. It is an offspring of the COST Action HOME: Advances in homogenization methods of climate series: an integrated approach (COST-ES0601). UPDATE: in this post, you can find links to all the articles and a short discussion.

References

Lindau, R. and V. Venema. On the multiple breakpoint problem and the number of significant breaks in homogenization of climate records. Idojaras, Quarterly Journal of the Hungarian Meteorological Service, 117, no. 1, pp. 1-34, 2013.
Venema, V., O. Mestre, E. Aguilar, I. Auer, J.A. Guijarro, P. Domonkos, G. Vertacnik, T. Szentimrey, P. Stepanek, P. Zahradnicek, J. Viarre, G. Müller-Westermeier, M. Lakatos, C.N. Williams,
M.J. Menne, R. Lindau, D. Rasol, E. Rustemeier, K. Kolokythas, T. Marinova, L. Andresen, F. Acquaotta, S. Fratianni, S. Cheval, M. Klancar, M. Brunetti, Ch. Gruber, M. Prohom Duran, T. Likso,
P. Esteban, Th. Brandsma. Benchmarking homogenization algorithms for monthly data. Climate of the Past, 8, pp. 89-115, doi: 10.5194/cp-8-89-2012, 2012.

Variable Variability

Pages

Sunday, 24 March 2013