Tuesday 10 January 2012

New article: Benchmarking homogenisation algorithms for monthly data

The main paper of the COST Action HOME on homogenisation of climate data has been published today in Climate of the Past. This post describes shortly the problem of inhomogeneities in climate data and how such data problems are corrected by homogenisation. The main part explains the topic of the paper, a new blind validation study of homogenisation algorithms for monthly temperature and precipitation data. All the most used and best algorithms participated.


To study climatic variability the original observations are indispensable, but not directly usable. Next to real climate signals they may also contain non-climatic changes. Corrections to the data are needed to remove these non-climatic influences, this is called homogenisation. The best known non-climatic change is the urban heat island effect. The temperature in cities can be warmer than on the surrounding country side, especially at night. Thus as cities grow, one may expect that temperatures measured in cities become higher. On the other hand, many stations have been relocated from cities to nearby, typically cooler, airports. Other non-climatic changes can be caused by changes in measurement methods. Meteorological instruments are typically installed in a screen to protect them from direct sun and wetting. In the 19th century it was common to use a metal screen on a North facing wall. However, the building may warm the screen leading to higher temperature measurements. When this problem was realised the so-called Stevenson screen was introduced, typically installed in gardens, away from buildings. This is still the most typical weather screen with its typical double-louvre door and walls. Nowadays automatic weather stations, which reduce labor costs, are becoming more common; they protect the thermometer by a number of white plastic cones. This necessitated changes from manually recorded liquid and glass thermometers to automated electrical resistance thermometers, which reduces the recorded temperature values.

One way to study the influence of changes in measurement techniques is by making simultaneous measurements with historical and current instruments, procedures or screens. This picture shows three meteorological shelters next to each other in Murcia (Spain). The rightmost shelter is a replica of the Montsouri screen, in use in Spain and many European countries in the late 19th century and early 20th century. In the middle, Stevenson screen equipped with automatic sensors. Leftmost, Stevenson screen equipped with conventional meteorological instruments.
Picture: Project SCREEN, Center for Climate Change, Universitat Rovira i Virgili, Spain.

A further example for a change in the measurement method is that the precipitation amounts observed in the early instrumental period (about before 1900) are biased and are 10% lower than nowadays because the measurements were often made on a roof. At the time, instruments were installed on rooftops to ensure that the instrument is never shielded from the rain, but it was found later that due to the turbulent flow of the wind on roofs, some rain droplets and especially snow flakes did not fall into the opening. Consequently measurements are nowadays performed closer to the ground.


To reliably study the real development of the climate, non-climatic changes have to be removed. For this the small difference of one station to its direct neighbours are utilized. In this way non-climatic changes (shelter and instrument changes or station moves usually) in a single stations can be more clearly seen as in the record of one station by itself due to the strong natural climatic variability. This method does not work when changes are applied to a whole country’s network. Such extensive changes are less problematic, however, because are typically well documented.
UPDATE (17 Jan. 2012): There is now a longer post with more information on inhomogeneities and homogenisation.

Meteorological window suggested by Italian Central Office for Meteorology and Climate in 1879 (Tacchini, 1879). In the last decades of the 19th century most of Italian observations were performed in urban environments, in screens located outside a north-facing window of the highest floor of a “meteorological tower”. The purpose of using such towers was to perform observations above the level of the roofs of the surrounding buildings. Picture: Michele Brunetti, ISAC-CNR, Bologna, Italy.


To study the performance of the various homogenisation methods, the COST Action HOME has performed a test with artificial climate data. The advantage of artificial data is that the non-climatic changes are known to those who created the data. The artificial data used mimics climatic networks and their data problems with unprecedented realism.

For me it was interesting to see how my algorithm to generate surrogate 3-dimensional cloud fields (the Iterative Amplitude Adjusted Fourier Transform algorithm; IAAFT) could be applied to climate research. This algorithm is able to produce non-Gaussian data with arbitrary temporal variability and cross-correlations between the stations. It turned out that the added realism of surrogate climate data is important and homogenisation is more difficult with surrogate data as with Gaussian white noise.

The artificial data may have a warming, a cooling or no trend, to ensure objective testing of the methods.

The main novelty is that the test was blind. In other words, while homogenising the data the scientists did not know which station contained which non-climatic problem. The artificial data were generated and the analysis of results was performed by independent researchers, who did not homogenise the data themselves. The Action asked me to write this paper, as I could be an impartial arbiter not having worked on homogenisation before. Consequently, the COST Action is sure that the results are an honest appraisal of the true power of homogenisation algorithms. For some more general thoughts on benchmarking, see my post on benchmarking as a community activity.

Some people remaining skeptical of climate change claim that adjustments applied to the data by climatologists, to correct for the issues described above, lead to overestimates of global warming. The results clearly show that homogenisation improves the quality of temperature records and makes the estimate of climatic trends more accurate.

The photo on the left shows an open shelter for meteorological instruments at the edge of the school square of the primary school of La Rochelle, in 1910. La Rochelle is a coastal city in western France and a seaport on the Bay of Biscay. On the right one sees the current situation, a Stevenson-like screen located closer to the ocean, along the Atlantic shore, in place named "Le bout blanc". Behind the fence you see the water of the port. Picture: Olivier Mestre, Meteo France, Toulouse, France.

Methodological advances in homogenisation

In the past it was customary in homogenisation to compare a station with its neighbours by creating a reference time series from averaging over multiple neighbouring stations. Due to the averaging the influence of random non-climatic factors is strongly reduced. Thus if a jump was found in the difference time series of a station with its reference, the jump was assumed to be in the station, not in the reference, which was assumed to be homogeneous. In recent years climatologists and statisticians have worked on advanced statistical methods that do not need the assumption that the reference is homogeneous. The traditional methods reduced the influence of non-climatic factors on the temperature measurements, but the complex modern methods clearly improved the data much more. This finding could only be reached using the benchmark data simulating complete networks with realistic non-climatic problems. Thus now we can recommend with confidence that climatologists should use such new methods. These recommendations are not only based on the numerical results, but also on our mathematical understanding of the algorithms.

Open-access publishing

The scientific article with 31 authors describing this study has been published today in the journal Climate of the Past. This international journal is an open-access and an open-review journal of the European Geosciences Union. The articles of open-access journals can be freely read by anyone; the costs of publication are born by the authors. Open-access publishing makes it easier for researchers, also from poorer countries, to stay up to date and to participate in science. Also the general public can profit from open-access publishing as the access to the primary source can make the public debate on current scientific issues in newspapers and blogs more informed. Especially for this topic, we felt it was important that everyone can read the article. Next to many EGU journals, last week also the meteorological journal Tellus joined the open access movement.

Climate of the Past is also an open-review journal. This new way of reviewing scientific articles is public, everyone has the possibility to respond to the initial draft of the paper and everyone can read these comments as well as the comments of the official peer reviewers of the manuscript.

International surface temperature initiative

The international surface temperature initiative (ISTI) is working on an open and transparent framework for creating and hosting global temperature datasets. The main feature will be provenance, meaning that every temperature value can be traced back to its origin. The database will contain digital images of the records, the keyed numbers, the temperature values in a common format, as well as quality controlled and homogenised data. To be able to study the performance of the software performing all these steps, a similar artificial temperature benchmark dataset will be generated, but this time a global dataset, building upon the groundwork of the COST Action using regional climate networks.

For more information

Venema, V., O. Mestre, E. Aguilar, I. Auer, J.A. Guijarro, P. Domonkos, G. Vertacnik, T. Szentimrey, P. Stepanek, P. Zahradnicek, J. Viarre, G. Müller-Westermeier, M. Lakatos, C.N. Williams, M. Menne, R. Lindau, D. Rasol, E. Rustemeier, K. Kolokythas, T. Marinova, L. Andresen, F. Acquaotta, S. Fratianni, S. Cheval, M. Klancar, M. Brunetti, Ch. Gruber, M. Prohom Duran, T. Likso, P. Esteban, Th. Brandsma. Benchmarking homogenization algorithms for monthly data. , Climate of the Past, 8, pp. 89-115, 2012.

If you would like to analyse the data used in this study, please go to this page for a link to the data as well as to documents that describe the dataset and data formats in detail.

The homepage of the Action HOME with amongst others a bibliography with most if not all articles on homogenisation of climate networks.

If you are interested in homogenisation, please send me an e-mail and I will put you on our email distribution list.

More posts on homogenisation

Statistical homogenisation for dummies
A primer on statistical homogenisation with many pictures.
Homogenisation of monthly and annual data from surface stations
A short description of the causes of inhomogeneities in climate data (non-climatic variability) and how to remove it using the relative homogenisation approach.
HUME: Homogenisation, Uncertainty Measures and Extreme weather
Proposal for future research in homogenisation of climate network data.
Investigation of methods for hydroclimatic data homogenization
An example of the daily misinformation spread by the blog Watts Up With That? In this case about homogenisation.
What distinguishes a benchmark?
Main answer: benchmarking is a community effort.

No comments:

Post a Comment

Comments are welcome, but comments without arguments may be deleted. Please try to remain on topic. (See also moderation page.)

I read every comment before publishing it. Spam comments are useless.

This comment box can be stretched for more space.