Thursday, 23 April 2020

Correcting inhomogeneities when all breaks are perfectly known

Much of the scientific literature on the statistical homogenization of climate data is about the detection of breaks, especially the literature before 2012. Much of the more recent literature studies complete homogenization algorithms. That leaves a gap for the study of correction methods.

Spoiler: if we know all the breaks perfectly, the correction method removes trend biases from a climate network perfectly. I found the most surprising outcome that in this case the size of the breaks is irrelevant for how well the correction method works, what matters is the noise.

This post is about a study filling this gap by Ralf Lindau and me. The post assumes you are familiar with statistical homogenization, if not you can find more information here. For correction you naturally need information on the breaks. To study correction in isolation as much as possible, we have assumed that all breaks are known. That is naturally quite theoretical, but it makes it possible to study the correction method in detail.

The correction method we have studied is a so-called joint correction method, that means that the corrections for all stations in a network are computed in one go. The somewhat unfortunate name ANOVA is typically used for this correction method. The equations are the same as those of the ANOVA test, but the application is quite different, so I find this name confusing.

This correction method makes three assumptions. 1) That all stations have the same regional climate signal. 2) That every station has its own break signal, which is a step function with the positions of the steps given by the known breaks. 3) That every station also has its own measurement and weather noise. The algorithm computes the values of the regional climate signal and the levels of the step functions by minimizing this noise. So in principle the method is a simple least square regression, but with much more coefficients than when you use it to compute a linear trend.

Three steps

In this study we compute the errors after correction in three ways, one after another. To illustrate this let’s start simple and simulate 1000 networks of 10 stations with 100 years/values. In the first examples below these stations have exactly five breaks, whose sizes are drawn from a normal distribution with variance one. The noise, simulating measurement uncertainties and differences in local weather, is also noise with a variance of one. This is quite noisy for European temperature annual averages, but happens earlier in the climate record and in other regions. Also to keep it simple there is no net trend bias yet.

The figure to the right is a scatterplot with theoretically 1000*10*100=1 million yearly temperature averages as they were simulated (on the x-axis) and after correction (y-axis).

Within the plots we show some statistics, on the top left these are 1) the mean of x, i.e. the mean of the inserted inhomogeneities. 2) Then the variance of the inserted inhomogeneities x. 3) Then the mean of the computed corrections y. 4) Finally the variance of the corrections.

In the lower right, 1) the correlation (r) is shown and 2) the number of values (n). For technical reasons, we only show a sample of the 1 million points in the scatterplot, but these statistics are based on all values.

The results look encouraging, they show a high correlation: 0.96. And the points nicely scatter around the x=y line.

The second step is to look at the trends of the stations, there is one trend per station, so we have 1000*10=10,000 of them. See figure to the right. The trend is computed in the standard way using least squares linear regression. Trends would normally have the unit °C per year or century. Here we multiplied the trend with the period, so the values are the total change due to the trend and have unit °C.

The values again scatter beautifully around x=y and the correlation is as high as before: 0.95.

The final step is to compute the 1000 network trends. The result is shown below. The averaging over 10 stations reduces the noise in the scatterplot and the values beautifully scatter around the x=y line, while the correlation is now smaller, it is still decent: 0.81. Remember we started with quite noisy data where the noise was as large as the break signal.

The remaining error

In the next step, rather than plotting the network trend after correction on the y-axis, we plotted the difference between this trend and the inserted network mean trend, which is the trend error remaining after correction. This is plotted in the left panel below. For this case the uncertainty after correction is half of the uncertainty before correction in terms of the printed variances. It is typical to express uncertainties as standard deviations, then the remaining trend error is 71%. Furthermore, their averages are basically zero, so no bias is introduced.

With a signal having as much break variance as noise variance from the measurement and weather differences between the stations, the correction algorithm naturally cannot reconstruct the original inhomogeneities perfectly, but it does so decently and its errors have nice statistical properties.

Now if we increase the variance of the break signal by a factor two we get the result shown in the right panel. Comparing the two panels, it is striking that the trend error after correction is the same, it does not depend on the break signal, only the noise determines how accurate the trends are. In case of large break signals this is nice, but if the break signal is small, this will also mean that the the correction can increase the random trend error. That could be the case in regions where the networks are sparse and the difference time series between two neighboring stations consequently quite noisy.

Large-scale trend biases

This was all quite theoretical, as the networks did not have a bias in their trends. They did have a random trend error due to the inserted inhomogeneities, but averaging over many such networks of 10 stations the trend error would tend to zero. If that were the case in reality, not many people would work on statistical homogenization. The main aim is the reduce the uncertainties in large-scale trends due to (possible) large-scale biases in the trends.

Such large-scale trend biases can be caused by changes in the thermometer screens used, the transition from manual to automatic observations, urbanization around the stations or relocations of stations to better sites.

If we add a trend bias to the inserted inhomogeneities and correct the data with the joint correction method, we find the result to the right. We inserted a large trend bias to all networks of 0.9 °C and after correction it was completely removed. This again does not depend on the size of the bias or the variance of the break signal.

However, this all is only true if all breaks are known, before I write a post about the more realistic case were the breaks are perfectly known, I will first have to write a post about how well we can detect breaks. That will be the next homogenization post.

Some equations

Next to these beautiful scatterplots, the article has equations for each of the the above mentioned three steps 1) from the inserted breaks and noise to what this means for the station data, 2) how this affects the station trend errors, and 3) how this results in network trends.

With equations for the influence of the size of the break signal (the standard deviation of the breaks) and the noise of the difference time series (the standard deviation of the noise) one can then compute how the trend errors before and after correction depend on the signal to noise ratio (SNR), which is the standard deviation of the breaks divided by the standard deviation of the noise. There is also a clear dependence on the number of breaks.

Whether the network trends increase or decrease due to the correction method is determined by the quite simple equation: 6 times the SNR divided by the number of breaks. So if the SNR is one, as in the initial example of this post and the number of breaks is 6 or smaller the correction would improve the trend error, while if there are more than 7 breaks the correction would add a random trend error. This simple equation ignores a weak dependence of the results on the number of stations in the networks.

Further research

I started saying that the correction methods was a research gap, but homogenization algorithms have many more steps beyond detection and correction, which should also be studied in isolation if possible to gain a better understanding.

For example, the computation of a composite reference. The selection of reference stations. The combination of statistical homogenization with metadata on documented changes in the measurement setup. And so on. The last chapter of the draft guidance on homogenization describes research needs, including research on homogenization methods. There are still of lot of interesting and important questions.

Other posts in this series

Part 5: Statistical homogenization under-corrects any station network-wide trend biases

Part 4: Break detection is deceptive when the noise is larger than the break signal

Part 3: Correcting inhomogeneities when all breaks are perfectly known

Part 2: Trend errors in raw temperature station data due to inhomogeneities

Part 1: Estimating the statistical properties of inhomogeneities without homogenization


Lindau, R, V. Venema, 2018: On the reduction of trend errors by the ANOVA joint correction scheme used in homogenization of climate station records. International Journal of Climatology, 38, pp. 5255– 5271. Manuscript: Article:

1 comment:

Comments are welcome, but comments without arguments may be deleted. Please try to remain on topic. (See also moderation page.)

I read every comment before publishing it. Spam comments are useless.

This comment box can be stretched for more space.