Showing posts with label temperature trend bias. Show all posts
Showing posts with label temperature trend bias. Show all posts

Sunday, May 1, 2016

Christy and McNider: Time Series Construction of Summer Surface Temperatures for Alabama

John Christy and Richard McNider have a new paper in the AMS Journal of Applied Meteorology and Climatology called "Time Series Construction of Summer Surface Temperatures for Alabama, 1883–2014, and Comparisons with Tropospheric Temperature and Climate Model Simulations". Link: Christy and McNider (2016).

This post gives just few quick notes on the methodological aspects of the paper.
1. They select data with a weak climatic temperature trend.
2. They select data with a large cooling bias due to improvements in radiation protection of thermometers.
3. They developed a new homogenization method using an outdated design and did not test it.

Weak climatic trend

Christy and McNider wrote: "This is important because the tropospheric layer represents a region where responses to forcing (i.e., enhanced greenhouse concentrations) should be most easily detected relative to the natural background."

The trend in the troposphere should a few percent stronger than at the surface; mainly in the tropics. However, it is mainly interesting that they see a strong trend as a reason to prefer tropospheric temperatures, because when it comes to the surface they select the period and temperature with the smallest temperature trend: the daily maximum temperatures in summer.

The trend in winter due to global warming should be 1.5 times the trend in summer and the trend in the night time minimum temperatures is stronger than the trend in the day time maximum temperatures, as discussed here. Thus Christy and McNider select the data with the smallest trend for the surface. Using their reasoning for the tropospheric temperatures they should prefer night time winter temperatures.

(And their claim on the tropospheric temperatures is not right because whether a trend can be detected does not only depend on the signal, but also on the noise. The weather noise due to El Nino is much stronger in the troposphere and the instrumental uncertainties are also much larger. Thus the signal to noise ratio is smaller for the tropospheric temperatures, even if the signal were as long as the surface observations.

Furthermore, I am somewhat amused that there are still people interested in the question whether global warming can be detected.)

[UPDATE. Tamino shows that within the USA, Alabama happens to be the region with the least warming. The more so for the maximum temperature. The more so for the summer temperature.]

Cooling bias

Then they used data with a very large cooling bias due to improvements in the protection of the thermometer for (solar and infra-red) radiation. Early thermometers were not protected as well against solar radiation and typically record too high temperatures. Early thermometers also recorded too cool minimum temperatures; the thermometer should not see the cold sky, otherwise it radiates out to it and cools. The warming bias in the maximum temperature is larger than the cooling bias in the minimum temperature, thus the mean temperature still has some bias, but less than the maximum temperature.

Due to this reduction in the radiation error summer temperatures have a stronger cooling bias than winter temperatures.

The warming effect of early measurements on the annual means is probably about 0.2 to 0.3°C. In the maximum temperature is will be a lot higher and in the summer temperature it will again be a lot higher.

That is why most climatologists use the annual means. Homogenization can improve climate data, but it cannot remove all biases. Thus it is good to start with data that has least bias. Much better than starting with a highly biased dataset like Christy and McNider did.

Statistical homogenization removes biases by comparing a candidate station to its neighbour. The stations need to be close enough together so that the regional climate can be assumed to be similar in both stations. The difference between two stations is then weather noise and inhomogeneities (non-climatic changes due to changes in the way temperature was measured).

If you want to be able to see the inhomogeneities you thus need to have well correlated neighbors that have as little weather noise as possible. By using only the maximum temperature, rather than the mean temperature, you increase the weather noise. But using the monthly means in summer, rather than the annual means or at the very least the summer means, you increase the weather noise. By going back in time more than a century you increase the noise because we had less stations to compare with at the time.

They keyed part of the the data themselves mainly for the period before 1900 from the paper records. It sounds as if they performed no quality control of these values (to detect measurement errors). This will also increase the noise.

With such a low signal to noise ratio (inhomogeneities that are small relative to the weather noise in the difference time series), the estimated date of the breaks they still found will have a large uncertainty. It is thus a pity that they purposefully did not use information from station histories (metadata) to get the date of the breaks right.

Homogenization method

They developed their own homogenization method and only tested it on a noise signal with one break in the middle. Real series have multiple breaks; in the USA typically every 15 years. Furthermore also the reference series has breaks.

The method uses the detection equation from the Standard Normal Homogeneity Test (SNHT), but then starts using different significance levels. Furthermore for some reason it does not use the hierarchical splitting of SNHT to deal with multiple breaks, but it detects on a window, in which it is assumed there is only one break. However, if you select the window too long it will contain more than one break and if you select the window too short the method will have no detection power. You would thus theoretically expect the use of a window for detection to perform very badly and this is also what we found in a numerical validation study.

I see no real excuse not to use better homogenization methods (ACMANT, PRODIGE, HOMER, MASH, Craddock). These are build to take into account that also the reference station has breaks and that a series will have multiple breaks; no need for ad-hoc windows.

If you design your own homogenization method, it is good scientific practice to test it first, to study whether it does what you hope it does. There is, for example, the validation dataset of the COST Action HOME. Using that immediately allows you to compare your skill to the other methods. Given the outdated design principles, I am not hopeful the Christy and McNider homogenization method would score above average.

Conclusions

These are my first impressions on the homogenization method used. Unfortunately I do not have the time at the moment to comment on the non-methodological parts of the paper.

If there are no knowledgeable reviewers available in the USA, it would be nice if the AMS would ask European researchers, rather than some old professor who in the 1960s once removed an inhomogeneity from his dataset. Homogenization is a specialization, it is not trivial to make data better and it really would not hurt if the AMS would ask for expertise from Europe when American experts are busy.

Hitler is gone. The EGU general assembly has a session on homogenization, the AGU does not. The EMS has a session on homogenization, the AMS does not. EUMETNET organizes data management workshops, a large part of which is about homogenization; I do not know of an American equivalent. And we naturally have the Budapest seminars on homogenization and quality control. Not Budapest, Georgia, nor Budapest, Missouri, but Budapest, Hungary, Europe.



Related reading

Tamino: Cooling America. Alabama compared to the rest of contiguous USA.

HotWhopper discusses further aspects of this paper and some differences between the paper and the press release. Why nights can warm faster than days - Christy & McNider vs Davy 2016

Early global warming

Statistical homogenisation for dummies

Tuesday, August 11, 2015

History of temperature scales and their impact on the climate trends

Guest post by Peter Pavlásek of the Slovak Institute of Metrology. Metrology, not meteorology, they are the scientists that work on making measurements more precise by developing high accurate standards and thus make experimental results better comparable.

Since the beginning of climate observations temperature has always been an important quantity that needed to be measured as its values affected every aspect of human society. Therefore its precise and reliable temperature determination was important. Of course the ability to precisely measure temperature strongly depends on the measuring sensor and method. To be able to determine how precisely the sensor measures temperature it needs to be calibrated by a temperature standard. As science progressed with time new temperature scales were introduced and the previous temperature standards naturally changed. In the following sections we will have a look on the importance of temperature scales throughout the history and their impact on evaluation of historical climate data.

The first definition of a temperature standard was created in 1889. At the time thermometers were ubiquitous, and had been used for centuries; for example, they had been used to document ocean and air temperature now included in historical records. Metrological temperature standards are based on state transitions of matter (under defined conditions and matter composition) that generate a precise and highly reproducible temperature value. For example, the melting of ice, the freezing of pure metals, etc. Multiple standards can be used as a base for a temperature scale by creating a set of defined temperature points along the scale. An early definition of a temperature scale was invented by the medical doctor Sebastiano Bartolo (1635-1676), who was the first to use melting snow and the boiling point of water to calibrate his mercury thermometers. In 1694 Carlo Renaldini, mathematician and engineer, suggested using the ice melting point and the boiling point of water to divide the interval between these two points into 12 degrees, applying marks on a glass tube containing mercury. Reamur divided the scale in 80 degrees, while the modern division of roughly 100 degrees was adopted by Anders Celsius in 1742. Common to all the scales was the use of phase transitions as anchor points, or fixed points, to define intermediate temperature values.

It is not until 1878 that the first sort of standardized mercury-in-glass thermometers were introduced as an accompanying instrument for the metre prototype, to correct from thermal expansion of the length standard. These special thermometers were constructed to guarantee reproducibility of measurement of a few thousandths of a degree. They were calibrated at the Bureau International des Poids et Mesures (BIPM), established after the recent signature of the Convention du Metre of 1875. The first reference temperature scale was adopted by the 1st Conférence générale des poids et measures ( CGPM) in 1889. It was based on constant volume gas thermometry, and relied heavily on the work of Chappius at BIPM, who had used the technique to link the readings of the very best mercury-in-glass thermometers to absolute (i.e. thermodynamic) temperatures.

Meanwhile, the work of Hugh Longbourne Callendar and Ernest Howard Griffiths on the development of platinum resistance thermometers (PRTs) lay the foundations for the first practical scale. In 1913, after a proposal from the main Institutes of metrology, the 5th CGPM encouraged the creation of a thermodynamic International Temperature Scale (ITS) with associated practical realizations, thus merging the two concepts. The development was halted by the World War I, but the discussions resumed in 1923 when platinum resistance thermometers were well developed and could be used to cover the range from –38 °C, the freezing point of mercury, to 444.5 °C, the boiling point of sulphur, using a quadratic interpolation formula, that included the boiling point of water at 100 °C. In 1927 the 7th CGPM adopted the International Temperature Scale of 1927 that even extended the use of PRTs to -183 °C. The main intention was to overcome the practical difficulties of the direct realization of thermodynamic temperatures by gas thermometry, and the scale was a universally acceptable replacement for the various existing national temperature scales.

In 1937 the CIPM established the Consultative Committee on Thermometry (CCT). Since then the CCT has taken all initiatives in matter of temperature definition and thermometry, including, in the recent years, issues concerning environment, climate and meteorology. It was in fact the CCT that in 2010, shortly after the BIPM-WMO workshop on “Measurement Challenges for Global Observing Systems for Climate Change Monitoring” submitted the recommendation CIPM (T3 2010), encouraging National Metrology Institutes to cooperate with the meteorology and climate communities for establishing traceability to those thermal measurements of importance for detecting climate trends.

The first revision of the 1927 ITS took place in 1948, when extrapolation below the oxygen point to –190 °C was removed from the standard, since it had been found to be an unreliable procedure. The IPTS-48 (with “P” now standing for “practical”) extended down only to –182.97 °C. It was also decided to drop the name "degree Centigrade" for the unit and replace it by degree Celsius. In 1954 the 10th CGPM finally adopted a proposal that Kelvin had made back one century before, namely that the unit of thermodynamic temperature to be defined in terms of the interval between the absolute zero and a single fixed point. The fixed point chosen was the triple point of water, which was assigned the thermodynamic temperature of 273.16 °K or equivalently 0.01 °C and replaced the melting point of ice. Work continued on helium vapour pressure scales and in 1958 and 1962 the efforts were concentrated at low temperatures below 0.9 K. In 1964 the CCT defined the reference function “W” for interpolating the PRTs readings between all new low temperature fixed points, from 12 K to 273,16 K and in 1966 further work on radiometry, noise, acoustic and magnetic thermometry made CCT preparing for a new scale definition.

In 1968 the second revision of the ITS was delivered: both thermodynamic and practical units were defined to be identical and equal to 1/273.16 of the thermodynamic temperature of the triple point of water. The unit itself was renamed "the kelvin" in place of "degree Kelvin" and designated "K" in place of "°K". In 1976 further consideration and results at low temperatures between 0.5 K and 30 K were included in the Provisional Temperature Scale, EPT-76. Meanwhile several NMIs continued the work to better define the fixed points values and the PRT’s characteristics. The International Temperature Scale of 1990 (ITS-90) came into effect on 1 January 1990, replacing the IPTS-68 and the EPT-76 and is still today adopted to guarantee traceability of temperature measurements. Among the main features of ITS-90, with respect to the 1968 one, is the use of the triple point of water (273.16 K), rather than the freezing point of water (273.15 K), as a defining point; it is in closer agreement with thermodynamic temperatures; it has improved continuity and precision.

It follows that any temperature measurement made before 1927 is impossible to trace to an international standard, except for a few nations with a well-defined national definition. Later on, during the evolution of both the temperature unit and the associated scales, changes have been introduced to improve the realization and measurement accuracy.

With each redefinition of the practical temperature scale since the original scale of 1927, the BIPM published official transformation tables to enable conversion between the old and the revised temperature scale (BIPM. 1990). Because of the way the temperature scales have been defined, they really represent an overlap of multiple temperature ranges, each of which may have their own interpolating instrument, fixed points or mathematical equations describing instrument response. A consequence of this complexity is that no simple mathematical relations can be constructed to convert temperatures acquired according to older scales into the modern ITS90 scale.

As an example of the effect of temperature scales alternations let us examine the correction of the daily mean temperature record at Brera, Milano in Italy from 1927 to 2010, shown in Figure 1. The figure illustrates the consequences of the temperature scale change and the correction that needed to be applied to convert the historical data to the current ITS-90. The introduction of new temperature scales in 1968 and 1990 is clearly visible as discontinuities in the magnitude of the correction, with significantly larger corrections for data prior to 1968. As expected from Figure 1, the cycling follows the seasonal changes in temperature. The higher summer temperatures require a larger correction.


Figure 1. Example corrections for the weather station at Brera, Milano in Italy. The values are computed for the daily average temperature. The magnitude of the correction cycles with the annual variations in temperature: the inset highlights how the warm summer temperatures are corrected much more (downward) than the cool winter temperatures.

For the same reason the corrections will differ between locations. The daily average temperatures at the Milano station typically approaches 30 °C on the warmest summer days, while it may fall slightly below freezing in winter. In a different location with larger differences between typical summer and winter temperature the corrections might oscillate around 0 °C, and a more stable climate might see smaller corrections overall: at Utsira, a small island off the south-western coast of Norway the summertime corrections are typically 50% below the values for Brera. To better see the magnitude of corrections for specific historical temperatures the Figure 2 is provided.


Figure 2. The corrections in °C that need to be applied to a certain historical temperatures in the range form -50 °C up to +50 °C with regard to the time period the historical data were measured.

The uncertainty in the temperature readings from any individual thermometer is significantly larger than the corrections presented here. Furthermore, even for the limited timespan since 1927 a typical meteorological weather station has seen many changes which may affect the temperature readings. Examples include instrument replacement; instrument relocations; screens may be rebuilt, redesigned or moved; the schedule for readings may change; the environment close to the station may become more densely populated and therefore enhance the urban heat island effect; and manually recorded temperatures may suffer from unconscious observer bias (Camuffo, 2002; Bergstrøm and Moberg, 2002; Kennedy, 2013). Despite the diligent quality control employed by meteorologists during the reconstruction of long records, every such correction also has an uncertainty associated with it. Thus, for an individual instrument, and perhaps even an individual station, the scale correction is insignificant.

On the other hand, more care is needed for aggregate data. The scale correction represents a bias which is equal for all instruments, regardless of location and use, and simply averaging data from multiple sources will not eliminate it. The scale correction is smaller than, but of the same order of magnitude as the uncertainty components claimed for monthly average global temperatures in the HadCRUT4 dataset (Morice et al., 2012). To evaluate the actual value of the correction for the global averages would require a recalculation of all the individual temperature records. However, the correction does not alter the warming trend: if anything it would exacerbate it slightly. Time averaging or averaging multiple instruments has been claimed to lower temperature uncertainty to around 0.03 °C (for example in Kennedy (2013) for aggregate temperature records of sea surface temperature). To be credible such claims for the uncertainty need to consider the scale correction in our opinion.

Scale correction for temperatures earlier than 1927 is harder to assess. Without an internationally accepted and widespread calibration reference it is impossible to construct a simple correction algorithm, but there is reason to suspect that the corrections become more important for older parts of the instrumental record. Quantifying the correction would entail close scrutiny of the old calibration practices, and hinges on available contemporary descriptions. Conspicuous errors can be detected, such as the large discrepancy which Burnette et al. found in 1861 from records at Fort Riley, Kansas (Burnette et al., 2010). In that case the decision to correct the dubious values was corroborated by metadata describing a change of observer: however, this also illustrates the calibration pitfall when no widespread temperature standard was available. One would expect that many more instruments were slightly off, and the question is whether this introduced a bias or just random fluctuations which can be averaged away when producing regional averages.

Whether the relative importance of the scale correction increases further back in time remains an open question. The errors from other sources such as the time schedule for the measurements also become more important and harder to account for, such as the transformation from old Italian time to modern western European time described in (Camuffo, 2002).

This brief overview of temperature scales history has shown what an impact these changes have on historical temperature data. As it was discussed earlier the corrections originating from the temperature scale changes is small when compared with other factors. Even when the values of the correction may be small it doesn’t mean it should be ignored as their magnitude are far from negligible. More details about this problematic and the conversion equation that enables to convert any historical temperature data from 1927 up to 1989 to the current ITS-90 can be found in the publication of Pavlasek et al. (2015).



Related reading

Why raw temperatures show too little global warming

Just the facts, homogenization adjustments reduce global warming

References

Camuffo, Dario, 2002: Errors in early temperature series arising from changes in style of measuring time, sampling schedule and number of observations. Climatic change, 53, pp. 331-352.

Bergstrøm, H. and A. Moberg, 2002: Daily air temperature and pressure series for Uppsala (1722-1998). Climatic change, 53, pp. 213-252.

Kenndy, John J., 2013: A review of uncertainty in in situ measurements and data sets of sea surface temperature. Reviews of geophysics, 52, pp. 1-32.

Morice, C.P., et al., 2012: Quantifying uncertainties in global and regional temperature change using an ensemble of observational estimates: The HaddCRUT4 data set. Journal of geophysical research, 117, pp. 1-22.

Burnette, Dorian J., David W. Stahle, and Cary J. Mock, 2010: Daily-Mean Temperature Reconstructed for Kansas from Early Instrumental and Modern Observations. Journal of Climate, 23, pp. 1308-1333.

Pavlasek P., A. Merlone, C. Musacchio, A.A.F. Olsen, R.A. Bergerud, and L. Knazovicka, 2015: Effect of changes in temperature scales on historical temperature data. International Journal of Climatology, doi: 10.1002/joc.4404.

Saturday, June 6, 2015

No! Ah! Part II. The return of the uncertainty monster



Some may have noticed that a new NOAA paper on the global mean temperature has been published in Science (Karl et al., 2015). It is minimally different from the previous one. Why the press is interested, why this is a Science paper, why the mitigation sceptics are not happy at all is that due to these minuscule changes the data no longer shows a "hiatus", no statistical analysis needed any more. That such paltry changes make so much difference shows the overconfidence of people talking about the "hiatus" as if it were a thing.

You can see the minimal changes, mostly less than 0.05°C, both warmer and cooler, in the top panel of the graph below. I made the graph extra large, so that you can see the differences. The thick black line shows the new assessment and the thin red line the previous estimated global temperature signal.



It reminds of the time when a (better) interpolation of the datagap in the Arctic (Cowtan and Way, 2014) made the long-term trend almost imperceptibly larger, but changed the temperature signal enough to double the warming during the "hiatus". Again we see a lot of whining from the people who should not have build their political case on such a fragile feature in the first place. And we will see a lot more. And after that they will continue to act as if the "hiatus" is a thing. At least after a few years of this dishonest climate "debate" I would be very surprised if they would sudden look at all the data and would make a fair assessment of the situation.

The most paradox are the mitigation sceptics who react by claiming that scientists are not allowed to remove biases due to changes in the way temperature was measured. Without accounting for the fact that old sea surface temperature measurements were biased to be too cool, global warming would be larger. Previously I explained the reasons why raw data shows more warming and you can see the effect in the bottom panel of the above graph. The black line shows NOAA's current best estimate for the temperature change, the thin blue (?) line the temperature change in the raw data. Only alarmists would prefer the raw temperature trend.



The trend changes over a number of periods are depicted above; the circles are the old dataset, the squares the new one. You can clearly see differences between the trend for the various short periods. Shifting the period by only 2 years creates large trend difference. Another way to demonstrate that this features is not robust.

The biggest change in the dataset is that NOOA now uses the raw data of the land temperature database of the International Surface Temperature Initiative (ISTI). (Disclosure, I am member of the ISTI.) This dataset contains much more stations than the previously used Global Historical Climate Network (GHCNv3) dataset. (The land temperatures were homogenized with the same Pairwise Homogenization Algorithm (PHA) as before.)

The new trend in the land temperature is a little larger over the full period; see both graphs above. This was to be expected. The ISTI dataset contains much more stations and is now similar to the one of Berkeley Earth, which already had a somewhat stronger temperature trend. Furthermore, we know that there is a cooling bias in the land surface temperatures and with more stations it is easier to see data problems by comparing stations with each other and relative homogenization methods can remove a larger part of this trend bias.

However, the largest trend changes in recent periods are due to the oceans; the Extended Reconstructed Sea Surface Temperature (ERSST v4) dataset. Zeke Hausfather:
They also added a correction for temperatures measured by floating buoys vs. ships. A number of studies have found that buoys tend to measure temperatures that are about 0.12 degrees C (0.22 F) colder than is found by ships at the same time and same location. As the number of automated buoy instruments has dramatically expanded in the past two decades, failing to account for the fact that buoys read colder temperatures ended up adding a negative bias in the resulting ocean record.
It is not my field, but if I understand it correctly other ocean datasets, COBE2 and HadSST3, already took these biases into account. Thus the difference between these datasets needs to have another reason. Understanding these differences would be interesting. And NOAA did not yet interpolate over the data gap in the Arctic, which would be expected to make its recent trends even stronger, just like it did for Cowtan and Way. They are working on that; the triangles in the above graph are with interpolation. Thus the recent trend is currently still understated.

Personally, I would be most interested in understanding the difference that are important for long-term trends, like the differences shown below in two graphs prepared by Zeke Hausfather. That is hard enough and such questions are more likely answerable. The recent differences between the datasets is even tinier than the tiny "hiatus" itself; no idea whether that can be understood.





I need some more synonyms for tiny or minimal, but the changes are really small. They are well within the statistical uncertainty computed from the year to year fluctuations. They are well within the uncertainty due to the fact that we do not have measurements everywhere and need to interpolate. The latter is the typical confidence interval you see in historical temperature plots. For most datasets the confidence interval does not include the uncertainty because biases were not perfectly removed. (HadCRUT does this partially.)

This uncertainty becomes relatively more important on short time scales (and for smaller regions); for large time scales are large regions (global) many biases will compensate each other. For land temperatures a 15-year period is especially dangerous, that is about the period between two inhomogeneities (non-climatic changes).

The recent period is in addition especially tricky. We are just in an important transitional period from manual observations with thermometers Stevenson screens to automatic weather stations. Not only the measurement principle is different, but also the siting. It is difficult, on top of this, to find and remove inhomogeneities near the end of the series because the computed mean after the inhomogeneity is based on only a few values and has a large uncertainty.

You can get some idea of how large this uncertainty is be comparing the short-term trend of two independent datasets. Ed Hawkins has compared the new USA NOAA data and the current UK HadCRUT4.3 dataset at Climate Lab Book and presented these graphs:



By request, he kindly computed the difference between these 10-year trends shown below. They suggest that if you are interested in short term trends smaller than 0.1°C per decade (say the "hiatus"), you should study whether your data quality is good enough to be able to interpret the variability as being due to climate system. The variability should be large enough or have a stronger regional pattern (say El Nino).

If the variability you are interested in is somewhat bigger than 0.1°C you probably want to put in work. Both datasets are based on much of the same data and use similar methods. For homogenization of surface stations we know that it can reduce biases, but not fully remove them. Thus part of the bias will be the same for all datasets that use statistical homogenization. The difference shown below is thus an underestimate of the uncertainty and it will need analytic work to compute the real uncertainty due to data quality.



[UPDATE. I thought I had an interesting new angle, but now see that Gavin Schmidt, director of NASA GISS, has been saying this in newspapers since the start: “The fact that such small changes to the analysis make the difference between a hiatus or not merely underlines how fragile a concept it was in the first place.”]

Organisational implications

To reduce the uncertainties due to changes in the way we measure climate we need to make two major organizational changes: we need to share all climate data with each other to better study the past and for the future we need to build up a climate reference network. These are, unfortunately, not things climatologists can do alone, but need actions by politicians and support by their voters.

To quote from my last post on data sharing:
We need [to share all climate data] to see what is happening to the climate. We already had almost a degree of global warming and are likely in for at least another one. This will change the sea level, the circulation, precipitation patterns. This will change extreme and severe weather. We will need to adapt to these climatic changes and to know how to protect our communities we need climate data. ...

To understand climate, we need a global overview. National studies are not enough. To understand changes in circulation, interactions with mountains and vegetation, to understand changes in extremes, we need spatially resolved information and not just a few stations. ...

To reduce the influence of measurement errors and non-climatic changes (inhomogeneities) on our (trend) assessments we need dense networks. These errors are detected and corrected by comparing one station to its neighbours. The closer the neighbours are, the more accurate we can assess the real climatic changes. This is especially important when it comes to changes in severe and extreme weather, where the removal of non-climatic changes is very challenging. ... For the best possible data to protect our communities, we need dense networks, we need all the data there is.
The main governing body of the World Meteorological Organization (WMO) is just meeting until next week Friday (12th of June). They are debating a resolution on climate data exchange. To show your support for the free exchange of climate data please retweet or favourite the tweet below.

We are conducting a (hopefully) unique experiment with our climate system. Future generations climatologists would not forgive us if we did not observe as well as we can how our climate is changing. To make expensive decisions on climate adaptation, mitigation and burden sharing, we need reliable information on climatic changes: Only piggy-backing on meteorological observations is not good enough. We can improve data using homogenization, but homogenized data will always have much larger uncertainties than truly homogeneous data, especially when it comes to long term trends.

To quote my virtual boss at the ISTI Peter Thorne:
To conclude, worryingly not for the first time (think tropospheric temperatures in late 1990s / early 2000s) we find that potentially some substantial portion of a model-observation discrepancy that has caused a degree of controversy is down to unresolved observational issues. There is still an undue propensity for scientists and public alike to take the observations as a 'given'. As [this study by NOAA] attests, even in the modern era we have imperfect measurements.

Which leads me to a final proposition for a more scientifically sane future ...

This whole train of events does rather speak to the fact that we can and should observe in a more sane, sensible and rational way in the future. There is no need to bequeath onto researchers in 50 years time a similar mess. If we instigate and maintain reference quality networks that are stable SI traceable measures with comprehensive uncertainty chains such as USCRN, GRUAN etc. but for all domains for decades to come we can have the next generation of scientists focus on analyzing what happened and not, depressingly, trying instead to inevitably somewhat ambiguously ascertain what happened.
Building up such a reference network is hard because we will only see the benefits much later. But already now after about 10 years the USCRN provides evidence that the siting of stations is in all likelihood not a large problem in the USA. The US reference network with stations at perfectly sited locations, not affected by urbanization or micro-siting problems, shows about the same trend as the homogenized historical USA temperature data. (The reference network even has a non-significant somewhat larger trend.)

There is a number of scientists working on trying to make this happen. If you are interested please contact me or Peter. We will have to design such reference networks, show how much more accurate they would make climate assessments (together with the existing networks) and then lobby to make it happen.



Further reading

Metrologist Michael de Podesta sees to agree with the above post and wrote about the overconfidence of the mitigation sceptics in the climate record.

Zeke Hausfather: Whither the pause? NOAA reports no recent slowdown in warming. This post provides a comprehensive, well-readable (I think) overview of the NOAA article.

A similar well-informed article can be found on Ars Technica: Updated NOAA temperature record shows little global warming slowdown.

If you read the HotWhopper post, you will get the most scientific background, apart from reading the NOAA article itself.

Peter Thorne of the ISTI on The Karl et al. Science paper and ISTI. He gives more background on the land temperatures and makes a case for global climate reference networks.

Ed Hawkins compares the new NOAA dataset with HadCRUT4: Global temperature comparisons.

Gavin Schmidt as a climate modeller explains who well the new dataset fits to climate projections: NOAA temperature record updates and the ‘hiatus’.

Chris Merchant found about the same recent trend in his satellite sea surface temperature dataset and writes: No slowdown in global temperature rise?

Hotwhopper discusses the main egregious errors of the first two WUWT posts on Karl et al. and an unfriendly email of Anthony Watts to NOAA. I hope Hotwhopper is not planning any holidays. It will be busy times. Peter Thorne has the real back story.

NOAA press release: Science publishes new NOAA analysis: Data show no recent slowdown in global warming.

Thomas R. Karl, Anthony Arguez, Boyin Huang, Jay H. Lawrimore, James R. McMahon, Matthew J. Menne, Thomas C. Peterson, Russell S. Vose, Huai-Min Zhang, 2015: Possible artifacts of data biases in the recent global surface warming hiatus. Science. doi: 10.1126/science.aaa5632.

Boyin Huang, Viva F. Banzon, Eric Freeman, Jay Lawrimore, Wei Liu, Thomas C. Peterson, Thomas M. Smith, Peter W. Thorne, Scott D. Woodruff, and Huai-Min Zhang, 2015: Extended Reconstructed Sea Surface Temperature Version 4 (ERSST.v4). Part I: Upgrades and Intercomparisons. Journal Climate, 28, pp. 911–930, doi: 10.1175/JCLI-D-14-00006.1.

Rennie, Jared, Jay Lawrimore, Byron Gleason, Peter Thorne, Colin Morice, Matthew Menne, Claude Williams, Waldenio Gambi de Almeida, John Christy, Meaghan Flannery, Masahito Ishihara, Kenji Kamiguchi, Abert Klein Tank, Albert Mhanda, David Lister, Vyacheslav Razuvaev, Madeleine Renom, Matilde Rusticucci, Jeremy Tandy, Steven Worley, Victor Venema, William Angel, Manola Brunet, Bob Dattore, Howard Diamond, Matthew Lazzara, Frank Le Blancq, Juerg Luterbacher, Hermann Maechel, Jayashree Revadekar, Russell Vose, Xungang Yin, 2014: The International Surface Temperature Initiative global land surface databank: monthly temperature data version 1 release description and methods. Geoscience Data Journal, 1, pp. 75–102, doi: 10.1002/gdj3.8.

Wednesday, April 15, 2015

Why raw temperatures show too little global warming

In the last few amonths I have written several posts why raw temperature observations may show too little global warming. Let's put it all in perspective.

People who have followed the climate "debate" have probably heard of two potential reasons why raw data shows too much global warming: urbanization and the quality of the siting. These are the two non-climatic changes that mitigation sceptics promote claiming that they are responsible for a large part of the observed warming in the global mean temperature records.

If you only know of biases producing a trend that is artificially too strong, it may come as a surprise that the raw measurements actually have too small a trend and that removing non-climatic changes increases the trend. For example, in the Global Historical Climate Network (GHCNv3) of NOAA, the land temperature change since 1880 is increased by about 0.2°C by the homogenization method that removes non-climatic changes. See figure below.

(If you also consider the adjustments made to ocean temperatures, the net effect of the adjustments is that they make the global temperature increase smaller.)


The global mean temperature estimates from the Global Historical Climate Network (GHCNv3) of NOAA, USA. The red curve shows the global average temperature in the raw data. The blue curve is the global mean temperature after removing non-climatic changes. (Figure by Zeke Hausfather.)

The adjustments are not always that "large". The Berkeley Earth group may much smaller adjustments. The global mean temperature of Berkeley Earth is shown below. However, as noted by Zeke Hausfather in the comments below, also the curve where the method did not explicitly detect breakpoints does homogenize the data partially because it penalises stations that have a very different trend than their neighbours. After removal of non-climatic changes BEST come to a similar climatic trend as seen in GHCNv3.


The global mean temperature estimates from the Berkeley Earth project (previously known as BEST), USA. The blue curve is computed without using their method to detect breakpoints, the red curve the temperature after adjusting for non-climatic changes. (Figure by Steven Mosher.)

Let's go over the reasons why the temperature trend may show too little warming.
Urbanization and siting
Urbanization warms the location of a station, but these stations also tend to move away from the centre to better locations. What matters is where the stations were in the beginning of the observation and where they are now. How much too warm the origin was and how much too warm the ending. This effect has been studied a lot and urban stations seem to have about the same trend as their surrounding (more) rural stations.
 
A recent study for two villages showed that the current location of the weather station is half a degree centigrade cooler than the centre of the village. Many stations started in villages (or cities), thermometers used to be expensive scientific instruments operated by highly educated people and they had to be read daily. Thus the siting of many stations may have improved, which would lead to a cooling bias.
 
When a city station moves to an airport, which happened a lot around WWII, this takes the station (largely) out of the urban heat island. Furthermore, cities are often located near the coast and in valleys. Airports may thus often be located at a higher altitude. Both reasons could lead to a considerable cooling for the fraction of stations that moved to airports.
 
Changes in thermometer screens
During the 20th century the Stevenson screen was established as the dominant thermometer screen. This screen protected the thermometer much better against radiation (solar and heat) than earlier designs. Deficits of earlier measurement methods have artificially warmed the temperatures in the 19th century.
 
Some claim that earlier Stevenson screens were painted with inferior paints. The sun consequently heats up the screen more, which again heats the incoming air. The introduction of modern durable white paints may thus have produced a cooling bias.
 
Currently we are in a transition to Automatic Weather Stations. This can show large changes in either direction for the network they are introduced in. What the net global effect is, is not clear at this moment.
 
Irrigation
Irrigation on average decreases the 2m-temperature by about 1 degree centigrade. At the same time, irrigation has spread enormously during the last century. People preferentially live in irrigated areas and weather stations serve agriculture. Thus it is possible that there is a higher likelihood that weather stations are erected in irrigated areas than elsewhere. In this case irrigation could lead to a spurious cooling trend. For suburban stations an increase of watering gardens could also produce a spurious cooling trend.
It is understandable that in the past the focus was on urbanization as a non-climatic change that could make the warming in the climate records too strong. Then the focus was on whether climate change was happening (detection). To make a strong case, science had to show that even the minimum climatic trend was too large to be due to chance.

Now that we know that the Earth is warming, we no longer just need a minimum estimate of the temperature trend, but the best estimate of the trend. For a realistic assessment of models and impacts we need the best estimate of the trend, not just the minimum possible trend. Thus we need to understand the reasons why raw records may show too little warming and quantify these effects.

Just because the mitigation skeptics are talking nonsense about the temperature record does not mean that there are no real issues with the data and it does not mean that statistical homogenization can remove trend errors sufficiently well. This is a strange blind spot in climate science. As Neville Nicholls, one of the heroes of the homogenization community, writes:
When this work began 25 years or more ago, not even our scientist colleagues were very interested. At the first seminar I presented about our attempts to identify the biases in Australian weather data, one colleague told me I was wasting my time. He reckoned that the raw weather data were sufficiently accurate for any possible use people might make of them.
One wonders how this colleague knew this without studying it.

The reasons for a cooling bias have been studied much too little. At this time we cannot tell which reason is how important. Any of these reasons is potentially important enough to be able to explain the 0.2°C per century trend bias found in GHNv3. Especially in the light of the large range of possible values, a range that we can often not even estimate at the moment. In fact, all the above mentioned reasons could together explain a much larger trend bias, which could dramatically change our assessment of the progress of global warming.

The fact is that we cannot quantify the various cooling biases at the moment and it is a travesty that we can't.


Other posts in this series

Irrigation and paint as reasons for a cooling bias

Temperature trend biases due to urbanization and siting quality changes

Changes in screen design leading to temperature trend biases

Temperature bias from the village heat island

Saturday, April 4, 2015

Irrigation and paint as reasons for a cooling bias

Irrigation pump in India 1944

In previous posts on reasons why raw temperature data may show too little global warming I have examined improvements in the siting of stations, improvements in the protection of thermometers against the sun, and moves of urban stations to better locations, in particular to airports. This post will be about the influence of irrigation and watering, as well as improvements in the paints used for thermometer screens.

Irrigation and watering

Irrigation can decrease air temperature by up to 5 degrees and typically decreases the temperature by about 1°C (Cook et al., 2014). Because of irrigation more solar energy is used for evaporation and for transpiration by the plants, rather than for warming of the soil and air.

Over the last century we have seen a large 5 to 6 fold global increase in irrigation; see graph below.



The warming by the Urban Heat Island (UHI) is real. The reason we speak of a possible trend bias due to increases in the UHI is that an urban area has a higher probability of siting a weather station than rural areas. If only for the simple reason that that is where people live and want information on the weather.

The cooling due to increases in irrigation are also real. It seems to be a reasonable assumption that an irrigated area again has a higher probability of siting a weather station. People are more likely to live in irrigated areas and many weather stations are deployed to serve agriculture. While urbanization is a reason for stations to move to better locations, irrigation is no reason for a station to move away. On the contrary maybe even.

The author of the above dataset showing increases in irrigation, Stefan Siebert, writes: "Small irrigation areas are spread across almost all populated areas of the world." You can see this strong relation between irrigation and population on a large scale in the map below. It seems likely that this is also true on local scales.



Many stations are also in suburbs and these are likely watered more than they were in the past when water (energy) was more expensive or people even had to use hand pumps. In the same way as irrigation, watering could produce a cool bias due to more evaporation. Suburbs may thus be even cooler than the surrounding rural areas if there is no irrigation. Does anyone know of any literature about this?

I know of one station in Spain where the ground is watered to comply with WMO guidelines that weather stations should be installed on grass. The surrounding is dry and bare, but the station is lush and green. This could also cause a temperature trend bias under the reasonable assumption that this is a new idea. If anyone knows more about such stations, please let me know.



From whitewash to latex paint

Also the maintenance of the weather station can be important. Over the years better materials and paints may have been used for thermometer screens. If this makes the screens more white, they heat up less and they heat up the air flowing through the Louvres less. More regular cleaning and painting would have the same effect. It is possible that this has improved when climate change made weather services aware that high measurement accuracies are important. Unfortunately, it is also possible that good maintenance is nowadays seen as inefficient.

The mitigation skeptics somehow thought that the effect would go into the other direction. That the bad paints used in the past would be a cooling bias, rather than a warming bias. Something with infra-red albedo. Although most materials used have about the same infra-red albedo and the infra-red radiation fluxes are much smaller than the solar fluxes.

Anthony Watts started a paint experiment in his back garden in July 2007. The first picture below shows three Stevenson screens, a bare one, a screen with modern latex paint and one with whitewash, a chalk paint that quickly fades.



Already 5 months later in December 2007, the whitewash had deteriorated considerably; see below. This should lead to a warm bias for the whitewash screen, especially in summer.

Anthony Watts:
Compare the photo of the whitewash paint screen on 7/13/07 when it was new with one taken today on 12/27/07. No wonder the NWS dumped whitewash as the spec in the 70’s in favor of latex paint. Notice that the Latex painted shelter still looks good today while the Whitewashed shelter is already deteriorating.

In any event the statement of Patrick Michaels “Weather equipment is very high-maintenance. The standard temperature shelter is painted white. If the paint wears or discolors, the shelter absorbs more of the sun’s heat and the thermometer inside will read artificially high.” seems like a realistic statement in light of the photos above.
I have not seen any data from this experiment beyond a plot with one day of temperatures, which was a day one month after the start, showing no clear differences between the Stevenson screens. They were all up to 1°C warmer than the modern ventilated automatic weather station when the sun was shining. (That the most modern ventilated measurement had a cool bias was not emphasized in the article, as you can imagine.) Given that Anthony Watts maintains a stealth political blog against mitigation of climate change, I guess we can conclude that he probably did not like the results, that the old white wash screen was warmer and he did not want to publish that.

We may be able to make a rough estimate the size of the effect by looking at another experiment with a bad screen. In sunny Italy Giuseppina Lopardo and colleagues compared two old aged, yellowed and cracked screens of unventilated automatic weather stations that should have been replaced long ago with a good new screen. The picture to the right shows the screen after 3 years. They found a difference of 0.25°C after 3 years and 0.32°C after 5 years.

The main caveat is that the information on the whitewash comes from Anthony Watts. It may thus well misinformation that the American Weather Bureau used whitewash in the past. Lacquer paints are probably as old as 8000 years and I see no reason to use whitewash for a small and important weather screen. If anyone has a reliable source about paints used in the past, either inside or outside the USA, I would be very grateful.



Related posts

Changes in screen design leading to temperature trend biases

Temperature bias from the village heat island

Temperature trend biases due to urbanization and siting quality changes

Climatologists have manipulated data to REDUCE global warming

Homogenisation of monthly and annual data from surface stations

References

Cook, B.I., S.P. Shukla, M.J. Puma, L.S. Nazarenko, 2014: Irrigation as an historical climate forcing. Climate Dynamics, 10.1007/s00382-014-2204-7.

Siebert, Stefan, Jippe Hoogeveen, Petra Döll, Jean-Marc Faurès, Sebastian Feick and Karen Frenken, 2006: The Digital Global Map of Irrigation Areas – Development and Validation of Map Version 4. Conference on International Agricultural Research for Development. Tropentag 2006, University of Bonn, October 11-13, 2006.

Siebert, S., Kummu, M., Porkka, M., Döll, P., Ramankutty, N., and Scanlon, B.R., 2015: A global data set of the extent of irrigated land from 1900 to 2005. Hydrology and Earth System Sciences, 19, pp. 1521-1545, doi: 10.5194/hess-19-1521-2015.

See also: Zhou, D., D. Li, G. Sun, L. Zhang, Y. Liu, and L. Hao (2016), Contrasting effects of urbanization and agriculture on surface temperature in eastern China, J. Geophys. Res. Atmos., 121, doi: 10.1002/2016JD025359.

Tuesday, March 31, 2015

Temperature trend biases due to urbanization and siting quality changes

The temperature in urban areas can be several degrees higher than their surrounding due to the Urban Heat Island (UHI). The additional heat stress is an important medical problem and studied by bio-meteorologists. Many urban geographers study the UHI and ways to reduce the heat stress. Their work suggests that the UHI is due to a reduction in evaporation from bare soil and vegetation in city centers. The solar energy that is not used for evaporation goes into warming of the air. In case of high-rise buildings there are, in addition, more surfaces and thus more storage of heat in the buildings during the day, which is released during the night. High-rise buildings also reduce radiative cooling (infrared) at night because the surface sees a smaller part of the cold sky. Recent work suggests that cities also influence convection (often visible as cumulus (towering) clouds).

To study changes in the temperature, a constant UHI bias is no problem. The problem is an increase in urbanization. For some city stations this can be clearly seen in a comparison with nearby rural stations. A clear example is the temperature at the station in Tokyo, where the temperature since 1920 rises faster than in surrounding stations.



Scientists like to make a strong case, thus before they confidently state that the global temperature is increasing, they have naturally studied the influence of urbanization in detail. An early example is Joseph Kincer of the US Weather Bureau (HT @GuyCallendar) who studied the influence of growing cities in 1933.

While urbanization can be clearly seen for some stations, the effect on the global mean temperature is small. The Fourth Assessment Report from the IPCC, states the following.
Studies that have looked at hemispheric and global scales conclude that any urban-related trend is an order of magnitude smaller than decadal and longer time-scale trends evident in the series (e.g., Jones et al., 1990; Peterson et al., 1999). This result could partly be attributed to the omission from the gridded data set of a small number of sites (<1%) with clear urban-related warming trends. ... Accordingly, this assessment adds the same level of urban warming uncertainty as in the TAR: 0.006°C per decade since 1900 for land, and 0.002°C per decade since 1900 for blended land with ocean, as ocean UHI is zero.
Next to the removal of urban stations, the influence of urbanization is reduced by statistical removal of non-climatic changes (homogenization). The most overlooked aspect may, however, be that urban stations do not often stay at the same location, but rather are relocated when the surrounding is seen to be no longer suited or the meteorological offices simply cannot pay the rent any more or the offices are relocated to airports to help with air traffic safety.

Thus urbanization does not only lead to an gradual increase in temperature, but also to downward jumps. Such a non-climatic change often looks like an (irregular) sawtooth. This can lead to artificial trends in both directions; see sketch below. In the end, what counts is how strong the UHI was in the beginning and how strong it is now.



The first post of this series was about a new study that showed that even villages have a small "urban heat island". For a village in Sweden (Haparanda) and Germany (Geisenheim) the study found that the current location of the weather station is about 0.5°C (1°F) colder than the village center. For cities you would expect a larger effect.

Around the Second World War many city stations were moved to airports, which largely takes the stations out of the urban heat island. Comparing the temperature trend of stations that are currently at airports with the non-airport stations, a number of people have found that this effect is about 0.1°C for the airport stations, which would suggest that it is not important for the entire dataset.

This 0.1°C sounds rather small to me. If we have urban heat islands of multiple degrees and people worry about small increases in the urban heat island effect, then taking a station (mostly) out of the heat island should lead to a strong cooling. Furthermore, cities are often in valleys and coasts and the later build airports thus often are at a higher and thus cooler location.

A preliminary study by citizen scientist Caerbannog suggests that airport relocations can explain a considerable part of the adjustments. These calculations need to be performed more carefully and we need to understand why the apparently small difference for airport stations translates to a considerable effect for the global mean. A more detailed scientific study on relocations to airports is unfortunately still missing.

Also the period where the bias increases in GHCNv3 corresponds to the period around the second world war where many stations were relocated to airports, see figure below. Finally, also that the temperature trend bias in the raw GHCNv3 data is larger than the bias in the Berkeley Earth dataset suggests that airport relocations could be important. Airport stations are overrepresented in GHCNv3, which contains a quite large fraction of airport stations.



With some colleagues we have started the Parallel Observations Science Team (POST) in the International Surface Temperature Initiative. There are some people interested in using parallel measurements (simultaneous measurements at cities and airports) to study the influence of these relocations. There seems to be more data than one may think. We are, however, still looking for a leading author (hint).

If the non-climatic change due to airport relocations is different (likely larger) than the change implemented in GHCNv3, that would give us an estimate of how well homogenization methods can reduce trend biases. Williams, Menne, and Thorne (2012) showed that homogenization can reduce trend errors, that they improve trend estimates, but also that part of the bias remains.



In the 19th century and earlier, thermometers were expensive scientific instruments and meteorological observations were made by educated people, apothecaries, teachers, clergymen, and so on. These people lived in the city. Many stations have subsequently been moved to better and colder locations. Whether urbanization produces a cold or a warm bias is thus an empirical and historical question. The evidence seems to show that on average the effect is small. It would be valuable when the effect of urbanization and relocations would be studies together. That may lead to an understanding of this paradox.



Related posts

Changes in screen design leading to temperature trend biases

Temperature bias from the village heat island

Climatologists have manipulated data to REDUCE global warming

Homogenisation of monthly and annual data from surface stations

Sunday, February 8, 2015

Changes in screen design leading to temperature trend biases

In the lab, temperature can be measured with amazing accuracies. Outside, exposed to the elements, measuring the temperature of the air is much harder. For example, if the temperature sensor gets wet, due to rain or dew, the evaporation leads to a cooling of the sensor. The largest cause of exposure errors are solar and heat radiation. For these reasons, thermometers need to be protected from the elements by a screen. Changes in the radiation error are an important source of non-climatic changes in station temperature data. Innovations leading to reductions in these errors are an major source of temperature trend biases.

A wall measurement at the Mathematical Tower in Kremsmünster. You mainly see the bright board to protect the instruments against rain, which is on the first floor, at the base of the window, a little right of the entrance.

History

The history of changes in exposure is different in every country, but in broad lines follows this pattern. In the beginning thermometers were installed in unheated rooms or in front of a window of an unheated room on the North (poleward) side of a building.

When this was found to lead to too high temperatures a period of innovation and diversity started. For example, small metal cages were added to the North wall measurements. More importantly free standing structures were designed: stands, shelters, houses and screens. In the Common Wealth the Glaisher (Greenwich) stand was prevalent. It has a vertical wooden board, a small roof and sides, but it is fully open in the front and in summer you have to rotate it to ensure that no direct sun gets onto the thermometer.

Shelters were build with larger roofs and sides, but still open to the front and the bottom, for example the Mountsouris and Wild screens. Sometimes even small houses or garden sheds were build, in the tropics with a thick thatched roof.

In the end, the [[Stevenson screen]] (Cotton Region Shelter) won the day. This screen is closed to all sides. It has double Louvre walls, double boards as roof and a board as bottom. (Early designs sometimes did not have a bottom.)

In the recent decades there is a move to Automatic Weather Stations (AWS), which do not have a normal (liquid in glass) thermometer, but an electrical resistance temperature sensor and is typically screened by multiple round plastic cones. These instruments are sometimes mechanically ventilated, reducing radiation errors during calm weather. Some countries have installed their automatic sensors in Stevenson screens to reduce the non-climatic change.


The photo on the left shows an open shelter for meteorological instruments at the edge of the school square of the primary school of La Rochelle, in 1910. On the right one sees the current situation, a Stevenson-like screen located closer to the ocean, along the Atlantic shore, in place named "Le bout blanc". Picture: Olivier Mestre, Meteo France, Toulouse, France.

Radiation errors

To understand when and where the temperature measurements have most bias, we need to understand how solar and heat radiation leads to measurement errors.

The temperature sensor should have the temperature of the air and should thus not be warmed or cooled by solar or heat radiation. The energy exchange between sensor and air due to ventilation should thus be large relative to the radiative exchanges. One of the reasons why temperature measurements outside are so difficult is that these are conflicting requirements: closing the screen for radiation will also limit the air flow. However, with a smart design, mechanical ventilation and small sensors this conflict can be partially resolved.

For North-wall observations direct solar radiation on the sensor was sometimes a problem during sunrise and sunset. In addition the sun may heat the wall below the thermometer and warm the rising air. Even for Stevenson screens some solar radiation still gets into the screen. Furthermore, the sun shining on the screen warms it, which can then warm the air flowing through the screen. For this reason it is important that the screen is regularly painted white and cleaned.

Scattered solar radiation (clouds, vegetation, surface) is important for older screens being open to the front. The open front also leads to a direct cooling of the sensor as it emits heat radiation. The net heat radiation flux is especially large when the back radiation of the atmosphere is low, thus when there are no clouds and the air is dry. Warm air can contain more humidity, thus these effects are generally also largest when it is cold.

Because older screens did not have a bottom, a hot surface below the screen could be a problem during the day and a cold surface during the night. This especially happens when the soil is dry and bare.

All these effects are most clearly seen when the wind is calm.

Concluding, we expect the cooling bias at night to be largest when the weather is calm, cloud free and the air is dry (cold). We also expect a warming bias during the day to be largest when the weather is calm and cloud free. In addition we can get a warm bias when the soil is dry and bare and in summer during sunrise and sunset.

Thus all things being equal, the radiation error is expected to be largest in sub-tropic, tropical and continental climates and small in maritime, moderate and cold climates.


Schematic drawing of the various factors that can lead to radiation errors.

Parallel measurements

We know how large these effects are from parallel measurements, where an old and new measurement set-up are compared side by side. Unfortunately, there are not that many of parallel measurements for the transition to Stevenson screens. Many parallel measurements in North-West Europe, a maritime, moderate or cold climate, where the effects are expected to be small of those are described in a wonderful review article by David Parker (1994) and he concludes that in the mid-latitudes the past warm bias will be smaller than 0.2°C. In the following, I will have a look at the parallel measurements outside of this region.

In the topics, the bias can be larger. Parker also describes two parallel measurements of a tropical thatched house with a Stevenson screen. One in India and one in Ceylon (Sri Lanka). They both have a bias of about 0.4°C. The bias naturally depends on the design, a comparison of a normal Stevenson screen with one with a thatched roof in Samoa shows almost no differences.


This picture shows three meteorological shelters next to each other in Murcia (Spain). The rightmost shelter is a replica of the Montsouri (French) screen, in use in Spain and many European countries in the late 19th century and early 20th century. In the middle, Stevenson screen equipped with automatic sensors. Leftmost, Stevenson screen equipped with conventional meteorological instruments.
Picture: Project SCREEN, Center for Climate Change, Universitat Rovira i Virgili, Spain.


Recently two beautiful studies were made with modern automatic equipment to study the influence of the screens. With automatic sensors you can make measurements every 10 minutes, which helps in understanding the reasons for the differences. In Spain they have build two replicas of the French screen used around 1900. One was installed in [[La Coruna]] (more Atlantic) and one in [[Murcia]] (more Mediterranean). They showed that the old measurements had a temperature bias of about 0.3°C; the Mediterranean location had, as expected, a somewhat larger bias than the Atlantic one.

The second modern study was in Austria, at the Mathematical Tower in Kremsmünster (depicted at the top of this post). This North-wall measurement was compared to a Stevenson screen (Böhm et al., 2010). It showed a temperature bias of about 0.2°C. The wall was oriented North-North-East and during sunrise in summer the sun could shine on the instrument.

For both the Spanish and the Austrian examples it should be noted that small modern sensors were used. It is possible that the radiation errors would have been larger had the original thermometers been used.

Comparing a Wild screen with a Stevenson screen at the astronomical observatory in [[Basel]], Switzerland, Renate Auchmann and Stefan Brönnimann (2012) found clear signs of radiation errors, but the annual mean temperature was somehow not biased.


Parallel measurement with a Wild screen and a Stevenson screen in Basel, Switzerland.
In [[Adelaide]], Australia, we have a beautiful long parallel measurement of the Glaisher (Greenwich) stand with a Stevenson screen (Cotton Region Shelter). It runs 61 complete years (1887-1947) and shows that the historical Glaisher stand recorded on average 0.2°C higher temperatures; see figure with annual cycle below. The negative bias in the minimum temperature at night is almost constant throughout the year, the positive bias is larger and strongest in summer. Radiation errors thus not only affect the mean, but also the size of the annual cycles. They will also affect the daily cycle, as well as the weather variability and extremes in the temperature record.

The exact size of the bias of this parallel measurement has a large uncertainty, it varies considerably from year to year and the data also shows clear inhomogeneities itself. For such old measurements, the exact measurement conditions are hard to ascertain.

The annual cycle of the temperature difference between a Glaisher stand and a Stevenson screen. For both the daily maximum and the daily minimum temperature. (Figure 1 from Nicholls et al. (1996)

Conclusions

Our understanding of the measurements and limited evidence from parallel measurements suggest that there is a bias of a few tenth of a Centigrade in observations made before the introduction of Stevenson screens. The [[Stevenson screen]]
was designed in 1864, most countries switched in the decades around 1900, but some countries did not switch until the 1960ies.

The last few decades there was a new transition to automatic weather stations (AWS). Some countries have installed the automatic probes in Stevenson screens, but most have installed single unit AWS with multiple plastic cones as screen. The smaller probe and mechanical ventilation could make the radiation errors smaller, but depending on the design possibly also more radiation gets into the screen and the maintenance may also be worse now that the instrument is no longer visited daily. An review article on this topic is still dearly missing.

Last month we have founded the Parallel Observations Science Team (POST) as part of the International Surface Temperature Initiative (ISTI) to gather and analyze parallel measurements and see how they affect the climate record. (Not only with respect to the mean, but also for changes in day and annual cycles, weather variability and weather extremes.) Theo Brandsma will lead our study on the transition to Stevenson screens and Enric Aguilar the transition from conventional observations to automatic weather stations. If you know of any dataset and/or want to collaborate please contact us.

Acknowledgement

With some colleagues I am working on a review paper on inhomogeneities in the distribution of daily data. This work, especially with Renate Auchmann, has greatly helped me understand radiation errors. Mistakes in this post are naturally my own. More on non-climatic changes in daily data later.



Further reading

A beautiful "must-read" article on temperature screens by Stephen Burt: What do we mean by ‘air temperature’? Measuring temperature is not as easy as you may think.

Just the facts, homogenization adjustments reduce global warming: The adjustments to the land surface temperature increase the trend, but the adjustments to the sea surface temperature decrease the trend.

Temperature bias from the village heat island

A database with parallel climate measurements describes the database we want to build with parallel measurements

A database with daily climate data for more reliable studies of changes in extreme weather gives somewhat more background

Statistical homogenisation for dummies

New article: Benchmarking homogenisation algorithms for monthly data

References

Auchmann, R., and S. Brönnimann, 2012: A physics-based correction model for homogenizing sub-daily temperature series. Journal Geophysical Research, 117, D17119, doi: 10.1029/2012JD018067.

Böhm, R., P.D. Jones, J. Hiebl, D. Frank, M. Brunetti, M.Maugeri, 2010: The early instrumental warm-bias: a solution for long central European temperature series 1760–2007. Climatic Change, 101, no. 1-2, pp 41-67, doi: 10.1007/s10584-009-9649-4.

Brunet, M., Asin, J., Sigró, J., Bañón, M., García, F., Aguilar, E., Palenzuela, J. E., Peterson, T. C. and Jones, P., 2011: The minimization of the screen bias from ancient Western Mediterranean air temperature records: an exploratory statistical analysis. International Journal Climatology, 31, pp, 1879–1895, doi:
10.1002/joc.2192.

Nicholls, N., R. Tapp, K. Burrows, D. Richards. Historical thermometer exposures in Australia. International Journal of Climatology, 16, pp. 705-710, doi: 10.1002/(SICI)1097-0088(199606)16:6<705::AID-JOC30>3.0.CO;2-S, 1996.

Parker, D. E., 1994: Effects of changing exposure of thermometers at land stations. International Journal of Climatology, 14, pp. 1–31, doi: 10.1002/joc.3370140102.