Tuesday, 11 August 2015

History of temperature scales and their impact on the climate trends

Guest post by Peter Pavlásek of the Slovak Institute of Metrology. Metrology, not meteorology, they are the scientists that work on making measurements more precise by developing high accurate standards and thus make experimental results better comparable.

Since the beginning of climate observations temperature has always been an important quantity that needed to be measured as its values affected every aspect of human society. Therefore its precise and reliable temperature determination was important. Of course the ability to precisely measure temperature strongly depends on the measuring sensor and method. To be able to determine how precisely the sensor measures temperature it needs to be calibrated by a temperature standard. As science progressed with time new temperature scales were introduced and the previous temperature standards naturally changed. In the following sections we will have a look on the importance of temperature scales throughout the history and their impact on evaluation of historical climate data.

The first definition of a temperature standard was created in 1889. At the time thermometers were ubiquitous, and had been used for centuries; for example, they had been used to document ocean and air temperature now included in historical records. Metrological temperature standards are based on state transitions of matter (under defined conditions and matter composition) that generate a precise and highly reproducible temperature value. For example, the melting of ice, the freezing of pure metals, etc. Multiple standards can be used as a base for a temperature scale by creating a set of defined temperature points along the scale. An early definition of a temperature scale was invented by the medical doctor Sebastiano Bartolo (1635-1676), who was the first to use melting snow and the boiling point of water to calibrate his mercury thermometers. In 1694 Carlo Renaldini, mathematician and engineer, suggested using the ice melting point and the boiling point of water to divide the interval between these two points into 12 degrees, applying marks on a glass tube containing mercury. Reamur divided the scale in 80 degrees, while the modern division of roughly 100 degrees was adopted by Anders Celsius in 1742. Common to all the scales was the use of phase transitions as anchor points, or fixed points, to define intermediate temperature values.

It is not until 1878 that the first sort of standardized mercury-in-glass thermometers were introduced as an accompanying instrument for the metre prototype, to correct from thermal expansion of the length standard. These special thermometers were constructed to guarantee reproducibility of measurement of a few thousandths of a degree. They were calibrated at the Bureau International des Poids et Mesures (BIPM), established after the recent signature of the Convention du Metre of 1875. The first reference temperature scale was adopted by the 1st Conférence générale des poids et measures ( CGPM) in 1889. It was based on constant volume gas thermometry, and relied heavily on the work of Chappius at BIPM, who had used the technique to link the readings of the very best mercury-in-glass thermometers to absolute (i.e. thermodynamic) temperatures.

Meanwhile, the work of Hugh Longbourne Callendar and Ernest Howard Griffiths on the development of platinum resistance thermometers (PRTs) lay the foundations for the first practical scale. In 1913, after a proposal from the main Institutes of metrology, the 5th CGPM encouraged the creation of a thermodynamic International Temperature Scale (ITS) with associated practical realizations, thus merging the two concepts. The development was halted by the World War I, but the discussions resumed in 1923 when platinum resistance thermometers were well developed and could be used to cover the range from –38 °C, the freezing point of mercury, to 444.5 °C, the boiling point of sulphur, using a quadratic interpolation formula, that included the boiling point of water at 100 °C. In 1927 the 7th CGPM adopted the International Temperature Scale of 1927 that even extended the use of PRTs to -183 °C. The main intention was to overcome the practical difficulties of the direct realization of thermodynamic temperatures by gas thermometry, and the scale was a universally acceptable replacement for the various existing national temperature scales.

In 1937 the CIPM established the Consultative Committee on Thermometry (CCT). Since then the CCT has taken all initiatives in matter of temperature definition and thermometry, including, in the recent years, issues concerning environment, climate and meteorology. It was in fact the CCT that in 2010, shortly after the BIPM-WMO workshop on “Measurement Challenges for Global Observing Systems for Climate Change Monitoring” submitted the recommendation CIPM (T3 2010), encouraging National Metrology Institutes to cooperate with the meteorology and climate communities for establishing traceability to those thermal measurements of importance for detecting climate trends.

The first revision of the 1927 ITS took place in 1948, when extrapolation below the oxygen point to –190 °C was removed from the standard, since it had been found to be an unreliable procedure. The IPTS-48 (with “P” now standing for “practical”) extended down only to –182.97 °C. It was also decided to drop the name "degree Centigrade" for the unit and replace it by degree Celsius. In 1954 the 10th CGPM finally adopted a proposal that Kelvin had made back one century before, namely that the unit of thermodynamic temperature to be defined in terms of the interval between the absolute zero and a single fixed point. The fixed point chosen was the triple point of water, which was assigned the thermodynamic temperature of 273.16 °K or equivalently 0.01 °C and replaced the melting point of ice. Work continued on helium vapour pressure scales and in 1958 and 1962 the efforts were concentrated at low temperatures below 0.9 K. In 1964 the CCT defined the reference function “W” for interpolating the PRTs readings between all new low temperature fixed points, from 12 K to 273,16 K and in 1966 further work on radiometry, noise, acoustic and magnetic thermometry made CCT preparing for a new scale definition.

In 1968 the second revision of the ITS was delivered: both thermodynamic and practical units were defined to be identical and equal to 1/273.16 of the thermodynamic temperature of the triple point of water. The unit itself was renamed "the kelvin" in place of "degree Kelvin" and designated "K" in place of "°K". In 1976 further consideration and results at low temperatures between 0.5 K and 30 K were included in the Provisional Temperature Scale, EPT-76. Meanwhile several NMIs continued the work to better define the fixed points values and the PRT’s characteristics. The International Temperature Scale of 1990 (ITS-90) came into effect on 1 January 1990, replacing the IPTS-68 and the EPT-76 and is still today adopted to guarantee traceability of temperature measurements. Among the main features of ITS-90, with respect to the 1968 one, is the use of the triple point of water (273.16 K), rather than the freezing point of water (273.15 K), as a defining point; it is in closer agreement with thermodynamic temperatures; it has improved continuity and precision.

It follows that any temperature measurement made before 1927 is impossible to trace to an international standard, except for a few nations with a well-defined national definition. Later on, during the evolution of both the temperature unit and the associated scales, changes have been introduced to improve the realization and measurement accuracy.

With each redefinition of the practical temperature scale since the original scale of 1927, the BIPM published official transformation tables to enable conversion between the old and the revised temperature scale (BIPM. 1990). Because of the way the temperature scales have been defined, they really represent an overlap of multiple temperature ranges, each of which may have their own interpolating instrument, fixed points or mathematical equations describing instrument response. A consequence of this complexity is that no simple mathematical relations can be constructed to convert temperatures acquired according to older scales into the modern ITS90 scale.

As an example of the effect of temperature scales alternations let us examine the correction of the daily mean temperature record at Brera, Milano in Italy from 1927 to 2010, shown in Figure 1. The figure illustrates the consequences of the temperature scale change and the correction that needed to be applied to convert the historical data to the current ITS-90. The introduction of new temperature scales in 1968 and 1990 is clearly visible as discontinuities in the magnitude of the correction, with significantly larger corrections for data prior to 1968. As expected from Figure 1, the cycling follows the seasonal changes in temperature. The higher summer temperatures require a larger correction.

Figure 1. Example corrections for the weather station at Brera, Milano in Italy. The values are computed for the daily average temperature. The magnitude of the correction cycles with the annual variations in temperature: the inset highlights how the warm summer temperatures are corrected much more (downward) than the cool winter temperatures.

For the same reason the corrections will differ between locations. The daily average temperatures at the Milano station typically approaches 30 °C on the warmest summer days, while it may fall slightly below freezing in winter. In a different location with larger differences between typical summer and winter temperature the corrections might oscillate around 0 °C, and a more stable climate might see smaller corrections overall: at Utsira, a small island off the south-western coast of Norway the summertime corrections are typically 50% below the values for Brera. To better see the magnitude of corrections for specific historical temperatures the Figure 2 is provided.

Figure 2. The corrections in °C that need to be applied to a certain historical temperatures in the range form -50 °C up to +50 °C with regard to the time period the historical data were measured.

The uncertainty in the temperature readings from any individual thermometer is significantly larger than the corrections presented here. Furthermore, even for the limited timespan since 1927 a typical meteorological weather station has seen many changes which may affect the temperature readings. Examples include instrument replacement; instrument relocations; screens may be rebuilt, redesigned or moved; the schedule for readings may change; the environment close to the station may become more densely populated and therefore enhance the urban heat island effect; and manually recorded temperatures may suffer from unconscious observer bias (Camuffo, 2002; Bergstrøm and Moberg, 2002; Kennedy, 2013). Despite the diligent quality control employed by meteorologists during the reconstruction of long records, every such correction also has an uncertainty associated with it. Thus, for an individual instrument, and perhaps even an individual station, the scale correction is insignificant.

On the other hand, more care is needed for aggregate data. The scale correction represents a bias which is equal for all instruments, regardless of location and use, and simply averaging data from multiple sources will not eliminate it. The scale correction is smaller than, but of the same order of magnitude as the uncertainty components claimed for monthly average global temperatures in the HadCRUT4 dataset (Morice et al., 2012). To evaluate the actual value of the correction for the global averages would require a recalculation of all the individual temperature records. However, the correction does not alter the warming trend: if anything it would exacerbate it slightly. Time averaging or averaging multiple instruments has been claimed to lower temperature uncertainty to around 0.03 °C (for example in Kennedy (2013) for aggregate temperature records of sea surface temperature). To be credible such claims for the uncertainty need to consider the scale correction in our opinion.

Scale correction for temperatures earlier than 1927 is harder to assess. Without an internationally accepted and widespread calibration reference it is impossible to construct a simple correction algorithm, but there is reason to suspect that the corrections become more important for older parts of the instrumental record. Quantifying the correction would entail close scrutiny of the old calibration practices, and hinges on available contemporary descriptions. Conspicuous errors can be detected, such as the large discrepancy which Burnette et al. found in 1861 from records at Fort Riley, Kansas (Burnette et al., 2010). In that case the decision to correct the dubious values was corroborated by metadata describing a change of observer: however, this also illustrates the calibration pitfall when no widespread temperature standard was available. One would expect that many more instruments were slightly off, and the question is whether this introduced a bias or just random fluctuations which can be averaged away when producing regional averages.

Whether the relative importance of the scale correction increases further back in time remains an open question. The errors from other sources such as the time schedule for the measurements also become more important and harder to account for, such as the transformation from old Italian time to modern western European time described in (Camuffo, 2002).

This brief overview of temperature scales history has shown what an impact these changes have on historical temperature data. As it was discussed earlier the corrections originating from the temperature scale changes is small when compared with other factors. Even when the values of the correction may be small it doesn’t mean it should be ignored as their magnitude are far from negligible. More details about this problematic and the conversion equation that enables to convert any historical temperature data from 1927 up to 1989 to the current ITS-90 can be found in the publication of Pavlasek et al. (2015).

Related reading

Why raw temperatures show too little global warming

Just the facts, homogenization adjustments reduce global warming


Camuffo, Dario, 2002: Errors in early temperature series arising from changes in style of measuring time, sampling schedule and number of observations. Climatic change, 53, pp. 331-352.

Bergstrøm, H. and A. Moberg, 2002: Daily air temperature and pressure series for Uppsala (1722-1998). Climatic change, 53, pp. 213-252.

Kenndy, John J., 2013: A review of uncertainty in in situ measurements and data sets of sea surface temperature. Reviews of geophysics, 52, pp. 1-32.

Morice, C.P., et al., 2012: Quantifying uncertainties in global and regional temperature change using an ensemble of observational estimates: The HaddCRUT4 data set. Journal of geophysical research, 117, pp. 1-22.

Burnette, Dorian J., David W. Stahle, and Cary J. Mock, 2010: Daily-Mean Temperature Reconstructed for Kansas from Early Instrumental and Modern Observations. Journal of Climate, 23, pp. 1308-1333.

Pavlasek P., A. Merlone, C. Musacchio, A.A.F. Olsen, R.A. Bergerud, and L. Knazovicka, 2015: Effect of changes in temperature scales on historical temperature data. International Journal of Climatology, doi: 10.1002/joc.4404.


  1. Thank you for this.

    One quibble. The use of triple points solves the man on Mars problem (e.g. how do you tell the Martian Standards Institute how to calibrate their thermometers). Freezing points are fairly but not completely independent of pressure over a large range, a triple point is completely fixed by the composition of the pure material

  2. Eli, why do you call it a "quibble"?

    Do you mean that that should have been explained or is there an error somewhere? In the latter case, please be a bit more specific.

  3. This comment has been removed by a blog administrator.


Comments are welcome, but comments without arguments may be deleted. Please try to remain on topic. (See also moderation page.)

I read every comment before publishing it. Spam comments are useless.

This comment box can be stretched for more space.