Be careful with the new daily temperature dataset from Berkeley

Wednesday, 5 March 2014

Be careful with the new daily temperature dataset from Berkeley

The Berkeley Earth Surface Temperature project now also provides daily temperature data. On the one hand this is an important improvement, that we now have a global dataset with homogenized daily data. On the other hand, there was a reason that climatologists did not publish a global daily dataset yet. Homogenization of daily data is difficult and the data provided by Berkeley is likely better than analyzing raw data, but still insufficient for robust conclusions about changes in extreme weather and weather variability.

The new dataset is introuduced by Zeke Hausfather and Robert Rohde on Real Climate:

Daily temperature data is an important tool to help measure changes in extremes like heat waves and cold spells. To date, only raw quality controlled (but not homogenized) daily temperature data has been available through GHCN-Daily and similar sources. Using this data is problematic when looking at long-term trends, as localized biases like station moves, time of observation changes, and instrument changes can introduce significant biases.

For example, if you were studying the history of extreme heat in Chicago, you would find a slew of days in the late 1930s and early 1940s where the station currently at the Chicago O’Hare airport reported daily max temperatures above 45 degrees C (113 F). It turns out that, prior to the airport’s construction, the station now associated with the airport was on the top of a black roofed building closer to the city. This is a common occurrence for stations in the U.S., where many stations were moved from city cores to newly constructed airports or wastewater treatment plants in the 1940s. Using the raw data without correcting for these sorts of bias would not be particularly helpful in understanding changes in extremes.

The post explains in more detail how the BEST daily method works and presents some beautiful visualizations and videos of the data. Worth reading in detail.

Daily homogenization

When I understand the homogenization procedure of BEST right, it is based on their methods for the monthly mean temperature and this only accounts for non-climatic changes (inhomogeneities) in the mean temperature.

The example of a move from black roof in a city to an airport is also a good example that not only the mean can change. The black roof will show more variability because on hot sunny days the warm bias is larger than on windy cloudy days. Thus part of this variability is variability in solar insolation and wind.

Also the urban heat island could be a source of variability, the UHI is strongest on wind and cloud free days. Thus part of the variability in observed temperature will be due to variability in wind and clouds.

A nice illustration of the problem can be found in a recent article by Blair Trewin. He compares the distribution of two stations, one in a city near the coast and one at an airport more inland. In the past the station was in the city, nowadays it is at the airport. The modern measurements in the city that are shown below have been made to study the influence of this change.

For this plot he computed the 0th to the 100th percentile. The 50th percentile is the median, 50% of the data has a lower value. The 10th percentile is the value where 10% of the data is smaller, and so on. The 0th and 100th percentile in this plot are the minimum and maximum. What is displayed is the temperature difference between these percentiles. On average the difference is about 2°C, the airport is warmer. However, for the higher percentiles (95th) the difference is much larger. Trewin explains this by cooling of the city station by a land-sea circulation (sea breeze) often seen on hot summer days. For the highest percentiles (99th), the difference becomes smaller again because offshore wind override the sea breeze.

Clearly if you would homogenize this time series for the transition from the coast to the inland by only correcting the mean, you would still have a large inhomogeneity in the higher percentiles, which would still lead to non-climatic spurious trends in hot weather.

Thus we would need a bias correction of the complete probability distribution and not just its mean.

Or we should homogenize the indices we are interested in, for example percentiles or the number of days above 40°C. etc. The BEST algorithm being fully automatic could be well suited for such an approach.

Gridding and kriging

Another problem I see is the use of the interpolation method kriging to bring the data to a regular grid. The number of stations available to estimate the daily means of a grid box will determine it uncertainty and thus also how much this values fluctuates. It will be hard to distinguish changes in weather variability with changes in the error in this estimate due to changes in the configuration of the station network.

This problem can go in both ways. If you have many stations in a grid box, more stations would reduce the uncertainty in the estimate of the grid box mean. An increase in the number of station would then lead to a spurious decrease in variability and less extremes.

If there are less stations as grid boxes, the method performs an interpolation. Interpolation smooths a field. An empty grid box is estimated as the mean of many far away surrounding stations. That gives quite a smooth values. When a new station appears in this grid box, the grid box mean will be for a large part determined by this relatively noisy single measurement. This would thus give a spurious increase in variability.

The number of stations varies considerably in time; see figure below. Thus this could be a serious source of error, especially for daily data where the variability is high and the spatial correlations are relatively low.

Thus I would feel it is saver to analyze changes in extremes and weather variability on station data and avoid the additional problems of gridded datasets, especially at daily scales.

Using this dataset will in general be better than using raw data and it is great to have a global dataset. But please be careful and compare your results with those derived from carefully homogenized regional daily datasets. These methods are also still in their beginning stages, but if they can be applied, they should produce more reliable data.

Statistically interesting problems: correction methods in homogenization

HUME: Homogenisation, Uncertainty Measures and Extreme weather

A database with daily climate data for more reliable studies of changes in extreme weather

Introduction to series on weather variability and extreme events (part 1)

2. On the importance of changes in weather variability for changes in extremes (part 2, important further posts in this series are unfortunately still missing.)

Reference

Trewin, B. A daily homogenized temperature data set for Australia. Int. J. Climatol., 33, pp. 1510–1529, doi: 10.1002/joc.3530, 2013.

6 comments:

UnknownWednesday, 5 March 2014 at 22:32:00 GMT
Thanks for your thoughts on this new dataset. I read about it in the morning and was immediately curious about your opinion. Glad to be able to read it in the evening. :)
ReplyDelete
Replies
Victor VenemaWednesday, 5 March 2014 at 22:36:00 GMT
I had planned on doing something useful this evening. :) But well, it is my topic and I could thus write this post relatively fast.

The philosophy post took me a few weeks.

Any other wishes? ;)

Will work on the weather variability series soon again. Those posts are also more difficult and take time.
ReplyDelete
Replies
Gregor VertacnikThursday, 6 March 2014 at 12:09:00 GMT
For the highest percentiles (99th), the difference becomes smaller again because then also the airport is influence by the air-sea circulation.

I think the statement should be "reversed". I would say the difference becomes smaller when there is strong and hot wind blowing towards sea - this brings hot inland conditions right to the coast.

Another issue may arise when merging the series of neighbour stations that are very prone to local incluences. Imagine stations that measure most of the highest temperatures in different synoptic (weather-type) situation (for example due to orographic effects or sea-land contrast). A trend in frequency of such situations automatically changes the distribution, however different for each of the stations. Merging the two series together just by matching the distribution percentiles would thus produce artificial trend. However, I'm not aware of any such case in the literature - maybe the effect is negligible.
ReplyDelete
Replies
Victor VenemaThursday, 6 March 2014 at 16:06:00 GMT
Hi Gregor, you are right. Blair Trewin wrote: "This mostly occurs with extreme high maxima at near-coastal locations (Figure 1), where large differences between sites on 95th percentile days collapse to near zero on the very hottest days when offshore winds override the sea breeze."

Will correct the text. Thank you.

Another issue may arise when merging the series of neighbour stations that are very prone to local influences.

I think you are right here as well. Once you start thinking in terms of variability and correlations with other stations, variables and weather situations, the homogenization of daily data becomes incredibly complicated.
ReplyDelete
Replies
Harry DavidsonThursday, 23 June 2022 at 08:49:00 BST
I have long thought that the whole methodology of constructing averages, with 'homogenization', across wildly varying dataset input is deeply flawed. One basic objection is that if you don't have data, then you don't have it. There is no valid way to magic it into existence by averaging or any other process. So data from the past that isn't there, just isn't there. This is even worse in modern datasets where all the data for an area is missing for a period because the instruments were 'offline' and the data is infilled by averaging the surrounding stations (The Australian BoM does a lot of this).
IMHO, the only intellectually valid approach for determining shifts in the climate is to look the historical record for stations that are rural, have remained unaffected by UHI and have solid record keeping going back through decades. That gives a reliable sample of trend that can then by unified using Bayesian techniques.
All averaging and homogenization currently in use is so vulnerable to bias confirmation as to be entirely meaningless, and then you have the increase in standard error that homogenization brings. That is *never* discussed.
ReplyDelete
Replies

Add comment

Comments are welcome, but comments without arguments may be deleted. Please try to remain on topic. (See also moderation page.)

I read every comment before publishing it. Spam comments are useless.

This comment box can be stretched for more space.

Variable Variability

Pages

Wednesday, 5 March 2014