Sunday, 25 January 2015

We have a new record

Daily Mail with a stupid headline: Data: Gavin Schmidt, of Nasa's Goddard Institute for Space Studies, admits there's a margin of error. Schmidt look appropriately on photo.
The look of Gavin Schmidt accurately portrait my feelings for the Daily Mail.
It seems the word record has a new meaning.

2014 was a record warm year for the global temperature datasets maintained by the Americans: NOAA, GISS and BEST, as well as for the Japanese dataset. For HadCRUT from the UK it seems not to be clear which year will be highest.*

[UPDATE: data is now in: HadCRUT4 global temperature anomalies:
2014 0.563°C
2010 0.555°C
I could imagine that that is too close to call, the value to of 2014 could still change with new data coming in.]

The method of Cowtan and Way (C&W) is expected to see 2014 as the second warmest year. [It now does.]

(The method of C&W is currently seen as the most accurate method, at least for short-term trends; it makes recent temperature estimates more accurate using satellite tropospheric temperatures to fill the gaps between the temperature stations.)

Up to now I had always thought that you set a record when you get the largest or lowest value, whichever is hardest. The world record in marathon is the fastest time in an official race. The worlds best football player is the one getting most votes from sports journalists. And so on.

Climate change, however, has a special place in the heart of some Americans. These people do not see the question whether 2014 was a record in the datasets as an interesting question; the normal definition. Rather they claim, you are only allowed to call a year a record if you are sure that it was the highest value for the unknown actual global mean temperature. That is not the same.

Last September a new marathon world record was set in Berlin. Dennis Kimetto set the world record with a time of 2:02:57, while the number two of the same race, Emmanuel Mutai, set the world second best time with 2:03:13. Two records in one race! Clearly the conditions were ideal (the temperature, the wind, the flat track profile). Had other good runners participated in this race, they may well have been faster.

Should we call it a record? According to the traditional definition, Kimetto run fastest and has a record.

According to the new definition, we cannot be sure that Kimetto is really the fastest marathon runner on the world and we do not know what the world record is. Still newspapers around the world simply wrote about the record as if it were a fact.

When Cristiano Ronaldo was voted world footballer of the year 2014 with 37.66% of the votes, the BBC simply headlined: Cristiano Ronaldo wins Ballon d’Or over Lionel Messi & Manuel Neuer.

According to the traditional definition, Ronaldo is fairly seen as the best football player. According to the new definition, we cannot tell who the best football player is. He had such a small percentage of the votes, journalists clearly are error prone and they have a bias for forwards and against keepers.

In the sports cases it is clear that the probabilities are low, but hard to quantify them. In case of the global mean temperature we can and statistics is fun. All American groups were very active in communicating the probability that the global mean temperature itself was the highest in 2014. An interesting information quantum for the science nerd that may have put some people on the wrong foot.




And just for the funsies.


* Interesting, that Germany, France and China do not have their own global temperature datasets. Okay, Germany makes an effort not to look like a world power, but one would have expected France to have one. China is making a considerable effort in homogenization lately and has a large network already. I would not be surprised if they had their own global dataset soon, maybe using the raw data collection of the International Surface Temperature Initiative.

[UPDATE. I swear, I did not know, but Ronan Connolly pointed me to a new article on a Chinese global dataset. :) It integrates the long series of four other global datasets: CRUTEM3, GHCN-V3, GISSTMP and Berkeley.]



More information

A Deeper Look: 2014′s Warming Record and the Continued Trend Upwards
An informative article by Zeke Hausfather puts the 2014 record into perspective. The trend is important.

How ‘Warmest Ever’ Headlines and Debates Can Obscure What Matters About Climate Change
Andrew C. Revkin with a long piece with a similar opinion.

Thoughts on 2014 and ongoing temperature trends
The article by Gavin Schmidt at RealClimate is very informative, but more technical. For someone liking stats. He begins with some media critique: for the media a record is clearly an important hook. (They want news.)

12 comments:

metzomagic said...

Great article, Victor. Just the right amount of snark for a scientist, too :-)

Sou said...

Very good, Victor. You nailed it well. I notice that it took nothing at all to convince the unconvinceable that 2014 could not have been a record. Oddly enough many of those same people are convinced that 1998 was a record. NASA said so. Some have even convinced themselves that some record high temperature set back in the 1930s still holds. Even though I don't expect there would have been nearly as many high quality stations in the world at that time.

Give it another three or four years and deniers will be arguing that 2014 was a record that hasn't yet been broken :)

John Russell (@JohnRussell40) said...

Excellent, Victor.

Interesting that when the ball is on the other foot, those in denial seem to believe, without even the tiniest trace of scepticism, that Antarctic sea ice extent is always at record levels. :-)

Victor Venema said...

metzomagic, thanks. This is the second version. I started from scratch. The first version could have been on HotWhopper.

Sou and John, yes, reading WUWT & Co. you sure get the impression that there is a system to what counts as a valid record or more in general what counts as good science.

ligne said...

Sou: "Give it another three or four years and deniers will be arguing that 2014 was a record that hasn't yet been broken :) "

three or four years? come on! we both know that, if 2015 turns out to be less warm than 2014, they'll be bleating on about how an ice age is starting :-)

Patrice Monroe said...

Hi,
I am wondering that likelihood for a record year among N records should normally decrease with number of years. When I simulated how many times would actually measured random value would be indeed true highest value ( sd=0.7 was 0.7 for value and 0.5 for a measurement error, both normally distributed ), I've got the following results (Number of years, being highest percentage):
2 , 0.801146
10 , 0.532256
20 , 0.46076
50 , 0.384286
100 , 0.338422
135 , 0.32037
200 , 0.30028
1000 , 0.232482
10000 , 0.165234
I am not sure, but can we conclude that 38% likelihood was actually not that small ?

Victor Venema said...

Patrice Monroe, I guess I probably do not understand what you computed.

Would your computation suggest that after drawing 10.000 random numbers (normally distributed, no trend), the probability of a new record is still 15.52%? That still every 6 years we would see a new record? Sounds intuitively a bit high to me.

Patrice Monroe said...

Oh, sorry, ok, what I've tried to do is the following:
Let SA be the set of N datapoints, distributed around mean with sd=0.07 (resembling the interannual variability in actual - not measured - temperatures). For SA, the record is determined (let's say that index of data point ia)). Then, I've constructed measurement errors set (SE) with sd=0.05 (resembling uncertainty in measurements) and added SE to SA, giving me the measured set MS). For MS, if selected the index of it's highest value (with index im) and tested whether ia and im equals (so calculating, how many times the actual highest data was measured as such).
The purpose of this was to see how 38% is comparing to this ratio - I do suspect that when you have to determine the likelihood of a record in a large data set, one would expect to likelihood to decrease with number of data points participating in a "race" increasing.
So if I am correct (and I am extremely skeptical of my knowledge), for 135 years of data with given variability (and normal distribution), there is about 32 percent chance, that actual record would be measured as such (not the probability of a record). So I guess (and am unsure this is OK) that 38% likelihood is not that small for 135 years of data.

Patrice Monroe said...

And of course, I have assumed no trends and no changes in variability in both measured and actual data sets.

pinroot said...

So what type of equipment are they using that can measure the global temperature to within 1/1000 of a degree?

Anomaly:
2014 0.563°C
2010 0.555°C
Difference: 0.008°C

According to this (https://www.google.com/url?sa=t&rct=j&q=&esrc=s&frm=1&source=web&cd=2&cad=rja&uact=8&ved=0CCUQFjAB&url=https%3A%2F%2Fwww.wmo.int%2Fpages%2Fprog%2Fwww%2FIMOP%2Fpublications%2FIOM-104_TECO-2010%2FP5_16_Moore_UK.doc&ei=GsDLVM6IKfOHsQTpo4KgAw&usg=AFQjCNHuwx6Idvh94y32zlyLmcfsgVl0aQ&bvm=bv.84607526,d.cWc):The highest accuracy required for climatological temperature measurements within the screen is 0.1˚C. So how do you get 1/1000 degree accuracy in a global temperature when your equipment has 0.1 degree accuracy? I guess you can color me skeptical.

Victor Venema said...

The confidence interval of an average does down with the square root of the number of measurements. With 300 days a year and 1000 stations that gives an error of 0.1°C/sqrt(300*1000) = 0.0002°C.

In other words, the reading accuracy is not important. The confidence interval due to sampling the surface (and thus needing to interpolate) is much larger.

Anyway, the idea of this post is exactly that it is strange to consider such uncertainties. Never done for any other record. Just for the temperature record of 2014. No even for the temperature record in the 1930-ies in the USA. Mitigation skeptics always call that a record, never mention how uncertain it is.

Victor Venema said...

Forgot to add a link for more information on the confidence interval for an average.