A month ago the New York Times insulted its subscribers with a climate change column by Bret Stephens, their new hire from the Wall Street Journal. The bizarre text was mostly a sequence of claims that did not follow from the arguments presented.

The column also contained one fact. Which was wrong and later corrected. Stephens claimed:

Anyone who has read the 2014 report of the Intergovernmental Panel on Climate Change knows that, while the modest (0.85 degrees Celsius, or about 1.5 degrees Fahrenheit)As a dutiful watcher of Potholer54, which any real skeptic should be, you know that it is a good idea to check the source and Stephens helpfully provided a link to the Summary for Policymakers of the 5th assessment synthesis report of the Intergovernmental Panel on Climate Change (IPCC). This summary mentions the number "0.85" in the sentence:warming of the Northern Hemispheresince 1880 is indisputable, as is the human influence on that warming, much else that passes as accepted fact is really a matter of probabilities.

Theglobally averaged combined land and ocean surface temperaturedata as calculated by a linear trend show a warming of 0.85 [0.65 to 1.06] °C over the period 1880 to 2012, when multiple independently produced datasets exist (Figure SPM.1a). {1.1.1, Figure 1.1}

*Figure SPM.1a. Annually and globally averaged combined land and ocean surface temperature anomalies relative to the average over the period 1986 to 2005. Colours indicate different data sets.*

Thus Stephens confused the global temperature with the temperature of the Northern Hemisphere. Not a biggy, but indicative of the quality of Stephens' writing.

A related weird claim is that the "

*warming of the earth since 1880 is indisputable, as is the human influence on that warming, much else that passes as accepted fact is really a matter of probabilities.*"

As quoted above the warming since 1880 is not exactly known, but probably between 0.65 and 1.06 °C. That it was warming is so certain that a journalist may call it "indisputable". There is thus no conflict between probabilities and certainty. In fact they go hand in hand. When scientists talk about uncertainties, they are quantifying how certain we are.

However, I did not want to attack the soft target Bret Stephens. The hard target IPCC is much more interesting. They put some thought in their writing.

More precisely I have problems when they write in the summary for policy makers: "

*temperature data as calculated by a linear trend show a warming of 0.85*". That means that they fitted a linear function to the data — using [[least squares regression]] — and used this trend and the length of the period to estimate the total warming over this period.

This is a problem because calculating the total amount of warming using a linear trend underestimates global warming.* I show this below for two global temperature datasets by comparing the linear warming estimate with a nonlinear (LOESS) warming estimate. The linear estimate is smaller: For NASA's GISTEMP it is 0.05 °C smaller and for Berkeley Earth it is 0.1 °C smaller.

Such linear estimates are well suited for comparing different datasets because it is well defined how to compute a linear trend and the bias will be similar in the different datasets. That is why linear estimates are used a lot in the scientific literature and scientists reading this know that a linear estimate can be biased when the curve itself is not linear.

But this was a warming estimate for the

*summary for policy makers*. Policy makers and the public in general should get an unbiased estimate of the climatic changes we have seen and are before us.

Tending to underplay the problem is quite typical. There is even an article on climate Scientists Erring on the Side of Least Drama and also The Copenhagen Diagnosis gives several examples such as the very low predictions for the decline in sea ice or the increase in sea level.

When it comes the warming found in station data, we did study the main possible warming bias (urbanization) in depth, but hardly did any work on cooling biases that may lead us to underestimate the amount of warming.

In a recent study to define what "pre-industrial" means when it comes to the 2 °C warming limit, the authors suggest a comparison period with relatively few volcanoes, which is thus relatively warm. This underestimates the warming since "pre-industrial". The authors wanted to be "conservative". I think we should be unbiased.

I understand that scientists want to be careful before crying wolf, whether we have a problem or not. However, when it comes to the size of the wolf, we should give our best estimate and not continually err on the side of a Chihuahua.

## Related reading

Climate Scientists Erring on the Side of Least DramaWhy raw temperatures show too little global warming

The NY Times promised to fact check their new climate denier columnist — they lied

** The linear estimate is typically smaller, sometimes a lot, whether the actual underlying function is convex or concave. I had expected this estimate to be always smaller, but noticed while writing this post that for polynomial functions, f(t) = t*

The bottom graph shows these linear estimates as a function of exponent p, where you can see that for an exponent between 1 (linear) and 2 (quadratic) the estimates can be a little higher than one, while they are generally lower. Probably Carl Friedrich Gauss or at least Paul Lévy already wrote an article about this, but it was a small surprise to me.

^{p}, it can also be a few percent higher for p between 1 and 2. Below you can see 4 example curves, where the time runs between zero and one and thus also f(t) goes from zero to one. The total "warming" in all cases is exactly one. The linear estimates are generally less than one, except for the f(t) = t^{1.5}example.The bottom graph shows these linear estimates as a function of exponent p, where you can see that for an exponent between 1 (linear) and 2 (quadratic) the estimates can be a little higher than one, while they are generally lower. Probably Carl Friedrich Gauss or at least Paul Lévy already wrote an article about this, but it was a small surprise to me.

** Top photo of a Chihuahua is licensed under the Creative Commons Attribution 3.0 Unported license.

Bottom Chihuahua photo by Inanishi is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 2.0 Generic (CC BY-NC-ND 2.0) license.

*** The R code to generate the plots with the linear and nonlinear warming from two global temperature datasets is on GitHub. The R code to study the influence of the polynomial exponent on the linear estimate is also on GitHub.

## 16 comments:

Victor - as usual, very helpful article. On SLR, recently there was a second paper using a different method that agrees with earlier Hay-Mitrovica finding that SLR from 1900 to 1990 averaged ~1.2mm/yr... and since 1990 it's ~3.2mm/yr. The last ten years is in excess of 4mm/yr. That I can find nobody has graphed it in the same way you have done the GMST your post. I suspect the results would be as striking... maybe more striking.

Interesting to consider the shape of that curve if the suggested WW2 sea temperature bias prove supportable.

The linear trend is not the larges issue.

Spatial coverage biases, sea ice, blending SST with SAT data and such things are contributing to underestimating the warming trend.

One can test it with the CIMP5 output.

Using the data from here

http://www.climate-lab-book.ac.uk/2015/an-apples-to-apples-comparison-of-global-temperatures/

I got a 24% to 37% high linear trend, if I use the (in the models globally available) near-surface air temperatures (SATs), then applying the HadCRUT4 method with blended SST/SAT temperature anomalies.

If the difference in the real world is similar to that seen in the model output, the warming trend shown in the 'surface temperature' data sets substantial underestimates the really global near-surface air temperature trend.

Victor: Interesting post, which has increased my understanding of how warming can be quantified. 1.00 K does appear to be a better estimate of warming for 1880-2016 than 0.95 K. Unfortunately, these values don't convey any information about the uncertainty and significance of this change. A linear AR1 model has the advantage of also estimating the uncertainty in the warming TREND. That confidence interval multiplied by the period provides a confidence interval for the AMOUNT of warming. Unless one can obtain a confidence interval using LOESS, LOESS may provide the best single value for warming - but not the type of central estimate plus confidence interval that appears to be more appropriate for an IPCC report.

Respectfully, Frank (We exchanged some long comments at ATTP)

Victor wrote: "the IPCC underestimates global warming". This is only true if you believe that GW has a single well-constrained value whose significance is obvious. The importance of uncertainty and "statistical significance" in quantifying warming is illustrated by the fact that the recent El Nino raised GISTEMP GMST by 0.5 K (monthly data) to 0.3 K (your graph of annual averages?) over the average for the 2000-2013, a period of somewhat stable temperature and modest warming. 1 K of overall warming with occasional deviations of 0.5 K suggests that reporting a single value for warming is inappropriate.

The simplest way to quantify warming is by the difference in temperature between two dates, say 1/1/1900 and 1/1/2000. OK, that won't work. Monthly averages: 1/1900 and 1/2000? Annual averages: 1900 and 2000? How about 1895-1905 vs 1995-2005? Or 1890-1910 vs 1990-2010? With these longer averages, we obtain a difference between two means PLUS a confidence interval around that difference (derived from the standard deviation of the periods averaged). Much more useful. How long a period should we average over? One answer is to use all of the data - which is what we accomplish by doing an AR1 linear fit to the data and multiplying the trend (with confidence interval) by the period. Unfortunately, there is no reason to believe that warming should be linear over the last century, but it could be a reasonable assumption for the last half-century.

Returning to the difference between two means with a confidence interval to assess the significance of the difference, what determines whether averaging over 10 or 20 years is long enough? In theory, we need a period long enough to sample all of the noise, unforced variability (and possibly naturally-forced variability) in the data. Two decades (but not one decade) might be barely long enough to get a representative sample of the noise from ENSO, but isn't appropriate for sampling phenomena like the AMO, PDO, LIA, MWP etc. So, this strategy for understanding the significance of a difference in temperature has some problems. My confidence in much of the statistical significance of 20th-century warming diminishes when I start thinking about variability over the past few millennia.

Doug Keenan raised the possibility of using statistical models for warming besides linear AR1, but random-walk statistical models are totally inappropriate for a planet with a relatively stable temperature for the last billion years. And science rarely progresses by fitting ARBITRARY mathematical or statistical functions to noisy data. (I've personally made this mistake.) We make progress in science by creating hypotheses about the fundamental physics (or other science) controlling the behavior of things and testing those hypotheses. That is what AOGCMs do; they try to apply fundamental physics to AGW. This is how the Met Office sensibly answered Keenan - we know it is warming from a combination of observations and models. Unfortunately, those models must use large grid cells and parameters describing heat transfer processes that occur on sub-grid scale. Fortunately, no parameterization can make model climate sensitivity approach zero, but the ability of models to reproduce unforced variability remains an issue. Which brings us back to the original problem, the significance of the warming we have observed.

Lorenz discusses this subject in a prescient paper "Chaos, Spontaneous Climatic Variability, and Detection of the Greenhouse Effect" (1991). http://www.iaea.org/inis/collection/NCLCollectionStore/_Public/24/049/24049764.pdf?r=1 page 445.

From my perspective, the IPCC over-estimates the SIGNIFICANCE of GW - but I'd be glad to hear why I'm wrong.

Frank, yes the LOESS estimate would naturally also need an uncertainty estimate. I did not do this for this blog post, but I am sure it is possible. The uncertainty would likely be larger than the linear estimate because effectively only about 40% of the data is used, 20% at the beginning and 20% and the end. (There often is a trade off between bias and uncertainty.)

For the main point of the blog post, that linear estimates tend to be biased to give too little warming, the uncertainty is not important.

"

From my perspective, the IPCC over-estimates the SIGNIFICANCE of GW - but I'd be glad to hear why I'm wrong."You are not even wrong because the significance of global warming is not defined. See: Falsifiable and falsification in science

The details of AOGCMs are not important. No realistic model, however complex or simple, of the climate over the last billions of year will be a random walk model.

For the rest of your comment I would recommend learning statistics in a non-climate setting, where your priors about climate change do not inhibit learning the basic ideas and skills.

FWIW, IMO the QM that describes the interactions between radiation and GHG, plus absorption cross-sections measure in the laboratory, plus the decrease in temperature with altitude, plus conservation of energy logically demands that rising GHGs cause the Earth to warm. It is idiotic to attempt to falsify one of these four using observations of temperature change on a planet where CHAOTIC changes in heat flux between the surface and the deep ocean (ENSO, for example) are potentially as large as large as the AGW signal one is trying to detect.

For me, the issue is climate sensitivity, not falsification. Furthermore, there are sound physical reasons (I won't go into) for believing that climate sensitivity can't be much below 1 K/doubling. For climate sensitivity, one needs to understand the statistical significance of warming in chaotic system, where cause and effect are problematic. Statistics are how we abstract "meaning from data".

Your post shows a better way of abstracting meaning from historical temperature data than the IPCC's AR1 linear regression, but you haven't carried the approach far enough to produce a confidence interval. For that, you need a confidence interval around the Loess regression values for temperature in 1880 and 2016. Then you presumably could apply the usual method used for confidence interval for the difference between two means. Given autocorrelation in monthly data and apparent long-term persistence in annual data (1920-1945 warming, 1945-1975 hiatus, 1975-2000 warming, 2000-2013 hiatus), obtaining a rigorous confidence interval for your Loess Regression could be a non-trivial problem. My cruder suggestion was to average data about beginning and ending time points. Loess is better than simple averaging. Autocorrelation and LTP also complicate averaging, which is why I raised questions about the period over which one should average.

Yes, my education in statistics has significant weaknesses. So I'll quote from the above Lorenz paper: "Certainly no observations have told us that decadal-mean temperatures are nearly constant under constant external influences". Lorenz discusses how AOGCMs might address this problem.

Frank

I had not realised both comments were yours. The first one made a lot more sense. :-)

I would ignore the data around the WWII, it is not reliable.

We understand the reasons for the hiatus from 1945 to 1975, mostly increases in pollution and a bit the sun.

The 2000-2013 hiatus does not exist, it is a fairy tale of people who are bad at statistics, do not appreciate how large the uncertainties in short-term trends are and thus how bad cherry picking a special period is.

If you want to estimate how much warming we have due to global warming, those long-term persistent phenomena could be relevant. Here I had the more modest aim of estimating how much long-term warming we had seen, then those LTP elements would also be relevant and should not be seen as noise.

Thus if you want to estimate the error of my LOESS warming estimate you could simply generate noise, following the procedure of my post on short-term trends, and compute the "warming" distribution of the white noise, which we know does not have average warming to compute the uncertainty. Thus any warming we would see from those noise series would be due to the estimation method and the short-term noise.

VIctor wrote: "I would ignore the data around the WWII, it is not reliable.... We understand the reasons for the hiatus from 1945 to 1975, mostly increases in pollution and a bit the sun."

There is certainly reason to be suspicious of SST during WWII. Most of the 1920-1945 increase occurred outside the war years and even the war years showed continued warming over land. Therefore, as best I can tell, there was a period of strong warming from 1920-1945, but we can quibble about how strongly it continued during the war. The subsequent period of cooling can be explained by the negative forcing from aerosols is very strong, but experts are now favoring less cooling from tropospheric aerosols.

I completely agree that the confidence interval for the warming trend for what skeptics call the Pause in the 2000's is far too wide to claim warming stopped. However it isn't wide enough to reach the long-term warming trend of about 0.17 K/decade. Thus there is evidence for a slowdown (or hiatus) in warming (which consensus climate scientists have discussed). In theory, I should be able to cherry-pick 2.5% of periods with warming trends below the 95% ci, but eye-balling Nick Stoke's trend viewer (`979-present) suggests that more that 2.5% of the periods have unusually slow warming.

So I think the deviations in your LOESS fit are more likely to be examples of LTP rather than noise. Which is a good thing in some respects, since the purpose of the LOESS regression is to remove some of the noise from the data. However, the LOESS parameters you chose didn't detect a lower trend in the 2000's.

Your confidence intervals for white noise are interesting; though I will remember it somewhat differently: halve the period, 3X bigger ci. I tend to disdain periods shorter than about the traditional 30 years. 40 years has a ci almost 10X smaller than 10 years.

A brief search suggests that there are statistical methods (available in R) that produce ci's for a variety of non-linear regressions:

https://seriousstats.wordpress.com/tag/loess/

Frank

Yes, there was warming before WWII. Some climate "sceptics" seem to think, or pretend to think, that warming before WWII would be purely natural. Thus let me add that we already had greenhouse forcings before the war. IPCC report: Figure 8.18.

I only wanted to suggest that the variability from decade to decade may be a bit less strong than people think because of the clear WWII peak. Part of the peak is also likely real, we had an El Nino period in the beginning of WWII.

The post on cherry picking I linked to above makes clear that the confidence intervals for a random period are much smaller than the confidence intervals for a cherry picked period. There was no "hiatus", just continuous warming and bad statistics.

> the LOESS parameters you chose didn't detect a lower trend in the 2000's.

Because there was none. Had a chosen a smaller smoothing length, it would have emphasised a fluctuation that is not a change in the long-term trend and you make such a LOESS fit to highlight the long-term trend, not the noise.

> halve the period, 3X bigger ci

That is quite similar to 10% the period 30 times the uncertainty in the trend. Just apply your rule 3x: (1/2)^3=1/8 of the period is 3x3x3=27 times the uncertainty.

Do you remember where you got that rule from? I did not search much because I wanted to go step by step for my blog post anyway, but I did not find any literature on the uncertainty of trends as a function of their length.

Victor wrote: "Yes, there was warming before WWII. Some climate "sceptics" seem to think, or pretend to think, that warming before WWII would be purely natural. Thus let me add that we already had greenhouse forcings before the war. IPCC report: Figure 8.18."

Yes, but the fingerprinting papers the IPCC used to assign most warming AFTER 1950 to humans also found that most warming before 1950 could not be assigned to humans or nature and therefore was mostly unforced variability. Don't remember if these were the AR4 or AR5 references, but I did look them up what they said about pre-1950 warming - since the IPCC wasn't candid enough to tell me. Sorry, don't have a reference right now.

However, I do agree with you that we are talking about uncomfortably small changes in temperature when we discuss deviations from the warming one expects from forcing alone. If it weren't for the massive unforced changes in temperature produced by ENSO, I wouldn't pay as much attention to the decadal deviations that also appear to be unforced variability.

"halve the period, 3X bigger ci" was simply an approximate mathematical restatement of your rule, that I found more useful than your version.

IMO, we know it is warming today because the linear trend for the last 40 years is about 0.17 +/- 0.025 K/decade (95% ci). If you move to periods half, one third, or one quarter this long, the confidence interval on Nick Stoke's trend viewer widens dramatically. The confidence interval for these shorter periods is OFTEN so wide that no one should be claiming a SIGNIFICANT decrease or increase compared with 40-year warming rate. (This isn't the case for 2001-2013, when the central estimate was negative, however.) So, with exception of brief periods (which ended with the recent El Nino and aren't coming back without dramatic cooling over next five years), 0.15-0.20 K/decade is the best answer we have for the current rate warming. Your LOESS Regression shows about the same rate. With the exception of the hiatus, I refuse to consider any shorter periods with absurdly wide confidence intervals

For example, the last 20 years (1/97 to 1/17) for HADCRUT is 0.135 +/-0.07 K/decade, with roughly 3X the confidence interval and no evidence or a significant difference from 0.17 +/- 0.025 K/decade. The same thing is true for the last ten years (1/2007 to 1/2017): 0.32 +/- 0.18 K/decade with a ci roughly 10X bigger. FWIW, I didn't cherry-pick these periods.

Why do I think the hiatus (2001-2013 hiatus isn't just a cherry-picked period with slower warming? Look at the trend triangle for 1997-now with the Upper CI Trend radio button selected. That big green area surrounded by gray has confidence intervals well below 0.15 K/decade. Click anywhere in that big green area surrounded by gray and check the confidence interval. That area is far more than the 2.5% of the trends one expects to find outside the confidence interval by chance. Now look at the same area for 1979 to now. Still looks like more than 2.5% to me.

Nothing much changes if you going back to 1970 or even 1960: the rate drops modestly because of negative aerosol forcing and the fact that CO2 was rising only 1 ppm/yr. So I like to use the trend over the last 30-40 years as the best answer we have for today's trend.

Frank

The situation before WWII is more complicated, but that does not mean that the warming is all natural, but a part may well be. It is also more complicated because part of attribution studies is how well the models fit to the observations and before WWII the observations are not that good. The network density was low, which makes it hard to remove inhomogeneities.

It is unavoidable to cherry pick a period. We know the data. This is not the same as saying someone is evil. If I would pick a certain short period, I would also be cherry picking. You can give the data to some econometrist and claim it is Australian grain harvest data. If you do, such a study was actually done by Stephan Lewandowsky, they will tell you there was no "hiatus".

For longer periods cherry picking is less of a problem, as my above linked post on cherry picking shows. Other solutions are not to cherry pick, but let statistics decide whether there is a change in trend in an unknown year. The test will find no trend change.

Other solutions are to remove the El Nino noise (and the volcanoes and the solar cycle), if you do, the time series becomes much less noisy and cherry picking thus less of a problem. If you do, the "hiatus" disappears. Tamino has posts showing this.

I just published a new post on the warming explosion since 2011.

With the "methods" of the "hiatus" fans there is one, but there is naturally no explosion. Only bad statistics.

@Frank:

You've tried to compare shorter term temperature trends to a long term trend (over f.e. 40-years).

There is an issue with the long term trend because of volcanic aerosol forcing.

There is an not so small contribution to the overall long term trend due to the recovery from the volcanic cooling after some large volcanic eruptions from 1963 to 1991. For example for a 40-year period of 1977 to 2016, I expect a positive contribution of the recovery from the volcanic cooling to the long term temperature trend of 0.024 K/decade from application the volcanic forcing to an EBM. So the temperature trend without volcanic recovery is lower, about 0.17-0.024 = 0.146 K/decade. For shorter periods the effect is worse.

So the long term temperature trend may be overestimated even for 40 year periods. I tried some periods to minimize the effect. So far, for calculating the long term temperature trend (from observed data, without correction of volcanic contribution, ENSO ...) I would recommend to start before 1960 to minimize the bias.

I got 0.12 to 0.17 K/decade (no ci!) depending on assumptions and start year for the long term temperature trend to 2016 using the BESTlo data.

After removing both ENSO and volcanic variability, I got a much smoother temperature curve.

There is a strong acceleration in warming around 1976-1979, so trends from 1960 on or earlier, as I suggestet in my last comment, would undererstimate the warming trend in the last 40 years.

For 1977-2016 I got 0.18 K/decade for the messured BESTlo data and 0.16 K/decade for the data corrected from the ENSO and volcanic contribution.

Post a comment