Monday, 30 September 2013

Reviews of the IPCC review

The first IPCC report (Working Group One), "Climate Change 2013, the physical science basis", has just been released.

One way to judge the reliability of a source, is to see what it states about a topic you are knowledgeable about. I work on homogenization of station climate data and was thus interested in the question how well the IPCC report presents the scientific state-of-the-art on the uncertainties in trend estimates due to historical changes in climate monitoring practices.

Furthermore, I have asked some colleague climate science bloggers to review the IPCC report on their areas of expertise. You find these reviews of the IPCC review report at the end of the post as they come in. I have found most of these colleagues via the beautiful list with climate science bloggers of Doug McNeall.

Large-Scale Records and their Uncertainties

The IPCC report is nicely structured. The part that deals with the quality of the land surface temperature observations is in Chapter 2 Observations: Atmosphere and Surface, Section 2.4 Changes in Temperature, Subsection 2.4.1 Land-Surface Air Temperature, Subsubsection Large-Scale Records and their Uncertainties.

The relevant paragraph reads (my paragraph breaks for easier reading):
Particular controversy since AR4 [the last fourth IPCC report, vv] has surrounded the LSAT [land surface air temperature, vv] record over the United States, focussed upon siting quality of stations in the US Historical Climatology Network (USHCN) and implications for long-term trends. Most sites exhibit poor current siting as assessed against official WMO [World Meteorological Organisation, vv] siting guidance, and may be expected to suffer potentially large siting-induced absolute biases (Fall et al., 2011).

However, overall biases for the network since the 1980s are likely dominated by instrument type (since replacement of Stevenson screens with maximum minimum temperature systems (MMTS) in the 1980s at the majority of sites), rather than siting biases (Menne et al., 2010; Williams et al., 2012).

A new automated homogeneity assessment approach (also used in GHCNv3, Menne and Williams, 2009) was developed that has been shown to perform as well or better than other contemporary approaches (Venema et al., 2012). This homogenization procedure likely removes much of the bias related to the network-wide changes in the 1980s (Menne et al., 2010; Fall et al., 2011; Williams et al., 2012).

Williams et al. (2012) produced an ensemble of dataset realisations using perturbed settings of this procedure and concluded through assessment against plausible test cases that there existed a propensity to under-estimate adjustments. This propensity is critically dependent upon the (unknown) nature of the inhomogeneities in the raw data records.

Their homogenization increases both minimum temperature and maximum temperature centennial-timescale United States average LSAT trends. Since 1979 these adjusted data agree with a range of reanalysis products whereas the raw records do not (Fall et al., 2010; Vose et al., 2012a).

I would argue that this is a fair summary of the state of the scientific literature. That naturally does not mean that all statements are true, just that it fits to the current scientific understanding of the quality of the temperature observations over land. People claiming that there are large trend biases in the temperature observations, will need to explain what is wrong with Venema et al. (an article of mine from 2012) and especially Williams et al. (2012). Williams et al. (2012) provides strong evidence that if there is a bias in the raw observational data, homogenization can improve the trend estimate, but it will normally not remove the bias fully.

Personally, I would be very surprised if someone would find substantial trend biases in the homogenized US American temperature observations. Due to the high station density, this dataset can be investigated and homogenized very well.

Global mean temperature

What the report unfortunately does not discus are the uncertainties in the global temperature record. A good reference for this is still Parker (1994). If there is a next IPCC report, it can hopefully report on the results of the global Benchmarking of homogenization algorithms performed in the International Surface Temperature Initiative.

This IPCC report could have mentioned that NOAA now uses a much better homogenization method as they did before and found biases in the global mean temperature. In the GHCNv3 dataset, the trend in the global mean temperature in the raw data is 0.6°C per century and in the adjusted data it is 0.8°C per century (Lawrimore et al., 2012). The main reason for these biases is probably that past temperatures were too high due to larger radiation errors, especially in the time before Stevenson screens were used.


Conclusion, there is still a lot we can do to improve our understanding of uncertainties due to non-climatic changes in trend estimates. It would be desirable if a next IPCC would be able to produce stronger statements about global uncertainties in the mean temperature trend (and also for changes in extreme weather). The current report is, however, a honest representation of the scientific literature. And that is what the IPCC is supposed to do.

More reviews

If you see further reviews of the IPCC report by experts, please let me know in the comments. If you are able to write such a review, but do not have a blog, please consider to comment below or write a guests post.

Other reviews

Impressions of AR5 from an aerosol forcing point of view by Karsten in the comments below.
Karsten discuses the direct and indirect effects of aerosols on the radiative forcing. Indirect effects are the effects aerosols have by changing the clouds. "Bottomline: AR5 is a fair representation of the current literature and if at all, there is a tendency to err on the side of lesser drama this time (i.e. lower aerosol forcing)."
Sea level in the 5th IPCC report by Stefan Rahmstorf at RealClimate.
Stefan Rahmstorf acknowledges the improvements in the estimates of sea level rise, but still finds the IPCC estimates to be too conservative, too low. A more accessible report on the same issue can be found at IRIN News.
The IPCC AR5 attribution statement by Gavin Schmidt at RealClimate.
Gavin Schmidt ends his post: "Bottom line? These statements are both comprehensible and traceable back to the literature and the data. While they are conclusive, they are not a dramatic departure from basic conclusions of AR4 and subsequent literature."
Mayflies mandibles: as seen in the IPCC report by Richard Telford at Musings on Quantitative Palaeoecology.
As a warming up exercise to his review of the IPCC report Telford comments on a reference in the report to an article that "should never have been submitted, let alone published". He also states that: "The chapter represents the paper correctly, and its inclusion has no impact on any of the chapters conclusions."
Aslak Grinsted has written two posts: AR5 sea level rise uncertainty communication failure and Optimistic & over-confident ice sheet projections in AR5 at
Grinsted is disappointed in how the sea level rise projection uncertainties are presented in the IPCC AR5. The best estimate and especially the worst case scenario estimate of sea level rise are much too low in his opinion. The reason seems to be that the IPCC does not taking the collapse of the Antarctic ice sheet into account because of large uncertainties.
AR5: cursory review of chapter 4 (cryosphere) mass balance of Antarctica by William M. Connolley at Stoat.
Connolley used to work at the British Anarctic Survey and comments on the mass balance of Antarctica. As he has left academia, he can not compare the report to the literature, but he can still comment intelligently. Parental advisory. Explicit IPCC critique.
Near-term global surface temperature projections in IPCC AR5 by Ed Hawkins at Climate Lab Book.
This post discusses the new chapter and assessment on "near-term" climate change, which is relevant for adaptation decision making. Hawkins gives three reasons for small differences between his and the IPCC assessment.
What does the 2013 IPCC Summary Say About Water? by Peter Gleick at Significant Figures.
This posts lists the important points about water in the Summary for Policy makers. Gleick wrote by mail that the IPCC points and his (more limited) reading of the literature are in agreement.

Other reactions on the IPCC report

Professor Piers Forster and his 18 tweet summary of the IPCC SPM by Mark Brandon at Mallemaroking
A series of tweets from IPCC drafting author Professor Piers Forster.
What Does the New IPCC Report Say About Climate Change? by Steve Easterbrook at Serendipity
Steve Easterbrook summarises the 8 highlights of the report, each illustrated with a figure. Not directly related, but his recent series on the climate as a system is a great tutorial on systems thinking.
Michael Mann: Climate-Change Deniers Must Stop Distorting the Evidence (Op-Ed) by Michael Mann at live science
A first reaction on the IPCC report by Michael Mann, but mainly a rant about the weird debate with climate change deniers
What Leading Scientists Want You to Know About Today's Frightening Climate Report by Richard Schiffman at The Atlantic
Forget the end of the title and you get a good report with first reactions from a range of scientists


Lawrimore, J.H., M.J. Menne, B.E. Gleason, C.N. Williams, D.B. Wuertz, R.S. Vose, and J. Rennie, 2011: An overview of the Global Historical Climatology Network monthly mean temperature data set, version 3. J. Geophys. Res., 116, no. D19121, doi: 10.1029/2011JD016187.

Parker, D.E. Effects of changing exposure of thermometers at land stations. Int. J. Climatol., 14, pp. 1–31, doi: 10.1002/joc.3370140102, 1994.

Venema, V., O. Mestre, E. Aguilar, I. Auer, J.A. Guijarro, P. Domonkos, G. Vertacnik, T. Szentimrey, P. Stepanek, P. Zahradnicek, J. Viarre, G. Müller-Westermeier, M. Lakatos, C.N. Williams, M.J. Menne, R. Lindau, D. Rasol, E. Rustemeier, K. Kolokythas, T. Marinova, L. Andresen, F. Acquaotta, S. Fratianni, S. Cheval, M. Klancar, M. Brunetti, Ch. Gruber, M. Prohom Duran, T. Likso, P. Esteban, Th. Brandsma. Benchmarking homogenization algorithms for monthly data. Climate of the Past, 8, pp. 89-115, doi: 10.5194/cp-8-89-2012, 2012.

Williams, C.N. jr., M.J. Menne, and P.W. Thorne, 2012: Benchmarking the performance of pairwise homogenization of surface temperatures in the United States. J. Geophys. Res., 117, art. no. D05116, doi: 10.1029/2011JD016761.


  1. One paragraph? The IPCC report is 2000 pages. So you know about less than a thousandths of CAWG?

  2. From Aslak Grinsted:

    A second article is linked therein. More to come, it would appear.

    My impression is that the AR5 conclusions are driven by ice sheet models which have proven to be wrong (albeit that some of the proof is in post-AR5 papers).

    Also, at a quick read, I'm deeply unhappy with the AR5 treatment of the implications of the inability of the models to handle the transition from current to mid-Pliocene-like conditions. If they're missing something major in that regard, how can we rely on them for 2100? This issue has monumental policy implications.

    The same failed ice sheet models were used to low-ball mid-Pliocene SLR.

    For the record, IANAS, although I do try to pay attention.

  3. Steve, thank you for finding that link. I have added Grinsted's posts to the list.

    Unfortunately he does not explicitly write whether the IPCC estimates of sea level rise are also to low relative to estimates from the scientific literature. I fully understand that the IPCC estimates are not based on a survey among glaciologists, the IPCC needs scientific evidence not just opinion. Plus the survey was from 2013. Thus after the deadline.

    It does sound as if the IPCC should have stated in clearer language in the Summary for Policy Makers that their estimate is biased due to not including a collapse of the Antarctic ice sheet.

  4. Anonymous, I guess that is what science is. Quality not quantity.

    Previously I worked on radiative transfer through clouds. In the long run this is also interesting for climate (and weather prediction and remote sensing), but not directly. Thus none of this work has ever ended up in an IPCC report. That does not make it less important.

  5. It seems you're essentially saying that the IPCC report fairly represents the literature, but that there could still be issues with homogenization that might suggest that the actual trend is greater than the current data suggests. If so, is there any chance that the trend could be lower, or is it that the homogenization techniques have tended to err on the conservative side.

  6. Wotts, there is no evidence that homogenization would make the trends more biased and a lot of evidence that homogenization makes trend estimates better.

    Given that a homogenization made the trend clearly stronger in the GHCNv3, I would see a lower trend as very unlikely. Still, it is just one paper, with one method. More independent research is necessary before one would call it robust.

    The only chance I see for lower true trends would be a strong increase in the urban heat island effect. Many people have worked on this, but it is a part of the literature I personally do not know well yet. My colleagues tell me this contribution is small. The above post suggest that it is better to trust scientists with such an assessment as WUWT and Co.

    Regionally urbanization can be important for some periods. For example currently in China.

  7. Wotts, maybe I should add that it is also expected the early temperature measurements are biased (too warm). Parallel measurements with historical and modern set-ups show such biases, mainly attributed to radiation errors in early measurements.

  8. Thanks Victor. I had heard that early measurements were biased and likely too warm, but never heard a good explanation why.

  9. Given that 70% of surface stations surveyed by Watts show severe siting issues likely to cause substantial temperature bias, surely it is more likely that any homogenisation process is discarding "anomalous" good data, in favour of biased data.

    Eric Worrall

  10. Dear Eric Worrall, I could give you an even more worrying estimates: almost all long climate series are inhomogeneous an typically there is a non-climatic break every 15 to 20 years.

    The introduction of automatic weather stations (AWS), which in the US often resulted in poorer siting because of the cables needed and because the technician only had one day to install them, just represents a small fraction of all inhomogeneities. That is why homogenization is so important and that is also why I find it bizarre that climate ostriches are against homogenization. Surely one would like to remove any biases as a result of decreases in siting. For completeness, I should probable mention that the change in the instrumentation turned out to be more important and the transition to AWS in the USA actually resulted in a cooling in the raw data.

    I know the surface station project. Would love to see such a project world wide. It is beautiful to see what you can do in the internet age with enthusiastic volunteers. I hope this project keeps on running, then it will become an important resource for future US climatologists.

    The siting of the temperature stations is an important issue. You want to measure air temperature and not some mixture of air temperature with contributions from wind, insolation and so on. This is especially a problem for studies on changes in extreme weather. However, I would say all the evidence shows that after homogenization such siting issues should not be important for the trends in the global annual mean temperature.

  11. Hi Victor,

    How do you know the climate didn't actually cool?

    I draw your attention to this post on WUWT, which studied the effect of homogenisation on a specific localised area, which shows good station data being discarded in favour of poor station data - or more concisely, poor station data is being used to incorrectly adjust good station data.

    When you see examples like that, and adjustments like the following NOAA adjustment, a massive hockey stick shaped adjustment to an otherwise much flatter trend, there is only one word for it - it stinks (note the original NOAA link is currently not working due to the government shutdown).

    I completely understand and appreciate the need for discarding outliers and attempting to compensate for breaks caused by undocumented station changes - I'm just concerned there is evidence that substantial systemic biases in the underlying data are slipping through the current regime.

    Kind Regards,

  12. Dear Eric,

    Why I am sure it is not cooling? :-)

    Because there are so many independent lines of research that point to a warming of the global mean surface temperature during the last century. You do not only see this in the surface station network, but also in the sea surface temperature and radiosondes. You see it in nature, in the behaviour of animals, wild plants and agricultural plants.

    More details can be found here:
    Are surface temperature records reliable? and
    The human fingerprint in global warming

    That new WUWT post is interesting. It is on my reading list, a cursory view revealed some blatant errors, but I would like to study it a bit more as some points sounded interesting. Unfortunately, my past experience with WUWT tells me that when you investigate claims more closely, the arguments turn out to be unfounded.

    Anyway, I am sure that the warming trend is not due to homogenization (the topic of the WUWT post) for a number of reasons. Firstly, also in the raw data, without homogenization, you see a warming trend. Secondly, there is nothing in the way the algorithms work that has a preference for the direction of the trend. Thirdly, I have tested many homogenization algorithms, including the one of NOAA, a blind set-up. Only I knew where the inhomogeneities were and the algorithms found them quite well and applying them improved trend estimates. For more information see my old post on this validation study.

    I hope that clarifies your question.

  13. There is other evidence that surface stations are running hot, for example the following WFT comparison between RSS and Hadcrut4 shows half a degree of extra warming in the Hadcrut4 series.

    Obviously 0.05c is not that large a difference - but it gets even more interesting if you compare RSS vs Hadcrut4 since 1997 - around 0.1c difference in reported trend over that period.

    Obviously RSS and surface stations are measuring very different things, but it is interesting to me at least that surface stations seem to be running hotter than satellite measurements of the troposphere - to me it adds weight to the suggestion that surface stations are affected by uncorrected UHI effects.


  14. Dear Eric,

    We are getting closer to agreeing. Good that you gave up the fundamentalist position that there is a long term cooling trend.

    Still your choice of terms "running hot" and "half a degree of extra warming" in your first paragraph do not really fit to the evidence your present. 0.05°C over 30 years is less than 0.5°C, except if you were thinking of a 300 year instrumental global time series. I would not know that one exists.

    May I ask you where your preference for the RSS dataset comes from? Had you compared it to the UAH dataset, you would have had to write that the urbanization leads to a cooling relative to satellite data.

    Not that I would have made such a claim. When it comes to long term trends I hold satellite data to be less reliable as the station network. The satellite dataset has its own problems with inhomogeneities, the number and type of satellites have changed during the decades and during the life time of a satellite its orbit changes. These are all very difficult inhomogeneities (non-climatic) to remove, because the number of satellites at any one moment is small. The station network with its high number of stations is much more redundant and thus much easier to homogenize.

    If the station network and the satellites disagree, it is very interesting to investigate the reasons for these differences, but I see no reason what so ever to immediately assume that the station network and especially urbanization is to blame. Do you have any arguments for that?

  15. Dear Victor,

    At what point did I give up my position that there is a possible long term cooling trend? If the raw data is accepted at face value, the 1930s was equal to if not hotter than current temperatures. Warming since the 1930s is as Roy Spencer observed , an artefact of adjustments. So the validity of the adjustments is of vital importance when making statements about long term trends.

    My point RE RSS is that it runs consistently cooler than surface stations, which is what you would expect if the surface stations were affected by uncorrected UHI. Naturally there are other possible causes - but as you point out, it is important such discrepancies are thoroughly investigated.


  16. Eric, why did you suddenly switch to the US data set alone? Why did you not acknowledge that RSS was a cherry pick, and that with UAH your conclusion would have been opposite?


  17. Eric, I am sure you are sufficiently experienced to be able to keep throwing up new issues for a long time. I am also reasonably confident that you already know the standard answers to the problems you pose.

    Instead of evading discussions and starting new topics before old ones are resolved, let's try an old trick from science. Let's try to go into detail, quantify uncertainties and try to see if we can come to a common understanding on this one specific topic, or at least understand why we disagree and what additional studies would be necessary. Ones we understand the topic, we can move to the next.

    Let's stick to your interesting claim that it is possible that the global mean temperature is decreasing. If true that would have large consequences.

    Let's first try to make this statement more exact. (1) Would you state that there is a more than 50% chance that the global mean land surface temperature over the last 150 is decreasing? If not what would be a similar well defined statement about the decrease of the temperature that does fit to your opinion?

    I thought, you had abandoned the position that temperature may be decreasing

    (2) Because you did not respond to my argument that there is much independent evidence next to the surface station network.

    (3) Because the bias you mention is only 0.05 °C per 30 years. That is about 0.2 °C per century. Whereas the temperature trend is estimated to be 0.8°C per century over the last 150 years. Thus if your bias estimate is right, there would still be a temperature trend of at least 0.6°C per century.

    (4) Because you did not give any arguments to prefer RSS over UAH, while UAH would suggest that there is no temperature trend bias, at least following your own line of reasoning. I thought the UAH dataset was well trusted in your circles. I thought that Dr. Roy Spencer, the scientist behind UAH was well respected among climate ostriches.

    To understand why we disagree, (1) it would be nice if you could make your position more exact, (2) could comment on why you do not see the independent evidence as supportive, (3) where I made a computational error or misunderstood your argument on biases seen by comparing with satellite data, (4) explain what your reasons are to prefer RSS over UAH.

    Looking forward to your reply. My apologies that my response is a little slow at the moment.

  18. Victor, re the possible warm bias of early temperature measurements, any idea how that would tend to affect the shape of the mid-century "dip"? Also, could you give me a pointer to the parallel measurements research?

    Nice job w/ Eric, BTW. I'm going to point to that exchange as an object lesson in how to do it right.

  19. Steve, thanks. The real post on parallel measurements still has to be written. Hopefully soon. I am working on a review paper, which would include some of such information.

    There are some problems mentioned with references on my main homogenization post. And you can find a lot of references to parallel measurements at this wiki, which is on the other hand still be beginning. There are likely much more papers and datasets. Some "just" has to lift the treasure.

    It is really work on progress, maybe also nothing comes out. Let's see what the data says. Speculating on details such as special periods is really too early.

  20. Victor, with some delay, lets briefly summarize my impressions of AR5 from an aerosol forcing point of view. It is mainly dealt with in the Clouds and Aerosol chapter (Ch 7), feeding into the Radiative Forcing chapter (Ch 8) with particular regard to the temporal evolution of the aerosol forcing until now.

    In general, the current state of the science appears to be well represented, sth which shouldn't come as a surprise. Most of the choices made by the authors in charge are well justified and mostly explained in sufficient detail. However, there are a few fine details where they could have elaborated a bit more. While most of the numbers for the different aerosol radiative forcing estimates are taken straight from the literature, the task becomes trickier when these numbers have to be converted according to the latest radiative forcing convention in a consistent fashion. Most published estimates are TOA radiative forcing without allowing for atmospheric radiative flux adjustments. The newly introduced "Effective Radiative Forcing" (ERF) aims at including these effects. In case of atmospheric aerosols, it includes rapid cloud adjustments due to changes in the vertical temperature profile caused by the presence of aerosols.
    That's where the trouble starts, as only few studies are available which provide the basis for a proper conversion. In my point of view, some of the assumptions are simply too optimistic, i.e. the uncertainties tend to be slighly underestimated given the quality and the number of the available literature (it mainly concerns the indirect aerosol effects). Since one of the assumptions is particularly odd, arguably the total aerosol forcing might well be an underestimate, although only by a small margin (my personal best guess would have been -1W/m2 in 2011, while AR5 makes it -0.9W/m2). I'm talking about the LW indirect aerosol effect which is nowhere in the literature as positive as they assumed it to be (+0.2W/m2). I'm still in the process of figuring out why they decided to put these numbers (as I am struggling to reproduce them), but getting to the right people who can answer this question takes some time. Together with a tendency to give quite some credit to a few equally uncertain black carbon studies (which suggest a stronger positive BC forcing than previously thought), I wouldn't be surprised to see a slightly stronger negative forcing estimates in the future again. On the other hand, they nicely discuss potential buffer effects which can't properly constrained in the current generation of GCMs. An aspect which might play an important role as well.

    In this context, it might be worth mentioning a recent paper by Wilcox et al. 2013 (submitted after the AR5 deadline), which seem to have found a neat clue to set some upper and lower bounds on the aerosol forcing, thus considerably reducing the uncertainty range. Though they didn't provide global forcing estimates, they showed that in order to explain the observed interhemispheric temperature evolution some indirect forcing is indispensable in the GCMs. In contrast, GCMs with too strong an indirect effect are incompatible with the observations. I reckon that the best fit for the forcing would be in the ballpark of the AR5 figures.

    Bottomline: AR5 is a fair representation of the current literature and if at all, there is a tendency to err on the side of lesser drama this time (i.e. lower aerosol forcing). Given that is has been the other way around in AR4, I can't see any systematic bias in that direction. While some details remain unsatisfactorily explained, it is a honest representation of the literature as far as I can tell. This is particularly true, as they included an expert judgement as one of the pillars to pin the total anthropogenic aerosol forcing down.

  21. Kassten, thanks a lot. I have added your review to the list in the post. It starts to become a nice list.

  22. I know this is a very, very late comment -but I only just discovered this post. I agree, your reply to Eric Worrall is a great example of how best to do it.


Comments are welcome, but comments without arguments may be deleted. Please try to remain on topic. (See also moderation page.)

I read every comment before publishing it. Spam comments are useless.

This comment box can be stretched for more space.