Showing posts with label climate models. Show all posts
Showing posts with label climate models. Show all posts

Tuesday, 6 December 2016

Scott Adams: The Non-Expert Problem and Climate Change Science



Scott Adams, the creator of Dilbert, wrote today about how difficult it is for a non-expert to judge science and especially climate science. He argues that it is normally a good idea for a non-expert to follow the majority of scientists. I agree. Even as a scientist I do this for topics where I am not an expert and do not have the time to go into detail. You cannot live without placing trust and you should place your trust wisely.

While it is clear to Scott Adams that a majority of scientists agree on the basics of climate change, he worries that they still could all be wrong. He lists the below six signals that this could be the case and sees them in climate science. If you get your framing from the mitigation sceptical movement and only read the replies to their nonsense you may easily get his impression. So I thought it would be good to reply. It would be better to first understand the scientific basis, before venturing into the wild.

The terms Global Warming and Climate Change are both used for decades

Scott Adams assertion: It seems to me that a majority of experts could be wrong whenever you have a pattern that looks like this:

1. A theory has been “adjusted” in the past to maintain the conclusion even though the data has changed. For example, “Global warming” evolved to “climate change” because the models didn’t show universal warming.


This is a meme spread by the mitigation sceptics that is not based on reality. From the beginning both terms were used. One hint is name of the Intergovernmental Panel on Climate Change, a global group of scientists who synthesise the state of climate research and was created in 1988.

The irony of this strange meme is that it were the PR gurus of the US Republicans who told their politicians to use the term "climate change" rather than "global warming", because "global warming" was more scary. The video below shows the historical use of both terms.



Global warming was called global warming because the global average temperature is increasing, especially in the beginning there were still many regions were warming was not yet observed, while it was clear that the global average temperature was increasing. I use the term "global warming" if I want to emphasis the temperature change and the term "climate change" when I want to include all the other changes in the water cycle and circulation. These colleagues do the same and provide more history.

Talking about "adjusted", mitigation sceptics like to claim that temperature observations have been adjusted to show more warming. Truth is that the adjustments reduce global warming.

Climate models are not essential for basic understanding

Scott Adams assertion: 2. Prediction models are complicated. When things are complicated you have more room for error. Climate science models are complicated.

Yes, climate models are complicated. They synthesise a large part of our understanding of the climate system and thus play a large role in the synthesis of the IPCC. They are also the weakest part of climate science and thus a focus of the propaganda of the mitigation sceptical movement.

However, when it comes to the basics, climate model are not important. We know about the greenhouse effect for well over a century, long before we had any numerical climate models. That increasing the carbon dioxide concentration of the atmosphere leads to warming is clear, that this warming is amplified because warm air can contain more water, which is also a greenhouse gas, is also clear without any complicated climate model. This is very simple physics already used by Svante Arrhenius in the 19th century.

The warming effect of carbon dioxide can also be observed in the deep past. There are many reasons why the climate changes, but without carbon dioxide we can, for example, not understand the temperature swings of the past ice ages or why the Earth was able to escape from being completely frozen (Snowball Earth) at a time the sun was much dimmer.

The main role of climate models is trying to find reasons why the climate may respond differently this time than in the past or whether there are mechanisms beyond the simply physics that are important. The average climate sensitivity from climate models is about the same as for all the other lines of evidence. Furthermore, climate models add regional detail, especially when in comes to precipitation, evaporation and storms. These are helpful to better plan adaptation and estimate the impacts and costs, but are not central for the main claim that there is a problem.

Model tuning not important for basic understanding

Scott Adams assertion: 3. The models require human judgement to decide how variables should be treated. This allows humans to “tune” the output to a desired end. This is the case with climate science models.

Yes, models are tuned. Mostly not for the climatic changes, but to get the state of the atmosphere right, the global maps of clouds and precipitation, for example. In the light of my answer to point 2, this is not important for the question whether climate change is real.

The consensus is a result of the evidence

Scott Adams assertion: 4. There is a severe social or economic penalty for having the “wrong” opinion in the field. As I already said, I agree with the consensus of climate scientists because saying otherwise in public would be social and career suicide for me even as a cartoonist. Imagine how much worse the pressure would be if science was my career.

It is clearly not career suicide for a cartoonist. If you claim that you only accept the evidence because of social pressure, you are saying you do not really accept the evidence.

Scott Adams sounds as if he would like scientists to first freely pick a position and then only to look for evidence. In science it should go the other way around.

This seems to be the main argument and shows that Scott Adams knows more about office workers than about the scientific community. If science was your career and you would peddle the typical nonsense that comes from the mitigation sceptical movement that would indeed be bad for your career. In science you have to back up your claims with evidence. Cherry picking and making rookie errors to get the result you would like to get are not helpful.

However, if you present credible evidence that something is different, that is wonderful, that is why you become a scientist. I have been very critical of the quality of climate data and our methods to remove data problems. Contrary to Adams' expectation this has helped my career. Thus I cannot complain how climatology treats real skeptics. On the contrary, a lot of people supported me.

Another climate scientist, Eric Steig, strongly criticized the IPCC. He wrote about his experience:
I was highly critical of IPCC AR4 Chapter 6, so much so that the [mitigation skeptical] Heartland Institute repeatedly quotes me as evidence that the IPCC is flawed. Indeed, I have been unable to find any other review as critical as mine. I know "because they told me" that my reviews annoyed many of my colleagues, including some of my [RealClimate] colleagues, but I have felt no pressure or backlash whatsoever from it. Indeed, one of the Chapter 6 lead authors said “Eric, your criticism was really harsh, but helpful "thank you!"
If you have the evidence, there is nothing better than challenging the consensus. It is also the reason to become a scientist. As a scientist wrote on Slashdot:
Look, I'm a scientist. I know scientists. I know scientists at NOAA, NCAR, NIST, the Labs, in academia, in industry, at biotechs, at agri-science companies, at space exploration companies, and at oil and gas companies. I know conservative scientists, liberal scientists, agnostic scientists, religious scientists, and hedonistic scientists.

You know what motivates scientists? Science. And to a lesser extent, their ego. If someone doesn't love science, there's no way they can cut it as a scientist. There are no political or monetary rewards available to scientists in the same way they're available to lawyers and lobbyists.

Scientists consider and weigh all the evidence

Scott Adams assertion: 5. There are so many variables that can be measured – and so many that can be ignored – that you can produce any result you want by choosing what to measure and what to ignore. Our measurement sensors do not cover all locations on earth, from the upper atmosphere to the bottom of the ocean, so we have the option to use the measurements that fit our predictions while discounting the rest.

No, a scientist cannot produce any result they "want" and an average scientist would want to do good science and not get a certain result. The scientific mainstream is based on all the evidence we have. The mitigation sceptical movement behaves in the way Scott Adams expects and likes to cherry pick and mistreat data to get the results they want.

Arguments from the other side only look credible

Scott Adams assertion: 6. The argument from the other side looks disturbingly credible.

I do not know which arguments Adams is talking about, but the typical nonsense on WUWT, Breitbart, Daily Mail & Co. is made to look credible on the surface. But put on your thinking cap and it crumbles. At least check the sources. That reveals most of the problems very quickly.



For a scientist it is generally clear which arguments are valid, but it is indeed a real problem that to the public even the most utter nonsense may look "disturbingly credible". To help the public assess the credibility of claims and sources several groups are active.

Most of the zombie myths are debunked on RealClimate or Skeptical Science. If it is a recent WUWT post and you do not mind some snark you can often find a rebuttal the next day on HotWhopper. Media articles are regularly reviewed by Climate Feedback, a group of climate scientists, including me. They can only review a small portion of the articles, but it should be enough to determine which of the "sides" is "credible". If you claim you are sceptical, do use these resources and look at all sides of the argument and put in a little work to go in depth. If you do not do your due diligence to decide where to place your trust, you will get conned.



While political nonsense can be made to look credible, the truth is often complicated and sometimes difficult to convey. There is a big difference between qualified critique and uninformed nonsense. Valuing the strength of the evidence is part of the scientific culture. My critique of the quality of climate data has credible evidence behind it. There are also real scientific problems in understanding changes of clouds, as well as the land and vegetation. These are important for how much the Earth will respond, although in the long run the largest source of uncertainty is how much we will do to stop the problem.

There are real scientific problems when it comes to assessing the impacts of climate change. That often requires local or regional information, which is a lot more difficult than the global average. Many impacts will come from changes in severe weather, which are by definition rare and thus hard to study. For many impacts we need to know several changes at the same time. For droughts precipitation, temperature, humidity of the air and of the soil and insolation are all important. Getting them all right is hard.

How humans and societies will respond to the challenges posed by climate change is an even more difficult problem and beyond the realm of natural science. Not only the benefits, but also the costs of reducing greenhouse gas emissions are hard to predict. That would require predicting future technological, economic and social development.

When it comes to how big climate change itself and its impacts will be I am sure we will see surprises. What I do not understand is why some are arguing that this uncertainty is a reason to wait and see. The surprises will not only be nice, they will also be bad and all over increase the risks of climate change and make the case for solving this solvable problem stronger.




Related reading

Older post by a Dutch colleague on Adams' main problem: Who to believe?

How climatology treats sceptics

What's in a Name? Global Warming vs. Climate Change

Fans of Judith Curry: the uncertainty monster is not your friend

Video medal lecture Richard B. Alley at AGU: The biggest control knob: Carbon Dioxide in Earth's climate history

Just the facts, homogenization adjustments reduce global warming

Climate model ensembles of opportunity and tuning

Journalist Potholer makes excellent videos on climate change and true scepticism: Climate change explained, and the myths debunked


* Photo Arctic Sea Ice by NASA Goddard Space Flight Center used under a Creative Commons Attribution 2.0 Generic (CC BY 2.0) license.
* Cloud photo by Bill Dickinson used under a Creative Commons Attribution-NonCommercial-NoDerivs 2.0 Generic (CC BY-NC-ND 2.0) license.

Monday, 15 August 2016

Downscaling temperature fields with genetic programming

Sierpinski fractal

This blog is not called Variable Variability for nothing. Variability is the most fascinating aspect of the climate system. Like a fractal you can zoom in and out of a temperature signal and keep on finding interesting patterns. The same goes for wind, humidity, precipitation and clouds. This beauty was one of the reasons why I changed from physics to the atmospheric sciences, not being aware at the time that also physicists had started studying complexity.

There is variability on all spatial scales, from clusters of cloud droplets to showers, fronts and depressions. There is variability on all temporal scales. With a fast thermometer you can see temperature fluctuations within a second and the effect of clouds passing by. Temperature has a daily cycle, day to day fluctuations, seasonal fluctuations and year to year fluctuations and so on.

Also the fluctuations fluctuate. Cumulus fields may contain young growing clouds with a lot of variability, older smoother collapsing clouds and a smooth haze in between. Temperature fluctuations are different during the night when the atmosphere is stable, after sun rise when the sun heats the atmosphere from below and the summer afternoon when thermals develop and become larger and larger. The precipitation can come down as a shower or as drizzle.

This makes measuring the atmosphere very challenging. If your instrument is good at measuring details, such as a temperature or cloud water probe on an aircraft, you will have to move it to get a larger spatial overview. The measurement will have to be fast because the atmosphere is changing continually. You can also select an instrument that measures large volumes or areas, such as a satellite, but then you miss out on much of the detail. A satellite looking down on a mountain may measure the brightness of some mixture of the white snow-capped mountains, dark rocks, forests, lush green valleys with agriculture and rushing brooks.



The same problem happens when you model the atmosphere. A typical global atmospheric oceanic climate model has a resolution of about 50 km. Those beautiful snow-capped mountains outside are smoothed to fit into the model and may have no snow any more. If you want to study how mountain glaciers and snow cover feed the rivers you can thus not use the simulation of such a global climate model directly. You need a method to generate a high resolution field from the low resolution climate model fields. This is called downscaling, a beautiful topic for fans of variability.

Deterministic and stochastic downscaling

For the above mountain snow problem, a simple downscaling method would take a high-resolution height dataset of the mountain and make the higher parts colder and the lower parts warmer. How much exactly, you can estimate from a large number of temperature measurements with weather balloons. However, it is not always colder at the top. On cloud-free nights, the surface rapidly cools and in turn cools the air above. This cold air flows down the mountain and fills the valleys with cold air. Thus the next step is to make such a downscaling method weather dependent.

Such direct relationships between height and temperature are not always enough. This is best seen for precipitation. When the climate model computes that it will rain 1 mm per hour, it makes a huge difference whether this is drizzle everywhere or a shower in a small part of the 50 times 50 km box. The drizzle will be intercepted by the trees and a large part will evaporate quickly again. The drizzle that lands on the ground is taken up and can feed the vegetation. Only a small part of the heavy shower will be intercepted by trees, most of it will land on the ground, which can only absorb a small part fast enough and the rest runs over the land towards brooks and rivers. Much of the vegetation in this box did not get any water and the rivers swell much faster.

In the precipitation example, it is not enough to give certain regions more and others less precipitation, the downscaling needs to add random variability. How much variability needs to be added depends on the weather. On a dreary winters day the rain will be quite uniform, while on a sultry summer evening the rain more likely comes down as a strong shower.

Genetic Programming

There are many downscaling methods. This is because the aims of the downscaling depend on the application. Sometimes making accurate predictions is important; sometimes it is important to get the long-term statistics right; sometimes the bias in the mean is important; sometimes the extremes. For some applications it is enough to have data that is locally realistic, sometimes also the spatial patterns are important. Even if the aim is the same, downscaling precipitation is very different in the moderate European climate than it is in the tropical simmering pot.

With all these different aims and climates, it is a lot of work to develop and test downscaling methods. We hope that we can automate a large part of this work using machine learning: Ideally we only set the aims and the computer develops the downscaling method.

We do this with a method called "Genetic Programming", which uses a computational approach that is inspired by the evolution of species (Poli and colleagues, 2016). Every downscaling rule is a small computer program represented by a tree structure.

The main difference from most other optimization approaches is that GP uses a population. Every downscaling rule is a member of this population. The best members of the population have the highest chance to reproduce. When they cross-breed, two branches of the tree are exchanged. When they mutate, an old branch is substituted by a new random branch. It is a cartoonish version of evolution, but it works.

We have multiple aims, we would like the solution to be accurate, we would like the variability to be realistic and we would like the downscaling rule to be small. You can try to combine all these aims into one number and then optimize that number. This is not easy because the aims can conflict.
1. A more accurate solution is often a larger solution.
2. Typically only a part of the small-scale variability can be predicted. A method that only adds this predictable part of the variability, would add too little variability. If you would add noise to such a solution, its accuracy goes down again.

Instead of combining all aims into one number we have used the so-called “Pareto approach”. What a Pareto optimal solution is is best explained visually with two aims, see the graphic below. The square boxes are the Pareto optimal solutions. The dots are not Pareto optimal because there are solutions that are better for both aims. The solutions that are not optimal are not excluded: We work with two populations: a population of Pareto optimal solutions and a population of non-optimal solutions. The non-optimal solutions are naturally less likely to reproduce.


Example of a Pareto optimization with two aims. The squares are the Pareto optimal solutions, the circles the non-optimal solutions. Figure after Zitzler and Thiele (1999).

Coupling atmospheric and surface models

We have the impression that this Pareto approach has made it possible to solve a quite complicated problem. Our problem was to downscale the fields near the surface of an atmospheric model before they are passed to a model for the surface (Zerenner and colleagues, 2016; Schomburg and colleagues, 2010). These were, for instance, fields of temperature, wind speed.

The atmospheric model we used is the weather prediction model of the German weather service. It has a horizontal resolution of 2.8 km and computes the state of the atmosphere every few seconds. We run the surface model TERRA at 400 m resolution. Below every atmospheric column of 2.8x2.8 km, there are 7x7 surface pixels.

The spatial variability of the land surface can be huge; there can be large differences in height, vegetation, soil type and humidity. It is also easier to run a surface model at a higher spatial resolution because it does not need to be computed so often, the variations in time are smaller.

To be able to make downscaling rules, we needed to know how much variability the 400x400 m atmospheric fields should have. We study this using a so-called training dataset, which was made by making atmospheric model runs with 400 m resolution for a smaller than usual area for a number of days. This would be too much computer power for a daily weather prediction for all of Germany, but a few days on a smaller region are okay. An additional number of 400 m model runs was made to be able to validate how well the downscaling rules work on an independent dataset.

The figure below shows an example for temperature during the day. The panel to the left shows the coarse temperature field after smoothing it with a spline, which preserves the coarse scale mean. The panel in the middle shows the temperature field after downscaling with an example downscaling rule. This can be compared to the 400 m atmospheric field the coarse field was originally computed from on the right. During the day, the downscaling of temperature works very well.



The figure below is the temperature field at night during a clear sky night. This is a difficult case. On cloud-free nights the air close to the ground cools and gathers in the valleys. These flows are quite close to the ground, but a good rule was to take the temperature gradient in the lower model layers and multiply it with the height anomalies (height differences from spline-smoothed coarse field).



Having a population of Pareto optimal solutions is one advantage of our approach. There is normally a trade of between the size of the solution and its performance and having multiple solutions means that you can study this and then chose a reasonable compromise.

Contrary to working with artificial neural networks as machine learning method, the GP solution is a piece of code, which you can understand. You can thus select a solution that makes sense physically and thus more likely works as well in situation that are not in the training dataset. You can study the solutions that seem strange and try to understand why they work and gain insight into your problem.

This statistical downscaling as an interface between two physical models is a beautiful synergy of statistics and physics. Physics and statistics are often presented at antagonists, but they actually strength each other. Physics should inform your statistical analysis and the above is an example where statistics makes a physical model more realistic (not performing a downscaling is also a statistical assumption, just less visible and less physical).

I would even argue that the most interesting current research in the atmospheric sciences merges statistics and physics: ensemble weather prediction and decadal climate prediction, bias corrections of such ensembles, model output statistics, climate model emulators, particle assimilation methods, downscaling global climate models using regional climate models and statistical downscaling, statistically selecting representative weather conditions for downscaling with regional climate models and multivariate interpolation. My work on adaptive parameterisation combining the strengths of more statistical parameterisations with more physical parameterisations is also an example.


Related reading

On cloud structure

An idea to combat bloat in genetic programming

References

Poli, R., W.B. Langdon and N. F. McPhee, 2016: A field guide to genetic programming. Published via Lulu.com (With contributions by J. R. Koza).

Schomburg, A., V.K.C. Venema, R. Lindau, F. Ament and C. Simmer, 2010: A downscaling scheme for atmospheric variables to drive soil-vegetation-atmosphere transfer models. Tellus B, doi: 10.1111/j.1600-0889.2010.00466.x, 62, no. 4, pp. 242-258.

Zerenner, Tanja, Victor Venema, Petra Friederichs and Clemens Simmer, 2016: Downscaling near-surface atmospheric fields with multi-objective Genetic Programming. Environmental Modelling & Software, in press.

Zitzler, Eckart and Lothar Thiele, 1999: Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach. IEEE transactions on Evolutionary Computation 3.4, pp. 257-271, 10.1109/4235.797969.


* Sierpinski fractal at the top was generated by Nol Aders and is used under a GNU Free Documentation License.

* Photo of mountain with clouds all around it (Cloud shroud) by Zoltán Vörös and is used under a Creative Commons Attribution 2.0 Generic (CC BY 2.0) license.

Wednesday, 3 August 2016

Climate model ensembles of opportunity and tuning



Listen to grumpy old men.

As a young cloud researcher at a large conference, enthusiastic about almost any topic, I went to a town-hall meeting on using a large number of climate model runs to study how well we know what we know. Or as scientists call this: using a climate model ensemble to study confidence/uncertainty intervals.

Using ensembles was still quite new. Climate Prediction dot Net had just started asking citizens to run climate models on their Personal Computers (old big iPads) to get the computer power to create large ensembles. Studies using just one climate model run were still very common. The weather predictions on the evening television news were still based on one weather prediction model run; they still showed highs, lows and fronts on static "weather maps".

During the questions, a grumpy old men spoke up. He was far from enthusiastic about his new stuff. I see a Statler or Waldorf angrily swing his wooden walking stick in the air. He urged everyone, everyone to be very careful and not to equate the ensemble with a sample from a probability distribution. The experts dutifully swore they were fully aware of this.

They likely were and still are. But now everyone uses ensembles. Often using them as if they sample the probability distribution.

Before I wrote about the problems confusing model spread and uncertainty made in the now mostly dead "hiatus" debate. That debate remains important: after the hiatus debate is before the hiatus debate. The new hiatus is already 4 month old.* And there are so many datasets to select a "hiatus" from.


Fyfe et al. (2013) compared the temperature trend from the CMIP ensemble (grey histogram) to observations (red something) implicitly assuming that the model spread is the uncertainty. While the estimated trend is near the model spread, it is well within the uncertainty. The right panel is for a 20 year period: 1993–2012. The left panel starts in the cherry picked large El Nino year: 1998–2012.

This time I would like to explain better why the ensemble model spread is typically smaller than the confidence interval. These reasons suggest other questions where we need to pay attention: It is also important for comparing long-term historical model runs with observations and could affect some climate change impact studies. For long-term projections and decadal climate prediction it is likely less relevant.

Reasons why model spread is not uncertainty

One climate model run is just one realisation. Reality has the same problem. But you can run a model multiple times. If you change the model fields you begin with just a little bit, due to the chaotic nature of atmospheric and oceanic flows a second run will show a different realisation. The highs, lows and fronts will move differently, the ocean surface is consequently warmed and cooled at different times and places, internal modes such as El Nino will appear at different times. This chaotic behaviour is mainly found at the short time scales and is one reason for the spread of an ensemble. And it is one reason to expect that model spread is not uncertainty because models focus on getting the long term trend right and differ strongly when it comes to the internal variability.

But that is just reason one. The modules of a climate model that simulate specific physical processes have parameters that are based on measurements or more detailed models. We only know these parameters within some confidence interval. A normal climate model takes the best estimate of these parameters, but they could be anywhere within the confidence interval. To study how important these parameters are special "perturbed physics" ensembles are created where every model run has parameters that vary within the confidence interval.

Creating a such an ensemble is difficult. Depending on the reason for the uncertainty in the parameter, it could make sense to keep its value constant or to continually change it within its confidence interval and anything in between. It could make sense to keep the value constant over the entire Earth or to change it spatially and again anything in between. The parameter or how much it can fluctuate may dependent on the local weather or climate. It could be that when parameter X is high also parameter Y is high (or low); these dependencies should also be taken into account. Finally, also the distributions of the parameters needs to be realistic. Doing all of this for the large number of parameters in a climate model is a lot of work, typically only the most important ones are perturbed.

You can generate an ensemble that has too much spread by perturbing the parameters too strongly (and by making the perturbations too persistent). If you do it optimally, the ensemble would still show too little spread because not all physical processes are modelled because they are thought not to be important enough to justify the work and the computational resources. Part of this spread can be studied by making ensembles using many different models (multi-model ensemble), which are developed by different groups with different research questions and different ideas what is important.

That is where the title comes in: ensembles of opportunity. These are ensembles of existing model runs that were not created to be an ensemble. The most important example is the ensemble of the Coupled Models Intercomparison Project (CMIP). This group coordinates the creating of a set of climate model runs for similar scenarios, so that the results of these models can be compared with each other. This ensemble will automatically sample the chaotic flows and it is a multi-model ensemble, but it is not a perturbed physics ensemble; these model runs are model aiming at the best possible reproduction of what happened. For this reason alone the spread of the CMIP ensemble is expected to be too low.

The term "ensembles of opportunity" is another example the tendency of natural scientists to select neutral or generous terms to describe the work of colleagues. The term "makeshift ensemble" may be clearer.

Climate model tuning

The CMIP ensemble also has too little spread when it comes to the global mean temperature because the model are partially tuned to it. There is just an interesting readable article out on climate model tuning in BAMS**, which is intended for a general audience. Tuning has a large number of objectives, from getting the mean temperature right to the relationship between humidity and precipitation. There is also a section on tuning to the magnitude of warming the last century. It states about the historical runs:
The amplitude of the 20th century warming depends primarily on the magnitude of the radiative forcing, the climate sensitivity, as well as the efficiency of ocean heat uptake. ...

Some modeling groups claim not to tune their models against 20th century warming, however, even for model developers it is difficult to ensure that this is absolutely true in practice because of the complexity and historical dimension of model development. ...

There is a broad spectrum of methods to improve model match to 20th century warming, ranging from simply choosing to no longer modify the value of a sensitive parameter when a match is already good for a given model, or selecting physical parameterizations that improve the match, to explicitly tuning either forcing or feedback both of which are uncertain and depend critically on tunable parameters (Murphy et al. 2004; Golaz et al. 2013). Model selection could, for instance, consist of choosing to include or leave out new processes, such as aerosol cloud interactions, to help the model better match the historical warming, or choosing to work on or replace a parameterization that is suspected of causing a perceived unrealistically low or high forcing or climate sensitivity.
Due to tuning models that have a low climate sensitivity tend to have stronger forcings over the last century and model with a high climate sensitivity a weaker forcing. The forcing due to greenhouse gasses does not vary much, that part is easy. The forcings due to small particles in the air (aerosols) that like CO2 stem from the burning of fossil fuels and are quite uncertain and Kiehl (2007) showed that high sensitivity models tend to have more cooling due to aerosols. For a more nuanced updated story see Knutti et al. (2008) and Forster et al. (2013).


Kiehl (2007) found an inverse correlation between forcing and climate sensitivity. The main reason for the differences in forcing was the cooling by aerosols.
This "tuning" initially was not an explicit tuning of model parameters, but mostly because modellers keep working until the results look good. Look good compared to observations. Bjorn Stevens talks about this in an otherwise also recommendable Forecast episode.

Nowadays the tuning is often performed more formally and an important part of studying the climate models and understanding their uncertainties. The BAMS article proposes to collect information on tuning for the upcoming CMIP. In principle a good idea, but I do not think that that is enough. In a simple example of climate sensitivity and aerosol forcing, the groups with low sensitivity and forcing and the ones with high sensitivity and forcing are happy with their temperature trend and will report not to have tuned. But that choice also leads to too little ensemble spread, just like the groups that did need to tune. Tuning makes it complicated to interpret the ensemble, it is no problem for a specific model run.

Given that we know the temperature increase, it is impossible not to get a tuned result. Furthermore, I mention several additional reasons why the model spread is not the uncertainty above that complicate the interpretation of the ensemble in the same way. A solution could be to follow the work in ensemble weather prediction with perturbed-physics ensembles and to tune all models, but to tune them to cover the full range of uncertainties that we estimate from the observations. This should at least cover the the climate sensitivity and ocean heat uptake, but preferably also other climate characteristics that are important for climate impact and climate variability studies. Large modelling centres may be able to create such large ensembles by themselves, the others could coordinate their work in CMIP to make sure the full uncertainty range is covered.

Historical climate runs

Because the physics is not perturbed and especially due to the tuning, you would expect that the CMIP ensemble spread is too low for global mean temperature increase. That the CMIP ensemble average fits well to the observed temperature increase shows that with reasonable physical choices we can understand why the temperature increased. It shows that known processes are sufficient to explain it. That is fits so accurately, does not say much. I liked the title of an article from Reto Knutti (2008): "Why are climate models reproducing the observed global surface warming so well?" Which implies it all.

Much more interesting to study how good models are, are spatial patterns and other observations. New datasets are greeted with much enthusiasm by modellers because they allow for the best comparison and are more likely to show new problems that need fixing and lead to a better understanding. Also model results for the deep past are important tests, which models are not tuned for.


That the CMIP ensemble mean fits to the observations is no reason to expect that the observations are reliable


When the observations peak out of this too narrow CMIP ensemble spread that is to be expected. If you want to make a case that our understanding does not fit to the observations, you have to take the uncertainties into account, not the spread.

Similarly, that the CMIP ensemble mean fits to the observations is no reason to expect that the observations are reliable. Because of the overconfidence in the data quality also many scientists took the recent minimal deviations from the trend line too seriously. This finally stimulated more research into the accuracy of temperature trends, into inhomogeneities in the ERSST sea surface temperatures, into the effect of coverage and how we blend sea, land and ice temperatures together. There are some more improvements under way.

Compared to the global warming of about 1°C up to now, these recent and upcoming corrections are large. Many of the problem could have been found long ago. It is 2016. It is about time to study this. If funding is an issue we could maybe sacrifice some climate change impact studies for wine. Or for truffles. Or caviar. The quality of our data is the foundation of our science.

That the comparison of the CMIP ensemble average with the instrumental observation is so central to the public climate "debate" is rather ironic. Please take a walk in the forest. Look at all the different changes. The ones that go slower as well as the many that go faster than expected.

Maybe it is good to emphasise that for the attribution of climate change to human activities, the size of the historical temperature increase is not used. The attribution is made via correlations with the 3-dimensional spatial patterns between observations and models. By using the correlations (rather than root mean square errors), the magnitude of the change in either the models or the observations is no longer important. Ribes (2016) is working on using the magnitude of the changes as well. This is difficult because of inevitable tuning, which makes specifying the uncertainties very difficult.

Climate change impact studies

Studying the impacts of climate change is hard. Whether dikes break depends not only on sea level rise, but also on the changes in storms. The maintenance of the dikes and the tides are important. It matters whether you have a functioning government that also takes care of problems that only become apparent when the catastrophe happens. I would not sleep well if I lived in an area where civil servants are not allowed to talk about climate change. Because of the additional unnecessary climate dangers, but especially because that is a clear sign of a dysfunctional government that does not prioritise protecting its people.

The too narrow CMIP ensemble spread can lead to underestimates of the climate change impacts because typically the higher damages from stronger than expected changes are larger than the reduced damages from smaller changes. The uncertainty monster is not our friend. Admittedly, the effect of the uncertainties is rather modest. This is only important for those impacts we understand reasonably well already. The lack of variability can be partially solved in the statistical post-processing (bias correction and downscaling). This is not common yet, but Grenier et al. (2015) proposed a statistical method to make the natural variability more realistic.

This problem will hopefully soon be solved when the research programs on decadal climate prediction mature. The changes over a decade due to greenhouse warming are modest, for decadal prediction we thus especially need to accurately predict the natural variability of the climate system. An important part of these studies is assessing whether and which changes can be predicted. As a consequence there is a strong focus on situation specific uncertainties and statistical post-processing to correct biases of the model ensemble in the means and in the uncertainties.

In the tropics decadal climate prediction works reasonably well and helps farmers and governments in their planning.



In the mid-latitudes, where most of the researchers live, it is frustratingly difficult to make decadal predictions. Still even in that case, we would still have an ensemble where the ensemble can be used as a sample of the probability distribution. That is important progress.

When a lack of ensemble spread is a problem for historical runs, you might expect it to be a problem for projecting for the rest of the century. This is probably not the case. The problem of tuning would be much reduced because the influence of aerosols will be much smaller as the signal of greenhouse gasses becomes much more dominant. For long term projections the main factor is that the climate sensitivity of the models needs to fit to our understanding of the climate sensitivity from all studies. This fit is reasonable for the best estimate of the climate sensitivity, which we expect to be 3°C for a doubling of the CO2 concentration. I do not know how well the fit is for the spread in the climate sensitivity.

However, for long-term projections even the climate sensitivity is not that important. For the magnitude of the climatic changes in 2100 and for the impact of climate change in 2100, the main source of uncertainty is what we will do. As you can see in the figure below the difference between a business as usual scenario and strong climate policies is 3 °C (6 °F). The uncertainties within these scenario's is relatively small. Thus the main question is whether and how aggressively we will act to combat climate change.





Related information

Is it time to freak out about the climate sensitivity estimates from energy budget models?

Fans of Judith Curry: the uncertainty monster is not your friend

Are climate models running hot or observations running cold?

Forecast: Gavin Schmidt on the evolution, testing and discussion of climate models

Forecast: Bjorn Stevens on the philosophy of climate modeling

The Guardian: In a blind test, economists reject the notion of a global warming pause

Reference

Forster, P.M., T. Andrews, P. Good, J.M. Gregory, L.S. Jackson, and M. Zelinka, 2013: Evaluating adjusted forcing and model spread for historical and future scenarios in the CMIP5 generation of climate models. Journal of Geophysical Research, 118, 1139–1150, doi: 10.1002/jgrd.50174.

Fyfe, John C., Nathan P. Gillett and Francis W. Zwiers, 2013: Overestimated global warming over the past 20 years. Nature Climate Change, 3, pp. 767–769, doi: 10.1038/nclimate1972.

Golaz, J.-C., J.-C. Golaz, and H. Levy, 2013: Cloud tuning in a coupled climate model: Impact on 20th century warming. Geophysical Research Letters, 40, pp. 2246–2251, doi: 10.1002/grl.50232.

Grenier, Patrick, Diane Chaumont and Ramón de Elía, 2015: Statistical adjustment of simulated inter-annual variability in an investigation of short-term temperature trend distributions over Canada. EGU general meeting, Vienna, Austria.

Hourdin, Frederic, Thorsten Mauritsen, Andrew Gettelman, Jean-Christophe Golaz, Venkatramani Balaji, Qingyun Duan, Doris Folini, Duoying Ji, Daniel Klocke, Yun Qian, Florian Rauser, Cathrine Rio, Lorenzo Tomassini, Masahiro Watanabe, and Daniel Williamson, 2016: The art and science of climate model tuning. Bulletin of the American Meteorological Society, published online, doi: 10.1175/BAMS-D-15-00135.1.

Kiehl, J.T., 2007: Twentieth century climate model response and climate sensitivity. Geophysical Research Letters, 34, L22710, doi: 10.1029/2007GL031383.

Knutti, R., 2008: Why are climate models reproducing the observed global surface warming so well? Geophysical Research Letters, 35, L18704, doi: 10.1029/2008GL034932.

Murphy, J.M., D.M.H. Sexton, D.N. Barnett, G.S. Jones, M.J. Webb, M. Collins and D.A. Stainforth, 2004: Quantification of modelling uncertainties in a large ensemble of climate change simulations. Nature, 430, pp. 768–772, doi: 10.1038/nature02771.

Ribes, A., 2016: Multi-model detection and attribution without linear regression. 13th International Meeting on Statistical Climatology, Canmore, Canada. Abstract below.

Rowlands, Daniel J., David J. Frame, Duncan Ackerley, Tolu Aina, Ben B. B. Booth, Carl Christensen, Matthew Collins, Nicholas Faull, Chris E. Forest, Benjamin S. Grandey, Edward Gryspeerdt, Eleanor J. Highwood, William J. Ingram, Sylvia Knight, Ana Lopez, Neil Massey, Frances McNamara, Nicolai Meinshausen, Claudio Piani, Suzanne M. Rosier, Benjamin M. Sanderson, Leonard A. Smith, Dáithí A. Stone, Milo Thurston, Kuniko Yamazaki, Y. Hiro Yamazaki & Myles R. Allen, 2012: Broad range of 2050 warming from an observationally constrained large climate model ensemble. Nature Geoscience, 5, pp. 256–260, doi: 10.1038/ngeo1430 (manuscript).


MULTI-MODEL DETECTION AND ATTRIBUTION WITHOUT LINEAR REGRESSION
Aurélien Ribes
Abstract. Conventional D&A statistical methods involve linear regression models where the observations are regressed onto expected response patterns to different external forcings. These methods do not use physical information provided by climate models regarding the expected response magnitudes to constrain the estimated responses to the forcings. As an alternative to this approach, we propose a new statistical model for detection and attribution based only on the additivity assumption. We introduce estimation and testing procedures based on likelihood maximization. As the possibility of misrepresented response magnitudes is removed in this revised statistical framework, it is important to take the climate modelling uncertainty into account. In this way, modelling uncertainty in the response magnitude and the response pattern is treated consistently. We show that climate modelling uncertainty can be accounted for easily in our approach. We then provide some discussion on how to practically estimate this source of uncertainty, and on the future challenges related to multi-model D&A in the framework of CMIP6/DAMIP.


* Because this is the internet, let me say that "The new hiatus is already 4 month old." is a joke.

** The BAMS article calls any way to estimate a parameter "tuning". I would personally only call it tuning if you optimize for emerging properties of the climate model. If you estimate a parameter based on observations or a specialized model, I would not call this tuning, but simply parameter estimation or parameterization development. Radiative transfer schemes use the assumption that adjacent layers of clouds are maximally overlapped and that if there is a clear layer between two cloud layers that they are random overlapped. You could introduce two parameters that vary between maximum and random for these two cases, but that is not done. You could call that an implicit parameter, which shows that distinguishing between parameter estimation and parameterization development is hard.

*** Photo at the top: Grumpy Tortoise Face by Eric Kilby, used under a Attribution-ShareAlike 2.0 Generic (CC BY-SA 2.0) license.

Climate model ensembles of opportunity and tuning



Listen to grumpy old men.

As a young cloud researcher at a large conference, enthusiastic about almost any topic, I went to a town-hall meeting on using a large number of climate model runs to study how well we know what we know. Or as scientists call this: using a climate model ensemble to study confidence/uncertainty intervals.

Using ensembles was still quite new. Climate Prediction dot Net had just started asking citizens to run climate models on their Personal Computers (old big iPads) to get the computer power to create large ensembles. Studies using just one climate model run were still very common. The weather predictions on the evening television news were still based on one weather prediction model run; they still showed highs, lows and fronts on static "weather maps".

During the questions, a grumpy old men spoke up. He was far from enthusiastic about his new stuff. I see a Statler or Waldorf angrily swing his wooden walking stick in the air. He urged everyone, everyone to be very careful and not to equate the ensemble with a sample from a probability distribution. The experts dutifully swore they were fully aware of this.

They likely were and still are. But now everyone uses ensembles. Often using them as if they sample the probability distribution.

Before I wrote about the problems confusing model spread and uncertainty made in the now mostly dead "hiatus" debate. That debate remains important: after the hiatus debate is before the hiatus debate. The new hiatus is already 4 month old.* And there are so many datasets to select a "hiatus" from.


Fyfe et al. (2013) compared the temperature trend from the CMIP ensemble (grey histogram) to observations (red something) implicitly assuming that the model spread is the uncertainty. While the estimated trend is near the model spread, it is well within the uncertainty. The right panel is for a 20 year period: 1993–2012. The left panel starts in the cherry picked large El Nino year: 1998–2012.

This time I would like to explain better why the ensemble model spread is typically smaller than the confidence interval. These reasons suggest other questions where we need to pay attention: It is also important for comparing long-term historical model runs with observations and could affect some climate change impact studies. For long-term projections and decadal climate prediction it is likely less relevant.

Reasons why model spread is not uncertainty

One climate model run is just one realisation. Reality has the same problem. But you can run a model multiple times. If you change the model fields you begin with just a little bit, due to the chaotic nature of atmospheric and oceanic flows a second run will show a different realisation. The highs, lows and fronts will move differently, the ocean surface is consequently warmed and cooled at different times and places, internal modes such as El Nino will appear at different times. This chaotic behaviour is mainly found at the short time scales and is one reason for the spread of an ensemble. And it is one reason to expect that model spread is not uncertainty because models focus on getting the long term trend right and differ strongly when it comes to the internal variability.

But that is just reason one. The modules of a climate model that simulate specific physical processes have parameters that are based on measurements or more detailed models. We only know these parameters within some confidence interval. A normal climate model takes the best estimate of these parameters, but they could be anywhere within the confidence interval. To study how important these parameters are special "perturbed physics" ensembles are created where every model run has parameters that vary within the confidence interval.

Creating a such an ensemble is difficult. Depending on the reason for the uncertainty in the parameter, it could make sense to keep its value constant or to continually change it within its confidence interval and anything in between. It could make sense to keep the value constant over the entire Earth or to change it spatially and again anything in between. The parameter or how much it can fluctuate may dependent on the local weather or climate. It could be that parameter X is high also parameter Y is high (or low); these dependencies should also be taken into account. Finally, also the distributions of the parameters needs to be realistic. Doing all of this for the large number of parameters in a climate model is a lot of work, typically only the most important ones are perturbed.

You can generate an ensemble that has too much spread by perturbing the parameters too strongly (and by making the perturbations too persistent). If you do it optimally, the ensemble would still show too little spread because not all physical processes are modelled because they are thought not to be important enough to justify the work and the computational resources. Part of this spread can be studied by making ensembles using many different models (multi-model ensemble), which are developed by different groups with different research questions and different ideas what is important.

That is where the title comes in: ensembles of opportunity. These are ensembles of existing model runs that were not created to be an ensemble. The most important example is the ensemble of the Coupled Models Intercomparison Project (CMIP). This group coordinates the creating of a set of climate model runs for similar scenarios, so that the results of these models can be compared with each other. This ensemble will automatically sample the chaotic flows and it is a multi-model ensemble, but it is not a perturbed physics ensemble; these model runs are model aiming at the best possible reproduction of what happened. For this reason alone the spread of the CMIP ensemble is expected to be too low.

The term "ensembles of opportunity" is another example the tendency of natural scientists to select neutral or generous terms to describe the work of colleagues. The term "makeshift ensemble" may be clearer.

Climate model tuning

The CMIP ensemble also has too little spread when it comes to the global mean temperature because the model are partially tuned to it. There is just an interesting readable article out on climate model tuning in BAMS**, which is intended for a general audience. Tuning has a large number of objectives, from getting the mean temperature right to the relationship between humidity and precipitation. There is also a section on tuning to the magnitude of warming the last century. It states about the historical runs:
The amplitude of the 20th century warming depends primarily on the magnitude of the radiative forcing, the climate sensitivity, as well as the efficiency of ocean heat uptake. ...

Some modeling groups claim not to tune their models against 20th century warming, however, even for model developers it is difficult to ensure that this is absolutely true in practice because of the complexity and historical dimension of model development. ...

There is a broad spectrum of methods to improve model match to 20th century warming, ranging from simply choosing to no longer modify the value of a sensitive parameter when a match is already good for a given model, or selecting physical parameterizations that improve the match, to explicitly tuning either forcing or feedback both of which are uncertain and depend critically on tunable parameters (Murphy et al. 2004; Golaz et al. 2013). Model selection could, for instance, consist of choosing to include or leave out new processes, such as aerosol cloud interactions, to help the model better match the historical warming, or choosing to work on or replace a parameterization that is suspected of causing a perceived unrealistically low or high forcing or climate sensitivity.
Due to tuning models that have a low climate sensitivity tend to have stronger forcings over the last century and model with a high climate sensitivity a weaker forcing. The forcing due to greenhouse gasses does not vary much, that part is easy. The forcings due to small particles in the air (aerosols) that like CO2 stem from the burning of fossil fuels and are quite uncertain and Kiehl (2007) showed that high sensitivity models tend to have more cooling due to aerosols. For a more nuanced updated story see Knutti et al. (2008) and Forster et al. (2013).


Kiehl (2007) found an inverse correlation between forcing and climate sensitivity. The main reason for the differences in forcing was the cooling by aerosols.
This "tuning" initially was not an explicit tuning of model parameters, but mostly because modellers keep working until the results look good. Look good compared to observations. Bjorn Stevens talks about this in an otherwise also recommendable Forecast episode.

Nowadays the tuning is often performed more formally and an important part of studying the climate models and understanding their uncertainties. The BAMS article proposes to collect information on tuning for the upcoming CMIP. In principle a good idea, but I do not think that that is enough. In a simple example of climate sensitivity and aerosol forcing, the groups with low sensitivity and forcing and the ones with high sensitivity and forcing are happy with their temperature trend and will report not to have tuned. But that choice also leads to too little ensemble spread, just like the groups that did need to tune. Tuning makes it complicated to interpret the ensemble, it is no problem for a specific model run.

Given that we know the temperature increase, it is impossible not to get a tuned result. Furthermore, I mention several additional reasons why the model spread is not the uncertainty above that complicate the interpretation of the ensemble in the same way. A solution could be to follow the work in ensemble weather prediction with perturbed-physics ensembles and to tune all models, but to tune them to cover the full range of uncertainties that we estimate from the observations. This should at least cover the the climate sensitivity and ocean heat uptake, but preferably also other climate characteristics that are important for climate impact and climate variability studies. Large modelling centres may be able to create such large ensembles by themselves, the others could coordinate their work in CMIP to make sure the full uncertainty range is covered.

Historical climate runs

Because the physics is not perturbed and especially due to the tuning, you would expect that the CMIP ensemble spread is too low for global mean temperature increase. That the CMIP ensemble average fits well to the observed temperature increase shows that with reasonable physical choices we can understand why the temperature increased. It shows that known processes are sufficient to explain it. That is fits so accurately, does not say much. I liked the title of an article from Reto Knutti (2008): "Why are climate models reproducing the observed global surface warming so well?" Which implies it all.

Much more interesting to study how good models are, are spatial patterns and other observations. New datasets are greeted with much enthusiasm by modellers because they allow for the best comparison and are more likely to show new problems that need fixing and lead to a better understanding. Also model results for the deep past are important tests, which models are not tuned for.


That the CMIP ensemble mean fits to the observations is no reason to expect that the observations are reliable


When the observations peak out of this too narrow CMIP ensemble spread that is to be expected. If you want to make a case that our understanding does not fit to the observations, you have to take the uncertainties into account, not the spread.

Similarly, that the CMIP ensemble mean fits to the observations is no reason to expect that the observations are reliable. Because of the overconfidence in the data quality also many scientists took the recent minimal deviations from the trend line too seriously. This finally stimulated more research into the accuracy of temperature trends, into inhomogeneities in the ERSST sea surface temperatures, into the effect of coverage and how we blend sea, land and ice temperatures together. There are some more improvements under way.

Compared to the global warming of about 1°C up to now, these recent and upcoming corrections are large. Many of the problem could have been found long ago. It is 2016. It is about time to study this. If funding is an issue we could maybe sacrifice some climate change impact studies for wine. Or for truffles. Or caviar. The quality of our data is the foundation of our science.

That the comparison of the CMIP ensemble average with the instrumental observation is so central to the public climate "debate" is rather ironic. Please take a walk in the forest. Look at all the different changes. The ones that go slower as well as the many that go faster than expected.

Maybe it is good to emphasise that for the attribution of climate change to human activities, the size of the historical temperature increase is not used. The attribution is made via correlations with the 3-dimensional spatial patterns between observations and models. By using the correlations (rather than root mean square errors), the magnitude of the change in either the models or the observations is no longer important. Ribes (2016) is working on using the magnitude of the changes as well. This is difficult because of inevitable tuning, which makes specifying the uncertainties very difficult.

Climate change impact studies

Studying the impacts of climate change is hard. Whether dikes break depends not only on sea level rise, but also on the changes in storms. The maintenance of the dikes and the tides are important. It matters whether you have a functioning government that also takes care of problems that only become apparent when the catastrophe happens. I would not sleep well if I lived in an area where civil servants are not allowed to talk about climate change. Because of the additional unnecessary climate dangers, but especially because that is a clear sign of a dysfunctional government that does not prioritise protecting its people.

The too narrow CMIP ensemble spread can lead to underestimates of the climate change impacts because typically the higher damages from stronger than expected changes are larger than the reduced damages from smaller changes. The uncertainty monster is not our friend. Admittedly, the effect of the uncertainties is rather modest. This this is only important for those impacts we understand reasonably well already. The lack of variability can be partially solved in the statistical post-processing (bias correction and downscaling). This is not common yet, but Grenier et al. (2015) proposed a statistical method to make the natural variability more realistic.

This problem will hopefully soon be solved when the research programs on decadal climate prediction mature. The changes over a decade due to greenhouse warming are modest, for decadal prediction we thus especially need to accurately predict the natural variability of the climate system. An important part of these studies is assessing whether and which changes can be predicted. As a consequence there is a strong focus on situation specific uncertainties and statistical post-processing to correct biases of the model ensemble in the means and in the uncertainties.

In the tropics decadal climate prediction works reasonably well and helps farmers and governments in their planning.



In the mid-latitudes, where most of the researchers live, it is frustratingly difficult to make decadal predictions. Still even in that case, we would still have an ensemble where the ensemble can be used as a sample of the probability distribution. That is important progress.

When a lack of ensemble spread is a problem for historical runs, you might expect it to be a problem for projecting for the rest of the century. This is probably not the case. The problem of tuning would be much reduced because the influence of aerosols will be much smaller as the signal of greenhouse gasses becomes much more dominant. For long term projections the main factor is that the climate sensitivity of the models needs to fit to our understanding of the climate sensitivity from all studies. This fit is reasonable for the best estimate of the climate sensitivity, which we expect to be 3°C for a doubling of the CO2 concentration. I do not know how well the fit is for the spread in the climate sensitivity.

However, for long-term projections even the climate sensitivity is not that important. For the magnitude of the climatic changes in 2100 and for the impact of climate change in 2100, the main source of uncertainty is what we will do. As you can see in the figure below the difference between a business as usual scenario and strong climate policies is 3 °C (6 °F). The uncertainties within these scenario's is relatively small. Thus the main question is whether and how aggressively we will act to combat climate change.





Related information

New article (september 2017): Gavin Schmidt et al.: Practice and philosophy of climate model tuning across six US modeling centers.

Discussion paper suggesting a path to solving the difference between model spread and uncertainty by James Annan and Julia Hargreaves: On the meaning of independence in climate science.

Is it time to freak out about the climate sensitivity estimates from energy budget models?

Fans of Judith Curry: the uncertainty monster is not your friend.

Are climate models running hot or observations running cold?

Forecast: Gavin Schmidt on the evolution, testing and discussion of climate models.

Forecast: Bjorn Stevens on the philosophy of climate modeling.

The Guardian: In a blind test, economists reject the notion of a global warming pause.

Reference

Forster, P.M., T. Andrews, P. Good, J.M. Gregory, L.S. Jackson, and M. Zelinka, 2013: Evaluating adjusted forcing and model spread for historical and future scenarios in the CMIP5 generation of climate models. Journal of Geophysical Research, 118, 1139–1150, doi: 10.1002/jgrd.50174.

Fyfe, John C., Nathan P. Gillett and Francis W. Zwiers, 2013: Overestimated global warming over the past 20 years. Nature Climate Change, 3, pp. 767–769, doi: 10.1038/nclimate1972.

Golaz, J.-C., J.-C. Golaz, and H. Levy, 2013: Cloud tuning in a coupled climate model: Impact on 20th century warming. Geophysical Research Letters, 40, pp. 2246–2251, doi: 10.1002/grl.50232.

Grenier, Patrick, Diane Chaumont and Ramón de Elía, 2015: Statistical adjustment of simulated inter-annual variability in an investigation of short-term temperature trend distributions over Canada. EGU general meeting, Vienna, Austria.

Hourdin, Frederic, Thorsten Mauritsen, Andrew Gettelman, Jean-Christophe Golaz, Venkatramani Balaji, Qingyun Duan, Doris Folini, Duoying Ji, Daniel Klocke, Yun Qian, Florian Rauser, Cathrine Rio, Lorenzo Tomassini, Masahiro Watanabe, and Daniel Williamson, 2016: The art and science of climate model tuning. Bulletin of the American Meteorological Society, published online, doi: 10.1175/BAMS-D-15-00135.1.

Kiehl, J.T., 2007: Twentieth century climate model response and climate sensitivity. Geophysical Research Letters, 34, L22710, doi: 10.1029/2007GL031383.

Knutti, R., 2008: Why are climate models reproducing the observed global surface warming so well? Geophysical Research Letters, 35, L18704, doi: 10.1029/2008GL034932.

Murphy, J.M., D.M.H. Sexton, D.N. Barnett, G.S. Jones, M.J. Webb, M. Collins and D.A. Stainforth, 2004: Quantification of modelling uncertainties in a large ensemble of climate change simulations. Nature, 430, pp. 768–772, doi: 10.1038/nature02771.

Ribes, A., 2016: Multi-model detection and attribution without linear regression. 13th International Meeting on Statistical Climatology, Canmore, Canada. Abstract below.

Rowlands, Daniel J., David J. Frame, Duncan Ackerley, Tolu Aina, Ben B. B. Booth, Carl Christensen, Matthew Collins, Nicholas Faull, Chris E. Forest, Benjamin S. Grandey, Edward Gryspeerdt, Eleanor J. Highwood, William J. Ingram, Sylvia Knight, Ana Lopez, Neil Massey, Frances McNamara, Nicolai Meinshausen, Claudio Piani, Suzanne M. Rosier, Benjamin M. Sanderson, Leonard A. Smith, Dáithí A. Stone, Milo Thurston, Kuniko Yamazaki, Y. Hiro Yamazaki & Myles R. Allen, 2012: Broad range of 2050 warming from an observationally constrained large climate model ensemble. Nature Geoscience, 5, pp. 256–260, doi: 10.1038/ngeo1430 (manuscript).


MULTI-MODEL DETECTION AND ATTRIBUTION WITHOUT LINEAR REGRESSION
Aurélien Ribes
Abstract. Conventional D&A statistical methods involve linear regression models where the observations are regressed onto expected response patterns to different external forcings. These methods do not use physical information provided by climate models regarding the expected response magnitudes to constrain the estimated responses to the forcings. As an alternative to this approach, we propose a new statistical model for detection and attribution based only on the additivity assumption. We introduce estimation and testing procedures based on likelihood maximization. As the possibility of misrepresented response magnitudes is removed in this revised statistical framework, it is important to take the climate modelling uncertainty into account. In this way, modelling uncertainty in the response magnitude and the response pattern is treated consistently. We show that climate modelling uncertainty can be accounted for easily in our approach. We then provide some discussion on how to practically estimate this source of uncertainty, and on the future challenges related to multi-model D&A in the framework of CMIP6/DAMIP.


* Because this is the internet, let me say that "The new hiatus is already 4 month old." is a joke.

** The BAMS article calls any way to estimate a parameter "tuning". I would personally only call it tuning if you optimize for emerging properties of the climate model. If you estimate a parameter based on observations or a specialized model, I would not call this tuning, but simply parameter estimation or parameterization development. Radiative transfer schemes use the assumption that adjacent layers of clouds are maximally overlapped and that if there is a clear layer between two cloud layers that they are random overlapped. You could introduce two parameters that vary between maximum and random for these two cases, but that is not done. You could call that an implicit parameter, which shows that distinguishing between parameter estimation and parameterization development is hard.

*** Photo at the top: Grumpy Tortoise Face by Eric Kilby, used under a Attribution-ShareAlike 2.0 Generic (CC BY-SA 2.0) license.

Thursday, 7 July 2016

Is it time to freak out about the climate sensitivity estimates from energy budget models?

Estimates of climate sensitivity using simple energy budget models tended to produce lower values than many other methods. Consequently they were loved by the mitigation sceptical movement, who seemed to regard these as the most robust of all methods. Part of their argument is the claim that these are “empirical” estimates, conveniently forgetting the simple statistical model the method uses, that they still require information from physical global climate models for the forcings, and that global climate models output also fit the “empirical” temperature change (and many other observed changes).

Before the last IPCC report the estimate for equilibrium climate sensitivity was between 2°C and 4.5°C with a best estimate of 3°C. I do not know of any explicit statement, but I have the feeling that the new studies with low estimates from energy budget models were the reason why the last IPCC report reduced the lower bound to 1.5°C. Since the reasons for the discrepancies were not understood the last IPCC report no longer gave a best estimate for equilibrium climate sensitivity.

The equilibrium climate sensitivity is defined as the equilibrium change in global mean near-surface air temperature after doubling the atmospheric concentration of carbon dioxide.

A Nature News and Views by Kyle Armour (2016) showed this week that three assumptions made in the simple energy budget models lead to strong biases.

1. This week Mark Richardson and colleagues (2016) showed that the temperature change is underestimated because we have few measurements in regions where the change is large, especially the Arctic. This masking problem creates a bias of 15%.

Furthermore, over the ocean, empirical estimates do not use the air temperature, but use the sea surface temperature instead; the water temperature is a much smoother field and can thus be estimated using many fewer samples, which is good because observations over the oceans are sparse. Above sea ice the air temperature is used. Thus this also means that the decrease in the ice cover need to be taken into account. The temperature trend of the air temperature over the ocean is also higher than the trend of the sea surface temperature. Both effects make the "observed" trend 9% smaller.*

2. Climate change is mainly due to increases in carbon dioxide concentrations, but also warming due to increases in methane concentrations, cooling due to increases in aerosols (small airborne particles) and changing due to land use changes. Half a year ago Kate Marvel and colleagues showed that these forcings do not have the same global effect as carbon dioxide and that, as a consequence, the energy balance models are biased low. Marvel and colleagues estimate that this makes the estimates of energy balance models 30% too low.

3. Kyle Armour and colleagues (2013) previous work showed that in the early warming phase climate sensitivity appears smaller than the true value you would get if you would wait till the system has returned to equilibrium. This leads to an underestimate of 25%.

Taking all three biases into account the best estimate from the energy balance models from around 2°C estimate becomes 4.6°C**; see Figure 1b of Armour (2016) reproduced below.


Climate sensitivity estimated from observations1 (black), and its revision following Richardson et al. (blue) then following Marvel et al. (green), and in red the revision for the time dependence (Armour). The grey histogram shows climate model values.

The equilibrium climate sensitivity from global climate models is about 3.5°C***, which is close to the best estimate from all lines of evidence of about 3°C. The "empirical" estimate of 4.6°C is now thus clearly larger than the ones of the global climate models.

Is that a reason to freak out? Have we severely underestimated the severity of the problem?

Probably not, there are many different lines of evidence that support an equilibrium climate sensitivity around 3, with a likely range from around 2 to about 4.5. That the simple energy balance models might now suggest a best estimate of around 4.6°C does not really influence this overall assessment. It is just one line of evidence.

That the energy balance climate sensitivity is minimally above the upper bound does not change this. These energy balance models have not been studied much and the biases are so large that the correction need to very accurate, while they are currently mostly based on single studies. It is quite likely that this value will still change the coming years. If this value still holds after a dozen more studies you may want to consider freaking out a little. How uncertain this bias corrected climate sensitivity is is illustrated by its wide distribution in the above graph with a 95% uncertainty range of 2.5-12.8°C.

[UPDATE. Gavin Schmidt mentions on twitter that it should also be studied whether these three factors are fully independent. While they seem to relate to different aspects there could be a link because spatial patterns and forcing efficacy are strongly related. Thus it would be valuable to make a study that considers all three biases in combination.]

The promotion of the cherry picked climate sensitivity of 2°C, or lower, was disingenuous. A similar promotion of a value of 4.6°C would be no better. (Someone promoting a climate sensitivity of 12.8°C deserves a place in statistical Purgatory.)

There are many other lines of evidence for an equilibrium climate sensitivity around 3, from basic physics, to global climate models, various climatic changes in the deep past and the climate response to volcanoes. Before accepting values far away from 3 we would need to understand the physics of the feedbacks that produce such deviations.


Figure 1 Ranges and best estimates of ECS based on different lines of evidence. Bars show 5-95% uncertainty ranges with the best estimates marked by dots. Dashed lines give alternative estimates within one study. The grey shaded range marks the likely 1.5°C to 4.5°C range as reported in AR5, and the grey solid line the extremely unlikely less than 1°C, the grey dashed line the very unlikely greater than 6°C. Figure taken from figure 1 of Box 12.2 in the IPCC 5th assessment report (AR5). Unlabeled ranges refer to studies cited in AR4. The figure in the review article by Knutti and Hegerl (2008) presented by Skeptical Science is also a very insightful overview.

The likely range of possible climate sensitivity values has been between 1.5°C and 4.5°C since the 1979. That does not sound like much progress. However, we now have many more lines of evidence and those lines have been much better vetted. Thus we can be more sure nowadays that this range is about right. A large part of the uncertainty comes from cloud and vegetation feedbacks. Having worked on clouds myself, I know that these are very difficult problems. Thus I am not hopeful that the uncertainty range will strongly decrease the coming decade or maybe even decades.

We will have to make decisions in the face of this uncertainty. Like any decision in a complex world.


Notes

* The temperature trend of the air temperature over the ocean is 9% higher than the trend of the sea surface temperature in the CMIP5 models. For most models the top layer is 10 m deep. For those models with a higher vertical resolution the trend is only 8% higher. The difference is small and not statistically significant, but the effective resolution of numerical models is normally larger than the nominal resolution, thus I would not be surprised if studies with dedicated high resolution models may lead to estimates that are a few percent points lower.

** If we simply combine all these biases: 1.24 (Richardson) * 1.30 (Marvel) * 1.25 (Armour) we get that the simple energy balance models are biased by as much as a factor 2. Taking this into account could suggest increasing the best estimate from the energy balance models from around 2oC to around 4oC. Because of the uncertainty around the estimates and the thick tails, the estimate becomes 4.6°C. See Figure 1b of Armour (2016).

*** The ensemble of global climate models of the CMIP5 project have an average climate sensitivity of 3.5°C with a 95% uncertainty range of 2.0-5.6°C (Geoffroy, et al. 2013).

**** Many thanks to Kyle Armour and And Then There’s Physics for many helpful hints and comments. Any errors are naturally mine.



Related reading

Nature Geoscience: Impact of decadal cloud variations on the Earth’s energy budget. A physical explanation of why climate sensitivities estimated from recently observed trends are probably biased low.

An oldie from Science in 2004: Three Degrees of Consensus explains the various ways to estimate climate sensitivity and why it may have been more luck than wisdom that the first estimate of the range of the climate sensitivity still holds.

Skeptical Science: How sensitive is our climate?

Climate dialogue: Climate Sensitivity and Transient Climate Response

Fans of Judith Curry: the uncertainty monster is not your friend

Tough, but interesting for scientists: Andrew Dessler talk at Ringberg15 on why the equilibrium climate sensitivity exceeds 2°C.

References

Armour, Kyle C., 2016: Projection and prediction: Climate sensitivity on the rise. Nature Climate Change, News and Views, doi: 10.1038/nclimate3079.

Armour, Kyle C., Cecilia M. Bitz and Gerard H. Roe, 2013: Time-Varying Climate Sensitivity from Regional Feedbacks. Journal of Climate, doi: 10.1175/JCLI-D-12-00544.1

Geoffroy, O., D. Saint-Martin, G. Bellon, A. Voldoire, D.J.L. Olivié and S. Tytéca, 2013: Transient Climate Response in a Two-Layer Energy-Balance Model. Part II: Representation of the Efficacy of Deep-Ocean Heat Uptake and Validation for CMIP5 AOGCMs. Journal of Climate, 26, pp. 1859- 1876, doi: 10.1175/JCLI-D-12-00196.1.

Marvel, K., G.A. Schmidt, R.L. Miller and L.S. Nazarenko, 2015: Implications for climate sensitivity from the response to individual forcings, Nature Climate Change, 6, pp. 386-389. 10.1038/nclimate2888.

Richardson, Mark, Kevin Cowtan, Ed Hawkins and Martin B. Stolpe, 2016: Reconciled climate response estimates from climate models and the energy budget of Earth. Nature Climate Change, doi: 10.1038/nclimate3066. If you cannot read this article at Nature, you can go there via The Guardian, which has a special link that allows everyone to read (not download) the article. See also the News and Views on this article by Kyle Armour.

Otto, A., F.E.L. Otto, O. Boucher, J. Church, G. Hegerl, P.M. Forster, N.P. Gillett, J. Gregory, G.C. Johnson, R. Knutti, N. Lewis, U. Lohmann, J. Marotzke, G. Myhre, D. Shindell, B. Stevens, and M.R. Allen, 2013: Energy budget constraints on climate response", Nature Geoscience, 6, pp. 415-416. 10.1038/ngeo1836.