Monday 27 April 2020

Break detection is deceptive when the noise is larger than the break signal

I am disappointed in science. It is impossible that it took this long for us to discover that break detection has serious problems when the signal to noise ratio is low. However, as far as we can judge this was new science and it certainly was not common knowledge, which it should have been because it has large consequences.

This post describes a paper by Ralf Lindau and me about how break detection depends on the signal to noise ratio (Lindau and Venema, 2018). The signal in this case are the breaks we would like to detect. These breaks could be from a change in instrument or location of the station. We detect breaks by comparing a candidate station to a reference. This reference can be one other neighbouring station or an average of neighbouring stations. The candidate and reference should be sufficiently close so that they have the same regional climate signal, which is then removed by subtracting the reference from the candidate. The difference time series that is left contains breaks and noise because of measurement uncertainties and differences in local weather. The noise thus depends on the quality of the measurements, on the density of the measurement network and on how variable the weather is spatially.

The signal to noise ratio (SNR) is simply defined as the standard deviation of the time series containing only the breaks divided by the standard deviation of time series containing only the noise. For short I will denote these as the break signal and the noise signal, which have a break variance and a noise variance. When generating data to test homogenization algorithms, you know exactly how strong the break signal and the noise signal is. In case of real data, you can estimate it, for example with the methods I described in a previous blog post. In that study, we found a signal to noise ratio for annual temperature averages observed in Germany of 3 to 4 and in America of about 5.

Temperature is studied a lot and much of the work on homogenization takes place in Europe and America. Here this signal to noise ratio is high enough. That may be one reason why climatologists did not find this problem sooner. Many other sciences use similar methods, we are all supported by a considerable statistical literature. I have no idea what their excuses are.



Why a low SNR is a problem

As scientific papers go, the discussion is quite mathematical, but the basic problem is relatively easy to explain in words. In statistical homogenization we do not know in advance where the break or breaks will be. So we basically try many break positions and search for the break positions that result in the largest breaks (or, for the algorithm we studied, that explain the most variance).

If you do this for a time series that contains only noise, this will also produce (small) breaks. For example, in case you are looking for one break, due to pure chance there will be a difference between the averages of the first and the last segment. This difference is larger than it would be for a predetermined break position, as we try all possible break positions and then select the one with the largest difference. To determine whether the breaks we found are real, we require that they are so large that it is unlikely that they are due to chance, while there are actually no breaks in the series. So we study how large breaks are in series that only contains noise to determine how large such random breaks are. Statisticians would talk about the breaks being statistically significant with white noise as the null hypothesis.

When the breaks are really large compared to the noise one can see by eye where the positions of the breaks are and this method is nice to make this computation automatically for many stations. When the breaks are “just” large, it is a great method to objectively determine the number of breaks and the optimal break positions.

The problem comes when the noise is larger than the break signal. Not that it is fundamentally impossible to detect such breaks. If you have a 100-year time series with a break in the middle, you would be averaging over 50 noise values on either side and the difference in their averages would be much smaller than the noise itself. Even if noise and signal are about the same size the noise effect is thus expected to be smaller than the size of such a break. To put it in another way, the noise is not correlated in time, while the break signal is the same for many years; that fundamental difference is what the break detection exploits.

However, to come to the fundamental problem, it becomes hard to determine the positions of the breaks. Imagine the theoretical case where the break positions are fully determined by the noise, not by the breaks. From the perspective of the break signal, these break positions are random. The problem is, also random breaks explain a part of the break signal. So one would have a combination with a maximum contribution of the noise plus a part of the break signal. Because of this additional contribution by the break signal, this combination may have larger breaks than expected in a pure noise signal. In other words, the result can be statistically significant, while we have no idea where the positions of the breaks are.

In a real case the breaks look even more statistically significant because the positions of the breaks are determined by both the noise and the break signal.

That is the fundamental problem, the test for the homogeneity of the series rightly detects that the series contains inhomogeneities, but if the signal to noise ratio is low we should not jump to conclusions and expect that the set of break positions that gives us the largest breaks has much to do with the break positions in the data. Only if the signal to noise ratio is high, this relationship is close enough.

Some numbers

This is a general problem, which I expect all statistical homogenization algorithms to have, but to put some numbers on this, we need to specify an algorithm. We have chosen to study the multiple breakpoint method that is implemented in PRODIGE (Caussinus and Mestre, 2004), HOMER (Mestre et al., 2013) and ACMANT (Domonkos and Coll, 2017), these are among the best, if not the best, methods we currently have. We applied it by comparing pairs of stations, like PRODIGE and HOMER do.

For a certain number of breaks this method effectively computes the combination of breaks that has the highest break variance. If you add more breaks, you will increase the break variance those breaks explain, even if it were purely due to noise, so there is additionally a penalty function that depends on the number of breaks. The algorithm selects that option where the break variance minus such a penalty is highest. A statistician would call this a model selection problem and the job of the penalty is to keep the statistical model (the step function explaining the breaks) reasonably simple.

In the end, if the signal to noise ratio is one half, the breaks that explain the largest breaks are just as “good” at explaining the actual break signal in the data as breaks at random positions.

With this detection model, we derived the plot below, let me talk you through this. On the x-axis is the SNR, on the right the break signal is twice as strong as the noise signal. On the y-axis is how well the step function belonging to the detected breaks fits to the step function of the breaks we actually inserted. The lower curve, with the plus symbols, is the detection algorithm as I described above. You can see that for a high SNR it finds a solution that closely matches what we put in and the difference is almost zero. The upper curve, with the ellipse symbols, is for the solution you find if you put in random breaks. You can see that for a high SNR the random breaks have a difference of 0.5. As the variance of the break signal is one, this means that half the variance of the break signal is explained by random breaks.


Figure 13b from Lindau and Venema (2018).

When the SNR is about 0.5, the random breaks are about as good as the breaks proposed by the algorithm described above.

One may be tempted to think that if the data is too noisy, the detection algorithm should detect less breaks, that is, the penalty function should be bigger. However, the problem is not the detection of whether there are breaks in the data, but where the breaks are. A larger penalty thus does not solve the problem and even makes the results slightly worse. Not in the paper, but later I wondered whether setting more breaks is such a bad thing, so we also tried lowering the threshold, this again made the results worse.

So what?

The next question is naturally: is this bad? One reason to investigate correction methods in more detail, as described in my last blog post, was the hope that maybe accurate break positions are not that important. It could have been that the correction method still produces good results even with random break positions. This is unfortunately not the case, already quite small errors in break positions deteriorate the outcome considerably, this will be the topic of the next post.

Not homogenizing the data is also not a solution. As I described in a previous blog post, the breaks in Germany are small and infrequent, but they still have a considerable influence on the trends of stations. The figure below shows the trend differences between many pairs of nearby stations in Germany. Their differences in trends will be mostly due to inhomogeneities. The standard deviation of 0.628 °C per century for the pairs translated to an average error in the trends of individual stations of 0.4 °C per century.


The trend differences (y-axis) of pairs of stations (x-axis) in the German temperature network. The trends were computed from 316 nearby pairs over 1950 to 2000. Figure 2 from Lindau and Venema (2018).

This finding makes it more important to work on methods to estimate the signal to noise ratio of a dataset before we try to homogenize it. This is easier said than done. The method introduced in Lindau and Venema (2018) gives results for every pair of stations, but needs some human checks to ensure the fits are good. Furthermore, it assumes the break levels behave like noise, while in Venema and Lindau (2019) we found that the break signal in the USA behaves like a random walk. This 2019 method needs a lot of data, even the results for Germany are already quite noisy, if you apply it to data sparse regions you have to select entire continents. Doing so, however, biases the results to those subregions were the there are many stations and would thus give too high SNR estimates. So computing SNR worldwide is not just a blog post, but requires a careful study and likely the development of a new method to estimate the break and noise variance.

Both methods compute the SNR for one difference time series, but in a real case multiple difference time series are used. We will need to study how to do this in an elegant way. How many difference series are used depends on the homogenization method, this would also make the SNR method dependent. I would appreciate to also have an estimation method that is more universal and can be used to compare networks with each other.

This estimation method should then be applied to global datasets and for various periods to study which regions and periods have a problem. Temperature (as well as pressure) are variables that are well correlated from station to station. Much more problematic variables, which should thus be studied as well, are precipitation, wind, humidity. In case of precipitation, there tend to be more stations. This will compensate some, but for the other variables there may even be less stations.

We have some ideas how to overcome this problem, from ways to increase the SNR to completely different ways to estimate the influence of inhomogeneities on the data. But they are too preliminary to already blog about. Do subscribe to the blog with any of the options below the tag cloud near the end of the page. ;-)

When we digitize climate data that is currently only available on paper, we tend to prioritize data from regions and periods where we do not have much information yet. However, if after that digitization the SNR would still be low, it may be more worthwhile to digitize data from regions/periods where we already have more data and get that region/period to a SNR above one.

The next post will be about how this low SNR problem changes our estimates of how much the Earth has been warming. Spoiler: the climate “sceptics” will not like that post.


Other posts in this series

Part 5: Statistical homogenization under-corrects any station network-wide trend biases

Part 4: Break detection is deceptive when the noise is larger than the break signal

Part 3: Correcting inhomogeneities when all breaks are perfectly known

Part 2: Trend errors in raw temperature station data due to inhomogeneities

Part 1: Estimating the statistical properties of inhomogeneities without homogenization

References

Caussinus, Henri and Olivier Mestre, 2004: Detection and correction of artificial shifts in climate series. The Journal of the Royal Statistical Society, Series C (Applied Statistics), 53, pp. 405-425. https://doi.org/10.1111/j.1467-9876.2004.05155.x

Domonkos, Peter and John Coll, 2017: Homogenisation of temperature and precipitation time series with ACMANT3: method description and efficiency tests. International Journal of Climatology, 37, pp. 1910-1921. https://doi.org/10.1002/joc.4822

Lindau, Ralf and Victor Venema, 2018: The joint influence of break and noise variance on the break detection capability in time series homogenization. Advances in Statistical Climatology, Meteorology and Oceanography, 4, p. 1–18. https://doi.org/10.5194/ascmo-4-1-2018

Lindau, R, Venema, V., 2019: A new method to study inhomogeneities in climate records: Brownian motion or random deviations? International Journal Climatology, 39: p. 4769– 4783. Manuscript: https://eartharxiv.org/vjnbd/ Article: https://doi.org/10.1002/joc.6105

Mestre, Olivier, Peter Domonkos, Franck Picard, Ingeborg Auer, Stephane Robin, Émilie Lebarbier, Reinhard Boehm, Enric Aguilar, Jose Guijarro, Gregor Vertachnik, Matija Klancar, Brigitte Dubuisson, Petr Stepanek, 2013: HOMER: a homogenization software - methods and applications. IDOJARAS, Quarterly Journal of the Hungarian Meteorological Society, 117, no. 1, pp. 47–67.

Thursday 23 April 2020

Correcting inhomogeneities when all breaks are perfectly known

Much of the scientific literature on the statistical homogenization of climate data is about the detection of breaks, especially the literature before 2012. Much of the more recent literature studies complete homogenization algorithms. That leaves a gap for the study of correction methods.

Spoiler: if we know all the breaks perfectly, the correction method removes trend biases from a climate network perfectly. I found the most surprising outcome that in this case the size of the breaks is irrelevant for how well the correction method works, what matters is the noise.

This post is about a study filling this gap by Ralf Lindau and me. The post assumes you are familiar with statistical homogenization, if not you can find more information here. For correction you naturally need information on the breaks. To study correction in isolation as much as possible, we have assumed that all breaks are known. That is naturally quite theoretical, but it makes it possible to study the correction method in detail.

The correction method we have studied is a so-called joint correction method, that means that the corrections for all stations in a network are computed in one go. The somewhat unfortunate name ANOVA is typically used for this correction method. The equations are the same as those of the ANOVA test, but the application is quite different, so I find this name confusing.

This correction method makes three assumptions. 1) That all stations have the same regional climate signal. 2) That every station has its own break signal, which is a step function with the positions of the steps given by the known breaks. 3) That every station also has its own measurement and weather noise. The algorithm computes the values of the regional climate signal and the levels of the step functions by minimizing this noise. So in principle the method is a simple least square regression, but with much more coefficients than when you use it to compute a linear trend.

Three steps

In this study we compute the errors after correction in three ways, one after another. To illustrate this let’s start simple and simulate 1000 networks of 10 stations with 100 years/values. In the first examples below these stations have exactly five breaks, whose sizes are drawn from a normal distribution with variance one. The noise, simulating measurement uncertainties and differences in local weather, is also noise with a variance of one. This is quite noisy for European temperature annual averages, but happens earlier in the climate record and in other regions. Also to keep it simple there is no net trend bias yet.

The figure to the right is a scatterplot with theoretically 1000*10*100=1 million yearly temperature averages as they were simulated (on the x-axis) and after correction (y-axis).

Within the plots we show some statistics, on the top left these are 1) the mean of x, i.e. the mean of the inserted inhomogeneities. 2) Then the variance of the inserted inhomogeneities x. 3) Then the mean of the computed corrections y. 4) Finally the variance of the corrections.

In the lower right, 1) the correlation (r) is shown and 2) the number of values (n). For technical reasons, we only show a sample of the 1 million points in the scatterplot, but these statistics are based on all values.

The results look encouraging, they show a high correlation: 0.96. And the points nicely scatter around the x=y line.

The second step is to look at the trends of the stations, there is one trend per station, so we have 1000*10=10,000 of them. See figure to the right. The trend is computed in the standard way using least squares linear regression. Trends would normally have the unit °C per year or century. Here we multiplied the trend with the period, so the values are the total change due to the trend and have unit °C.

The values again scatter beautifully around x=y and the correlation is as high as before: 0.95.

The final step is to compute the 1000 network trends. The result is shown below. The averaging over 10 stations reduces the noise in the scatterplot and the values beautifully scatter around the x=y line, while the correlation is now smaller, it is still decent: 0.81. Remember we started with quite noisy data where the noise was as large as the break signal.



The remaining error

In the next step, rather than plotting the network trend after correction on the y-axis, we plotted the difference between this trend and the inserted network mean trend, which is the trend error remaining after correction. This is plotted in the left panel below. For this case the uncertainty after correction is half of the uncertainty before correction in terms of the printed variances. It is typical to express uncertainties as standard deviations, then the remaining trend error is 71%. Furthermore, their averages are basically zero, so no bias is introduced.

With a signal having as much break variance as noise variance from the measurement and weather differences between the stations, the correction algorithm naturally cannot reconstruct the original inhomogeneities perfectly, but it does so decently and its errors have nice statistical properties.

Now if we increase the variance of the break signal by a factor two we get the result shown in the right panel. Comparing the two panels, it is striking that the trend error after correction is the same, it does not depend on the break signal, only the noise determines how accurate the trends are. In case of large break signals this is nice, but if the break signal is small, this will also mean that the the correction can increase the random trend error. That could be the case in regions where the networks are sparse and the difference time series between two neighboring stations consequently quite noisy.



Large-scale trend biases

This was all quite theoretical, as the networks did not have a bias in their trends. They did have a random trend error due to the inserted inhomogeneities, but averaging over many such networks of 10 stations the trend error would tend to zero. If that were the case in reality, not many people would work on statistical homogenization. The main aim is the reduce the uncertainties in large-scale trends due to (possible) large-scale biases in the trends.

Such large-scale trend biases can be caused by changes in the thermometer screens used, the transition from manual to automatic observations, urbanization around the stations or relocations of stations to better sites.

If we add a trend bias to the inserted inhomogeneities and correct the data with the joint correction method, we find the result to the right. We inserted a large trend bias to all networks of 0.9 °C and after correction it was completely removed. This again does not depend on the size of the bias or the variance of the break signal.

However, this all is only true if all breaks are known, before I write a post about the more realistic case were the breaks are perfectly known, I will first have to write a post about how well we can detect breaks. That will be the next homogenization post.

Some equations

Next to these beautiful scatterplots, the article has equations for each of the the above mentioned three steps 1) from the inserted breaks and noise to what this means for the station data, 2) how this affects the station trend errors, and 3) how this results in network trends.

With equations for the influence of the size of the break signal (the standard deviation of the breaks) and the noise of the difference time series (the standard deviation of the noise) one can then compute how the trend errors before and after correction depend on the signal to noise ratio (SNR), which is the standard deviation of the breaks divided by the standard deviation of the noise. There is also a clear dependence on the number of breaks.

Whether the network trends increase or decrease due to the correction method is determined by the quite simple equation: 6 times the SNR divided by the number of breaks. So if the SNR is one, as in the initial example of this post and the number of breaks is 6 or smaller the correction would improve the trend error, while if there are more than 7 breaks the correction would add a random trend error. This simple equation ignores a weak dependence of the results on the number of stations in the networks.

Further research

I started saying that the correction methods was a research gap, but homogenization algorithms have many more steps beyond detection and correction, which should also be studied in isolation if possible to gain a better understanding.

For example, the computation of a composite reference. The selection of reference stations. The combination of statistical homogenization with metadata on documented changes in the measurement setup. And so on. The last chapter of the draft guidance on homogenization describes research needs, including research on homogenization methods. There are still of lot of interesting and important questions.


Other posts in this series

Part 5: Statistical homogenization under-corrects any station network-wide trend biases

Part 4: Break detection is deceptive when the noise is larger than the break signal

Part 3: Correcting inhomogeneities when all breaks are perfectly known

Part 2: Trend errors in raw temperature station data due to inhomogeneities

Part 1: Estimating the statistical properties of inhomogeneities without homogenization

References

Lindau, R, V. Venema, 2018: On the reduction of trend errors by the ANOVA joint correction scheme used in homogenization of climate station records. International Journal of Climatology, 38, pp. 5255– 5271. Manuscript: https://eartharxiv.org/r57vf/ Article: https://doi.org/10.1002/joc.5728

Monday 20 April 2020

Corona Virus Update: the German situation improved, but if we relax measures so much that the virus comes back it will not be just local outbreaks (part 32)

The state of the epidemic has improved in Germany and we now have about half number of people getting ill compared to the peak we had a month ago. This has resulted in calls to relax the social distancing measures. The German states have decided that mostly nothing will happen the next two weeks, but small shops will open again (and some other shops) and in some states some school classes may open again (although I am curious whether that will actually happen, German Twitter is not amused).


An estimate of the number of new cases by the date these people got ill. Similar graphs tend to show the new cases for the date they were known with the health departments, by looking at the date people became ill, which is often much earlier, you can see a faster response to changes in social distancing. In dark blue you see the cases where the date someone got ill is known. In grey were it was estimated because only the case is know, but not when someone became ill. In light blue is an estimate for how many cases will still come in.

So in the last episode of the Corona Virus Update science journalist Korinna Henning tried to get the opinion of Christian Drosten on these political measures. He does not like giving political advice, but he did venture that some politicians seem to wrongly think measures can be relaxed without the virus coming back. The two weeks that the lockdown continues should be used to prepared other measures that can replace the lockdown-type measures, such a track and trace CoronaApp and the public wearing everyday masks.

Another reason it may be possible to relax measures somewhat would be that the virus may spread less efficiently in summer. It is not expected to go away, but the number of people who are infected by one infected person may go down a bit.

When the virus comes back, either because we relaxed social distancing too much too early or because of the winter, it will look differently from this first wave. This first wave was characterized by local outbreaks. A second wave would be everywhere as the virus (and it various mutations) are spreading evenly geographically.

Korinna Henning asks Drosten to explain why it is easier for him to call COVID-19 a pandemic than for the World Health Organization. This question was inspired by Trump complaining that the WHO called the pandemic too late. Drosten notes that it has political consequences when the WHO calls the situation a pandemic, but that does not influence the situation in your country and what Trump could have done.

Really interesting was the part at the end on some possible (not guaranteed) positive surprises.


Prof. Dr. Christian Drosten, expert for emerging viruses and developer of the WHO SARS-CoV-2 virus test, which was used in 150 countries.

The situation and measures in Germany

Korinna Hennig:
What's your assessment, how long would [the reproductive rate] have to stay below one for it to have a really long-term effect and we're not going to say that at some point we have to close all the schools again.
Christian Drosten:
I believe there is talk of months [in a report by the Helmholtz Association]. I can well believe that this is the case. However, this is not the path that has been chosen in essence [by the German government], but rather - I believe - the idea has arisen that the intention is to keep it within the current range, perhaps by taking additional measures to reduce the pressure a little more.

That is an important point of view, and one that needs to be understood. It is not primarily a question of saying that we have now achieved a great deal, that the measures have already had a considerable impact. And now we are simply letting them go a tad, because we no longer want to. Then at some point we will have to take a look and then we will have to consider how to proceed, that is one view.

The other is that everything will work out fine. Sometimes you can hear that between the lines. I have the feeling, particularly among the general public, that many people, even in politics, are speculating that it will not come back at all, that it will not pick up any momentum. Unfortunately, that is not what the epidemiological modelers are saying, but it is generally assumed that, if nothing is offered as a counter-offer to this relaxation of measures, it will really get out of hand.

And the idea is, of course - and this is a very real idea in Germany - that people say that they are now relaxing these measures to a small extent, but to a really small extent. It is rather the case that corrections are being made in places where we think we can perhaps get away with it without the efficient reduction of transmission suffering in the first place. And now, in the time that has been gained by the decision, it is preparing to allow other measures to come into force. And this of course includes the great promise of automated case tracking.

The cell phone tracking ... doesn't have to do the job completely, but you can combine it. You could say that there is a human manual case tracking system, but it gets help from such electronic measures, while you introduce these electronic measures. After all, this is not something that is introduced overnight; there must be some transition. I believe that the few weeks of time that have now been gained once again can be used to introduce such measures, and that is where a great deal of faith comes from at the moment.

Of course, there are other things to hope for as additional effects, such as, for example, a recommendation on the wearing of masks by the public. That could have an additional effect. Of course there will also be a small additional effect on seasonality. We have already discussed this, and there are studies which say that, unfortunately, there is probably not a large effect on seasonality, but there is a small effect on seasonality.

That is where things are coming together, so that we hope that the speed of propagation will perhaps slow down again overall and that we will at least be able to enter an region over the summer and into the autumn, where we will unfortunately see the effect of winter coming again, a possible winter wave, but where we will then have the first pharmaceutical interventions. Perhaps a first drug, with which certain risk patients could be treated in an early stage. Maybe first use studies, so efficacy studies of first vaccines. This is the overall concept, which one hopes will work.
Currently one infected person infects 0.7 or 0.8 other persons (RO, the reproduction number). That is behind the decline in the number of new cases. Theoretically you could thus allow for 25% more contacts while still being in a stable situation. I would be surprised if the small relaxations decided for the next two weeks would do that. I do worry that these relaxations make people take to problem less seriously and that can quickly lead to 25% more contacts.

I would personally prefer this decline to continue until we get to a level where containment by manual tracking infected people and their contacts becomes an effective way to fight the epidemic; Mailab explains it well in German.

If we get the tracking of infected people with a CoronaApp working, it would matter much less at which level of contagion we start, but I do not expect that the CoronaApp will be able to do all the work, it will likely need to be complemented by manual tracking. With the current plans, according to rumours in the media, placing less emphasis on privacy of the users, I worry that too few will participate to make any kind of dent. An app were we can only hope and need to trust that the government keeps its side of the bargain and does not abuse the data would also be less useful in large parts of the world where you can definitely not trust the government.

That some states are already starting with opening up some classes is in principle a good thing. But it goes too fast, the schools are not prepared yet and I see quite some backlash coming. If done well, by opening a few school classes we could have learned how to do this before we do more and we could study how much this contributes to a higher reproduction number R0. If we are lucky maybe hardly; see the last section on possible positive surprises.

Summertime

The flu normally goes away in summer, this is not expected for SARS-2, but the reproduction number could be 0.5 lower, that is that one infected person would infect half a person less. Without measures it is expected to be between 2 and 3 and we have to keep this reproduction number below 1 to avoid that the situation gets out of hand again. The summer may thus help a bit, which could mean less stringent restrictions.

It is not well understood what exactly makes the summer harder for the flu and even less for SARS-2. One aspect is likely that people are outside more and ventilate buildings more, which dilutes and dries the virus. Also when it comes to schools, it may be an option to do the classes outside, where the distancing rules could be less strict than indoors.

Museum could create large sculpture gardens outside for the summer. As the conference centres are empty and unused they could be used as social distancing museums. The empty hotels could be used to quarantine people who might otherwise infect other people in their households. We have to support the hotels anyway to survive until the pandemic is over.

I have often dreamed of conferences while walking outside in nature. You could transmit the voice of the speaker with a headset. The power points slides with Comic Sans would be missing. This may be the year to start this as alternative to video conferences. (Although there would still be transport.)

World Health Organization and Trump

Korinna Hennig:
Could you briefly explain again what the difference is when you say here in the podcast for example: Yes, we have a pandemic in an early phase. And the WHO is still hesitating for a very long time. What is the crucial difference when the WHO makes such an assessment?
Christian Drosten:
So I am only an individual and can give my opinion, which you can follow or not. You can take me for someone who knows what he's doing. Or you can say: He's just a fool and he says things here.

Of course, this has different consequence with the WHO. In the case of a UN organisation, this has certain consequences, not only when it comes to saying that this is a pandemic, but also, and especially, when it comes to saying that this is PHEIC, i.e. Public Health Emergency of International Concern. That is a term used in the context of international health regulations. This then also has consequences for intergovernmental organisations. This scope has certainly also led to delays in all these decisions by the WHO.

Of course there are advisory bodies. After all, the WHO is not a person, but an opinion-forming and opinion-collecting organisation. Experts are called together, committees that have to vote at some point and where there is sometimes disagreement. And then they say that we will meet again next week and until then we will observe the situation again. This then leads to decisions that are perceived as a delay by some countries. This is an ex post evaluation of the WHO's behaviour.

At the moment this is again all about politics. And it is about a decision by Donald Trump, who has now said that he is suspending the WHO payments, the contributions, because the WHO did not say certain things early on.

It was, of course, known relatively early on from individual case reports that cases had already been introduced in the USA. And now to say that it is a pandemic that is taking place in all other countries ... So the statement that this is a pandemic is to acknowledge the situation, that this is far is widespread. This has nothing to do with the assessment for your own country. Since you know, it is in your own country, you have to ask yourself: Will do I act or not?
Korinna Hennig:
And there are of course financial liabilities between countries that are linked to the WHO.

Local outbreaks in wave 1, everywhere in wave 2

If there is a second wave, it will not look like this first wave.
Christian Drosten:
What happened in the case of the Spanish flu was this: We also had a first wave there in some major US cities - that is very, very well documented - that caught our attention. However, it did not occur in all places, but was distributed extremely unevenly locally. It was conspicuous here and there, and elsewhere people did not even notice that this disease existed at all.

Even there, even at that time, people were already working with curfews and similar things. This was also happening in spring, by the way. Then it went into the summer and apparently there was a strong seasonal effect. And you didn't even notice the disease anymore. And under the cover of this seasonal effect - we can perhaps now envisage this as, under the cover of the social distancing measures that are currently in force - this illness has, however, unnoticed, spread much more evenly geographically.

And then, when the Spanish flu hit a winter wave, the situation was suddenly quite different. Then chains of infection started at the same time in all places because the virus had spread unnoticed everywhere and no one had paid any attention to it. This is of course an effect that will also occur in Germany, because we do not have a complete ban on leaving and travelling here, and of course we do not have zero transmission either, but we have an R, i.e. a reproduction number that is around or sometimes perhaps even slightly below one. But that does not mean that no more is being transmitted.
So you can look at our homepage, for example, at the Institute of Virology at the Charité - we have now published a whole set of [virus] sequences from Germany. You can see that the viruses in Germany are already very much intermixed, that the local clustering is slowly disintegrating and that all viruses can be found in all places. So let me put it very simply.It is slowly but surely becoming very intermixed. ...

We'll be in a different situation when winter sets in. ... Suddenly you'd be surprised that the virus starts everywhere at once. Of course it is a completely different impact that such a wave of infection would have.
What I find interesting to see it that there is nearly no difference in virus activity between cities and rural regions in Germany anymore. If anything, just looking at the map below, I have the impression that rural regions have more virus activity. On the other hand, in the beginning, I feel there was more activity in the cities.


Yesterday's map of the RKI, the German CDC, of the number of new cases over the last week per 100,000 inhabitants. The larger cities are denoted by a small red dot, the location of the smaller cities can sometimes be seen as a smaller region in a different colour. The darkest region is an outbreak, which was likely due to a strong beer feast.

Positive surprises

Christian Drosten:
It is also quite possible that there will be positive surprises. For example, we still know nothing about children. It is even the case that in studies that are very systematically designed, this effect is often still left out. We know from other coronavirus diseases, especially MERS, that not only are children hardly affected, but they are hardly ever infected. Now the question is, of course, whether this is also the case with this disease, that not only they do not get any symptoms and are therefore not so conspicuous in the statistics, but that they are somehow resistant in a certain way and that they do not even have to be counted in the population to be infected. So what is 70% of the population? Is it possible to consider the 20 percent of children as finished, because they do not get infected at all? In reality, only 50 percent of the population need to be infected? This is a big gap, which can also be interpreted as a great hope.

And there is something else - we are anticipating that, epidemiological modellers are doing that, and they are taking that into account: That there may be an unnoticed background immunity from the common cold corona viruses, because they are already related in some way to the SARS-2 virus. It could happen, however, that certain people, because they have had a cold from such a corona virus in the last year or two, are protected in a previously unnoticed way.

All I want to say is that we are currently observing more and more - and a major study has just come out of China in the preprint realm - that in well-observed household situations, the secondary attack rate, that is to say the rate of infected persons who become infected when there is an index case in the household, an infected person, is quite low. It is in the range of 12, 13, 14 percent. Depending on the correction, you can also say that it is perhaps 15, 16, 17 percent. But it does not lie at 50 or 60 percent or higher, where you would then say that these are probably just random effects. The one who didn't get infected wasn't at home during the infectious period or something.

How is it possible that so many people who were supposed to be in the household are not infected? Is there some sort of background immunity involved?

And there are these residual uncertainties. But at this stage, even if you include all these residual uncertainties in these models, you still get the picture that the medical system and the intensive care unit capacity would be overloaded. That is why it is certainly right at the moment to have taken these measures. We must now carry out intensive research work as quickly as possible, as we clarify issues such as: What is really wrong with the children? Do they not get seriously ill, but are they in fact infected and are giving off the virus and carrying it into the family? Or are they resistant in some way? The other question that we absolutely must also answer is: why do relatively few, perhaps even cautiously put, unexpectedly few get infected in the household? This is a realisation that is now maturing so slowly.

As I said, a new preprint has just appeared from China, and a few other studies suggest that this is the case. The Munich case tracking study, for example, has already hinted at this a bit. You have to take a closer look at that. Is there perhaps a hitherto unnoticed backgroundimmunity, even if only partial immunity?

That wouldn't mean that we were wrong at this point in time, and what we have done now was wrong. At the moment, even if you factor in these effects, you get the impression that it's right to stop this, that we're not getting into such a rampage that we can no longer control. But for the estimation of how long the whole thing will last, new information could arise from this. It could then be - and I would like to say this now, perhaps as a message of hope - that in a few weeks or months, new information will come out of science that says that the infection activity will probably stop earlier than we thought because of this special effect.

But I don't want to say that I can announce something now. These are not hints from me, or data that have been available for a long time, but that I wouldn't want to say in public or anything. Rather, they are simply fundamental considerations that we simply know too little about this disease at the moment. And that the knowledge, which is actually growing from week to week, will also influence the current projections.


Other podcasts

Part 31: Corona Virus Update: Don't take stories about reinfected cured patients too seriously.

Part 28: Corona Virus Update: exit strategy, masks, aerosols, loss of smell and taste.

Part 27: Corona Virus Update: tracking infections by App and do go outside

Part 23: Corona Virus Update: need for speed in funding and publication, virus arrival, from pandemic to endemic

Part 22: Corona Virus Update: scientific studies on cures for COVID-19.

Part 21: Corona Virus Update: tests, tests, tests and how they work.

Part 20: Corona Virus Update: Case-tracking teams, slowdown in Germany, infectiousness.

Part 19: Corona Virus Update with Christian Drosten: going outside, face masks, children and media troubles.

Part 18: Leading German virologist Prof. Dr. Christian Drosten goes viral, topics: Air pollution, data quality, sequencing, immunity, seasonality & curfews.

Related reading

This Corona Virus Update podcast and its German transcript. Part 32.

All podcasts and German transcripts of the Corona Virus Update.

Thursday 16 April 2020

Corona Virus Update: Don't take stories about reinfected cured patients too seriously (part 31)


Prof. Dr. Christian Drosten
The last Corona Virus Update Podacast with specialist for emerging viruses Prof. Dr. Christian Drosten had two main topics. The internationally most important one is about press reports that cured patients would be reinfected or even that people may not become immune after recovering from the disease. ThEN WHat AbOuT hErD iMmUNiTy?

I have seen people who are normally careful and well informed talk about these "reinfections". However, it is very likely just a problem with measurement accuracy when in the final stages of the disease the amount of virus becomes very low and hard to detect, especially in samples taken from the throat.

The other half of the podcast was about a study on the spread of SARS-CoV-2 in the German municipality Heinsberg. A region not too far from Bonn were there was a big early outbreak after a Carnival party. At a press conference some preliminary results were presented without any detail on the methods, on how these results were computed. The numbers suggested less people may die and more may be infected without knowing it.

There was first a wave of publicity praising the results and discussing the political implications. Then after consulting scientists there was a wave of publicity claiming the study was rubbish, while all the scientists had said was that they did not have information on the methods and thus could not comment. Sometimes they explained the kind of information they would need to have and that was spun into the study doing this this wrong, which was not claimed. On social media people started attacking the Heinsberg scientists or those asking for more information, which can only be based on whether they liked the numbers (politically) because they knew about the methods even less. For a day Germany looked like the US culture war. Social media has a mob problem that needs addressing.

It was not a glorious hour for science reporting by (probably mostly) political journalists. Anyway because this is much ado about nothing until we have a manuscript describing the methods and purely German I have skipped this part. I was nodding a lot, yes those are the kinds of problems you have interpreting measurements, yes you really need to know the measurement process well to assess the results. There are so many similarities between sciences.

It may still be fun for the real virology science nerd to learn the kind of details that matter to interpret a study. They can read the German transcript.

The basic problem determining whether someone is ill

Korinna Hennig:
Over the weekend there have been several reports from China and South Korea about patients who were considered to have recovered or were discharged from hospital and have now tested positive again. So this is not about antibodies, but about the actual virus detection in the throat swab, for example, or from the lungs. Is it conceivable that the virus is reactivated? You also examined the course of the PCR tests on the Munich patients.
Christian Drosten:
This phenomenon can be described as follows: A patient is discharged from the hospital, verified as corona negative and as cured. And a moment later - it could be days, three or four days, or even up to seven or eight days - the patient is tested again. And suddenly he is positive for the virus in the PCR. It is said that the patient may have become newly infected, or in reality he was not immune at all, although he survived the disease. Or the virus has come back again, and you know certain infectious diseases, herpes viruses are the prime example, which can always come back.

One asks the question: is this perhaps the case with this new virus? Unfortunately, there are still very few precise descriptions in the scientific literature of how the virus is excreted in patients in different types of samples, for example in swabs taken from the throat or in lung secretion, also known as sputum, or in stool samples - these are all the types of samples we know that the virus is detectable. Only a few studies have so far described how this behaves over time in relation to excretion.

We have made and published one of them. We have made an overview picture of this excretion over time in nine patients from Munich. ... This shows the detection limit of the polymerase chain reaction. And you can see clearly, especially towards the end of the disease process, when the patients recover, that there is still virus present. It is sometimes detectable, sometimes for a few days in a row, then again for a few days in a row it is not detectable. This always jumps above and below the detection limit.

These are simply statistical phenomena that occur. A PCR can only test a certain sample, a certain sample volume for virus. There are statistical distribution phenomena which mean that the virus has in principle been there the whole time, but the test cannot always detect it. You have to picture it like this, I often explain it to students like this: you have a swimming pool full of water and goldfish are swimming in it. And there is no doubt that they are there. But now you take a sample from this paddling pool with a bucket, blindfolded. And then you may have a goldfish in your bucket and sometimes not. Still, one would not deny that there are goldfish in the swimming pool. ...

Reporting of the results

And now the question is simply how to deal with it. I can tell you that here in Germany something like this would not happen, because we have a culture here, where results like this are questioned relatively quickly and rules are always seen with the possibility of an exception. In other words, a German health authority would practically say: well, okay, that's obvious, that's what happened now.

But in the Asian culture of public health there is a much greater strictness in dealing with such rules. That is not so bad. I don't want to criticize it now. It is simply a cultural difference that when such a rule is established, it is adhered to.And when it is then said that we now agree that a patient who has been PCR negative twice in a row, we define him as cured and discharge him. ...

It is a thoroughness to say: No, this rule will not be questioned now, this is no exception, but we just enter it into the table. The patient was tested negative twice and now he is positive again. And now we test a few hundred of such discharge courses and enter all this in the table and discuss it only after we have the table completely. Then we write this together and write a scientific publication about it. This is exactly what happened, several times.

These scientific publications are now in a public resource and readable, but now this discussion process is starting. So, now it's starting with people reading such publications, who perhaps do not know the details and say: What is this? It looks like a reinfection. What is going on with this virus? And it's being spread again through even more discussion channels. This creates excitement and uncertainty.
As a scientist, I would prefer the "Asian" process, that is the cleaner data, where you know exactly what happened. You have to understand the measurement process, but the scientific literature is for scientists.

I like the movement to open science, which makes it easier for people to participate in science and also for scientists to do science, but the scientific literature is not written for normal people and it will lead to problems when people with half-knowledge start reading the scientific literature. In this case it was probably innocent, in many cases bad actors abuse this to mislead the people.

Study one

How the samples were take for one of the studies was not fully clear, as can happen with preprints.
So it may well be that at one point when the patient was discharged, they simply took swabs from the throat, and at another time they may have looked in the lung secretion that someone coughed up. Such things can happen, these are two different types of samples.

And we know well, that the lung secretion stays positive much longer after discharge. And we also believe that it is not infectious for others. Using cell culture virus isolation studies, which we also did in our publication we tried this. We already believe it's no longer infectious. We've never been able to isolate an infectious virus. ...

Study two

In the other study it is actually more interesting, it is a bit more explicit. They examined 172 patients beyond the point of discharge. In 25 of them, the test was positive again, on average after 5.23 days after discharge. There it is also clearly stated, the discharge criterion was two negative throat swabs in a row.

So: The patient had to have a negative throat-swab twice, then he was discharged as cured. But we know exactly that the throat-swab is the sample that becomes negative earliest in patients. So in the second week of illness, many patients no longer have a positive throat-swab on most of the days that one tests, while stool and sputum are still reliably almost always positive.

And then it is said that of these 25 patients, 24 patients had severe histories. For me, this indicates that if someone has a severe history, he will of course be discharged later. Then he will be treated in hospital for a longer time. And especially with these patients we know that the virus in their throat is almost always completely gone. So the virus in the throat has had time to be eliminated. So in severe cases, the throat swab is no longer positive after this long time.
Let me set a break here to let this sink in. If it were really a problem of people being re-infected because they did not acquire immunity, it would be the patients who got most ill, who did not acquire immunity. If it really were a matter of immunity, the opposite would be more logical.
Then it is said that 25 patients have been diagnosed as positive. But in 14 of them, the laboratory test was positive again after they had been discharged from the stool, i.e. not from the throat-swab, and this tells me that we have exactly this mix-up here. For we know that the stool samples in particular remain positive for the virus for a long time, and I have to say that here too, by the way, we have not found any infectious virus in them. This is probably again only dead, excreted virus.

And with others it was throat swabs, which then tested positive again. But then we have to say again, a throat swab can also contain naturally coughed up lung mucus. You cough up the stuff and it sticks to the back of your throat.

You can see from the way in which it was done methodically and from the samples in which it was found, and also from the type of patients, that people say that these are patients who have been seriously ill for a long time, that there is a risk of falling into this trap, into this confusion. I would even suspect that the authors themselves simply know that this "mistake" could be present here. ...


Other podcasts

Part 28: Corona Virus Update: exit strategy, masks, aerosols, loss of smell and taste.

Part 27: Corona Virus Update: tracking infections by App and do go outside

Part 23: Corona Virus Update: need for speed in funding and publication, virus arrival, from pandemic to endemic

Part 22: Corona Virus Update: scientific studies on cures for COVID-19.

Part 21: Corona Virus Update: tests, tests, tests and how they work.

Part 20: Corona Virus Update: Case-tracking teams, slowdown in Germany, infectiousness.

Part 19: Corona Virus Update with Christian Drosten: going outside, face masks, children and media troubles.

Part 18: Leading German virologist Prof. Dr. Christian Drosten goes viral, topics: Air pollution, data quality, sequencing, immunity, seasonality & curfews.

Related reading

This Corona Virus Update podcast and its German transcript. Part 31.

All podcasts and German transcripts of the Corona Virus Update.

Roman Wölfel, Victor M. Corman, Wolfgang Guggemos, Michael Seilmaier, Sabine Zange, Marcel A. MĂ¼ller, Daniela Niemeyer, Terry C. Jones, Patrick Vollmar, Camilla Rothe, Michael Hoelscher, Tobias Bleicker, Sebastian BrĂ¼nink, Julia Schneider, Rosina Ehmann, Katrin Zwirglmaier, Christian Drosten & Clemens Wendtner, 2020: Virological assessment of hospitalized patients with COVID-2019. Nature. https://doi.org/10.1038/s41586-020-2196-x

Ye, G., Pan, Z., Pan, Y., Deng, Q., Chen, L., Li, J., Li, Y., & Wang, X., 2020: Clinical characteristics of severe acute respiratory syndrome coronavirus 2 reactivation. The Journal of infection, 80(5), e14–e17. Advance online publication. https://doi.org/10.1016/j.jinf.2020.03.001

Jing Yuan, MD, Shanglong Kou, PhD, Yanhua Liang, MS, JianFeng Zeng, MS, Yanchao Pan, PhD, Lei Liu, MD, 2020: PCR Assays Turned Positive in 25 Discharged COVID-19 Patients. Clinical Infectious Diseases, ciaa398. https://doi.org/10.1093/cid/ciaa398

Tuesday 14 April 2020

Opening up Germany in a Randomized Controlled Trial

It is now clear that for now Germany has managed to avoid spiralling into a situation where the new Coronavirus overburdens the healthcare system. In fact, I think we can say that the number of cases is declining.

So in Germany the discussion has started about slowly opening up society again. On Wednesday the 15th of April the government wants to decide what to do next week. While the number of confirmed infections is going down, I feel it would be good to basically keep the current measures in place for two more weeks. This would lower the numbers to where the tracking and tracing of infected people becomes an effective way to keep infections down, which means less restrictions long-term.

But we could use this two weeks for an experiment, which will help us make better decisions. The best experiments are [[randomized controlled trials]], where you have two conditions and randomly one of them. This is typically how new medicines are tested. Here one would randomly assign a pill or a placebo to patients.

In case of COVID-19 measures the two conditions could be a relaxation of measures or not. Because this is about the spread of a virus in a community, you cannot randomly select people, you will have to randomly select regions. As Germany is a federal state, a logical selection would be randomly assigning states, but you could also do it for municipalities. That would be better scientifically, but harder to implement.

Without mitigation measures one infected person infects 2 or 3 others. We have to bring this number below 1 to stop the epidemic. About half of the infections are transmitted by people with symptoms and half by people before they have symptoms. Some are transmitted by people who will never get symptoms and some via the environment, without direct contact. So quarantining people (with symptoms) is important, but not enough, we also need to reduce the number of physical contacts between people without symptoms, that means basically all of us. But it does not have to go to zero, which is why essential people are still working and supermarkets are open.

So we have to decide which physical contacts to allow until we have a cure or a vaccine and which ones we do not. This is a compromise between how important the contact is and how dangerous it is. Keeping supermarkets open is clearly important, people have to eat. Most dangerous are close contacts, with many people, over a longer time, inside buildings. Parties with thousands of people are clearly dangerous and, while nice, less important.

Those two decisions are easy, supermarkets open, parties closed.

The most difficult decision I see is about whether to open or close school.

On the one hand, this would be important. We cannot have our kids locked in at home, children need to move. We cannot have them miss school for one and a half year, the more so as this sacrifice does not help them as school children do not get ill. Children not going to school also prevents many parents doing essential work from going to work or working from home efficiently.

On the other hand, going to school would be dangerous. With many children, this means an enormous number of contacts. And it will be hard to change the behaviour of kids at school to reduce the contacts. (Within one class I am not even sure whether we should try.)

What makes the decision even harder is the uncertainty in how infectious children are. We know they can be infected, but as they do not have many symptoms, they may be less efficient in spreading it than adults.

So studying the influence of opening schools would be a good use of a randomized controlled trial. You could do this carefully by only having one or two years go back to school. Rather than switching from compulsory schooling, to closings schools, back to compulsory schooling, we could also make it voluntary. Parents who are in a health risk group could then opt keeping their children at home. While parents who most urgently need to work could opt to send their children to school.

Whatever we decide, I think it would be a good use of our time to use it for an experiment that helps us make better decisions about a disease we do not know much about yet.

Related reading

The German National Academy of Sciences, Leopoldina, released their recommendation this Monday, they recommend opening the schools stepwise: Dritte Ad-hoc-Stellungnahme: Coronavirus-Pandemie – Die Krise nachhaltig Ă¼berwinden

A privacy respecting track and trace app to fight Corona is possible and effective

German public radio channel NDR Info makes a daily podcast with virologist Christian Drosten, on my blog you can find translations of parts of these interviews.

The German CDC, the RKI makes wonderful informative daily situation reports, in German and English.

Monday 13 April 2020

A privacy respecting track and trace app to fight Corona is possible and effective

An app to track and trace infections seems to be a promising way out of the lockdowns. Tracking the contacts of infected people is a main strategy to fight this epidemic as long as we do not have a cure or vaccine. It is the main strategy used in South Korea and they are able to keep the number of new infections below 100 per day with it.

When the virus spreads more widely, like in many countries who did not take the virus serious enough soon enough, it becomes difficult for the health departments to track and trace so many people. In addition, a part of the contacts will not be known to the infected person and will thus not be tracked; for example, someone sitting next to you on public transport or in a restaurant.

How the app works

For the last case South Korea uses GPS information from mobile phones. I am not comfortable with the state having all that location data, but fortunately there is a better alternative. This is a great cartoon explaining how contact tracing can be done fully respecting privacy. The short version of this is below.



The Chaos Computer Club (CCC), German's most reliable technology activists, explain the conditions to make this work and have promised to warn about bad apps. I am happy use such an app and will listen to the CCC for advice. The CCC is comparable to America's [[Electronic Frontier Foundation]].

Let's hope data brokers Google and Apple getting involved does not mess this up. At least one group of scientists who started this approach are hopeful Google and Apple will help. We need many people participating, so we need something everyone can embrace.

How effective it is

A recently published study published in Science claims that such fast contact tracing could be as effective as a lockdown if 60% participate & 60% heeds its warnings.

To compute this they first estimate how the virus spreads; this paragraph can be skipped if you are not interested in the scientific basis. They estimated how long the incubation time is (5.5 days). On average it takes 5.0 days between one infected person and the next to show symptoms. So the moment someone gets ill, the people they have infected have started infecting other people. They estimate that on average 1 person infects 2 others. (This is a low value, other studies tend to find between 2 and 3.) The direct transmission from a symptomatic individual to someone else ("symptomatic transmission") explains 0.8 infections of those 2 infections. So even if if we theoretically would remove this fully the number of infections would still grow exponentially. Infected people infect 0.9 people before they show symptoms ("Pre-symptomatic transmission"). People without symptoms infect 0.1 people ("asymptomatic transmission"), while "environmental transmission", infections where people did not meet, account for 0.2 infections.

So it is important to be fast. This is the advantage of the app over a health departments trying to reach people by phone and email. Still it is worthwhile to both do manual and app tracing. A person from the health department calling you telling you your friend or colleague is ill and explaining how quarantine works is likely more effective than a notification by your phone. For this manual work to be effective we need to get the number of new infections down.

The speed of the testing is an important part of this strategy. It will thus work better in countries like Germany with a strong testing program than in America where much less testing is done, which in the short term makes the numbers look better, but does not make the situation better. The paper also studies how effective it would be if people with symptoms can warn people before being tested. This is naturally faster, but false warnings triggered by hostile actors can abuse the system. To avoid this one can tie the app to test results, where the health care providers can give the app user a code in case of a positive test.

When the app warns someone that they have been in contact with an infected person, this person will have to go into quarantine. This will work better in a country with paid sick leave and when the government gives the warning of the app the same status as "sick certificates" from the doctor.

The proximity detection by Bluetooth is far from perfect, so there will be false positives, but I would argue that that is still better assuming everyone had contact with an infected person and putting all of society on lockdown.

Enough people will have to participate. Fortunately it does not have to be all, apparently 60% is already enough. The privacy invading app of Singapore only has a take up of 10 to 15%. I would personally not use such an app, I'd rather take a small risk dying than giving a government really dangerous powers, while I would be happy to use the above described one. So I would expect the adoption of a decent app to be higher.

One would install the app to help others, so this may work less well on countries where the ruling class has pitted groups against each other to solidify their power. Ross Anderson from the UK is pessimistic about the adoption of such an app. I am quite optimistic. But we will have to do the experiment. Do note that when reading the second opinion of Anderson that I feel he does not accurately describe how the app would work; part of my text above is based on such misunderstandings others may have.

Prof. Dr. Christian Drosten, one of the main virologists in Germany who specializes in emerging viruses, thinks the app could work to reduce infections. In a recent podcast of the public radio channel NDR Info he talked about the app:
This is a study from the group of Christophe Fraser, certainly one of the best epidemiological modelers. It's a very interesting study, I think. It's published in Science. ...

The main outcome of the study is that you are too late with a simple [manual] identification of cases and contact tracing, because the whole thing depends on identifying symptomatic patients. So it really comes down to the last day. ...

And you can say in a nutshell, if the epidemics ran at the same speed as in the beginning in Wuhan ... then you could already lower R0 below one. This is amazing.

There are a few caveats on that. It is then said that in reality the speed of propagation in Europe is already faster than it was at the beginning in Wuhan. There are certainly several reasons for this. Population density, behaviour of the populations, but also how far the infection has already progressed. This of course makes it even more difficult again, so that a higher degree of cooperation among the population is actually needed. ...

You could combine such an App, for example by other general factors that reduce the transmission of the infection, such as wearing masks. ...

[The study models a situation where] there is no general lockdown. Companies can work, schools can teach, everything can work, but not for everyone at all times. There will come a time when you have this message on your mobile phone: "Please go into home quarantine." If you could then show this and your employer would say: Well, that's how it is, home quarantine this week. Then I find, that is at least a very interesting model one should not refuse thinking about.

Drosten can naturally only judge the effectiveness. The Chaos Computer Club (CCC), German's most reliable technology activists support the technical concept. It naturally depends on implementation details and while they will not recommend an app, they have promised to warn about bad apps. I will listen to the advice of the CCC.

Electronic Frontier Foundation makes clear that a trace and track app can only be part of a package of measures and rightly emphasise the importance of consent.
Informed, voluntary, and opt-in consent is the fundamental requirement for any application that tracks a user’s interactions with others in the physical world. Moreover, people who choose to use the app and then learn they are ill must also have the choice of whether to share a log of their contacts. Governments must not require the use of any proximity application. Nor should there be informal pressure to use the app in exchange for access to government services. Similarly, private parties must not require the app’s use in order to access physical spaces or obtain other benefits.

Individuals should also have the opportunity to turn off the proximity tracing app. Users who consent to some proximity tracking might not consent to other proximity tracking, for example, when they engage in particularly sensitive activities like visiting a medical provider, or engaging in political organizing.
A German conservative politician wanted to force people to use the app. He did not have a good day on social media. Well deserved. That is the most effective way to destroy trust and in times of Corona we need high compliance and thus solutions that have broad support.

The German National Academy of Sciences, Leopoldina, recommends three measures to replace the lockdowns. 1) Such an app. 2) Massive testing. 3) Everyone wearing simple masks in public. (In German.)

In the Netherlands, Arjen Lubach asks many questions on how such an app would be used. (video in Dutch.) Would your boss be allowed to force you to use such an app? Would this be a condition to use public transport? Would a restaurant be allowed to require customers to use an app? Would you be forced to share your random numbers when you find out that you are infected? Could you turn off the app? Could you ignore the warning of the app?

I had not considered many of these questions because I considered it natural to each time opt for the most free option and expect that that leads to much more people participating and thus to the largest effect. Any force to use the app would only make sense on a societal level. A boss or a restaurant has no advantages from such a measure, just like the users themselves only help society, not themselves.

My impression is that a main reason Germany got through this pandemic with only a blue eye is that the population was well informed, understood the danger, knew what to do and was very cooperative. It is relatively easy in science, but I have seen a huge part of people working from home well before there were any rules to do so. Meetings were cancelled well before the limits for the maximum number of participants went down to that level.

The alternative to so much compliance would be quite draconian rules and a lot surveillance and enforcement, leading to much more violations of freedom and economic damage. Thus I would expect that the best way to make the app a success is to respect the privacy of the citizens and respect their autonomy to make the right decisions. In countries were this is not possible, I am sceptical of the app helping much, except if they go full China and most countries do not have the enormous repressive system that would necessitate.

Where I do agree with Arjen Lubach is that we have to have this discussion now. It is not a matter of using the app or not, but how do we want to use it. We should have that discussion before we introduce it. Just like we should discuss all other measures and whether and when they can be relaxed. Even if we do not know exactly when yet, we can already discuss what has priority, opening school, shops, restaurants or car factories?


Disclaimer. In am just a simple climate scientist, not a virologist, nor an epidemiologist or encryption specialist. I had wanted to stay out of this topic and not pretend to be an instant Corona specialist, but the dumb people do not show such restraint and only few actual experts speak up. Those that do, report that they find it unpleasant. As a climate scientist I am unfortunately used to the well-funded hate mobs trying to bully others into silence and will not let myself be intimidated. Plus a large part of this post is about societal issues, where everyone should participate, not just experts.

Related reading

The position paper of the EFF is long, but worthwhile: The Challenge of Proximity Apps For COVID-19 Contact Tracing.

If anyone would like to get involved, there is a list with COVID-19 contact tracing projects around the world.

German public radio channel NDR Info makes a daily podcast with virologist Christian Drosten, on my blog you can find translations of parts of these interviews.

The German CDC, the RKI makes wonderful informative daily situation reports, in German and English.

Friday 10 April 2020

Corona Virus Update: exit strategy, masks, aerosols, loss of smell and taste (part 28)


Prof. Dr. Christian Drosten
Today's podcast had a wide range of topics, from the proposal for an exit from the lockdown by the German National Science Academy, to face masks (which is one of their proposals), to transfer of the SARS-CoV-2 virus by droplets and by tiny airborne particles (aerosols), how long a patient is contagious and a new study on the loss of smell and taste as a symptom of COVID-19.

The Corona Virus Update Podcast is an initiative of the German public radio channel NDR Info. Today science journalist Anja Martini does the interview with Prof. Dr. Christian Drosten. He is an expert for emerging viruses at the [[research hospital Charité]]. Fittingly the hospital was founded outside the city walls of Berlin three centuries ago to help fight an outbreak of the bubonic plague, which had already depopulated large parts of East Prussia.

An exit strategy

In the previous podcast Drosten talked about a study, which suggested that a mobile phone app, which can help trace back contacts of infected people, would be quite effective in reducing the spread of the virus. About as powerful as a lockdown.

Three day later the German National Academy of Science, Leopoldina, recommended three measures, which could become an alternative for a lockdown. 1) This app, 2) more testing, 3) wearing simple masks in public.

[EDIT: It goes viral in America that Apple and Google will somehow help with such apps. That most of the work is already done by governments is not something that gets much millimetres, while it is not that clear to me what Apple and Google will contribute. They say first an API. Maybe that helps to make different apps interoperable? In a second phase they want to integrate it in the OS. If that means that the data (also) goes to Apple and Google, that would be an efficient way to kill the project.]

Leopoldina presents a model, which suggests this would be enough to keep new infections per day close to zero in May, although they also show data from South Korea, which has a similar strategy, were there is still a decent amount of new infections going on. So the model does not capture reality fully.

Anja Martini:
The Leopoldina, the National Academy of Science, issued a second statement from its working group on the virus at the end of last week. You are also part of this working group. ... It recommends - over and above the measures that we have already taken so far, in other words keeping our distance - hygiene and quarantine in the event of suspicion, isolation: Consistent wearing of masks, including in local public transport and at school, more tests, including random tests, and the use of cell phone data, which we have already discussed here. If this is done, the number of people infected by an infected person could, according to the calculations, be reduced to less than one by the middle or end of May. Even if, after Eastern, more public life were to be gradually allowed again. That is cause for optimism for the time being, isn't it? Please explain this prognosis to us!
Christian Drosten:
Of course, one looks for ways to get out of the current measures. And an organisation like the Leopoldina, which is made up of scientists, also looks at the latest scientific data. Just last week we discussed a study published in "Science" about the effects that can be expected from such mobile apps, i.e. mobile phone apps that allow much more detailed and faster case tracking.

We can simply track a certain number of infected people at the local health departments. At some point, the capacity runs out. You can't make an infinite number of phone calls and contact an infinite number of contacts and tell them to stay home and so on. It just runs out at some point. A mobile app is not exhausted that quickly and it also gets behind it much faster. That's the one provision there.
If the modelling study on the impact of this app is right, this should do most of the work.

So one can wonder why masks are additionally proposed by Leopoldina. My impression from previous podcasts is that Drosten is quite sceptical of masks. While there is evidence that they reduce the amount of viral material an infected person produces, there is not much evidence on how much they would contain the spread of the virus.

Maybe that lack of strong evidence is why Drosten wonders whether people would be persuaded to wear the masks. I do not see much of a problem, but maybe I am too optimistic. Wearing a mask is a much smaller limitation than staying at home. And I recently came by this beautiful photo of California during the Spanish flu, where people are wearing masks at an outdoor barber shop. Another culture not used to masks that were willing to wear them when needed.
You can achieve considerable increases if you add some general effects [additional measures] to this very special tracking via mobile apps. A general effect can be the wearing of masks if everyone does it. In our society, we certainly do not have the best starting conditions to let everyone wear masks. There will quickly be people who say they don't want to, they don't see the point or they can't do it.
We have currently, of course, an additional argument in public, namely: you cannot buy any masks at all, because there are none. That is why it is of course not very promising at first to consider what would happen if a general obligation to wear masks were to be imposed ad hoc?

This is a relatively complicated phenomenon, ... to impose such a thing in a society where the whole thing is not culturally anchored and not trained. That is the one difficulty. It is of course taken into consideration in a forum like the Leopoldina, where social scientists, psychologists and so on are also represented. This is precisely why the totality of the expertise is represented, not only life scientists are in it, but also sociologist.

Types of masks

Christian Drosten:
We have hardly any scientific evidence that says that self-protection through simple masks works. Of course, there are much more complicated, elaborate masks for special wearers, i.e. for certain occupational groups, who also provide self-protection.

But these masks have actually never been available in large numbers. They are not so easy to produce so fast, as far as I know. By the way, they are also not easy to wear for everyone. You have to imagine that here in medicine there are preliminary occupational medical examinations for employees who have to wear these very safe self-protection masks in their professional life. Not everyone is able to do this, for example, if there is any doubt, the medical profession must carry out lung function tests. And something like that cannot be recommended for the normal population.
I am not sure I understand his claim on the simple masks:
With these [simple] masks it is the case, there is no scientific evidence of a benefit for self-protection. There is, however, starting evidence, which has not been very virus-specific so far, for the protection of others. But this of course presupposes that really everyone, everyone, everyone in society, in public life, must wear these masks.
I would expect that when half of all people do it you get half of the effect. But maybe Drosten means that for this to help for your own protection everyone would have to do it. Also if only half would do it, to help the others, one could expect that the participation drops. That is a kind of [[public goods game]] It could also be that he does not expect much of an effect and that half would thus really not be worth it.

Droplets and aerosols

A large part of the podcast was about the difference between droplets with virus and aerosols. Droplets would be defined as being large enough to drop to the ground by gravity within a minute, while aerosols can stay in the air for hours. It was a long and nuanced discussion about evidence on how these particles are produced and removed, how infectious they are and how important they are.

People are worried about the aerosols, about "airborne virus" because it means that you could be infected without having noticed someone coughing. But in the end the droplets are most important: "we are pretty sure that the vast majority of the viruses that are released in these diseases of the upper respiratory tract ... are these larger droplets - and they fall to the ground". So to focus on what is important, I only translated a small part:

Christian Drosten:
These large droplets over five microns (and they can can be much bigger, they can also be 100 micrometers, i.e. a tenth of a millimeter, so that you can really see them with the naked eye) - these are the droplets that we are talking about in a droplet infection. In other words, what you give off - which is part of a moist speech [when people spatter when they talk], for example, but also comes out when you cough or sneeze - and which falls to the ground within a radius of one and a half to two meters.

In this research into the common cold, we are pretty sure that the vast majority of the viruses that are released in these diseases of the upper respiratory tract (i.e. the diseases that mainly occur in the throat and nose) are these larger droplets - and they fall to the ground. Much of our precautions and infection prevention considerations are based on this insight.

Then there is something else, namely aerosols, whose particle size is less than five micrometers. For the experts, it must be said that this is of course not a sharply defined size, and an aerosol that really floats in the air and stays in the air longer, the actual droplets are even much smaller, they are less than one micrometer in size. ...

If I release such a droplet and it floats in the air in front of me, then it starts to dry and then it becomes smaller. The smaller it gets, the more likely it is that it will remain in the air for a long time. But at the same time there is another effect, namely when this droplet gets smaller and smaller, it will eventually be too small for the virus, and the virus will dry out and will no longer be infectious.

So on the one hand aerosols are potentially more problematic by staying in the air longer, on the other hand they are likely less contagious, while many studies only analyse whether virus is present, not whether the material is infectious. Reading such studies one should pay attention to this difference.

How long is someone contagious

An interesting preprint studied how much virus could be found in hospital rooms of COVID-19 patients.

Christian Drosten:
Wipe samples were taken - in 30 different hospital rooms, from 30 different patients, all of whom had the disease, in a hospital in Singapore, from all kinds of surfaces, and tested them for virus again.

By the way, I have to add here, in all these studies, especially the last study that we discussed first, and this one too, it is always only a viral detection of RNA and not of infectivity in the cell culture.
Anja Martini:
In other words, a virus that can be detected, but which possibly no longer infects anyone.
Christian Drosten:
Right, exactly. A desiccated virus, it still has the same amount of RNA and you can still detect it. None of this means anything directly about infectivity right now, it just means that virus has got there.

And here it is the case that a lot of deposited RNA has already been found in these samples. In the floor samples, for example, more than half of the wipe samples were virus-positive, i.e. viral RNA could be detected - which suggests that the virus is deposited to a considerable extent, which is a sign of the fact that the virus is in fact deposits considerably, which favours the concept of a coarser drops.

But then, something else and very important, I think: With these 30 patients were these virus swab samples always positive only in the first week of symptoms. In the the second week, when the patients were still definitely sick the wipe samples were no longer positive. So no more virus settles on the surfaces, was accordingly also no significant virus concentration more in the room air.
Note, this is just one study. Decisions should be based on all available evidence and an uncertainty estimate.

Infection via surfaces

If I understand it right, when someone coughs in their hands and then shakes hands, that is seen as droplet transmission and not as transmission via a surface. This route is important and a reason for the advice not to cough in your hands and to wash them regularly.

Anja Martini:
The insight we have from this is that we are infected via the air we breathe, via coughing, via aerosols, and not, as is a question much asked by listeners here, what is actually the case with infection via surfaces?
Christian Drosten:
Infection via surfaces themselves has been modelled, for example, in the study by Christophe Fraser that we discussed last week. He comes to the conclusion that perhaps ten percent of all transmissions could function via surfaces at all.

Many people I talk to don't really believe in the relevance of surface transmission. ...

We do not currently assume that this virus is significantly transmitted via surfaces. The current measures to prevent transmission are aimed at preventing both droplet and airborne transmission, especially - to say it again - droplet transmission. And the studies that have now been discussed here, that have now been published, do not suggest - even if small-droplet aerosols have been detected - that this mechanism would be the main focus.
Anja Martini:
This means, once again asking from the consumer's point of view, ... can we actually neglect surface disinfectants in our private lives?
Christian Drosten:
I am almost sure that it is not worthwhile to pay a lot of attention in the household to treat all kinds of surfaces with disinfectant. In a hospital, of course, this may be different. ...
My impression is that this was scientific "may", which mostly means "will". We sometimes talk in a somewhat weird way.
Images from television, for example, in China, where tanker lorries are driving through the streets with disinfectants, I think that has more of a psychological effect on the population than a real effect in curbing the transmission of infection.
I love those videos of teams with disinfectant sprayers walking through the streets as if they could be eye to eye with a terrorist any second.

Loss of smell and taste

Anja Martini:
What does a possible disease or even an infection with the virus actually do to the sense of taste and smell? This was already something of an observation in the press. There was also a Belgian study. Now there is one from Iran based on an online questionnaire.
Christian Drosten:
Yes, I think it's a very interesting study. There are already clear indications. In the Munich patient observation we have already seen a loss of the sense of taste and smell in almost half of the cases. So this has already been published.

There is now even a functional study that has just been published - and it says that it is a very specific type of cells in the olfactory system, in the nose, in the olfactory bulb, that is actually infected and affected by this virus.

But that is not what we want to discuss here. Interestingly, it is a study from Iran. I think it is simply great to see that this kind of useful research also comes from a country that is highly affected and where we all know that the data situation is unclear. The science there has to work in a difficult system, has also difficulties, for example, to get certain reagents. But but here comes a very interesting study, from the preprint realm, to the public.

Iranian scientists conducted a survey - also supported by apps and the Internet - and reached 15,000 people with this survey. Of these, 10,000 actually had a loss or impairment of the sense of smell. In fact, 76 percent of these 10,000 patients - an impressively large number - had a sudden loss.

You can tell the difference between saying that suddenly I couldn't smell anything anymore. Or whether you say, well, I just had a cold. And 75 percent, a similarly high rate, actually had influenza-like symptoms. So now that not only a runny nose is part of it, but also a noticeable fever and so on. This was clarified by questionnaires.

83 percent also had a loss of taste, which was also described, also in the Munich patients, so that a loss of taste is also involved. They could no longer taste or smell anything. ...

And if I suddenly couldn't smell anything anymore in my everyday life, I would stay at home and try to clarify what is going on with me at the moment, in the current situation.


Other podcasts

Part 27: Corona Virus Update: tracking infections by App and do go outside

Part 23: Corona Virus Update: need for speed in funding and publication, virus arrival, from pandemic to endemic

Part 22: Corona Virus Update: scientific studies on cures for COVID-19.

Part 21: Corona Virus Update: tests, tests, tests and how they work.

Part 20: Corona Virus Update: Case-tracking teams, slowdown in Germany, infectiousness.

Part 19: Corona Virus Update with Christian Drosten: going outside, face masks, children and media troubles.

Part 18: Leading German virologist Prof. Dr. Christian Drosten goes viral, topics: Air pollution, data quality, sequencing, immunity, seasonality & curfews.

Related reading

This Corona Virus Update podcast and its German transcript. Part 28.

All podcasts and German transcripts of the Corona Virus Update.

Respiratory virus shedding in exhaled breath and efficacy of face masks

Detection of Air and Surface Contamination by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) in Hospital Rooms of Infected Patients

Coincidence of COVID-19 epidemic and olfactory dysfunction outbreak

Aerosol and Surface Stability of SARS-CoV-2 as Compared with SARS-CoV-1

A paper from 2004 that shows that even while normally breathing out some people produce tiny droplets: Inhaling to mitigate exhaled bioaerosols

A letter from the American Academy of Sciences on droplets and aerosols.

News article in Science Magazine on the relevance of small droplets: You may be able to spread coronavirus just by breathing, new report finds