Science Feedback on Steroids

Climate Feedback is a group of climate scientists reviewing press articles on climate change. By networking this valuable work with science-interested citizens we could put this initiative on steroids.

Disclosure, I am member of Climate Feedback.

How Climate Feedback works

Climate Feedback works as follows. A science journalist monitors which stories on climate change are shared much on social media and invites publishing climate scientists with relevant expertise to review the factual claims being made. The scientists make detailed reviews on concrete claims, ideally using web annotations (see example below), sometimes by email.

They also write a short summary of the article and grade its scientific credibility. These comments, summaries and grades are then summarized in a graphic and an article written by the science journalist.

Climate Feedback takes care of spreading the reviews to the public and to the publication that was reviewed. Climate Feedback is also part of a network of fact checking organizations giving them more credibility and they add metadata to the review pages that social media and search engines can show their users.

For scientists this is a very efficient fact checking operation. The participants only have to respond to the claims they have expertise on. If there are many claims outside my expertise I can wait until my colleagues added their web annotations before I write my summary and determine my grade. Especially compared to writing a blog post Climate Feedback is very effective.

The initiative recently branched out to reviewing health claims with a new Health Feedback group. The umbrella is now called Science Feedback.

The impact

But there is only so much a group of scientists can do and by the time the reviews are in and summarized the article is mostly old news. Only a small fraction of readers would see any notifications social media systems could put on posts spreading them.

This is still important information for people who closely follow the topic, helps them to see how such reviews are done, assess which publication are reliable and helps to see which groups are credible.

The reviews may be most important for the journalists and the publications involved. Journalists doing high quality work can now demonstrate this to editors who will mostly not be able to assess this themselves. Some journalists have even asked for reviews of important pieces to showcase the quality of their work. Reversely editors can seek out good journalists and cut ties with journalists regularly hurt their reputation. The latter naturally only helps publications that care about quality.

The Steroids

With a larger group we could review more articles and have results while people are still reading it. There are not enough (climate) scientists to do this.

For Climate Feedback I only review articles on topics where I have expertise. But I think I would still do a decent job outside of my expertise. It is hard to determine how good a good article is, but the ones that are clearly bad are easy to identify and this does not require much expertise. At least in the climate branch of the US culture war the same tropes are used over and over again, the same "thinking" errors are made over and over again.

Many who are interested in climate change are interested in scientific detail, but are not scientists, would probably do a good job identifying these bad articles. Maybe even better. They say that magicians were better at debunking paranormal claims than scientists. We live in a bubble where most argue in good faith and science-interested normal citizens may well have a better BS detector.

However, how do we know who is good at this? Clearly not everyone, otherwise such a service would not be needed. We would have the data from Climate Feedback and Health Feedback to determine which citizen scientist's assessments predict the assessments of the scientists well. We could also ask people to classify the topic of the article. I would be best at observational climatology, decent in physical climatology and likely only average when it comes to many climate change impacts and economic questions. We could also ask people how confident they are in their assessments.

In the end it would be great to ingest ratings in a user friendly way with 1) a browser add-on on the article homepage itself, 2) replying to posts mentioning the article on social media, like replying to a tweet adding the handle of the PubPeerBot automatically submits the tweet to PubPeer.

A server would compute the ratings and as soon as there is enough data create a review homepage with the ratings as metadata to be used by search engines and social media sites. We will have to see if they are willing to use such a statistical product. Also an application programming interface (API) and ActivityPub can be used to spread the information to interested parties.

I would be happy to use this information on the micro-blogging system for scientists Frank Sonntag and I have set up. I presume more Open Social Media communities would be grateful for the information to make their place more reality-friendly. A browser add-on could also display the feedback on the article's homepage itself and on posts linking to it.

How to start?

Before creating such a huge system I would propose a much smaller feasibility study. Here people would be informed about articles Climate or Health Feedback are working on and they can return their assessments until the one of Climate Feedback is published. This could be a simple email distribution list to distribute the articles and a cloud-based spread sheet or web form to return the results.

This system should be enough to study whether citizens can distinguish fact from fiction well enough (I expect so, but knowing for sure is valuable) and develop statistical methods to estimate how well people are doing, how to compute an all over score and how many reviews are needed to do so.

This set-up points to two complications the full system would have. Firstly, only citizen's assessments that are made before the official feedback can be used. this should not be too much of a problem as most readers will read the article before the official feedback is published.

Secondly, as the number of official feedbacks will be small many volunteers will likely not review any of these articles themselves or just a few. Thus the assessment of how accurate the predictions of person A of articles X, Y and Z are may have to be assessed comparing their assessments with those of B, C and D who review X, Y or Z as well as one of the articles Climate Feedback reviewed. This makes the computation more complicated and uncertain, but if B, C and D are good enough, this should be doable. Alternatively, we would have to keep on informing our volunteers of the articles being reviewed by the scientists themselves.

This new system could be part of Science Feedback or an independent initiative. I feel, it would at least be good to have a separate homepage as the two systems are quite different and the public should not mix them up. A reason to keep it separate is that this system could also be used in combination with other fact checkers, but we could also make that organizational change when it comes to that.

Another organization question is whether we would like Google and Facebook to have access to this information or prefer a license that excludes them. Short term it is naturally best when they also use it to inform as many people as possible. Long-term it would also be valuable to break the monopolies of Google and Facebook. Having alternative services that can deliver better quality due to our assessments could contribute to that. They have money, we have people.

I asked on Twitter and Mastodon whether people would be interested in contributing to such a system. Fitting to my prejudice people on Twitter were more willing to review (I do more science on Twitter) and people on Mastodon were more willing to build software (Mastodon started with many coders).

What do you think? Could such a system work? Would enough people be willing to contribute? Is it technologically and statistically feasible? Any ideas to make the system or the feasibility study better?

Variable Variability

Pages

Monday, 9 November 2020