Wednesday, 27 December 2017

Open science and science communication at #EGU18, the European Geophysical Union General Assembly

The EGU General Assembly 2018 will bring 14,500 geoscientists from all over the world to Vienna. Those are not just climate scientists plotting to take over the world. Climatology is just one of 22 disciplines present. In my last post on interesting meetings for climate data scientists, I already pointed to the relevant scientific meetings taking place for climate data scientists.

In this post I wanted to point to more general meetings (sessions, debates and short courses) that take place that may be of interest to climate scientists on Open Science, Science Communication, scientific publishing and climate change.

Such big conferences have downsides, if you meet someone you will have to immediately get contact information because chances of meeting twice by accident are small. But an advantage is the attention for topics affecting many sciences, which do not take place at more focussed workshops and smaller meetings. There are sessions on nonlinear physics, which pose methodological problems almost all geoscientists have to deal with and I attended a lot as young scientist.

There was also a time where I worked on many different topics, clouds, downscaling, homogenisation, where EGU was great for meeting all these communities in the same week. Lately I was mostly focussed on homogenisation and I have not visited EGU for some time. If you are mostly interested in one session, it is a long trip and an expensive conference for just a few talks and a poster session.

Maybe I paid less attention to this in the past, but it looks like EGU nowadays has a very wide range of meetings on Open Data, Open Code, Open Science, Publishing, Citizen Science and Science Communication. As I am thinking of destroying a multi-billion dollar scientific publishing industry, those are topics very much on my mind. So I think I will visit EGU this year and have curated a list below of meetings that I think climate scientists could be interested in. (Descriptions are often shortened.)

Bottom up scientific conferences

Old rabbits, with which the Germans do not only mean our professorial chemistry bunny, can skip to the list, but for young scientists and outsiders I thought it would be interesting to explain how such a huge conference is organised. With 14,500 geoscientists writing more than 20,000 abstracts on the work they would like to present, it is impossible for the conference organisers to determine what will happen and the content of the conference is very much a bottom-up affair.

The conference is split up in 22 disciplinary divisions and 13 divisions of general interest. One of these divisions is "Climate: Past, Present, Future". It could have been called "climate". Within these divisions you have dozens of so called "sessions", which are meetings on a specific topic.

Everyone can propose a session. For this EGU there was a call-for-sessions with a deadline in September. As far as I know the only condition is that one of the organisers of the session needs to have a PhD.

The next step is the call-for-abstracts, which for EGU2018 ends the 10th of January. Everyone can submit an abstract to a session of their liking describing what they would like to talk about. Again bottom-up.

Normally abstracts are accepted. When I was organiser of the downscaling session, I could see the number and about 1% of the abstracts was rejected. They were mostly double or empty ones, where something had gone wrong during submission. If the organiser thinks there is something wrong with your work, the abstract is normally still accepted, but you will likely get a poster.

Space is limited and the organiser can only select one third of the abstracts as talks, the others become posters. One time block with talks is one and a half hour in which six presentations can be given. The minimum size of a session is thus 18 abstracts. If your session gets less, the divisions leaders will merge your session with another one on a similar topic. Getting to 18 abstracts is the main barrier to organising your own session.

Talks are best for broadcasting a new result, you reach more people, but there is only time for a few questions and thus little feedback. Posters are much better for feedback. As a convener an important criterion for making an abstract a talk or a poster is thus the stage the study seemed to be in. In the early phase feedback is important, if the work is finished broadcasting is important. In addition talks are typically for studies of more general interest and if it is known how well someone talks that is also an important consideration. Thus if you want to get talk, make sure to mention some results to make clear you are in the final stages and make sure the abstract is clear and contains no typos, which are proxies for being able to present your work clearly.

The posters are EGU are typically well visited, especially the main evening poster session with free beer and wine to get people talking. Personally I spend most of my time at the posters. If a talk is not interesting, 15 precious minutes are gone, if a poster is not interesting you just walk along.

Some sessions at EGU have a system been a talk and a poster called a PICO session. Here people present their work in a 2-minute talk and afterwards every presenter stands next to a touch screen with the presentation for detailed discussions. The advantage of a poster over a PICO is that the poster is up all day.



Next to these sessions where people present their latest, you can also reserve rooms for splinter meetings to talk with each other or organise short courses. Many of the interesting meetings listed below are short courses.

Science

Great Debate 4 Low-risk geo-engineering: are techniques available now?

With the Paris agreement, a majority of the world’s countries have agreed to keep anthropogenic warming below 2 °C. According to the Intergovernmental Panel on Climate Change*, this target would require not only reducing all man-made greenhouse gas emissions to zero but also removal of large amounts of carbon dioxide from the atmosphere, or other type of geo-engineering techniques. The issue of geo-engineering has been heavily debated during the last years and we are therefore asking: Are the potential risks with geo-engineering sufficiently known? Are safe geo-engineering techniques available? Are they available now?

This debate will address these questions of crucial importance for today’s society. It will discuss the most recent discoveries of geo-engineering techniques, their potential to reduce global warming and their potential risks.
* I am not sure whether the IPCC states this. The scenarios that stay below 2 °C do have carbon dioxide removal. But scenarios are just that, scenarios, not predictions for the future.

I think we urgently need to talk about a geo-intervention. There is no reason to wait until the Earth has warmed 2 °C, climate change does unacceptable damage right now.

Great Debate 2 Hands on or hands off?

A great debate on whether scientists should get involved in policy.
In recent years there has been a growing distrust of experts in the public imagination which has been expressed in numerous debates from Brexit to the US presidential election. This gives rise to serious questions about the role of scientists in policy making and the political sphere. As geoscientists, our disciplines can have a real impact on the way humanity organises itself, so what should our role in that be? There are serious tensions here between the desire for our knowledge to have real impact and make a difference, the need for scientific detachment and objectivity, and respect for broader perspectives and for democracy itself.

The key questions for this debate are:
  • Should geoscientists restrict themselves to knowledge generation and stay out of the policy world?
  • Or should we be getting involved and making change happen?
  • Should our voices as experts be heard louder than others?
  • Or does evidence-based policy undermine democracy?
  • Should we be hands on or keep our hands off
Conferences are busy, so let me answer the questions so you do not have to go.

Four of the five organisers are from the UK, but I hope that at least outside of Anglo-America it is uncontroversial for scientists to inform the public and policy makers of their findings. Scientists are humans and have human rights, including free speech. Germany and several other European countries have even set up climate service centres to facilitate the flow of information from science to groups that need to adapt to climatic changes.

When it goes further, trying to convince people of certain solutions, please let go of your saviour complex, you will mostly like not achieve much. The way scientists are trained to think and communicate works well for science, but it is not particularly convincing outside of it. The chance you are good at convincing people is not much better than the chance of some random dude or grandma down the road.

When it comes to informing people of our findings our voice should naturally be louder than that of groups misinforming people. In countries with a functioning media that does not need to be particularly loud. The opposite of evidence-based policy is misinformation-based policy. It is clearly less democratic and an abuse of power to set up a misinformation campaign to get your way politically because the public would not support your policies if they knew the truth. That is a violation of the self-determination of people to control their lives.

Educational and Outreach Symposia Session Geoethics: ethical, social and cultural implications of geoscience knowledge, education, communication, research and practice

 

Educational and Outreach Symposia Session Vision for Earth Observations in 2040

Educational and Outreach Symposia is a surprising place for this session. I hope the right people find it.
As both Earth science and technology advance while the expectations for the extent, quality, and timeliness of environmental information to be provided to the world’s population increases, the opportunity exists to harness the increased knowledge and capability to improve those products and services.

The World Meteorological Organization has made important contributions in making the connection between knowledge and products in the areas of weather, water, and climate through its periodic visions, most recently the Vision for the Global Observing System (GOS) in 2025. The WMO is now in the process of doing an update for the 2040 time frame, taking into account both surface and space-based measurements.

In this session, presentations that look ahead to the 2040 time frame and address expected observational capability that can realistically be expected to be available in that time frame, the expected demand for products and services informed by observational data that may be required for public use in that time frame, and mechanisms for connecting the two are all sought. Presentations for this session can address the full range of products for Earth System Science and are not limited to those addressed by the WMO in its development of the 2040 vision.
I hope a global climate station reference network will be part of this vision for 2040.

Interdisciplinary Session Big data and machine learning in geosciences

This session aims to bring together researchers working with big data sets generated from monitoring networks, extensive observational campaigns and extremely detailed modeling efforts across various fields of geosciences. Topics of this session will include the identification and handling of specific problems arising from the need to analyze such large-scale data sets, together with and methodological approaches towards automatically inferring relevant patterns in time and space aided by computer science-inspired techniques. Among others, this session shall address approaches from the following fields:
  • Dimensionality and complexity of big data sets
  • Data mining and machine learning
  • Deep learning in geo- and environmental sciences
  • Visualization and visual analytics of big data
  • Complex networks and graph analysis
  • Informatics and data science

Interdisciplinary PICO Session R’s deliberate role in Earth sciences

 

Interdisciplinary Session Citizen Science in the Era of Big Data

I wish they were a bit clearer on what kind of statistics specially for Big Data they are thinking of. I guess with lots of data it is easy to a result that is statistically significant, but physically negligibly small and not interesting. If you start analysing the data like a fishing expedition you should be extra careful not to fall into the multiple testing gap. It may be that they mean that with "Big Data approaches."
Citizen Science (the involvement of laypeople in scientific processes) is gaining momentum in one discipline after another, thereby more and more data on biodiversity, earthquakes, weather, climate, health issues among others are being collected at different scales. In many cases these datasets contain huge amounts of data points collected by various stakeholders. There is definitely power in numbers of data points, however, the full potential of these datasets is not realized yet. Traditional statistics often fail to utilize these prospects. Statistics for Big Data can unveil patterns hidden that are otherwise would not be visible in datasets. Since Big Data approaches and citizen science are still developing fields, most projects miss Big Data analyses.

In this session we are looking for successful approaches of working with Big Data in all fields of citizen science. We want to ask and find answers to the following questions:
  • Which Big Data approaches can be used in citizen science?
  • What are the biggest challenges and how to overcome them?
  • How to ensure data quality?
  • How to involve citizen scientists in Big Data Analyses, or is it possible?

Scientific publishing

Great Debate 1 Who pays for Scientific Publishing?

This Great Debate will address the following questions: whether the profits generated by traditional publishers are justifiable and sustainable, to what extent scientists should contribute to the business, what are the current and future alternatives, and what role will preprint servers play?
See also my recent post on the new preprint servers for the Earth Sciences and the townhall meeting below on self-archiving and EarthArXiv.

Townhall Meeting EarthArXiv - a preprint server for the Earth Sciences

Preprints and preprint servers are set to revolutionise and disrupt the standard approaches to scholarly publishing in the Earth Sciences. Yet, despite being widely-used and demonstrably successful in several other core science disciplines, the concept of preprints is new to many Earth Sciences. As a result, education is needed, such that Earth Scientists can benefit from the use of preprints and preprint servers. In this townhall we will introduce the general concepts of preprints and preprint servers, illustrating this with a demonstration of EarthArXiv, a community-led preprint server. We will also lead a general discussion of the use of preprints.

PICO Session Future of (hydrological) publishing

This session was one reason to write this blog post. It sounds really interesting and it is somewhat hidden by being in the Hydrology Division, where non-hydrologists may miss it. This could be a good place for my coming out with the idea of grassroots scientific publishing, where the scientific community takes back control of the quality assessment, beginning with making more informative open reviews of already published articles.
In recent years, the current and future system of scientific publishing has been heavily debated. Most of these discussions focused on criticizing aspects of the current system such as:
  • the scientific publishing industry being one of the most profitable branches (Guardian, 2017) in media, because the scientific community basically does all the work for free
  • the peer review system being corrupted, or at least not functioning perfectly
  • the limited access to scientific papers due to its current business model
  • the surging number of submitted papers in recent years, especially with strict publication requirements for PhD candidates. This is putting more pressure on editors, reviewers and readership, and will decrease the visibility and impact of each publication.
Times are changing, which can be seen in the increased demand and supply for open access publishing. However, we believe there might be plenty of other ideas and suggestions on how to improve scientific publishing. We invite and challenge everyone from the scientific community to propose ideas on how to do so in 5 minute presentations. Afterwards we will continue the discussion to answer questions such as: Who needs to pay for reading our work? Who should publish our work? How to cope with the excessive amount of submitted papers? Should we even be publishing?

Short course What are the key problems in Climate Science?

Climate science is a wide discipline that encompasses many of the EGU divisions, yet it is not always easy to know what the key problems are outside of your own specific area. ... During the short course, four climate experts from different divisions will introduce the “key problems” in their discipline, giving you an overview of what the current “hot topics” are. This course will provide you with enough background to venture into other divisions during the rest of the meeting. The floor will then be open for questions and discussion with our experts.
With so many disciplines together EGU would theoretically be an important place to learn about problem in other fields and see how that fits to yours.

However, the talks at EGU are very short, just 12 minutes. They do not leave much time for an introduction and are thus hard to follow for outsiders. It may be nice to extend the idea of this short course with just four hot topics to many more topics. Make it into a science slam, where you do not talk about your own work, but introduce the field in a way an outsider can get it.

It looks like these four key problems and their speakers are selected by the conveners. A science slam could be open to all like the normal talks.

Open Science

Townhall Meeting OSGeo Townhall: Open Science demystified

OSGeo is hosting this Townhall event to support the collaborative development of open source geospatial software by promoting sustainable Open Science within EGU. The Open Source Geospatial Foundation, or OSGeo, is a not-for-profit umbrella organization for Free and Open Source geospatial tools, including QGIS, gvSIG, GRASS GIS, Geoserver and many others.
The paradigm of Open Science is based on the tiers Open Access, Open Data and Free Open Source Software (FOSS). However, the interconnections between the tiers remain to be improved. This is a critical factor to enable Open Science.
This Townhall meeting reaches out all across EGU, especially welcoming Early Career Scientists, to network and discuss the current challenges and opportunities of the FOSS tier, including:
  • the easy approach to choosing software licences
  • recognition for scientific software: how to write a software paper?
  • software life cycle: who will maintain your software after you've finished your PhD and found a decent job?
  • funding software development: evolving software begun for your own research needs into something larger, that serves other's needs, and boosts your scientific reputation.
  • software reviews: how to set up software development such that other developers get involved in an early stage?
  • how can OSGeo help you with all these questions?

Short course How to find and share data in geosciences?

This short course aims to present some tips and tricks to accelerate the process of finding, processing and sharing the Geosciences data. We will also discuss the importance of open science and the opportunities it provides.

Short course Writing reproducible geoscience papers using R Markdown, Docker, and GitLab

I think code reproducibility is overrated. It is much stronger to make an independent reproduction and if the result depends on minute details being the same, it is mostly likely not a useful result. But the most of the same methods can be useful to speed up scientific progress. Sharing data and code is wonderful and helps other scientists get going faster. The code should thus preferably also run somewhere else.
Reproducibility is unquestionably at the heart of science. Scientists face numerous challenges in this context, not least the lack of concepts, tools, and workflows for reproducible research in Today's curricula.
This short course introduces established and powerful tools that enable reproducibility of computational geoscientific research, statistical analyses, and visualisation of results using R (http://www.r-project.org/) in two lessons:

1. Reproducible Research with R Markdown
Open Data, Open Source, Open Reviews and Open Science are important aspects of science today. In the first lesson, basic motivations and concepts for reproducible research touching on these topics are briefly introduced. During a hands-on session the course participants write R Markdown documents, which include text and code and can be compiled to static documents (e.g. HTML, PDF).
R Markdown is equally well suited for day-to-day digital notebooks as it is for scientific publications when using publisher templates.
To understand the rest of the description, I need to explain what Docker means:
Docker is a tool that can package an application and its dependencies in a virtual container that can run on any Linux server. This helps enable flexibility and portability on where the application can run, whether on premises, public cloud, private cloud, bare metal, etc.
Gitlab is a collaborative coding system based on the versioning system [[Git]]. Comparable to be probably better known [[GitHub]] and [[Bitbucket]].
2. GitLab and Docker
In the second lesson, the R Markdown files are published and enriched on an online collaboration platform. Participants learn how to save and version documents using GitLab (http://gitlab.com/) and compile them using [[Docker]] containers (https://docker.com/). These containers capture the full computational environment and can be transported, executed, examined, shared and archived. Furthermore, GitLab's collaboration features are explored as an environment for Open Science.
P.S. Those homepages really suck big time, except if their goal is to scare away anyone who is a hard core coder and already knows the product. That is why I mostly linked to Wikipedia.

Short course Building and maintaining R packages

R is a free and open software that gained paramount relevance in data science, including fields of Earth sciences such as climatology, hydrology, geomorphology and remote sensing. R heavily relies on thousands of user-contributed collections of functions tailored to specific problems, called packages. Such packages are self-consistent, platform independent sets of documented functions, along with their documentations, examples and extensive tutorials/vignettes, which form the backbone of quantitative research across disciplines.

This short course focuses on consolidated R users that have already written their functions and wish to i) start appropriately organizing these in packages and ii) keep track of the evolution of the changes the package experiences. While there are already plenty of introductory courses to R we identified a considerable gap in the next evolutionary step: writing and maintaining packages.

Short course Improving statistical evaluations in the geosciences

I love the (long) description of topics. Looks like just what a geo-scientist needs.

During my studies I got lucky. Studying physics, the only statistics we got was some error propagation for lab work. Somehow I was not happy with that and I found a statistics course in the sociology department. There was not much mathematics, one student even asked what the dot between X and Y was. I did not even understand the question, but the teacher casually answered that that was the multiplication sign. Maybe out of need it focussed on the big ideas, on the main problems and typical mistakes people make. It looks like this could be a similar course, but likely with more math.

Session Open Data, Reproducible Research, and Open Science

Open Data and Open Science not only address publications, but scientific research results in general, including figures, data, models, algorithms, software, tools, notebooks, laboratory designs, recipes, samples and much more.

Furthermore, they relate to the communication, review, and discussion of research results and consider changing needs regarding incentives, quality assessment, metrics, impact, reputation, grants and funding. Thus Open Data and Open Science encompass licensing, policy-making, infrastructures and scientific heritage, while safeguarding the dynamic nature of science and its evolving forms.
...
The speakers present success stories, failures, best practices, solutions and introduce networks. It is aimed to show how researchers, citizens, funding agencies, governments and other stakeholders can benefit from Open Data, Reproducible Research, and Open Science in various flavors, acknowledging the drawbacks and highlighting the opportunities available for geoscientists.

The session shall open a space to exchange experiences and to present either successful examples or failed efforts. Learning from others and understanding what to adopt and what to change are to help towards own undertakings and new initiatives, so that they become successes.

Educational and Outreach Symposia Session Promoting and supporting equality of opportunities in geosciences

Following the success of previous years, this session will be exploring reasons for the existence of underrepresentation of different groups (cultural, national and gender) by welcoming a debate with scientists, decision-makers and policy analysts related to geosciences.

The session will be focusing on both remaining obstacles that contribute to underrepresentation and on best practices and innovative ideas to tackle obstacles.

Science communication

Short course Help! I'm presenting at a scientific conference!

Sounds like this short course on giving a scientific presentation is tailored to newbies, although the seniors could also use some help. The seniors are hardest to change, they have learned that they have a highly motivated captive audience an that after a crappy talk everyone will pretend it was a good one.
Presenting at a scientific conference can be daunting for early career scientist and established. How can you optimally take advantage of those 12 minutes to communicate your research effectively? How do you cope with nervousness? What happens if someone asks a question that you don’t think you can answer? Is your talk tailored to the audience?

Giving a scientific talk is a really effective way to communicate your research to the wider community and it is something anyone can learn to do well! This short course provides the audience with hands-on tips and tricks in order to make your talk memorable and enjoyable for both speaker and audience.

Short course Once upon a time in Vienna

A short course on story telling, which is really important for readable prose. Although this blog post is probably not the best place to make this case.
This is an interactive workshop led by a professional communications facilitator and writer, and academics with a range of earth science outreach experience. Through a combination of expert talks, informal discussion, and practical activities, the session will guide you through the importance of storytelling, how to find exciting stories within your own research, and the tools to build a memorable narrative arc.

Short course Rhyme your research

After seeing the term "experienced science-poet" I was forced to include this short course. They missed the opportunity to write the description as poem.
Poetry is one of the oldest forms of art, potentially even predating literacy. However, what on Earth does it have to do with science? One is usually subjective and emotive, whilst the other (for the most part) is objective and empirical. However, poetry can be a very effective tool in communicating science to a broader audience, and can even help to enhance the long-term retention of scientific content. During this session, we will discuss how poetry can be used to make (your) science more accessible to the world, including to your students, your professors, your (grand)parents, and the general public.

Writing a poem is not a particularly difficult task, but writing a good poem requires both dedication and technique; anyone can write poetry, but it takes practice and process to make it effective. In this session, experienced science-poets will discuss the basics of poetry, before encouraging all participants to grab a pen and start writing themselves. We aim to maximise empowerment and minimise intimidation. Participants will have the opportunity to work on poems that help to communicate their research, and will be provided with feedback and advice on how to make them more effective, engaging and empathetic. Those who wish to do so may also recite their creations during the “EGU Science Poetry Slam 2018”.

Educational and Outreach Symposia Session Scientists, artists and the Earth: co-operating for a better planet sustainability

As communicators and artists, ​we have a shared responsibility ​to raise awareness of the importance of planet sustainability. ​ Educating people ​in this regard has normally been executed through traditional educational method​s.​ ​But ​there is evidence that science-art collaborations play a vital role in contributing to this issue, through the emotional and human connection that the arts can provide. This session,​ already in​ its fourth edition, has presented interesting ​and progressive ​​art science collaborations across a number of disciplines focussed on representing Earth science content. ​We have witnessed that climate change, natural hazard, meteo​rol​o​gy, palaeontology, earthquakes and volcanoes, geology have ​been successfully presented through music, visual art, photography, theatre, literature, digital art, ​where the artists ​explored new ​practices and methods in their work with scientists. ​A fundamental part of all art is the presentation of their final work. Then we provide a related 'performative session', to allow artists perform excerpts of their work and fully reveal the impact of this work in communicating the bigger planet sustainability message. This related session is entitled “A pilot-platform for performing your Earth&Art work”​​.

Short course Visualization in Earth Science: best practices

This short course is co-organized by the ESSI division: Earth & Space Science Informatics.
With constantly growing data sizes, both an effective visualization, as well as an efficient data analysis are getting more and more important. Different tasks in visualization require different visualization strategies. Geoscience data presents particular challenges, being typically large, multivariate, multidimensional, time-varying and uncertain. This short course aims at the presentation and demonstration of commonly available visualization tools, that are especially well suited to analyze earth science data sets. We at DKRZ -- the German Climate Computing Centre -- have many years of experience in the visualization of earth science data sets, and the goal of this workshop is to pass this knowledge on to you. We will show, explain and demonstrate the tools live, with which we work in our daily routine, and show you how to create effective and meaningful visualizations using free software.

Short course How to cartoon science

 

 Short course Science for Policy: What is it and how can scientists become involved in policy processes?

Organised by the EGU policy expert Chloe Hill.
Part 1: will focus on basic science for policy and communication techniques that can be used to engage policymakers. It will be of particular interest to anyone who wants to make their research more policy relevant and learn more about science-policy.

Part 2: will include invited speakers who will outline specific EU processes and initiatives and explain how scientists can become involved with them.

Short course Communicating your research to teachers, schools and the public - interactively

If you are serious about communicating your research to teachers, schools and the public then you should know something about these audiences, be familiar with the most effective ways of engaging with each of them and be clear about what the ‘take away’ messages would be. ... Methods of engaging the public and families through open days and similar events are different again, and usually use a range of activities to engage and educate at the same time. We will discuss insights and strategies for these different audiences and ask you to have a go yourself.

Short course Debunking myths and fake news: how can geoscientists fight misinformation and false claims

Maybe you’ve had an argument on social media with a climate change denier who is convinced the Earth is not warming. Or maybe you’ve received an email from a scared relative forwarding you a piece from an unreliable website about how total solar eclipses produce harmful rays that can make you blind. How do you go about convincing them they are mistaken without them holding on even more to their false beliefs? In the age of Brexit and Trump, of fake news and of expert snubbing, geoscientists have a role to play in tactfully fighting misinformation related to the Earth, space and planetary sciences. This short course will explore ways in which researchers can promote evidence and facts, prevent fake news from spreading, and successfully debunk false claims.

Short course Connect2Communicate: communicating your message with charisma, clarity and conviction

Making use of established techniques from the world of theatre and improvisation, this session will enable participants to make genuine connection with their audience.

Short course Science writing: selling your research through press releases and articles

Our press office once organised a short workshop on writing press releases, which was given by a former journalist. He could explain well what a journalist wanted from a press release, but did not understand that the interests of scientists are different. This course may be better, it is given by scientists.
The course will consist of: an introduction on how to identify a good science story; general tips on how to write with clarity and flair; an introduction on how to go about promoting your work via press releases and working with embargoes; tips on working with press officers and journalists; practical exercises on headline writing; and practical exercises about turning abstracts into press releases.

Short course Communicating geoscience to the media

The news media is a powerful tool to help scientists communicate their research to wider audiences. However, at times, messages in news reports do not properly reflect the real scientific facts and discoveries, resulting in misleading coverage and wary scientists. This is especially problematic in fields such as climate science, where climate skeptics can twist the research results to draw conclusions that are baseless. A way scientists have to prevent misleading or even inaccurate coverage is to improve the way they communicate and work with journalists. In this short course, co-organised with the CL and CR divisions, we will bring together science journalists and researchers with experience working with the media to provide tips and tricks on how scientists can better prepare for interviews with reporters. We will also provide pointers on how to ensure a smooth working relationship between researchers and journalists by addressing the needs and expectations of both parties. The focus will be on climate topics, but much of the advice would be applicable to other geoscience areas.

Educational and Outreach Symposia Session ECORD IODP Outreach: Past, Present and Future

The International Ocean Discovery Program is an international programme that works to explore the oceans and the rocks beneath them. ...
This session addresses the formats by which we disseminate scientific information and discoveries arising from ocean drilling – what have we done in the past, what are we doing now, and what ideas do we have for the future engagement of students with ocean research drilling. Experiences and examples of best practice illustrated in poster or oral format will present school teachers, university lecturers and researchers that describe their outreach efforts in the lab, field and geoscience classrooms to promote high-quality geoscience education at all levels.

Educational and Outreach Symposia Session Games for Geoscience

Games have the power to ignite imaginations and place you in someone else’s shoes or situation, often forcing you into making decisions from perspectives other than your own. This makes them potentially powerful tools for communication, through use in outreach, disseminating research, in education at all levels, and as a method to train the public, practitioners and decision makers in order to build environmental resilience. The session is a chance to share your experiences and best practice with using games to communicate geosciences, be they analogue, digital and/or serious games.

Educational and Outreach Symposia Session Communication and Education in Geoscience: Practice, Research and Reflection

Do you consider yourself a science communicator? Does your research group or institution participate in public engagement activities? Have you ever evaluated or published your education and outreach efforts?

Scientists communicate to non-peer audiences through numerous pathways including websites, blogs, public lectures, media interviews, and educational collaborations. A considerable amount of time and money is invested in this public engagement and these efforts are to a large extent responsible for the public perception of science. However, few incentives exist for researchers to optimize their communication practices to ensure effective outreach. This session encourages critical reflection on science communication practices and provides an opportunity for science communicators to share best practice and experiences with evaluation and research in this field.

Related reading

Slides of the talk: How to convene a session at the EGU General Assembly by Stephanie Zihms, Roelof Rietbroek and Helen Glaves.

EGU2018 and its call-for-abstracts.

The call-for-session of EMS2018 is currently open. Suggestions for improvements of the description of the "Climate monitoring: data rescue, management, quality and homogenization" session are welcome.

The fight for the future of science in Berlin. My report of thisyear's conference on scholarly communication, which lots of ideas and initiatives on Open Science and publishing.

Where is a climate data scientist to go in 2018?

Wednesday, 20 December 2017

Where is a climate data scientist to go in 2018?

Where is a climate data scientist to go in the next year? There are two oldies in Old Europe (EGU and EMS) and two three new opportunities in the Southern Hemisphere (Early Instrumental Meteorological Series, AMOS and the Data Management Workshop in Peru).

Early Instrumental Meteorological Series - Conference and Workshop

The Oeschger Centre for Climate Change Research (OCCR) organises a conference and workshop on Early Instrumental Meteorological Series. Hosts are Stefan Brönnimann (Institute of Geography) and
Christian Rohr (Institute of History).

The first two days are organised like a conference, the last two days like a workshop. It will take place from 18 to 21 June 2018 at the University of Bern, Switzerland. Registration and abstract submission are due by 15 March 2018.
The goal of this conference and workshop is to discuss the state of knowledge on early instrumental meteorological series from the 18th and early 19th century. The first two days will be in conference-style and will encompass invited talks from different regions of the world (including participation by skype) on existing compilations and on individual records, but also on instruments and archives as well as on climate events and processes. Contributed presentations (most will be posters) are welcome.

The third and fourth days target a smaller audience and are in workshop-style. The goal is to compile a detailed inventory of all early instrumental records: What has been measured, where, when and by whom? Is the location of the original data known? Have they been imaged, digitised, homogenised, or are they already in existing archives? This work will help to focus future data rescue activities.

Data Management Workshop in Peru

New is the “Data Management for Climate Services” Workshop taking place in Lima - Peru from the 28 of May to 1 of June 2018. I am trying to learn Spanish, but languages are clearly not my strong point. At a restaurant I would now be able to order cat, turtle and chicken. I think I will eat a lot of chicken. Fortunately the workshop will be carried out in two languages and will have a professional translation service from Spanish – English / English – Spanish.

The workshop is inspired by the series of EUMETNET Data Management Workshops held every two years in Europe. It would be great if similar initiatives would be tried on other continents.

The abstract submission deadline is soon: the 15 of January 2018.
  • Session 1: METADATA
    • Methods for data rescue and cataloguing; data rescue projects.
    • Methods of metadata rescue for the past and the present; systems for metadata storage; applications and use of metadata.
    • Methods for quality control of different meteorological observations of different specifications; processes to establish operational quality control.
  • Session 2: DATA HOMOGENIZATION
    • Methods for the homogenization of monthly climate data; projects and results from homogenization projects; investigations on parallel climate observations; use of metadata for homogenization.
  • Session 3: GRIDDED DATA
    • Verification of gridded data based on observations; products based on gridded data; methods to produce gridded data; adjustments of gridded data in complex topographies such as the Andes.
  • Session 4: CLIMATE SERVICES
    • Products and climate information: methods and tools of climate data analysis; presentation of climate products and information; products on extreme events
    • Climate services in Ibero-America: projects on climate services in Ibero-America.
    • Interface with climate information users: approaches to building the interface with climate information users; experiences from exchanges with users; user requirements on climate services.

EMS Annual Meeting

This time the EMS Annual Meeting: European Conference for Applied Meteorology and Climatology will be in Budapest, Hungary, from 3 to 7 September 2018. The abstract submission deadline is still far away, but if you have ideas for sessions this is the moment to say something. The call for sessions is open until January 4th. New sessions can be proposed.

The session on "Climate monitoring: data rescue, management, quality and homogenization" is organised by Manola Brunet-India, Ingeborg Auer, Dan Hollis and me. If you have any suggestions for improvements of our please tell me. New this year is that we have explicitly added marine data. The forgotten 70% will be forgotten no longer.
Robust and reliable climatic studies, particularly those assessments dealing with climate variability and change, greatly depend on availability and accessibility to high-quality/high-resolution and long-term instrumental climate data. At present, a restricted availability and accessibility to long-term and high-quality climate records and datasets is still limiting our ability to better understand, detect, predict and respond to climate variability and change at lower spatial scales than global. In addition, the need for providing reliable, opportune and timely climate services deeply relies on the availability and accessibility to high-quality and high-resolution climate data, which also requires further research and innovative applications in the areas of data rescue techniques and procedures, data management systems, climate monitoring, climate time-series quality control and homogenisation.
In this session, we welcome contributions (oral and poster) in the following major topics:
  • Climate monitoring , including early warning systems and improvements in the quality of the observational meteorological networks.
  • More efficient transfer of the data rescued into the digital format by means of improving the current state-of-the-art on image enhancement, image segmentation and post-correction techniques, innovating on adaptive Optical Character Recognition and Speech Recognition technologies and their application to transfer data, defining best practices about the operational context for digitisation, improving techniques for inventorying, organising, identifying and validating the data rescued, exploring crowd-sourcing approaches or engaging citizen scientist volunteers, conserving, imaging, inventorying and archiving historical documents containing weather records.
  • Climate data and metadata processing, including climate data flow management systems, from improved database models to better data extraction, development of relational metadata databases and data exchange platforms and networks interoperability.
  • Innovative, improved and extended climate data quality controls (QC), including both near real-time and time-series QCs: from gross-errors and tolerance checks to temporal and spatial coherence tests, statistical derivation and machine learning of QC rules, and extending tailored QC application to monthly, daily and sub-daily data and to all essential climate variables.
  • Improvements to the current state-of-the-art of climate data homogeneity and homogenisation methods, including methods intercomparison and evaluation, along with other topics such as climate time-series inhomogeneities detection and correction techniques/algorithms, using parallel measurements to study inhomogeneities and extending approaches to detect/adjust monthly and, especially, daily and sub-daily time-series and to homogenise all essential climate variables.
  • Fostering evaluation of the uncertainty budget in reconstructed time-series, including the influence of the various data processes steps, and analytical work and numerical estimates using realistic benchmarking datasets.
The next step is to analyse the data to understand what happens with the climate system. For this there is the session: "Climate change detection, assessment of trends, variability and extremes".

AMOS-ICSHMO

It is too late to submit abstracts, but you can still visit the Joint 25th AMOS National Conference and 12th International Conference for Southern Hemisphere Meteorology and Oceanography, AMOS-ICSHMO 2018, to be held at UNSW Sydney from 5 to 9 February 2018.

New is a session on "Data homogenisation and other statistical challenges in climatology" organised by Blair Trewin and Sandy Burden.
This session is intended as a forum to present work addressing major statistical challenges in climatology, from the perspectives of both climatologists and statisticians. It is planned to have a particular focus on climate data homogenisation, including the potential for merging observations from multiple sources. However, papers on all aspects of statistics in climatology are welcome, including (but not limited to) spatial analysis and uncertainty, quality control, cross-validation, and extreme value and threshold analysis. Statistical analyses of temperature and rainfall will be of most interest, but studies using any meteorological data are welcome.
If I see it right The session has four talks:
  • Testing for Collective Significance of Temperature Trends (Radan Huth)
  • Investigating Australian Temperature Distributions using Record Breaking Statistics and Quantile Regression (Elisa Jager)
  • A Fluctuation in Surface Temperature in Historical Context: Reassessment and Retrospective on the Evidence (James Risbey)
  • The Next-Generation ACORN-SAT Australian Temperature Data Set (Blair Trewin)
And there is a session on "Historical climatology in the Southern Hemisphere" organised by Linden Ashcroft, Joëlle Gergis, Stefan Grab, Ruth Morgan and David Nash.
Historical instrumental and documentary records contain valuable weather and climate data, as well as detailed records of societal responses to past climatic conditions. This information offers valuable insights into current and future climate research and climate change adaptation strategies. While the use of historical climate information is a well-developed field in the Northern Hemisphere, a vast amount of untapped resources exist in the southern latitudes. Recovering this material has the potential to dramatically improve our understanding of Southern Hemisphere climate variability and change. In this session we welcome interdisciplinary submissions on the rescue, interpretation and analysis of historical weather, climate, societal and environmental information across the Southern Hemisphere. This can include:
  • Instrumental data rescue (land and ocean) projects and practices
  • Comparison of documentary, instrumental and palaeoclimate reconstructions
  • Historical studies of extreme events
  • Past social engagement with weather, climate and the natural environment
  • Development of long-term climate records and chronologies.
It has five talks:
  • An Australian History of Anthropogenic Climate Change (Ruth Morgan)
  • Learning from the Present to Understand the Past: The Case of Precipitation Covariability between Tasmania and Patagonia (Martin Jacques-Coper)
  • Learning from Notorious Maritime Storms of the Late 1800’s (Stuart Browning)
  • Climate Data Rescue Activities at Meteo-France in the Southern Hemisphere (Alexandre Peltier)
  • Recovering Historic Southern Ocean Climate Data using Ships’ Logbooks and Citizen Science (Petra Pearce)

EGU General Assembly

EGU will be held in Vienna, Austria, from 8 to 13 April 2018. The abstract submission deadline is looming: the 10th of January.

The main session from my perspective is: "Climate Data Homogenization and Analysis of Climate Variability, Trends and Extremes", organised by Xiaolan Wang, Rob Roebeling, Petr Stepanek, Enric Aguilar and Cesar Azorin-Molina.
Accurate, homogeneous, and long-term climate data records are indispensable for many aspects of climate research and services. Realistic and reliable assessments of historical climate trends and climate variability are possible with accurate, homogeneous and long-term time series of climate data and their quantified uncertainties. Such climate data are also indispensable for assimilation in a reanalysis, as well as for the calculation of statistics that are needed to define the state of climate and to analyze climate extremes. Unfortunately, many kinds of changes (such as instrument and/or observer changes, changes in station location and/or environment, observing practices, and/or procedures) that took place during data collection period could cause non-climatic changes (artificial shifts) in the data time series. Such shifts could have huge impacts on the results of climate analysis, especially when it concerns climate trend analysis. Therefore, artificial shifts need to be eliminated, as much as possible, from long-term climate data records prior to their application.

The above described factors can influence different essential climate variables, including atmospheric (e.g., temperature, precipitation, wind speed), oceanic (e.g., sea surface temperature), and terrestrial (e.g., albedo, snow cover) variables from in-situ observing networks, satellite observing systems, and climate/earth-system model simulations. Our session calls for contributions that are related to:
  • Correction of biases, quality control, homogenization, and validation of essential climate variables data records.
  • Development of new datasets and their analysis (spatial and temporal characteristics, particularly of extremes), examining observed trends and variability, as well as studies that explore the applicability of techniques/algorithms to data of different temporal resolutions (annual, monthly, daily, sub-daily).
  • Rescue and analysis of centennial meteorological observations, with focus on wind data prior to the 1960s, as a unique source to fill in the gap of knowledge of wind variability over century time-scales and to better understand the observed slowdown (termed “stilling”) of near-surface winds in the last 30-50 years.
Also the session on "Atmospheric Remote Sensing with Space Geodetic Techniques" contains a fair bit of homogenisation. For most satellite datasets homogenisation is done very differently as they do not have as much redundant data, but the homogenisation of humidity datasets based on the geodetic data of the global navigation satellite system ([[GNSS]], consisting of GPS, GLONASS and Galileo) is very similar.
Today atmospheric remote sensing of the neutral atmosphere with space geodetic techniques is an established field of research and applications. This is largely due to the technological advances and development of models and algorithms as well as, the availability of regional and global ground-based networks, and satellite-based missions. Water vapour is under sampled in current operational meteorological and climate observing systems. Advancements in Numerical Weather Prediction Models (NWP) to improve forecasting of extreme precipitation, requires GNSS troposphere products with a higher resolution in space and shorter delivery times than are currently in use. Homogeneously reprocessed GNSS observations on a regional and global scale have high potential for monitoring water vapour climatic trends and variability, and for assimilation into climate models. Unfortunately, these time series suffer from inhomogeneities (for example instrumental changes, changes in the station environment), which can affect the analysis of the long-term variability. NWP data have recently been used for deriving a new generation of mapping functions and in Real-Time GNSS processing these data can be employed to initialise Precise Point Positioning (PPP) processing algorithms, shortening convergence times and improving positioning. At the same time, GNSS-reflectometry is establishing itself as an alternative method for retrieving soil moisture.
We welcome, but not limit, contributions on the subjects below:
  • Physical modelling of the neutral atmosphere using ground-based and radio-occultation data.
  • Multi-GNSS and multi-instruments approaches to retrieve and inter-compare tropospheric parameters.
  • Real-Time and reprocessed tropospheric products for forecasting, now-casting and climate monitoring applications.
  • Assimilation of GNSS measurements in NWP and in climate models.
  • Methods for homogenization of long-term GNSS tropospheric products.
  • Studies on mitigating atmospheric effects in GNSS positioning and navigation, as well as observations at radio wavelengths.
  • Usage of NWP data in PPP processing algorithms.
  • Techniques on retrieval of soil moisture from GNSS observations and studies of ground-atmosphere boundary interactions.
Also for ecological data homogenisation is often needed. Thus the session "Digital environmental models for Ecosystem Services mapping" by Miquel Ninyerola, Xavier Pons and Lluis Pesquer may also be interesting.
The session aims to focus on understanding, modelling, analysing and improving each step of the process chain for producing digital environmental surface grids (terrain, climate, vegetation, etc.) able to be used in Ecosystem Services issues: from the sensors (in situ as well as Earth Observation data) to the map dissemination. In this context, topics as data acquisition/ingestion, data assimilation, data processing, data homogenization, uncertainty and quality controls, spatial interpolation methods, spatial analysis tools, derived metrics, downscaling techniques, box-tools, improvements on metadata and web map services are invited. Spatio-temporal analyses and model contribution of large series of environmental data and the corresponding auxiliary Earth Observation data are especially welcome as well as studies that combine cartography, GIS, remote sensing, spatial statistics and geocomputing. A rigorous geoinformatics and computational treatment is required in all topics.
EGU also has a nice number of Open Science, Science communication and Publishing sessions[, you can find links in my new post]. I hope I will find the time to also write about them in a next post.

Other conferences

The Budapest homogenisation workshop was this year, so I do not expect another one in 2018. In case you missed it, the proceedings is now published and contains many interesting extended abstracts.

Also the last EUMETNET Data Management Workshop was in 2017. If there are any interesting meetings that I missed, please tell us in the comments.

Tuesday, 21 November 2017

The fight for the future of science in Berlin



A group of scientists, scholars, data scientists, publishers and librarians gathered in Berlin to talk about the future of research communication. With the scientific literature being so central to science, one could also say the conference was about the future of science.

This future will be more open, transparent, findable, accessible, interoperable and reusable.


The open world of research from Mark Hooper on Vimeo.

Open and transparent sounds nice and most seem to assume that more is better. But it can also be oppressive, help the powerful with the resources to mine the information efficiently.

This is best known when it comes to government surveillance, which can be dangerous; states are powerful and responsible for the biggest atrocities in history. The right to vote in secret, to privacy, to organize and protections against unreasonable searches are fundamental protections against power abuse.

Powerful lobbies and political activists abuse transparency laws to harass inconvenient science.

ResearchGate, Google Scholar profiles and your ORCID ID page contribute to squeezing scientists like lemons by prominently displaying the number of publications and citations. This continual pressure can lead to burnout, less creativity and less risk taking. It encourages scientists to pick low hanging fruits rather than do those studies they think would bring science forward the most. Next to this bad influence on publications many other activities, which are just as important for science, suffer from this pressure. Many good-willing people were trying to solve this by also quantifying these activities. But in doing so add more lemon presses.


That technology brings more surveillance and detrimental micro-management is not unique to science. The destruction of autonomy is a social trend, that, for example, also affects truckers.

Science is a creative profession (even if many scientists do not seem to realise this). You have good ideas when you relax under the shower, in bed with fever or on a hike. The modern publish or perish system is detrimental to cognitive work. Work that requires cognitive skills is performed worse if you pressure people, it needs autonomy, mastery and purpose.

Scientists work on the edge of what is known and invariably make mistakes often. If you are not making mistakes you are not pushing your limits. This needs some privacy because unfortunately making mistakes is not socially acceptable for adults.



Chinese calligraphy with water on a stone floor. More ephemeral communication can lead to more openness, improve the exchange of views and produce more quality feedback.
Also later on in the process the ephemeral nature of a scientific talk requires deep concentration from the listener and is a loss for people not present, but it is also a feature early in a study. Without the freedom to make mistakes there will be less exiting research and slower progress. Scientists are also humans and once an idea is fixed on "paper" it becomes harder to change, while the flexibility to update your ideas to the evidence is important and likely needed in early stages.

These technologies also have real benefits, for example, make it easier to find related articles by the same author. A unique researcher identifier like ORCID especially helps when someone changes their name or in countries like China where one billion people seem to share about 1000 unique names. But there is no need for ResearchGate to put the number of publications and citations in huge numbers on the main profile page. (The prominent number of followers on Twitter profile pages also makes it less sympathetic in my view and needlessly promotes competition and inequality. Twitter is not my work, artificial competition is even more out of place.)

Open Review is a great option if you are confident about your work, but fear that reviewers will be biased, but sometimes it is hard to judge how good your work is and nice to have someone discretely point to problems with your manuscript. Especially in interdisciplinary work it is easy to miss something a peer review would notice, while your network may not include someone from another discipline you can ask to read the manuscript.

Once an article, code or dataset is published, it is free game. That is the point where I support Open Science. For example, publishing Open Access is better than pay-walled. If there is a reasonable chance of re-use publishing data and code helps science progress and should be rewarded.

Still I would not make a fetish out of it; I made the data available for my article on benchmarking homogenisation algorithms. This is an ISI highly-cited article, but I only know of one person having used the data. For less important papers publishing data can quickly be additional work without any benefits. I prefer nudging people towards Open Science over making it obligatory.

The main benefactor of publishing data and code is your future self, no one is more likely to continue your work. This should be an important incentive. Another incentive are Open Science "badges": icons presented next to the article title indicating whether the study was preregistered and provides open data and open materials (code). The introductions of these badges in the journal "Psychological Science" increased the percentage of articles with available data quickly to almost 40%.

The conference was organised by FORCE11, a community interested in future research communication and e-scholarship. There are already a lot of tools for the open, findable and well-connected world of the future, but their adoption could go faster. So the theme of this year's conference was "changing the culture".

Open Access


Christopher Jackson; on the right. (I hope I am allowed to repeat his joke.)
A main address was by Christopher Jackson. He has published over 150 scientific articles, but only became aware of how weird the scientific publishing system is when he joined ResearchGate, a social network for scientists, and was not allowed to put many of his articles on it because the publishers have the copy rights and do not allow this.

The frequent requests for copies of his articles on Research Gate also created an awareness how many scientists have trouble accessing the scientific literature due to pay-walls.

Another key note speaker, Diego Gómez, was threatened with up to eight years in jail for making scientific articles accessible. His university, Universidad del Quindio in Costa Rica, spends more on licenses for scientific journals ($375,000) than on producing scientific knowledge themselves ($253,000).



The lack of access to the scientific literature makes research in poorer countries a lot harder, but even I am regularly not able to download important articles and have to ask the authors for a copy or ask our library to order a photocopy elsewhere, although the University of Bonn is not a particularly poor university.

Also non-scientists may benefit from being able to read scientific articles, although when it is important I would prefer to consul an expert over mistakenly thinking I got the gist of an article in another field. Sometimes a copy of the original manuscript is found on one of the authors homepages or a repository. Google (Scholar) and the really handy browser add-on unpaywall can help find those using the Open Access DOI database.

Also sharing passwords and Sci-Hub are solutions, but illegal. The real solutions to making research more accessible are Open Access publishing and repositories for manuscripts. By now about half of the recently published articles are Open Access and in this pace all articles would be Open Access by 2040. Interestingly the largest fraction of the publicly available articles does not have an Open Access license, also called bronze Open Access. This means that the download possibility could also be revoked again.

The US National Institutes of Health and the European Union mandate that its supported research will be published Open Access.

A problem with Open Access journals can be that some are only interested in the publication fees and do not care about the quality. These predatory journals are bad for the reputation of real Open Access journals, especially in the eyes of the public.

I have a hard time believing that the authors do not know that these journals are predatory. Next to the sting operations to reveal that certain journals will publish anything, it would be nice to also have sting predatory journals that openly email the authors that they will accept any trash and see if that scares away the authors.

Jeffrey Beall used to keep a list of predatory journals, but had to stop after legal pressure from these frauds. The publishing firm Cabell now launched their own proprietary (pay-walled) blacklist, which already has 6000 journals and is growing fast.

Preprint repositories

Before a manuscript is submitted to a journal, the authors naturally still have the copy rights. They can thus upload the manuscript to a database, so-called preprint or institutional repositories. Unfortunately some publishers say this constitutes publishing a manuscript and they refuse to publish it because it is no longer new. However, most publishers accept the publications of the manuscript as it was before submission. A smaller part is also okay with the final version being published on the author's homepages or repositories.

Where a good option for an Open Access journal exists we should really try to use it. Where it is allowed, we should upload our manuscripts to repositories.

Good news for the readers of this blog is that a repository for the Earth Sciences was opened last week: EarthArXiv. This fall, the AGU will also demonstrate its preprint repository at the AGU Fall meeting. For details see my previous post.  EarthArXiv already has 15 climate related preprints.

This November also a new OSF ArXiv has started: MarXiv, not for Marxists, but for the marine-conservation and marine-climate sciences.
    When we combine the repositories with peer review organised by the scientific community itself, we will no longer need pay-walling scientific publishers. This can be done in a much more informative way than currently where the reader only knows that the paper was apparently good enough for the journal, but not why it is a good article, not how it fits in the (later published) literature. With Grassroots scientific publishing we can do a much better job.

    One way the reviews at a Grassroots journal can be better is by openly assessing the quality of the work. Now all we know is that the study was sufficiently interesting for some journal at that time for whatever reason. What I did not realise before Berlin is that this wastes a lot of time reviewing. Traditional journals waste resources on manuscripts, which are valid, but are rejected because they are seen as not important enough for the journal. For example, Frontiers reviews 2.4 million manuscripts and has to bounce about 1 million valid papers.

    On average scientists pay $5,000 per published article. This while scientists do most of the work for free (writing, reviewing, editing) and while the actual costs are a few $100. The money we save can be used for research. In the light of these numbers it is actually amazing that Elsevier only makes a profit of 35 to 50%. I guess their CEO's salary eats into the profits.

    Preprints would also have the advantage of making studies available faster. Open Access makes text and data mining easier, which helps in finding all articles on molecule M or receptor R. First publishers are using Text mining and artificial intelligence to suggest suitable peer reviewers to their editors. (I would prefer editors who know their field.) It would also help in detecting plagiarism and even statistical errors.

    (Before our machine overlords find out, let me admit that I did not always write the model description of the weather prediction model I used from scratch.)



    Impact factors

    Another issue Christopher Jackson highlighted is the madness of the Journal Impact Factors (JIF or IF). They measure how often an average article in a journal is cited in the first two or five years after publication. They are quite useful for librarians to get an overview over which journals to subscribe to. The problem begins when this impact factor is used to determine the quality of a journal or the articles in it.

    How common this is, is actually something I do not know. For my own field I would think I have a reasonable feeling about the quality of the journals, which is independent of the impact factor. More focussed journals tend to have smaller impact factors, but that does not signal that they are less good. Boundary Layer Meteorology is certainly not worse than the Journal of Geophysical Research. The former has in Impact Factor of 2.573, the latter of 3.454. If you made a boundary layer study it would be madness to then publish it in a more general geophysical journal where the chance is smaller that relevant colleagues will read it. Climate journals will have higher impact factors than meteorological journals because meteorologists mainly cite each other, while many sciences build on climatology. When the German meteorological journal MetZet was still a pay-wall journal it had a low impact factor because not many outside of Germany had a subscription, but the quality of the peer review and the articles was excellent.

    I would hope that reviewers making funding and hiring decisions know the journals in their field and take these kind of effects into account and read the articles itself. The [[San Francisco Declaration on Research Assessment]] (DORA) rejects the use of the impact factor. In Germany it is officially forbidden to judge individual scientists and small groups based on bibliographic measures such as the number of articles times the impact factor of the journals. Although I am not sure if everybody knows this. Imperial College recently adopted similar rules:
    “the College should be leading by example by signalling that it assesses research on the basis of inherent quality rather than by where it is published”
    “eliminate undue reliance on the use of journal-based metrics, such as JIFs, in funding, appointment, and promotion considerations”
    The relationship between the number of citations an article can expect and the impact factor is weak because there is enormous spread. Jackson showed this figure.



    This could well be a feature and not a bug. We would like to measure quality, not estimate the (future) number of citations of an article. For my own articles, I do not see much correlation between my subjective quality assessment and the number of citations. Which journal you can get into may well be a better quality measure than individual citations. (The best assessment is reading articles.)

    The biggest problem is when the journals, often commercial entities, start optimising for the number of citations rather than quality. There are many ways to get more citations, a higher impact factor, than making the best possible quality control. An article that reviews the state of the scientific field typically get a lot of citations, especially if writing by the main people in the field. Nearly every article will mention it in the introduction. Review papers are useful, but we do not need a new one every year. Articles with many authors typically get more citations. Articles on topics many scientists work on will get more citations. For Science and Nature it is important to get coverage in the main stream press, which is also read by scientists and leads to more citations.

    Reading articles is naturally work. I would suggest to reduce the number of reviews.

    Attribution, credit

    Traditionally one gets credit for scientific work by being author of a scientific paper. However, with increased collaboration and interdisciplinary work author lists have become longer and longer. Also the publish or perish system likely contributed: outsourcing part of the work is often more efficient than doing it yourself, while the person doing a small part of the analysis is happy to have another paper on their publish or perish list.

    What is missing from such a system is getting credit for a multitude of other import tasks. How does one value non-traditional output items supplied by researchers: code, software, data, design, standards, models, MOOC lectures, newspaper articles, blog posts, community engaged research and citizen science? Someone even mentioned musicals.

    A related question is who should be credited: technicians, proposal writers, data providers? As far as I know it would be illegal to put people in such roles in author list, but they do work that is important, needs to be done and thus needs to be credited somehow. A work-around is to invite them to help in editing the manuscript, but it would be good to have systems where various roles are credited. Designing such a system is hard.

    One is temped to make such a credit system very precise, but ambiguity also has its advantages to deal with the messiness of reality. I once started a study with one colleague. Most of this study did not work out and the final article was only about a part. A second colleague helped with that part. For the total work the first colleague had done more work, for the part that was published the second one. Both justifiably found that they should be second author. Do you get credit for the work or for the article?

    Later the colleague who had become third author of this paper wrote another study where I helped. It was clear that I should have been the second author, but in retaliation he made me the third author. The second author wrote several emails that this was insane, not knowing what was going on, but to no avail. A too precise credit system would leave no room for such retaliation tactics to clear the air for future collaborations.

    In one session various systems of credit "badges" were shown and tried out. What seemed to work best was a short description of the work done by every author, similar to a detailed credit role at the end of a movie.

    This year a colleague wrote on a blog that he did not agree with a sentence of an article he was author of. I did not know that was possible; in my view authors are responsible for the entire article. Maybe we should also split up the authors list in authors who guarantee with their name and reputation for the quality of the full article and honorary authors who only contributed a small part. This colleague could then be a honorary author.

    LindedIn endorsements were criticised because they are not transparent and they make it harder to change your focus because the old endorsements and contacts stick.

    Pre-registration

    Some fields of study have trouble replicating published results. These are mostly empirical fields where single studies — to a large part — stand on their own and are not woven together by a net of theories.

    One of the problems is that only interesting findings are published and if no effect is found the study is aborted. In a field with strong theoretical expectations also finding no effect when one is expected is interesting, but if no one expected a relationship between A and B, finding no relationship between A and B is not interesting.

    This becomes a problem when there is no relationship between A and B, but multiple experiments/trails are made and some will find a fluke relationship by chance. If only those get published that gives a wrong impression. This problem can be tackled by registering trails before they are made, which is becoming more common in medicine.

    A related problem is p-hacking and hypothesis generation after results are known (HARKing). A relationship which is statistically significant if only one outlier were not there, makes it tempting to find a reason why the outlier is a measurement error and should be removed.

    Similarly the data can be analyses in many different ways to study the same question, one of which may be statistically significant by chance. This is also called "researcher degrees of freedom" or "the garden of forking paths". The Center for Open Science has made a tool where you can pre-register your analysis before the data is gathered/analysed to reduce the freedom to falsely obtain significant results this way.



    A beautiful example of the different answers one can get analysing the same data for the same question. If found this graph via a FiveThirtyEight article, which is also otherwise highly recommended: "Science Isn’t Broken. It’s just a hell of a lot harder than we give it credit for."

    These kind of problems may be less severe in natural sciences, but avoiding them can still make the science more solid. Before Berlin I was hesitant about pre-registering the analysis because in my work every analysis is different, which makes is harder to know in detail in advance how the analysis should go; there are also valid outlier that need to be removed, selecting the best study region needs a look at the data, etc.

    However, what I did not realise, although quite trivial, is that you can do the pre-registered analysis, but also additional analysis and simply mark them as such. So if you can do a better analysis after looking at the data, you can still do so. One of the problems of pre-registering is that quite often people did not do the analysis in the same way and that reviewers mostly do not check this.

    In the homogenisation benchmarking study of the ISTI we will describe the assessment measures in advance. This is mostly because the benchmarking participants have a right to know how their homogenisation algorithms will be judged, but it can also be seen as pre-registration of the analysis.

    To stimulate the adoption of pre-registration, the Center for Open Science has designed Open Science badges, which can be displayed with the articles meeting the criteria. The pre-registration has to be done at an external site where the text cannot be changed afterwards. The pre-registration can be kept undisclosed for up to two years. To get things started they even award 1000 prices of $1000 for pre-registered studies.

    The next step would be journals that review "registered reports", which are peer reviewed before the results are in. This should stimulate the publication of negative (no effect found) results. (There is still a final review when the results are in.)

    Quick hits

    Those were the main things I learned, now some quick hits.

    With the [[annotation system]] you can add comments to all web pages and PDF files. People may know annotation from Hypothes.is, which is used by ClimateFeedback to add comments to press articles on climate change. A similar initiative is PaperHive. PaperHive sells its system as collaborative reading and showed an example of students jointly reading a paper for class, annotating difficult terms/passages. It additionally provides channels for private collaboration, literature management and search. It has also already been used for the peer review (proof reading) of academic books. They now both have groups/channels to allow groups to make or read annotations, as well as private annotations, which can be used for your own paper archive. Web annotations aimed at the humanities are made by Pund.it.

    Since February this year, web annotation is a World Wide Web (W3C) standard. This will hopefully mean that web browsers will start adding annotation in their default configuration and it will be possible to comment every homepage. This will likely lead to public annotation streams going down to the level of YouTube comments. Also for the public channel some moderation will be needed, for example to combat doxing. PaperHive is a German organisation and thus removes hate speech.

    Peer Community In (PCI) a system to collaboratively peer review manuscripts that can later be send to an official journal.

    The project OpenUp studied a large number of Open Peer Review systems and their pros and cons.

    Do It Yourself Science. Not sure it is science, but great when people are having fun with science. When the quality level is right, you could say it is citizen science led by the citizens themselves. (What happened to the gentlemen scientists?)

    Philica: Instant academic publishing with transparent peer-review.



    Unlocking references from the literature: The Initiative for Open Citations. See also their conference abstract.

    I never realised there was an organisation behind the Digital Object Identifiers for scientific articles: CrossRef. It is a collaboration of about eight thousand scientific publishers. For other digital sources there are other organisation, while the main system is run by the international DOI Foundation. The DOIs for data are handled, amongst others, by DataCite. CrossRef is working on a system where you can also see the webpages that are citing scientific articles, what they call "event data". For example, this blog has cited 142 articles with a DOI. CrossRef will also take web annotations into account.

    Climate science was well represented at this conference. There were posters on open data for the Southern Ocean and on the data citation of the CMIP6 climate model ensemble. Shelley Stall of AGU talked about making FAIR and Open data the default for Earth and space science. (Et moi.)



    In the Life Sciences they are trying to establish "micro publications", the publication of a small result or dataset, several of which can then later be combined with a narrative into a full article.

    A new Open Science Journal: Research Ideas and Outcomes (RIO), which publishes all outputs along the research cycle, from research ideas, proposals, to data, software and articles. They are interested in all areas of science, technology, humanities and the social sciences.

    Collaborative writing tools are coming of age, for example, Overleaf for people using LaTex. Google Docs and Microsoft Word Live also do the trick.

    Ironically Elsevier was one of the sponsors. Their brochure suggests they are ones of the nice guys serving humanity with cutting edge technology.

    The Web of Knowledge/Science (a more selective version of Google Scholar) moved from Thomson Reuters to Clarivate Analytics, together with the Journal Citation Reports that computes the Journal Impact Factors.

    Publons has set up a system where researchers can get public credit for their (anonymous) peer reviews. It is hoped that this stimulates scientists to do more reviews.

    As part of Wikimedia, best known for Wikipedia, people are building up a multilingual database with facts: wikidata. Like in Wikipedia volunteers build up the database and sources need to be cited to make sure the facts are right. People are still working on software to make contributing easier for people who are not data scientists and do not dream of the semantic web every night.

    Final thoughts

    For a conference about science, there was relatively little science. One could have made a randomized controlled trial to see the influence of publishing your manuscript on a preprint server. Instead the estimated larger number of citations for articles also submitted to ArXiv (18%) was based on observational data and the difference could be that scientists put more work in spreading their best articles.

    The data manager at CERN argued that close collaboration with the scientists can help in designing interfaces that promote the use of Open Science tools. Sometimes small changes produce large increases in adoption of tools. More research into the needs of scientists could also help in creating the tools in a way that they are useful.

    Related reading, resources

    The easiest access to the talks of the FORCE2017 conference is via the "collaborative note taking" Google Doc

    Videos of last year's FORCE conference

    Peer review

    The Times Literary Supplement: Peer review: The end of an error? by ArXiving mathematician Timothy Gowers

    Peer review at the crossroads: overview over the various open review options, advantages and acceptance

    Jon Tennant and many colleagues: A multi-disciplinary perspective on emergent and future innovations in peer review

    My new idea: Grassroots scientific publishing

    Pre-prints

    The Earth sciences no longer need the publishers for publishing
     
    ArXivist. A Machine Learning app that suggest the most relevant new ArXiv manuscripts in a daily email

    The Stars Are Aligning for Preprints. 2017 may be considered the ‘year of the preprint

    Open Science


    The State of OA: A large-scale analysis of the prevalence and impact of Open Access articles

    Open Science MOOC (under development) already has an extensive resources page

    Metadata2020: Help us improve the quality of metadata for research. They are interested in metadata important for discoverability and reuse of data

    ‘Kudos’ promises to help scientists promote their papers to new audiences. For example with plain-language summaries and tools measure which dissemination actions were effective

    John P. A. Ioannidis and colleagues: Bibliometrics: Is your most cited work your best? Survey finds that highly cited authors feel their best work is among their most cited articles. It is the same for me, still looking at all articles it is not a strong correlation

    Lorraine Hwang and colleagues in Earth and Space Science: Software and the scientist: Coding and citation practices in geodynamics, 2017

    Neuroskeptic: Is Reproducibility Really Central to Science?