Datathon for climate research

The challenges of climate change are huge - also for research. At the HIDA Datathon for Grand Challenges on Climate Change, participants from all over the world were able to demonstrate how data science can be used to find solutions to climate change issues - with computing power, team spirit and in five different application areas. 

Photo: Verena Brüning
Virtual performance, but concrete questions: 94 participants searched for data-science-based solutions for climate science research questions at the HIDA Datathon. Photo: Verena Brüning

Not only endurance but also speed bring a marathon runner to the finish line. Both qualities were also in demand at the HIDA Datathon for Grand Challenges on Climate Change, which took place on November 5 and 6 as a virtual event - paired with a passion for data science methods and the interest and willingness to apply this knowledge to current research questions. Urgent questions about the challenges of climate change that can hardly be put off and on which scientists at various Helmholtz Centres are currently working. Participants had a total of five challenges to choose from:


- Map local city climate from satellite data automaticallyZhu Xiaoxiang and her team at DLR ask you to come up with machine learning models that create reliable Local Climate Zone maps for urban climatologists from satellite imagery.

- Show the flow of fish larvae in warming oceans in an interactive visualization: Help Willi Rath and the Ocean Dynamics Team at GEOMAR to investigate human-induced changes at the beginning of the food chain.

- Finding water with cosmic rays: Martin Schrön collects data on soil moisture by a neutron detector installed on a car; he and his team at UFZ challenge you to develop a self-improving computer model that reliably detects landscape features in a set of images they’ve collected on their rides through Germany.

- Spot the mistake in 50 million data points, cleverly: Help Lennart Schmidt from the team around Corinna Rebman at UFZ to find a clever method to automatically flag suspicious and bad data points in soil-moisture measurements in the German forest area Hohes Holz.

- Developing reliable forecasts for drought: Work with simulations spanning several thousands of years and help Eduardo Zorita and his team from the Helmholtz Zentrum Geesthacht to develop a model that reliably predicts rain and snow for the following fall and winter season.


The challenge participants were supported by HIDA partners: As technology partners, the Jülich Supercomputing Center and the Steinbuch Center for Computing of KIT provided HPC systems and computing time as well as support by one employee each. External partners of the event were Deloitte and NVIDIA. In addition to seven experienced data scientists from the centers, employees of the companies supported the participants as mentors.

94 participants actively participated in finding solutions - a considerable number for an event that was purely virtual due to Corona restrictions. Of the 13 teams that came together during the first morning, ten solution videos were submitted in the required time. This required a great deal of communication and coordination: More than 5000 messages were exchanged during the entire two days via the Slack channel set up by HIDA for this purpose. Data Scientists from Helmholtz Centres were particularly well represented in the challenges: Almost half of the participants came from twelve different Helmholtz institutions. International guests also enriched the teams: participants from a total of seven countries took part in the intensive work on the data sets.

“I was really impressed about the quality of the solutions that have been achieved, especially when we consider the short time the participants had available for the challenges”, says Hanna Meyer, Professor for Remote Sensing and Spatial Modeling at the Westfälische Wilhelms-Universität Münster and jury member about the submitted contributions. “I’m convinced that some of the proposals can be regarded as a step forward in using data science in the context of climate change research.”

In the end, five teams convinced the jury with their presentation. Juror Marcel Dickow from the Federal Environment Agency (Umweltbundesamt) was particularly impressed by the participants' ability to reflect on the specifics of the challenges they faced: “I think that’s something what we need: That people are not only able to find solutions but also to understand the problem and to adept their solutions to the problem.”

It is a declared goal of the Datathon that this ability to reflect should actually contribute to solving the respective research questions. Challenge giver Eduardo Zorita is confident about the results found in his challenge to predict droughts: “All three teams did an excellent work within the tight timeframe, exploring different options. One of these solutions based on Random Forest did show promising results. They will be now further pursued and very likely included in our ongoing Haicu pilot project.” Martin Schrön is also surprised about the good results: "It has given our idea a real push.” He plans to invite the two winning groups to talk with them about a possible automation of the script.

In the following video, the five challenge givers explain in which application areas their respective challenges lie and what specific problems had to be solved:

The solutions presented by the teams

Have a look at all the teams presenting their solutions in this video. You can find the winning teams here:

- Team Heidelbären (30:34)

- Team Moisture Magic (36:50)

- Team Neutrons_net (41:40)

- Team dlr_lcz42_uncertainty (50:15)

- Team Weatherpeople (56:10)