An excellent interface
From earth system sciences and energy research to biotechnology and medicine: Doctoral researchers from a wide range of research fields gather at the Helmholtz School for Data Science in Life, Earth and Energy (HDS-LEE) in Jülich. How do the PhDs benefit from the diversity of the three disciplines?
Laura Helleckes works with astonishing microscopic organisms on a daily basis. These organisms can be used to make synthetic materials, give cleaning products their power, or also produce substances such as glutamate for the food sector or insulin as an active pharmaceutical ingredient.
Tiny but powerful, the organisms Helleckes works with are often bacteria such as Escherichia coli and Corynebacterium glutamicum. She wants to find out: What temperature is ideal for the bacteria to thrive? Which nutrient solutions are good for them? And what are the optimal living conditions that allow the microorganisms to make these valuable substances from sugar or crop residues as quickly and effectively as possible?
As she works towards her doctorate at IBG-1, the biotechnology institute at Forschungszentrum Jülich, Helleckes evaluates millions of data points of relevance to these questions – as a data scientist. “It’s incredible what data analyses can tell you about them.”
Nature provides the templates
Genes are the blueprint for life and allow bacteria to create remarkable things – and this has fascinated Helleckes from a young age. “Nature provides the templates for many processes. When we humans need a specific substance, all we have to do is look very carefully – there is always some organism or other that already produces it or a very similar molecule.”
How great it would be, if we could use nature’s tricks as a sus-tainable way of manufacturing products – such as bioplastics from agricultural waste. Helleckes is studying biotechnology, and she herself is now right on the front line of research into nature’s capabilities.
The international graduate school HDS-LEE provides an interdisciplinary environment for training the next generation of data scientists in close contact with domain-specific knowledge and research. It is part of the newly established JARA Center for Simulation and Data Sciences, the German competence center for computing and data infrastructures, user support, and methodological and disciplinary research in simulation, data analysis, and high-performance computing technologies.
JARA is a unique cooperation between Helmholtz Research Center Jülich and RWTH Aachen University with strong international visibility.
Photo: Forschungszentrum Jülich from above - Credit: Forschungszentrum Jülich
Knowledge in biotechnology and the data sciences
In practical terms, this means that her biotechnology colleagues in the lab start off the tests using microorganisms and try to identify the best way to fuel the metabolic process they use to convert the sugar into the required substance. Helleckes uses digital methods to track the process. “I build a digital twin of the test and conduct a simulation on the computer to look at what would change if we were to tweak things slightly. The researchers test the best variants in the lab.”
This means round after round of collecting data, evaluating it, running it through the simulation, testing, evaluating results, making suggestions for new trial variants, discussing with the colleagues in the lab, and then starting the whole process all over again. But none of this would work if Helleckes didn’t collaborate closely with the biotechnologists and process technicians in the lab. “I need to know an equal amount about biotechnology and the data sciences to do this work – and this has been an outstanding place to learn it.” “This place” is the Helmholtz School for Data Science in Life, Earth and Energy, or HDS-LEE.
One of the special features of HDS-LEE is that the research topics addressed by the school’s doctoral candidates are just as diverse as the disciplines covered. Candidates in the earth system sciences are exploring water cycles in Africa, those in the energy sciences are analyzing fluctuations in the power supply over longer periods of time, and graduate students in neurobiology are looking at brain waves in apes to arrive at a better understanding of thought processes.
Ideal conditions for the data sciences
But why would disciplines as different as earth, life, and energy sciences team up to create a joint graduate school? “Because data sciences are a methodological discipline that is universally applicable. The fact that these three fields cover different content isn’t a problem for data scientists, and the questions that arise in the life, earth, and energy sciences are very similar from a methodological point of view,” explains the school’s spokesperson, Wolfgang Wiechert.
All three areas equally involve large volumes of different types of data from various sources, known as heterogeneous data in the field. Laboratories and research processes are also becoming increasingly automated and digitalized in all these areas, creating ideal conditions for the data sciences.
"We need these close connections between applied disciplines and data science. Be-cause the volumes of data we have to analyze to make optimal advancements in our research have long been far too vast and diverse to use them without data science."
Wolfgang Wiechert, HDS-LEE's school speaker
The specialist areas benefit from one another
Biotechnologists, for example, are characterizing microorganisms using mass spectrometers, microscopic image data, or gene sequencing.
Earth scientists are studying the planetary system using various means, including satellites, measurements taken by balloons in the stratosphere, and sensor networks in mobile labs. Researchers in the field of energy obtain information on the load capacity of energy networks by first looking at measurements for each individual form of energy – from solar to wind to nuclear – from various sources. And because data and its measurement sources vary so much, all three disciplines frequently work with simulations so they can combine these diverse aspects to create an overall picture.
Known as data fusion, this process is particularly reliant on machine learning because it is highly complex. Dr. Wiechert explains, “The methods are very transferable as a result, so the specialist areas benefit from one another.” For example, the algorithms used to analyze image data on plants can also be transferred to microscopic images of microorganisms.
“We need these close connections between applied disciplines and data science. Because the volumes of data we have to analyze to make optimal advancements in our research have long been far too vast and diverse to use them without data science,” Dr. Wiechert continues.
Which is why, when the Helmholtz Association encouraged its members to set up graduate schools at the interface between conventional scientific disciplines and data science five years ago, Dr. Wiechert immediately rounded up his colleagues from a number of different Helmholtz Centers and universities and asked them: Isn’t this something we could do?
A link between experts and data pros
In addition to Forschungszentrum Jülich, the other institutions that make up HDS-LEE today are RWTH Aachen, the University of Cologne, the German Aerospace Center (DLR), the Max Planck Institute for Iron Research (MPIE), and as of 2022, University Hospital Aachen.
The school was set up as a joint endeavor between these institutions under the auspices of the Helmholtz Information & Data Science Academy (HIDA). It thus forms part of a network of six research schools that are training over 250 fully funded doctoral candidates throughout Germany at the interface between data science and the six Helmholtz research fields. Right now, there are 62 PhD students on the graduate program at HDS-LEE alone.
Each of the doctoral candidates is supported by two principal investigators (PIs) during their three-year research period. A special aspect at HDS-LEE is that many PIs have expertise in both the data sciences and an applied discipline. “These scientists are the link between experts and data pros,” explains Dr. Wiechert.
The parallels quickly become clear when we discuss our research. Once you abstract the methods, there’s a lot that can be transferred over.
is a doctoral researcher at HDS-LEE and speaker for the PHDs.
A vibrant and inspiring community
New positions advertised at HDS-LEE attract high levels of interest. When the graduate school opened its doors in 2019, around 300 students applied for the 24 places in the first cohort, with the number increasing further still in 2021, when 400 competed for the places in the second cohort. Students apply directly for the specific research projects.
In addition to the 45 regular doctoral candidates, the school currently has 17 associates. These graduate students are already pursuing research at a partner institute using funding from other sources and are working on “topics that are a very good fit for the HDS-LEE profile,” Wiechert says. “The aim is for them to benefit in equal measure from the school’s versatile program accompanying their studies as well as the extensive research network.”
Laura Helleckes, who is a doctoral candidate at school spokesperson Dr. Wiechert’s institute, is one of the associates and speaker for the PHDs. She enjoys being part of this vibrant, inspiring community, which sometimes sees her meeting doctoral candidates from RWTH Aachen or students from Jülich or Cologne for lunch. Everyone comes together for an online coffee break every two weeks, and the HDS-LEE Women regularly meet up for an extra session.
All of the doctoral candidates go on a retreat together once a year, where the agenda includes team building activities and research discussions. The students present their projects to one another at poster sessions. “The parallels quickly become clear when we discuss our research,” says Helleckes. “Once you abstract the methods, there’s a lot that can be transferred over.”
Helleckes especially appreciates the support program that accompanies her studies at HDS-LEE. Lectures and courses offer a good insight into the sets of data science methods behind everything from deep learning, to time series analyses, through to probability-based modeling.
During hands-on sessions, doctoral candidates can take a playful approach to testing specific ways of using the methods. School coordinator Ramona Kloß also organizes a set of soft skill courses on topics including time management, interdisciplinary communication, and scientific writing. Kloß says, “We put together many workshops at the doctoral candidates’ request and support them in getting their own events up and running.”
Changing the world with research
And it goes without saying that HDS-LEE is continually growing and evolving. Collaborations with project partners are expanded, external resources are acquired – with one example being the DataPLANT project, which is funded by the German Research Foundation as part of the National Research Data Infrastructure. Over the next five years, there are plans to develop a comprehensive data infrastructure for modern plant research with support from PhD projects at HDS-LEE.
The school has also worked closely with the industrial sector since its inception. Enterprises such as BASF, Bayer, and the energy departments at Siemens regularly inquire about collaborating. Graduate students at HDS-LEE have the opportunity to take a leave of absence for an internship even before completing their doctoral studies. “Three graduates have already completed their PhDs and found positions at universities in Oslo and Aachen and at an IT company in Aachen,” says Kloß.
And what about Laura Helleckes – would she prefer to work in industry or academia? She laughs. “Either one; I am a researcher through and through – at the university would be great, but maybe at a company, too.” She says she finds it inspiring for her research to contribute to boosting the efficiency of production processes for creating key substances and facilitating the transition from fossil to sustainable resources as quickly as possible.
“That’s the really fantastic thing about research – developing new approaches and ideas that move us humans forward.” And what is her goal for her career? “I’d like to maybe become a professor myself at some point – at the interface between biotechnology and data science.”