Israel & UK Exchange Projects 2023 - Helmholtz Information & Data Science Academy

Folgende Projekte wurden von unseren israelischen und britischen Partnern im Jahr 2023 eingereicht - mit dem Ziel, Sie für einen dieser Forschungsansätze zu begeistern und mit Ihrer Expertise voranzubringen. Werfen Sie einen Blick auf die diesjährigen Vorschläge!

Unsere Austauschprojekte

Mit einem Klick auf die Karten erfahren Sie mehr über die jeweiligen Themen, die Mentorinnen und Mentoren und die Teilnahmebedingungen.

Die Projekte aus Großbritannien

Caterina Doglioni - University of Manchester

Machine learning algorithms for data compression in high energy physics and beyondMachine learning algorithms for data compression in high energy physics and beyond

We are a professor (the applicant, Caterina Doglioni), postdoctoral researchers and PhD students based at the University of Manchester, and this project is co-supervised by a PhD student at Lund University (Alex Ekman) who is part of the HELIOS graduate school. We are members of the ATLAS Collaboration at the LHC, and our research interests include searching for new physics phenomena that can be produced in proton-proton collisions, motivated by the presence of dark matter in our universe.
Within the SMARTHEP European Training Network that we coordinate, we work on real-time analysis, machine learning and heterogeneous computing infrastructures, and we are keen on FAIR, sustainable and green software.

www.smarthep.org

Caterina.doglioni@manchester.ac.uk

What is the data science project's research question?

The Large Hadron Collider (LHC) hosts multiple large-scale particle physics experiments. These experiments look for answers to the big questions of our universe, such as the nature of the mysterious substance that makes up most of the matter in the universe (dark matter). The combined data output of the LHC experiments is roughly 1 Petabyte per day [1], and the upgraded High Luminosity LHC in 2029 will produce 10 times more data. Our research question is: How do we make sure that we can record all this data in our limited storage space?

In particular:

Would lossy compression enabled by machine learning be a viable solution, and would it be able to preserve sufficient accuracy in the data that we need to search for new dark matter particles?
Would the same algorithm work for other kinds of data, since many other scientific collaborations are creating more and more data and need to optimize their storage as well?

What data will be worked on?

In the first instance, the student will work on Open Data recorded by experiments at the Large Hadron Collider, but we are also happy if the student comes with some data of their own because we are trying to see how this algorithm works for different disciplines.

What tasks will this project involve?

Run and document the performance of the existing model to compress data from LHC experiments
Improve the design and test the performance of different autoencoder models
Evaluate the performance compared to other lossy compression algorithms
If time allows and there is interest, try compressing other datasets outside of high-energy particle physics

What makes this project interesting to work on?

This project is tackling a problem that is very common in big science and industry: how to enable recording more data when we have limited resources for doing so. There are many algorithms that work for image or music compression, but there aren’t yet many that compress complex scientific data with many different features. The other interesting aspect is to understand how lossy compression modifies our data, and what the tolerance is for researchers doing the data analysis.

What is the expected outcome?

Co-authorship to research paper & contribution to software

Is the project open source?

Yes

What infrastructure, programs and tools will be used?

The main software we will use and modify in this project is called “Baler”, an open source compression tool undergoing development at the Particle Physics divisions of Lund University and the University of Manchester. Baler uses autoencoder neural networks as a type of lossy machine learning-based compression to compress multi-dimensional data and evaluate the accuracy of the dataset after compression.

Can the infrastructure, programs and tools be used remotely?

This software can be used remotely, and we can offer computing resources that are accessible remotely for use in this project.

What skills are necessary for this project?

Data analytics, statistics, Scientific computation, data mining, Machine learning, Deep learning, Software development, Python

What level of experience is necessary for this project?

Master, PhD or Postdoc

Who would be supervising the exchange participant?

Prof. Caterina Doglioni would be supervising the exchange participant together with Alexander Ekman (Lund University) and Pratik Jawahar (University of Manchester). The participant would be working in a team that also includes a Google Summer of Code student from the High Energy Physics Software Foundation.

Zahra Montazeri - University of Manchester

Modelling Realistic Appearance for Complex Materials

I am a lecturer in the department of Computer Science at University of Manchester especialized in Computer Graphics. Our lab focus is physically-based rendering to generate realistic images in virtual world given small samples of real world. This can be used to improved the realism in Metaverse, game and movies.

https://research.manchester.ac.uk/en/persons/zahra.montazeri

Zahra.montazeri@manchester.ac.uk

What is the data science project's research question?

How to accurately reproduce complex materials in virtual work given real photographs captured from real sample?

What data will be worked on?

Thousands of images taken from a small piece of material (eg, cloth, metal, plastic) under different configuration of light and camera. The goal is to explore these images and map them to a continuous space.

What tasks will this project involve?

Using a sophisticated scanner, we scan a small piece of matterial and generate thousands of images to learn how the light intracts with the sample. The project involves studying the data and probably using learning techniques such as Neural Network to define continious space to reproduce those materials in virtual world.

What makes this project interesting to work on?

With the advancement of metaverse and the need for virtual world, reproducing realistic materials is crucial more than ever. Appearance Modelling is a field in Computer Graphics studies the techniques to bring realism to the virtual world using physics, math and programming. This project will improve all these skillsets and offer a high demanding applications once completed.

What is the expected outcome?

Co-authorship to research paper

Is the data open source?

Yes.

What infrastructure, programs and tools will be used? Can they be used remotely?

We only need a powerful computer and it can be done remotely.

What skills are necessary for this project?

Data analytics, statistics, Machine learning, Deep learning, Parallel/distributed programming with GPUs, Computer vision and image processing/analysis, Software engineering, Python C/++/#

What level of experience is necessary for this project?

Master, PhD or Postdoc

Anirbit Mukherjee - University of Manchester

Solving Forward & Inverse PDEs via Neural Nets

Our group’s primary focus is provable neural training algorithms – which we have published at the top conferences and journals and multiple such works are under submission. Most recently we have also ventured into the mathematics of how PDEs can be solved by neural nets. We also have a number of ongoing experiments doing comparative tests between different neural methods of solving differential equations.

https://sites.google.com/view/anirbit/home

anirbit.mukherjee@manchester.ac.uk

What is the data science project's research question?

The exchange student will engage in developing theory at the interface of PDE solving and neural nets. In particular, we would try to understand

how the size of the net affects the ability to solve PDEs and
how parameters of a parametric PDE can be inferred by a neural net from the value of some solution of it at a few observation points.

Albeit this would mostly be a mathematics project, the student could also choose to spend some of their time on doing experiments in these themes.

What data will be worked on?

There is no externally sourced data that will be needed in the project.

What tasks will this project involve?

The project will necessarily involve the student developing a rigorous understanding of some of the recent papers where mathematical formalisms have been developed at this interface – like theory of DeepOperatorNetworks and Physics Informed Neural Nets. Then the student will be set to the task of proving the intended theorems for some simple PDEs.

What makes this project interesting to work on?

Its easy to infer the importance of this project from realizing that some of the main software companies are already investing heavily into developing codes associated to the kind of questions proposed here:

This project is intended to be at the cutting-edge of applied mathematics and deep-learning. To the best of my knowledge there are only a handful of groups around the world which are looking into the theory of why and how neural nets can solve or invert PDEs. So, via this project, the student has a rare chance to get an entry point into this exciting and futuristic direction of research that is poised to grow very big in the near future.

What is the expected outcome?

Co-authorship to research paper.

Is the data open source?

Yes.

What infrastructure, programs and tools will be used?

In principle the project can be done remotely since most of the research is intended to be mathematical. Some experiments that we might want to do is possible to be implemented via hiring GPU services on the cloud. But, it shall be strongly preferred that the internship happens in-person since being able to have lots of discussions on the blackboard is extremely beneficial to mathematics research.

Can the infrastructure, programs and tools be used remotely?

Yes, but in person is strongly preferred.

What skills are necessary for this project?

Machine learning, Deep learning, Python

What level of experience is necessary for this project?

Master, PhD or Postdoc

Felix Reidl - Birkbeck, University of London

Recognition of self-compatible shapes for efficient packing

Our group’s research focuses on difficult, real-world computational problems and the data related to them. We aim to conduct our projects end-to-end: from theoretical analysis to implementation, optimisation and finally experimental testing on realistic datasets. Currently we are working on counting problems in complex networks and packing problems of geometric data.

www.bbk.ac.uk/research/centres/birkbeck-institute-for-data-analytics

Supervisor's contact details: f.reidl@bbk.ac.uk

What is the data science project's research question?

The research question arises in a geometric packing problem, where we want to pack as many shapes (polygons) as possible into a given area. Some shapes can be packed neatly into a strip-arrangement if one ‘interleaves’ them properly. As a simple example, image a U-shaped object. In order to pack these shapes as closely as possible, we probably want to use an arrangement that looks as follows:

The research task is to develop a model that recognises whether a given shape can be tiled in a such a repeating manner.

What data will be worked on?

The data consists of a large number of polygon shapes stored in flat text files.

What tasks will this project involve?

The student will develop, together with members of the lab, an algorithm to test the self-compatibility of a shape developed and use this implementation to annotate the dataset. They will then attempt to develop a machine-learning model to recognize self-compatible shapes efficiently.

What makes this project interesting to work on?

The project is part of a collaboration with an industrial partner, any progress on the questions outlined above will directly improve their operations. The problem itself is challenging and therefore has great learning potential, among other things the student will work with tools from computational geometry, optimisation and machine learning.
The student will be involved as much as possible in the overarching project and, according to their ability, might extend beyond the tasks outlined here if the student is interested.

What is the expected outcome?

Co-authorship to research paper & Contribution to software

Is the data open source?

Other: Some of the data might be proprietary. If this is an issue we can limit ourselves to open-source data only.

What infrastructure, programs and tools will be used? Can they be used remotely?

We use Python for prototyping/scripting and Rust for computationally heavy algorithms. The tools could be used remotely, but this project will need close supervision which is easier to provide if the students is present.

What skills are necessary for this project?

Machine learning, Python, Other: The student should be interested in mathematics (linear algebra in particular) and algorithms

What level of experience is necessary for this project?

Master, PhD or Postdoc

Who would be supervising the exchange participant?

Dr Felix Reidl, Senior Lecturer, Department of Computer Science and Information Systems
and Oded Lachish, Lecturer in same department.

Felix Reidl - Birkbeck, University of London

Visual representations of large-scale network structure

Our group’s research focuses on difficult, real-world computational problems and the data related to them. We aim to conduct our projects end-to-end: from theoretical analysis to implementation, optimisation and finally experimental testing on realistic datasets. Currently we are working on counting problems in complex networks and packing problems of geometric data.

www.bbk.ac.uk/research/centres/birkbeck-institute-for-data-analytics

Supervisor's contact details: f.reidl@bbk.ac.uk

What is the data science project's research question?

Large complex networks are notoriously difficult to visualise which makes comparative analysis or classification by visual means very challenging. We would like to investigate whether simple, high-level visualisations of core network properties (degree distribution, density, number of high/medium/low degree vertices) can capture structurally interesting properties. In particular, we aim to classify networks according to the resulting visualisation and investigate whether this results in a useful classification method.

What data will be worked on?

An existing corpus of complex networks stored in a uniform file format.

What tasks will this project involve?

The student will implement a program which takes a complex network as input and outputs a visualisation of high-level properties. We already have foundational ideas for the visualisation, but the student will have freedom to explore variations. The student will apply the final visualisation to the whole network corpus, group them by similarity (using clustering techniques) and analyse the resulting groups. Specifically, we are interested whether the visualisation captures the network’s origin domain.

What makes this project interesting to work on?

The project exemplifies the difficult work with high-dimensional, non-numerical data. The student will have an opportunity to learn about network properties, graph algorithms and sharpen their data visualisation skills.

What is the expected outcome?

Co-authorship to research paper & Contribution to software

Is the data open source?

Yes.

What infrastructure, programs and tools will be used? Can they be used remotely?

We will use Python for the visualisation, specifically the matplotlib library. The tools could be used remotely.

What skills are necessary for this project?

Data analytics, statistics, Python, Other: The student should be familiar with matplotlib or a comparable visualisation library

Who would be supervising the exchange participant?

Dr. Felix Reidl, Senior Lecturer, Department of Computer Science and Information Systems and Oded Lachish, Lecturer in same department

Do you have any questions?

More information on how to participate is available in our FAQs.

You can also contact:
Ann-Kathrin Streletzki
ann-kathrin.streletzki@helmholtz.de

Die Projekte aus Israel

Keren Agay-Shay - Bar Ilan University

Keren Agay-Shay - Bar Ilan University, Azrieli Faculty of Medicine

Exploring the links between outdoor exposures to green spaces and pregnancy outcomes

We are an international team of environment and health scientists studying the effects of external environmental exposures on human health and well-being. Our multidisciplinary research projects cover such scientific topics as environmental epidemiology, human biology and public health, landscape and biodiversity studies. We apply geo-spatial data analysis and advanced statistical methods in order to evaluate the links between both beneficial and harmful environmental exposures and health outcomes, with the main focus on adverse pregnancy outcomes (APO), maternal health during pregnancy and mental health (such as risks of stress and anxiety during pregnancy).

http://research.md.biu.ac.il/labs/keren-agay-shay/

kagayshay@gmail.com

What is the data science project's research question?

To develop a novel exposure metrics that will be used to evaluate the associations between different types of residential surrounding greenness and pregnancy outcomes. Create new open source on-the-ground exposure measures using the Google Street View Imagery (SVI) and image semantic segmentation techniques.

What data will be worked on?

Main exposure data:
Participant(s) will apply geo-referenced (coordinate based) Google Street Images (SVIs) representing an on-the-ground perspective of the exposure to the external natural environment. The images captured from 2014 to 2022 on the territory of Israel will be retrieved and post-processed using image segmentation deep learning model.

Additional exposure data: satellite derived Normalized difference vegetation index (NDVI) from MODIS - Moderate Resolution Imaging Spectroradiometer (in .csv and .sav formats).

Additionally: grid shape files (.shp, .shx, .dbf formats).

What tasks will this project involve?

The main objectives of the project are:

to quantify residential greenspace coverage (% total greenspace) from Google SVIs (selected regions in Israel) applying image segmentation model in order to estimate the percentages of each environmental class (% trees, % grass, % flowers, and % plants combined) within a 360◦ view for each given location;
to validate the accuracy of SVIs-based greenspace metrics using NDVI satellite data;

Depending on participant(s) skills set, their levels of expertise and their preferences the following technical and data analysis tasks can be performed:

creating a high resolution regional grids using the shapefiles and/or the open street network;
extending current retrieving algorithms and adjusting Google API and street module scripts (https://github.com/robolyst/streetview) in order to obtain the locations of the images nearest to the residential addresses;
applying the pyramid scene parsing network (PSPNet) to derive greenspace metrics from the retrieved Google SVIs (pixel-level image segmentation). Each pixel within each image will be classified based on pre-trained scene parcing dataset (ADE20K);
creating geospatial raster files from segmented SVIs, which should be linked to geocoded residential addresses from pregnancy cohort data.
validation of a SVI-based metrics with the satellite-derived indicator of the quantity of vegetation on the ground (NDVI).

What makes this project interesting to work on?

The project roots in such an important topic as the benefits of public health from different types of residential landscapes. It was observed that natural residential environment including green spaces may reduce the effects of harmful exposures to rapidly increasing air temperatures and pollution in the current phase of climate change. Since for Israel there were no attempts to use machine learning techniques in greenness exposure studies, the participant(s) will contribute to the development of novel database, data processing and machine learning techniques and an overall multidisciplinary study design.

The project output will allow to compliment and improve existing exposure estimates from the on-the-ground eye-level perspective. Data processing skills and machine learning techniques applied during the project can be transferred to the other fields of research. The open source exposure metrics that will be developed will be available for future environment and health studies.

What is the expected outcome?

Database of retrieved Google street view eimages for the specific study regions across Israel. Greenness exposure metrics retrieved from Google SVIs (for each given location and generalized map for the whole study region). Statistical analysis outcomes (associations of created greenness exposure metrics with adverse pregnancy outcomes).

Is the data open source?

Yes, the geospatial exposure data (NDVI), image segmentation neural networks and pre-trained image annotation dataset as well as Google API and python modules are open source.

What infrastructure, programs and tools will be used? Can they be used remotely.

Python, Jupyter-notebooks, GitLab/GitHub, Google application programming interface (API), Google Street, street view module; pyramid scene parsing network (PSPNet); pre-trained scene parcing dataset (ADE20K); of an advantage: ArcGIS, QGIS. They can be used remotely.

What skills are necessary for this project?

Scientific computation; machine learning; computer vision and image processing/analysis; programming skills (Python and/or R), data visualization, experience with GitLab/GitHub, Jupyter notebooks; Geographic information systems; of an advantage: software and databases development.

What level of experience is necessary for this project?

Master student, Doctoral Researchers, Post-Docs

Tamir Bendory - Tel Aviv University

Tamir Bendory - Tel Aviv University, School of Electrical Engineering

Conformational variability analysis of small molecular structures using cryo-EM

We work on mathematical and computational problems in data science, focusing on structural biology applications.

https://www.tau.ac.il/~bendory/

bendory@tauex.tau.ac.il

What is the data science project's research question?

We aim to develop methods to analyze the conformational variability of small proteins, a task which is out of reach of current technology. The project involves a variety of data science fields, including high-dimensional statistics, optimization, manifold learning, and processing of massively large data sets.

What data will be worked on?

Cryo-EM data sets available online in public repositories

What tasks will this project involve?

Developing a new mathematical framework and implement it.

What makes this project interesting to work on?

The project has tremendous potential to revolutionize structural biology. It also involves a wide range of data science skills.

What is the expected outcome?

A new method to analyze conformational variability of small molecular structures, including an efficient and documented code that can be disseminated.

Is the data open source?

Yes, available at public repositories.

What infrastructure, programs and tools will be used? Can they be used remotely?

Infrastructure: my servers. Programs: Python/Matlab and standard cryo-EM software. It can also be used available remotely.

What skills are necessary for this project?

Very high mathematical and coding skills.

What level of experience is necessary for this project?

Post-Doctoral

Barak Fishbain - Technion

Barak Fishbain - Technion - Israel Institute of Technology, Civil and Environmental Engineering

Markov chain based analysis of air-pollution and climatologic systems

The Technion Enviromatics Lab (TechEL) focuses on Enviromatics, a new research field that aims at devising machine learning methods and mathematical models for better understanding built and natural complex environments. The goal is to harness new machine learning, mathematical models with engineering principles, computing, and networked sensing data analytics for enhancing the efficiency, resiliency, and sustainability of infrastructure and natural systems. This includes topics related to hydro-informatics, atmospheric-informatics, traffic data, structural health, smart infrastructure systems and connected transportation.

https://fishbain.net.technion.ac.il/

fishbain@technion.ac.il

What is the data science project's research question?

Data science approach for inferring which affects air-pollution more meteorology or anthropogenic activities?

What data will be worked on?

Climatologic and air-pollution data-sets

What tasks will this project involve?

Developing Markov-chain based model to infer the temporal behavior of meteorological and air-pollution systems and infer the effect of the former on the latter.

What makes this project interesting to work on?

Developing mathematical models for physical phenomena and inferring their physical properties through math.

What is the expected outcome?

A research paper describing the association between air-pollution, meteorology and human activities.

Is the data open source?

Yes.

What infrastructure, programs and tools will be used? Can they be used remotely?

The TechEL holds a virtual computation and storage platform (VMWare based) as well as Azure (Windows based) processing resources. These will be used for data storage processing with any software the student prefers (Python, Matlab, R, etc.). Given access over VPN to the Technion's intranet - all resources are available. Access can be granted through formal procedure.

What skills are necessary for this project?

Mathematical thinking, coding skills.

What level of experience is necessary for this project?

Master student, PhD students, Post-Doctoral

Barak Fishbain - Technion

Barak Fishbain - Technion - Israel Institute of Technology, Civil and Environmental Engineering

Robust Optimization of Water Supply Systems

The Technion Enviromatics Lab (TechEL) focuses on Enviromatics, a new research field that aims at devising machine learning methods and mathematical models for better understanding built and natural complex environments. The goal is to harness new machine learning, mathematical models with engineering principles, computing, and networked sensing data analytics for enhancing the efficiency, resiliency, and sustainability of infrastructure and natural systems. This includes topics related to hydro-informatics, atmospheric-informatics, traffic data, structural health, smart infrastructure systems and connected transportation.

https://fishbain.net.technion.ac.il
fishbain@technion.ac.il

What is the data science project's research question?

Robust optimization has been out there for quite some time. So far, when it comes to water network, robust optimization have been applied solely to the supply side and in the design of the network phase. Here we opt to utilize robust optimization to the operational aspects of the network.

What data will be worked on?

Simulated data, generated by EPA-NET.

What tasks will this project involve?

Developing the optimization algorithms and coding them, working with water supply system simulation software (EPA-NET).

What makes this project interesting to work on?

Robust optimization is a strong tool that can be used in many applications and thus learning it, provides significant skills toolset.

What is the expected outcome?

A research paper describing for the first time operational ROBUST optimization of water systems.

Is the data open source?

No.

What infrastructure, programs and tools will be used? Can they be used remotely?

The TechEL holds a virtual storage and computation system, which will provide the infrastructure of the project. Given access to the Technion's intranet all resources are available. Access can be granted through formal procedure.

What skills are necessary for this project?

Knowledge (and preferably experience) in optimization, coding (Python or R).

What level of experience is necessary for this project?

PhD students, Post-Doctoral

Oren Forkosh - Hebrew University of Jerusalem

Animal behavior and animal personality

We use AI to study animal behavior, personalities, and emotional states in various species, including mice, cows, cats, and birds.

https://www.forkoshlab.com/
oren.forkosh@mail.huji.ac.il

What is the data science project's research question?

New approaches to studying animal behavior such as identifying complex behaviors from video or position data, and incorporating ideas from natural language processing to decipher animal behaviors.

What data will be worked on?

Animal position tracking data (in mice or cows) or video data (in mice or birds)

What tasks will this project involve?

Developing algorithms that involve deep learning or statistical machine learning

What makes this project interesting to work on?

Unique data and problems (and the animals are cute)

What is the expected outcome?

New applicable algorithms to better understand animal behavior

Is the data open source?

We made some of the data available online.

What infrastructure, programs and tools will be used? Can they be used remotely?

Python or Matlab as well as databases. They can be used remotely.

What skills are necessary for this project?

Python or Matlab

What level of experience is necessary for this project?

Master student, PhD students, Post-Doctoral

Roy Friedman - Technion

Roy Friedman - Technion, Computer Science

Combining ML and Sketches for Efficient Caching

Working on stream processing and sketching algorithms, caching, and distributed systems.

https://roy.net.technion.ac.il/

roy@technion.ac.il

What is the data science project's research question?

Finding effective ways to combine sketches and NN to reduce the learning time as well as space and computational complexity of NN alone.

What data will be worked on?

Real-world access traces from various storage systems and Internet services.

What tasks will this project involve?

Programming the various algorithms we come up with and running them on a cluster of GPUs to measure relative performance.

What makes this project interesting to work on?

The potential to obtain orders of magnitudes improvements in space and computational complexity.

What is the expected outcome?

Showing that we can obtain order of magnitudes improvements and publishing the results as a research paper.

Is the data open source?

Yes.

What infrastructure, programs and tools will be used? Can they be used remotely?

Programming: Mostly Python/PyTorch. Hardware: A cluster of DGX-A100 machines. They can be used remotely.

What skills are necessary for this project?

Python programming, good background in ML and basic data science techniques.

What level of experience is necessary for this project?

Master student, PhD students, Post-Doctoral

Avitay Geltman - Technion

Avitay Geltman - Technion - Israel Institute of Technology, Environmental Informatics Lab

Deep Learning Approach for Countrywide High-Resolution Air Pollution Modelling

Devising machine & deep learning methods and mathematical models for better understanding built and natural complex environments.

https://fishbain.net.technion.ac.il/

avitay@campus.technion.ac.il

What is the data science project's research question?

Predicting and modelling country-scale air pollution levels using advanced neural network architectures (Transformers/diffusion models/GANs, etc., yet to be decided).

What data will be worked on?

Time-series tabular data, containing key variables such as: Monitored pollutant concentrations, transportation and industry emissions, meteorological variables, predictions of a deployed Physico-chemical model.

What tasks will this project involve?

Data pre-processing, implementation of feature selection methods, development of aforementioned neural network architectures, analyzing results and deriving conclusions.

What makes this project interesting to work on?

The project involves modern promising deep learning methods based on high-quality real-world data. A successful development would be eligible for deployment by the Israeli Ministry of Environmental Protection.

What is the expected outcome?

A publication + a potential deployment of the model in Israel

Is the data open source?

Part of it.

What infrastructure, programs and tools will be used? Can they be used remotely?

Python, PyTorch ( / PyTorch Lightning), etc. Can be used remotely.

What skills are necessary for this project?

Background in machine & deep learning, with research experience in the field.

What level of experience is necessary for this project?

Master student, PhD students, Post-Doctoral

Tzipi Horowitz Kraus - Technion

Tzipi Horowitz Kraus - Technion, Dept. of Education in Science and Technology

Using AI tools to determine reading difficulties

We are using neuroimaging data to determine neuronal patterns associated with language and reading impairments in children. We use different data modalities, ranging from fMRI, DTI, anatomy, EEG, fNIRS to reveal specific patterns associated with developmental changes related to communication and AI tools for prediction or classifications related to different profiles and response to treatments.

https://neuroimaging-center.technion.ac.il/

tzipi.kraus@technion.ac.il

What is the data science project's research question?

Can we detect specific brain patterns associated with positive response to treatment.

What data will be worked on?

fMRI

What tasks will this project involve?

Data pre and post processing, computational models

What makes this project interesting to work on?

The transformative nature of this project, the convergence of state-of-the-art neuroimaging data analysis tools combined with AI will make this project an intriguing one for students who are interested in biomedical/computational/data science project.

What is the expected outcome?

A scientific paper

Is the data open source?

No.

What infrastructure, programs and tools will be used? Can they be used remotely?

Python, Matlab. It cannot be done remotely.

What skills are necessary for this project?

Scripting is an advantage

What level of experience is necessary for this project?

PhD students

Itzik Klein - University of Haifa

Itzik Klein - University of Haifa, the Autonomous Navigation and Sensor Fusion Lab

Boosting Quadrotor Navigation Using Data-Science in GNSS-Denied Environments

The purpose of navigation is to determine the position, velocity and attitude of platforms, humans and animals. Obtaining accurate navigation commonly requires fusion between several sensors. The Autonomous Navigation and Sensor Fusion Lab (ANSFL) vision is augmenting and developing artificial intelligence (AI) algorithms in innovative breakthrough research to create meaningful knowledge to the society through collaboration with fellow researchers and engineers. Our pioneering research addresses the intersection of AI with the navigation and inertial sensors to create value and opportunities for ocean and environment protection, identifying illnesses and well-being in humans and animals, and developing tools for autonomous vehicles teamwork.
We have ongoing projects on AI navigation and sensor fusion, inertial sensing, pedestrian navigation, animal navigation and localization, drone sensor fusion, autonomous underwater vehicle navigation, mobile robot navigation, and much more.

http://marsci.haifa.ac.il/labs/ansfl

kitzik@univ.haifa.ac.il

What is the data science project's research question?

How data-science can help improve quadrotor navigation in GNSS-denied environments.

What data will be worked on?

Inertial sensors (accelerometers, gyroscopes, magnetometers, barometer) - time series

What tasks will this project involve?

Data exploration, network architecture derivation, performance evaluation and a possibility to participle in field -experiments

What makes this project interesting to work on?

You will work with a highly talented team on a challenging, interesting, and practical problem using state-of-the-art deep learning methods.

What is the expected outcome?

Improving the current state-of-the-art performance and enhanced algorithm robustness

Is the data open source?

Part of it, and soon the rest

What skills are necessary for this project?

Motivation and experience with deep-learning methods

What level of experience is necessary for this project?

Master student, PhD students, Post-Doctoral

Asaf Levy - The Hebrew Univeristy of Jerusalem

Detection of new insecticidal genes using ML

We are interested in deciphering the function of genes involved in interactions between bacteria and other bacteria, plants, and insects.

www.asaflevylab.com

alevy@mail.huji.ac.il

What is the data science project's research question?

Develop a machine learning classifier with our group members.

What data will be worked on?

Genes from bacterial genomes that are publicly available

What tasks will this project involve?

Large scale data genomic data analysis, feature extraction, reading some scientific papers.

What makes this project interesting to work on?

It is important for having a sustainable agriculture. Plants engineered with insecticidal genes are protected from pests (read about Bt corn for example).

What is the expected outcome?

A list of genes that we will validate in experiments in the lab.

Is the data open source?

No.

What infrastructure, programs and tools will be used? Can they be used remotely?

Whatever you want. Usually we use Python programming in the group.They can be used remotely.

What skills are necessary for this project?

Programming, data analysis, some experience in machine learning.

What level of experience is necessary for this project?

Master student, PhD students, Post-Doctoral

Mor Nitzan - The Hebrew University of Jerusalem

Mor Nitzan - School of Computer Science and Engineering, The Hebrew University of Jerusalem

Computational disentanglement of single-cell data

Our research is at the interface of Computer Science, Physics, and Biology, focusing on the representation, inference and design of multicellular systems. We develop computational frameworks, based on ideas rooted in dynamical systems theory and machine learning to better understand how cells encode multiple layers of spatial and temporal information, and how to efficiently decode that information from single-cell data. We aim to uncover organization principles underlying information processing, division of labor, collective cellular function, and self-organization of multicellular structures.

https://nitzanlab.com/

mor.nitzan@mail.huji.ac.il

What is the data science project's research question?

Can we tease apart distinct biological processes underlying the state of cells based on single-cell data.

What data will be worked on?

Publicly available single-cell datasets.

What tasks will this project involve?

Depending on the student's/postdoc's background and interests, the project can involve a subset of the following: method development, modeling and simulations, single-cell data analysis, and biological interpretation.

What makes this project interesting to work on?

This is an exciting, emerging direction in single-cell analysis, rooted in the realization that cells simultaneously encode in their transcriptome multiple layers of information about their collective physical configuration in the tissue-of-origin, temporal processes such as the cell cycle and differentiation, and response to external stimuli. The possibility to disentangle and manipulate these different layers of information is expected to deepen our understanding of collective behavior, information encoding and transfer, and division of labor in diverse biological systems.

What is the expected outcome?

Contribution to research paper, Contribution to software development.

Is the data open source?

Yes.

What infrastructure, programs and tools will be used? Can they be used remotely?

Coding in Python, access to University servers. They can be used remotely.

What skills are necessary for this project?

A computational/mathematical background (CS/Physics/Applied Math/similar).

What level of experience is necessary for this project?

PhD students, Post-Doctoral

Yaron Orenstein - Bar-Ilan University

Yaron Orenstein - Bar-Ilan University, Computer Science and Life Sciences

Deep learning in genomics

Our lab develops algorithms to infer predictive models of molecular interactions based on high-throughput biological data.

https://wwwee.ee.bgu.ac.il/~cb/index.html

yaron.orenstein@biu.ac.il

What is the data science project's research question?

We will take a biological phenomena on the molecular level, which has been measured in high-throughput, and develop, train, and test a deep neural network on the measurements. We will desire high prediction performance, and interrogate the network for the molecular principles it learned

What data will be worked on?

There are many publicly available high-throughput genomic datasets, which are ideal for machine learning and specifically deep learning

What tasks will this project involve?

Developing the network analytically, programming in python to implement the network, running the code on available data to train and test the network, and applying computational techniques to interrogate the network.

What makes this project interesting to work on?

High-throughput genomic data is ideal for deep neural network - there are many data points, and each data point is relatively small compared to vision tasks. There's also the scientific discovery in the step of interpreting the trained network.

What is the expected outcome?

A predictor with high prediction performance, and interpretation of the network. These may lead to a publication.

Is the data open source?

Yes.

What infrastructure, programs and tools will be used? Can they be used remotely?

Unix OS, Python, Keras and Tensorflow or Pytorch. They can be used remotely.

What skills are necessary for this project?

Mathematics background (linear algebra, calculus, statistics, probability), programming, algorithms, data structure, computer architecture

What level of experience is necessary for this project?

Master student, PhD students, Post-Doctoral

Gideon Oron -Ben Gurion University

Gideon Oron -Ben Gurion University of the Negev, Blaustein Instituted for Desert Research

Desalination Brine Reuse

Two Experts in water and agricultural systems

gidi@bgu.ac.il

What is the data science project's research question?

A contributive solution for the by product generated during desalination

What data will be worked on?

Amounts and qualities of the brine produced and possible solution directions

What tasks will this project involve?

Collect data and work on modeling of reuse systems

What makes this project interesting to work on?

Desalination is the solution for worldwide water shortage however, the brine disposal is still a problem to be considered seriously.

What is the expected outcome?

The recommend on guidelines for brine/concentrate reuse for diverse purposes

Is the data open source?

Yes.

What infrastructure, programs and tools will be used? Can they be used remotely?

Experimental systems in Sde-Boker and computing capacities. They can't be used remotely.

What skills are necessary for this project?

Background in field work and management modeling (optimization)

What level of experience is necessary for this project?

Master student, PhD students, Post-Doctoral

Rami Puzis - Ben-Gurion University

Rami Puzis - Software and Information Systems Engineering, Ben-Gurion University of the Negev

Socially acceptable and fair AI

At Complex Networks Analysis Lab at Ben-Gurion University (CNALAB@BGU) we tackle research problems in diverse domains using a combination of methods from graph theory and machine learning.
Complex Networks are found in cyber security, social networks, communication networks and the Internet, biological networks, financial networks, text analytics and more. Scientific programmers working the CNA Lab @ BGU develop generic software tools and libraries to analyze the structure of networks derived from the various problem domains. Graduate research students apply these tools to investigate specific problems in their domain of interest.

https://faramirp.wixsite.com/puzis

puzis@bgu.ac.il

What is the data science project's research question?

Many pre-trained language models exhibit bias with respect to protected classes such as gender, race, age etc. We study the fine-grained decomposition of such bias into factors well known in psychology. We try to understand to what extent do the language models exhibit the various factors of sexism, racism, etc.?

What data will be worked on?

The study population is the variety of pre-trained language models available on HuggingFace. In addition we use datasets that exhibit specific factors of the studied biases.

What tasks will this project involve?

Fine-tuning and domain adaptation of pre-trained language models. Formulation of queries that test for specific kinds of biases and their specific factors. Analysis of results. Report/paper writing.

What makes this project interesting to work on?

The intertwining between NLP and psychology.

What is the expected outcome?

A publishable paper. A benchmark for evaluating latent constructs related to fairness and socially acceptable behavior.

Is the data open source?

Yes.

What infrastructure, programs and tools will be used? Can they be used remotely?

Python, PyTorch, Transformers, Colab, GPU cluster. They can be used remotely.

What skills are necessary for this project?

Python programming. Familiarity with machine learning. Basic familiarity with NLP. Applicants should have some interest toward studying/understanding psychology.

What level of experience is necessary for this project?

Master student, PhD students, Post-Doctoral

Yoav Ram - Tel Aviv University

Yoav Ram - School of Zoology, Faculty of Life Sciences, Tel Aviv University

Unsupervised learning of microbial growth curve data

Computational ecology and evolution

https://www.yoavram.com

yoavram@tauex.tau.ac.il

What is the data science project's research question?

Microbial growth is typically assayed by measuring the density of a tube of liquid using an optical reader. The resulting measurements, called growth curves, are used to assess and compare growth of different strains (e.g. mutants, clinical isolates, evolved strains) in various conditions (e.g. drug treatment, limited nutrients, low/high pH). These growth curves are usually studied using dynamical models, characterized by ordinary differential equations [Ram 2018]. However, these model-based approaches are limited both by model assumptions and numerical and computational challenges.

The purpose of this project will be to apply unsupervised learning methods to growth curve data in order to develop a methodology for clustering, denoising, reconstructing, and generating growth curves.

What data will be worked on?

Growth curves (OD vs time) from experiments with bacteria and fungi under different conditions

What tasks will this project involve?

Analysis of growth curve data using classical unsupervised learning methods (e.g. PCA, k-means, hierarchical clustering [VanDerMaaten 2009]) as well as more recent methods (e.g. UMAP [McInnes 2018], variational autoencoders [Kingma 2013], autoregressive flow [Papamakarios 2017]). The student will write a final report to describe the project results, which will be assessed for its academic quality.

What makes this project interesting to work on?

Growth curves are easy to produce and automate, and are therefore ubiquitous. However, they are difficult to analyze and in many cases these curves are analyzed using simple methods (linear regression, area-under-the-curve), leading to loss of most of the information in the data. We will test several classical and novel machine learning methods in an attempt to extract these "lost" data.

What is the expected outcome?

Completion of this project will provide the first step towards a new method for analysis of growth curves data; the participating data scientist will be invited to be involved in further development of this method and in the writing and publication of a manuscript that summarizes the new method.

Is the data open source?

Some of the data is open source.

What infrastructure, programs and tools will be used? Can they be used remotely?

Unsupervised learning methods (clustering, generative models, dimension reduction), including machine and deep learning. Our lab has access to a large CPU (>1000 cores) and GPU (8xA100, 24xA6000) cluster. Other students and postdocs in the lab study a variety of questions in ecology and evolution, ranging from microbes to whales to humans, applying model-based and machine learning methods to data from the lab as well as archaeological and cultural data. All can be used remotely.

What skills are necessary for this project?

Background in statistics/machine learning/deep learning and experience in data processing, analysis and visualization. Knowledge of the specific methods is useful but can be covered by reading and discussions prior to the visit to our lab.

What level of experience is necessary for this project?

PhD students, Post-Doctoral

Tirza Routtenberg - Ben-Gurion University

Tirza Routtenberg - Ben-Gurion University of the Negev

Geometry Design for DOA Estimation in Seismic Arrays

Our research group focuses on signal processing and optimization with applications in the smart grid. We study various topics in this field, including power system data analytics, state estimation and event detection, and cyber security in the Smart Grid.

In particular, we are interested in statistical signal processing and detection and estimation theory. We work on post-model-selection estimation, periodic and nonlinear estimation, performance bounds, and constrained estimation.

Additionally, we also study Graph Signal Processing (GSP). Our research in GSP includes Blind Source Separation, Performance bounds, and Estimation and detection with graph signals.

https://www.ee.bgu.ac.il/~tirzar/

tirzar@bgu.ac.il

What is the data science project's research question?

The research question of this data science project is: How can the direction of arrival (DOA) estimation performance of a seismic signal be optimized using a planar array sensor design, in terms of the minimum-mean-squared-periodic-error (MSPE) obtained by the commonly-used maximum a-posteriori (MAP) estimator of the DOA? This research is based on comparing the MSPE of the MAP estimator with two other criteria: the Cyclic Bayesian Cramér-Rao Bound (CBCRB) and the complete Expected Log-Likelihood (ELL).

What data will be worked on?

The data used in this project is seismic data that was recorded by the GEres array located in the Bavarian Forest, Germany. The GEres array is part of the CTBTO (Comprehensive Nuclear-Test-Ban Treaty Organization) international monitoring system. It is a well-maintained and calibrated station, which ensures the high quality of the data. The data from the GEres array is continuously streamed to the IDC (International Data Centre) of the CTBTO, where it is analyzed. The data that will be used in this project has been collected by the GEres array and will be obtained from the IDC of the CTBTO.

What tasks will this project involve?

Implementing existing algorithm by Python. Testing and comparing different methods. (potential publication, depends on the results)

What makes this project interesting to work on?

This project is interesting to work on for several reasons:

Seismic event detection and localization is an important field of study as it helps in understanding and predicting natural disasters such as earthquakes. Accurately determining the direction of arrival of seismic signals is crucial for accurate event localization.

The use of sensor array geometry in the design of the planar array for DOA estimation is an interesting and challenging problem, as it involves optimizing the performance of the estimation algorithm with respect to the sensor array geometry.

The project compares different design criteria for DOA estimation, such as the minimum-mean-squared-periodic-error (MSPE) obtained by the commonly-used maximum a-posteriori (MAP) estimator, the Cyclic Bayesian Cramér-Rao Bound (CBCRB), and the complete Expected Log-Likelihood (ELL) which allows for a deeper understanding of the design problem and enables to develop better design strategies.

The use of high-quality seismic data from the GEres array, which is part of the CTBTO international monitoring system, provides an opportunity to work with real-world data and validate the proposed methodologies in a realistic scenario.

The study of the impact of different array geometry on the performance of DOA estimation could have applications in a wide range of fields, including natural disaster monitoring, explosion detection, and target localization.

What is the expected outcome?

The expected outcome of this project is to design a planar array that optimizes the direction of arrival (DOA) estimation of a narrowband signal. Thus, the expected outcome is a running algorithm and simulation platform, with possible publication (depends on the results).

Is the data open source?

Yes, under some obligations.

What infrastructure, programs and tools will be used? Can they be used remotely?

Participants should expect to work in Python. Can be used remotely.

What skills are necessary for this project?

Programming, machine learning, algorithms, signal processing (no need for all, we will design the final version of the project based on the abilities of the students)

What level of experience is necessary for this project?

Master student, PhD students, Post-Doctoral

Nir Weinberger - Technion

Nir Weinberger - Technion Israel Institute of Technology, ECE

Multi-armed bandit algorithms for information-theoretic Channel Selection

We are studying theoretical problems at the intersection of machine-learning, high dimensional statistics and information theory.

https://sites.google.com/view/nir-weinberger/home

nirwein@technion.ac.il

What is the data science project's research question?

To propose and analyze exploration algorithms for finding communication channels which are information-theoretic optimal (e.g, they have maximal capacity).

What tasks will this project involve?

Proposing exploration algorithms for the channel selection problem, performing a theoretical analysis, and simulating their performance.

What makes this project interesting to work on?

It involves both algorithmic questions, probabilistic analysis problems, and simulation, as well as domain knowledge in sequential decision making and information theory.

What skills are necessary for this project?

Mathematical maturity and basic knowledge in concentration inequalities, machine learning and information theory

What level of experience is necessary for this project?

PhD students, Post-Doctoral

Yossi Yovel - Tel Aviv University

Neural encoding of natural stimuli - an fmri student

Animal behavior and decision making.

www.yossiyovel.com

yossiyovel@gmail.com

What is the data science project's research question?

How does the brain encode natural stimuli?

What data will be worked on?

Fmri images

What tasks will this project involve?

Connecting the MRI signal to the movie observed by the scanned humans.

What makes this project interesting to work on?

It touches on one of the most fundamental questions in neuroscience.

What is the expected outcome?

A model connecting the natural stimulus and the brain

What infrastructure, programs and tools will be used? Can they be used remotely?

Matlab or python; everything can be done remotely.

What skills are necessary for this project?

Programming, ML

What level of experience is necessary for this project?

Master student

Igal Bilik -Ben Gurion University

Igal Bilik -Ben Gurion University of the Negev, School of Electrical and Computer Engineering

Three-dimensional scene reconstruction using echolocation echoes

Statistical and machine learning-based signal processing for sensors and sensor arrays enabling autonomous platforms operation. Smart Sensing Lab research focuses on signal processing for radar, visual and acoustic sensors data.

https://danielbilik2003.wixsite.com/igalbilik

bilik@bgu.ac.il

What is the data science project's research question?

Can artificial neural network-based approach perform 3D data reconstraction using echoelocation echoes?

What data will be worked on?

Data-base of 623K echoes simulated and recorded using bio-mimetic sonar system.

What tasks will this project involve?

Derive, implement and test the Deep neaural network-based approach for the 3D scene reconstraction using the dataset.

What makes this project interesting to work on?

An efficient approach for 3D scene reconstraction doesn't exist yet.

What is the expected outcome?

Journal publication summarising hte derived DNN-based approach for 3D scene reconstruction

Is the data open source?

Yes.

What infrastructure, programs and tools will be used? Can they be used remotely?

Computer with GPUs, Python with NN libraries. They can be used remotely.

What skills are necessary for this project?

Deep Learning, Statistical Signal Processing, Acoustic signal processing

What level of experience is necessary for this project?

Master student, PhD students, Post-Doctoral

Igal Bilik -Ben Gurion University

Deep learning of Camera/Radar data for safe robot and human interaction

Statistical and machine learning-based signal processing for sensors and sensor arrays enabling autonomous platforms operation. Smart Sensing Lab research focuses on signal processing for radar, visual and acoustic sensors data.

https://danielbilik2003.wixsite.com/igalbilik

bilik@bgu.ac.il

What is the data science project's research question?

Detection, Localization and classification of human operating in adjacency with robotic arm.

What data will be worked on?

Vision and radar recordings of human and robotic arm operating in laboratory environment.

What tasks will this project involve?

Development of Deep learning-based approach for detection, localization, and classification of human and robotic arm

What makes this project interesting to work on?

Ability to localize and classify between human and robotic arm is critical to introduction of the consumer cobots into human living environment. Reliable algorithms are still missing and this project can provide a basis for the new era of robot-human interaction. The project can have significant academic and industrial implications.

What is the expected outcome?

Journal article summarises derived algortihms and presenting results

Is the data open source?

No.

What infrastructure, programs and tools will be used? Can they be used remotely?

Computer with GPUs , Python with DLL libraries; they can be used remotely.

What skills are necessary for this project?

Statistical Signal Processing, Computer vision, Deep learning

What level of experience is necessary for this project?

Master student, PhD students, Post-Doctoral

Raja Giryes - Tel Aviv University

Raja Giryes - Electrical engineering, Tel Aviv University

Learning with small data, deep optics, image processing, visual language models

Doing research in deep learning and computational imaging

https://www.giryes.sites.tau.ac.il/

raja@tauex.tau.ac.il

What is the data science project's research question?

How we can learn with small data and how we can use deep learning to improve imaging.

What data will be worked on?

Up to the visitor

What tasks will this project involve?

Will be determined with the visitor

What is the expected outcome?

Hopefully a research paper

What infrastructure, programs and tools will be used?

Python and gpu servers

What skills are necessary for this project?

Knowledge with deep learning and Python

What level of experience is necessary for this project?

PhD students, Post-Doctoral

Nir Shlezinger - Ben-Gurion University

Data-driven factor graphs for symbol detection in wireless communications

My research group studies aspects on the intersection of machine learning, signal processing, and communications. We have established expertise in the derivation of algorithms that combine statistical-,model-based inference with data-driven deep learning techniques, with notable outcomes including hybrid implementations of celebrated algorithms such as the Kalman filter (in dynamic systems), Viterbi algorithm (in digital communications), and subspace-methods (in array signal processing).
The group includes 5 PhD students, and 14 MSc students.

https://sites.google.com/view/nirshl

nirshl@bgu.ac.il

What is the data science project's research question?

A fundamental family of algorithms in signal processing and communications is based on factor graphs as principled representation of stuctured distributions. We have recently developed a novel paradigm for integrating deep learning for abstracting and relieving the dependence of factor graph algorithms on accurate domain knowledge.
The data science project aim at investigating the ability of the proposed methodology for enabling wireless communications over unknown channels.

As such, our main research questions is can learned factor graphs enable reliable asynchronous communications by harnessing inherent structures in communication channels without requiring channel knowledge?

What data will be worked on?

Our main evaluation benchmark will be the Sionna library provided by Nvidia for synthesizing wireless communication channels and real-world communication protocols.

What tasks will this project involve?

The main tasks will be:

Learning of factor graph algorithms for symbol detection in wireless communications
Implementation of data-driven factor graph inference for synthetic communication setups
Experimental evaluation of the proposed framework for real-world wireless channels data and protocols

What makes this project interesting to work on?

The project combines both principled factor graph inference with emerging deep learning techniques. As such, the participating student will gain expertise in hybrid algorithm that are both statistical-model-based and data-driven, by both classical message passing methods and deep learning in a principled manner.

What is the expected outcome?

Experimental evaluation of an emerging algorithm combining factor graph inference with deep learning for symbol detection over unknown wireless communication channels.
The algorithm and its experimental study will serve as a basis for a journal paper that will summarize our main results and will be intended to the IEEE Transactions on Signal Processing or the IEEE Transactions on Communicaitons.

Is the data open source?

Yes, it is based on the Sionna library (https://developer.nvidia.com/sionna)

What infrastructure, programs and tools will be used? Can they be used remotely?

The main coding is done in Python using available packages.
The infrastructure used is based on the computational resources provided by the lab and the School of Electrical and Computer Engineering in Ben-Gurion University. Everything can be used remotely.

What skills are necessary for this project?

Hands on experience with deep learning for time sequences; knowledge in factor graphs and message passing algorithms

What level of experience is necessary for this project?

Master student, PhD students

Orr Spiegel - Tel Aviv University

Orr Spiegel - Tel Aviv University, School of Zoology

Animal movement and behavioral ecology

Animal movement is central in various ecological processes such as disease spread and conservation. Modern devices allow high-resolution tracking with massive datasets (e.g. a fix every second over months and tens of individuals). Yet, the development of adequate data-analysis methods is lingering, impairing progress and full utilization of the potential of the data.
Social interactions are a major factor shaping the movement and can be extracted from co-occurrence in time and space. We aim to develop a tool that will automatically identify and characterize these interactions (strength, duration, consistency, context) from noisy movement data and will allow us to locate hotspots of interactions. These hotspots can reveal resource patches in barn owls (feeding on agricultural pests), lapwings (providing insect removal service), and pigeons (spreading diseases)– facilitating intervention in these contexts.

https://orrspiegel.wixsite.com/orrspiegel

orrspiegel@tauex.tau.ac.il

What is the data science project's research question?

Animal movement is central in various ecological processes such as disease spread and conservation. Modern devices allow high-resolution tracking with massive datasets (e.g. a fix every second over months and tens of individuals). Yet, the development of adequate data-analysis methods is lingering, impairing progress and full utilization of the potential of the data.
Social interactions are a major factor shaping the movement and can be extracted from co-occurrence in time and space. We aim to develop a tool that will automatically identify and characterize these interactions (strength, duration, consistency, context) from noisy movement data and will allow us to locate hotspots of interactions. These hotspots can reveal resource patches in barn owls (feeding on agricultural pests), lapwings (providing insect removal service), and pigeons (spreading diseases)– facilitating intervention in these contexts.

What data will be worked on?

Multi-individual movement data at high resolution generated by an ATLAS system.

What tasks will this project involve?

Organize existing codes to read the movement data; generate codes to identify interactions; develop data-science methods to characterize these interactions; and create visualizations of the emerging social networks. For interested students, joining fieldwork to see how the data is extracted and assist in this aspect is also optional (not required).

What makes this project interesting to work on?

The project works with data generated by real free-ranging animals. It also combines challenging data science questions with relevance to conservation and management.

What is the expected outcome?

The outcome will be a tool that can facilitate future studies in our group and also applied apps such as, for instance, a rodent-outbreak detector from the identification of barn owl hotspots of interactions. If relevant the student can be co-author on relevant publications.

Is the data open source?

The codes of our group are open source in GitHub (https://github.com/Orrslab). The data is not open source since data on animal movement, nests and whereabouts can inform poachers and jeopardize conservation.

What infrastructure, programs and tools will be used? Can they be used remotely?

The project largely relies on our established ATLAS system. This is a cutting-edge tracking system that includes towers, receivers and central processing units, in a addition to the animal-born transmitters that are being deployed. The system is described in several publications,

such as this duch paper: https://animalbiotelemetry.biomedcentral.com/articles/10.1186/s40317-022-00307-w
or ours: https://assets.researchsquare.com/files/rs-1778136/v1/55332718-074e-460c-86fc-c20ed4c6392e.pdf?c=1657040056

Everything can be used remotely.

What skills are necessary for this project?

First, curiosity about wildlife and its behavior. Second strong coding and data-science abilities and the ability to work in a team with independent projects. Familiarity with ecological data and particularly with movement data is an advantage.

What level of experience is necessary for this project?

PhD students, Post-Doctoral

Noam Goldberg - Bar-Ilan University

Noam Goldberg - Bar-Ilan University, Department of Management

Medical procedure length of stay and how how it affects schedule variability using robust optimization methods versus current practices

Data driven optimization, robust optimization

https://management.biu.ac.il/en/Noam.Goldberg

noam.goldberg@biu.ac.il

What is the data science project's research question?

Can utilization and schedule variability be improved using robust scheduling and bin packing techniques?

What data will be worked on?

Anonymized hospital visits and appointment data

What tasks will this project involve?

Analysis of length of stay anonymized data for different hospital departments and medical procedures.

What makes this project interesting to work on?

Working with data analysis and visualization tools to get insights into real life processes
Interesting optimization algorithms that can lead to better decision making
Working on important healthcare applications

What is the expected outcome?

Implmentation of robust scheduling and bin packing algorithms. Analysis and characterization of LOS distribution.

Is the data open source?

Some of the data is.

What infrastructure, programs and tools will be used? Can they be used remotely?

R, Python, Julia. Most of it can be used remotely?

What skills are necessary for this project?

Programming skills in Python and/or Julia (preferred)

What level of experience is necessary for this project?

PhD students

Israel & UK Exchange - Projekte 2023

Unsere Austauschprojekte

Die Projekte aus Großbritannien

Caterina Doglioni - University of Manchester

Caterina Doglioni - University of Manchester

Zahra Montazeri - University of Manchester

Zahra Montazeri - University of Manchester

Anirbit Mukherjee - University of Manchester

Anirbit Mukherjee - University of Manchester

Felix Reidl - Birkbeck, University of London

Felix Reidl - Birkbeck, University of London

Felix Reidl - Birkbeck, University of London

Felix Reidl - Birkbeck, University of London

Do you have any questions?

Do you have any questions?

Die Projekte aus Israel

Keren Agay-Shay - Bar Ilan University

Keren Agay-Shay - Bar Ilan University, Azrieli Faculty of Medicine

Tamir Bendory - Tel Aviv University

Tamir Bendory - Tel Aviv University, School of Electrical Engineering

Barak Fishbain - Technion

Barak Fishbain - Technion - Israel Institute of Technology, Civil and Environmental Engineering

Barak Fishbain - Technion

Barak Fishbain - Technion - Israel Institute of Technology, Civil and Environmental Engineering

Oren Forkosh - Hebrew University of Jerusalem

Oren Forkosh - Hebrew University of Jerusalem

Roy Friedman - Technion

Roy Friedman - Technion, Computer Science

Avitay Geltman - Technion

Avitay Geltman - Technion - Israel Institute of Technology, Environmental Informatics Lab

Tzipi Horowitz Kraus - Technion

Tzipi Horowitz Kraus - Technion, Dept. of Education in Science and Technology

Itzik Klein - University of Haifa

Itzik Klein - University of Haifa, the Autonomous Navigation and Sensor Fusion Lab

Asaf Levy - The Hebrew Univeristy of Jerusalem

Asaf Levy - The Hebrew Univeristy of Jerusalem

Mor Nitzan - The Hebrew University of Jerusalem

Mor Nitzan - School of Computer Science and Engineering, The Hebrew University of Jerusalem

Yaron Orenstein - Bar-Ilan University

Yaron Orenstein - Bar-Ilan University, Computer Science and Life Sciences

Gideon Oron -Ben Gurion University

Gideon Oron -Ben Gurion University of the Negev, Blaustein Instituted for Desert Research

Rami Puzis - Ben-Gurion University

Rami Puzis - Software and Information Systems Engineering, Ben-Gurion University of the Negev

Yoav Ram - Tel Aviv University

Yoav Ram - School of Zoology, Faculty of Life Sciences, Tel Aviv University

Tirza Routtenberg - Ben-Gurion University

Tirza Routtenberg - Ben-Gurion University of the Negev

Nir Weinberger - Technion

Nir Weinberger - Technion Israel Institute of Technology, ECE

Yossi Yovel - Tel Aviv University

Yossi Yovel - Tel Aviv University

Igal Bilik -Ben Gurion University

Igal Bilik -Ben Gurion University of the Negev, School of Electrical and Computer Engineering

Igal Bilik -Ben Gurion University

Igal Bilik -Ben Gurion University

Raja Giryes - Tel Aviv University

Raja Giryes - Electrical engineering, Tel Aviv University

Nir Shlezinger - Ben-Gurion University

Nir Shlezinger - Ben-Gurion University

Orr Spiegel - Tel Aviv University

Orr Spiegel - Tel Aviv University, School of Zoology

Noam Goldberg - Bar-Ilan University

Noam Goldberg - Bar-Ilan University, Department of Management

Kontaktieren Sie uns

Teresa Weikert Managerin für interne & externe Partnerschaften

Kontakt

Alle Kontakte

Newsletter bestellen

Ihre Cookie-Einstellungen

Einstellungen

Teresa Weikert
Managerin für interne & externe Partnerschaften