International zusammenarbeiten:

Israel Exchange - Projekte 2022

29 Projekte wurden von unseren israelischen Partnern im Jahr 2022 eingereicht - mit dem Ziel, Sie für einen dieser Forschungsansätze zu begeistern und mit Ihrer Expertise voranzubringen. Werfen Sie einen Blick auf die diesjährigen Vorschläge!

Unsere Austauschprojekte


Mit einem Klick auf die Karten erfahren Sie mehr über die jeweiligen Themen, die Mentorinnen und Mentoren und die Teilnahmebedingungen.

Amiram Moshaiov - Tel-Aviv University

Amiram Moshaiov - Tel-Aviv University

DS and AI in Support of Balancing the Health and Economy Consequences of Non-Pharmaceutical Interventions During a Pandemic

 

Computational Intelligence with a focus on search and optimization techniques for solving multi-objective problems and supporting multi-criteria decision-making. We focus on the development of generic computational tools. We demonstrate their applicability on a large range of problem domains.

www.eng.tau.ac.il/~moshaiov

moshaiov@tauex.tau.ac.il

 

 

What is the data science project's research question?

How available socio-economic national/local data  can be used to create a network model  (graph) that will support quantitative evaluation of the effect of local (regional, cities etc.) lockdowns and similar local actions to fight a pandemic (local Non-Pharmaceutical Interventions) on the national economy.

 

What data will be worked on?

Regional Socio-economic Statistics of the Central Bureau of Statistics of Israel or any similar databases.

 

What tasks will this project involve?

Deciding on the relevant data; Extracting the relevant data; Analyzing the relevant data; Creating a network model that uses the data to map between local lockdowns to the national economic indices; Demonstrations and analysis of the model          

 

What makes this project interesting to work on?

Health-Economy Dilemma in Pandemic Policy Making is most significant to the entire world. I have initiated research on this topic when the Covid-19 started and worked on it together with other colleagues. I can send some relevant publications to interested colleagues with additional information on our achievements so far.

 

What is the expected outcome?

Contribution to research paper, Contribution to software development, model development.

 

What infrastructure, programs and tools will be used? Can they be used remotely?

Google colab. In case of need our department's GPU cluster will be available through VPN access.

 

Is the project open source?

Yes

 

What skills are necessary for this project?

Computational facilities of Tel-Aviv University . Can be used remotely.  Data analytics / statistics, Scientific computation, Computational models, Computer simulations, Databases

 

Interested candidates should be at Postdoc-level.  Amiram Moshaiov is looking for 1 visiting scientist, working on the project.

Armin Shmilovici - Ben Gurion University

Armin Shmilovici - Ben Gurion University

From Movie to Story: Movies' Story Analytics

 

The Data and Text Mining Research Lab is aimed at developing and implementing innovative methods of computational intelligence and data mining that will improve the understanding of human made creations such as text and movies.

 

https://in.bgu.ac.il/en/engn/sise/dm/Pages/default.aspx

armin@bgu.ac.il
 

What is the data science project's research question?

Can a computer understand the story of a feature movie? e.g., the characters, their motivations, the plot, etc.

 

What data will be worked on?

We have manually tagged 45 movies with over 6000 scenes for their story type and their story purpose.

 

What tasks will this project involve?

Feature generation and engineering from text, audio and video (e.g., cinematic cues), machine learning of story aspects in movies, analysis of results, involvement in paper writing

 

What is the expected outcome?

Contribution to research paper

 

Is the data open source?

Will soon be public after our paper is accepted

 

What infrastructure, programs and tools will be used? Can they be used remotely?

To the researcher's choice, standard computer servers and open source software. Some may be used remotely.

 

What skills are necessary for this project?

Data analytics / statistics, Scientific computation, Data mining / Machine learning 

 

Interested candidates should be at PhD level. Armin Shmilovici is looking for 2 visiting scientists, working on the project individually. Second supervisor will be Mark Last.

Arnon Hershkovitz - Tel Aviv University

Arnon Hershkovitz - Tel Aviv University

Log-Based Measuring of Creativity in Block Programming

 

We study learning behavior by analyzing log files drawn from online learning environments. This is mostly done in the context of mathematics education and computational thinking. This investigation raises some interesting data science-related issues, specifically related to the assessing of educational/psychological variables.          

 

https://sites.google.com/view/arnon-hershkovitz

arnonhe@tauex.tau.ac.il

 

What is the data science project's research question?

How could CREATIVITY be defined and measured from log files of a blockly-based learning environment?

 

What data will be worked on?

Code submissions of 6th- and 9th-grade students, drawn from an online learning environment for novice programming/computational thinking (Kodeut.org). Additionally, traditional creativity tests and some background information of participants, for triangulation and further investigation.

 

What tasks will this project involve?

Defining and implementing a CODE SIMILARITY algorithm that will allow for an automatic calculation of CREATIVITY.

 

What makes this project interesting to work on?

The project has high impact on education, as promoting creativity is a desired goal. It has some challenging data science aspects of processing log files and developing code similarity mechanisms.

 

What is the expected outcome?

Contribution to research paper, Contribution to software development


Is the data open source?

No

 

What infrastructure, programs and tools will be used? Can they be used remotely?

Any relevant software could be used, data is in simple csv format. Remote collaboration is definitely an option.

 

What skills are necessary for this project?

Data analytics / statistics, Scientific computation, Computational models, Visualization

 

Interested candidates should be at PhD level. Arnon Hershkovitz is looking for 1 visiting scientist, working on the project individually.

Asaf Levy - The Hebrew University of Jerusalem

Asaf Levy - The Hebrew University of Jerusalem

Machine learning based discovery of genes involved in insecticidal activity

 

My lab is interested in discoverying the microbial genes that are responsible for interactions between microbes and different organisms in the environment. We are very much interested in toxins. We have experimental systems to test our predictions in house.

 

www.asaflevylab.com

alevy@mail.huji.ac.il

 

What is the data science project's research question?

Can ML or deep learning be employed to identify novel genes that participate in killing of insect pests?               

 

What data will be worked on?

Large-scale data of microbial genomes (>100,000 genomes, >500,000,000 genes)


Is the data open source?

We would like to patent the results if they will be interesting enough.

 

What tasks will this project involve?

Developing a classifier, if interested particpating in lab experiments

 

What makes this project interesting to work on?

It is challenging, interesting, exciting and have a lot of impact to agriculture, food security, and human health (e.g. by controlling mosquitos).

 

What is the expected outcome?

Contribution to research paper, Contribution to software development.

 

What infrastructure, programs and tools will be used? Can they be used remotely?

We have an access to a large computer server.

 

What skills are necessary for this project?

Data mining / Machine learning, Deep learning, Understanding and interest in biology 

 

Interested candidates should be at PhD level. Asaf Levy is looking for 2 visiting scientists, working on the project together with the team.

Supervisor's contact:

Asaf Levy's collaborator Dr. William Andreopolus from California. They developed together a deep learning tool: https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkab1115/6454267?login=true

Asaf Weinstein - Hebrew University

Asaf Weinstein - Hebrew University

Learning optimal regularization for neural network training        

 

Broadly speaking, I work on developing theory and methodology for statistical problems involving simultaneity,  including problems in multiple comparisons, selective inference and variable selection, and am particularly interested in the interplay between Bayesian and Frequentist methods.

 

https://openscholar.huji.ac.il/asaf.weinstein/home

asaf.weinstein@mail.huji.ac.il

 

What is the data science project's research question?

Incorporating regularization in training of neural networks is standard, but conventional methods usually employ regularizers whose form is fixed in advance, while the "learning" part consists mainly of choosing a set of tuning parameters based on the data. For this project I propose a more general approach in which the *entire* regularization function is learned non-parametrically from the observed (training) data, with the prospect of adapting to the optimal regularization scheme called for by the specific instance of the problem (say, the true, unknown weights of the underlying network).

 

What data will be worked on?

This work is first and foremost methodological, but standard ML datasets (e.g. MNIST) could be suitable for evaluating performance. The data will be open source.

 

What tasks will this project involve?

Developing new statistical methods from a theoretical perspective; implementing resulting methods in (efficient) code, and using the university computing cluster; designing and experimenting with simulations; writing.

 

What makes this project interesting to work on?

The basic question posed here is: "what is the regularization that, if used in training a neural net, yields the best prediction performance?". Considering the popularity of neural networks in machine learning task, the importance and potential impact are quite obvious.

 

What is the expected outcome?

Contribution to research paper, Contribution to software development

 

What infrastructure, programs and tools will be used? Can they be used remotely?

As a host, I will be happy to provide personal computing equipment (laptop, any relevant software). Of course, we also have access to the computing cluster of the university for more demanding computational tasks.

 

What skills are necessary for this project?

Data analytics / statistics, Computer simulations, Data mining / Machine learning, Deep learning, High-performance computing

 

Asaf Weinstein is looking for 1 visiting scientist, working on the project individually.

Avigdor Gal - Technion

Avigdor Gal - Technion

Assessing the Plausibility of Data in Machine Learning Pipelines

 

Data Integration under Uncertainty, streaming data and process mining               . 

 

http://bigdataintegration.org

avigal@technion.ac.il

 

What is the data science project's research question?

"End-to-end machine learning pipelines are widespread in large-scale exploratory data analyses. The complexity of both, pipelines and data, holds large potential for various kinds of errors. On top of this, a definition of correctness for data inputs and outputs is typically unavailable, rendering any analysis results inherently uncertain. This project aims to build trust in analysis results through automated plausibility checking of data. Given a catalogue of common types of data and plausibility constraints on them, a first research question is: (1) How can plausibility constraints be integrated into a pipeline and checked on real data? Here, challenges arise from the uncertainty of real data, e.g. due to sparsity or variability in the measurement resolution or quality. Given a violation of a constraint, a user is interested in analyzing the root cause of the constraint violation. Root causes may be due to unexpected or missing data, a problem in the analysis pipeline, or a problem with the plausibility constraint. It would be important to understand which types of root causes can be excluded automatically, before reporting a constraint violation to a user. More generally, such aspects lead to a second main research question, which is: (2) How to support a user in the root cause analysis of plausibility constraint violations?"

 

What data will be worked on?

The work is based on a pipeline for detection of astrophysical transients in multiple wavelengths, with a collection of relevant data plausibility checks for said pipeline. The data is open source.

 

What tasks will this project involve?

A system for checking plausibility constraints shall be designed and implemented. This involves checking constraints on real-world data and supporting a root cause analysis of constraint validations.

 

What makes this project interesting to work on?

This project is an interdisciplinary collaboration on data science pipelines for exploratory research in observational astrophysics. It has the potential to improve future data analysis projects in other application domains.

 

What is the expected outcome?

Contribution to research paper.

 

What infrastructure, programs and tools will be used? Can they be used remotely?

A set of common data science tools will be used, including Python / Pandas / Scikit-learn / Tensorflow.

 

What skills are necessary for this project?

Data analytics / statistics, Scientific computation, Data mining / Machine learning

 

Interested candidates should be at PhD level. Avigdor Gal is looking for one visiting scientist, working on the project together with the team.

Barak Akabayov - Ben-Gurion University

Barak Akabayov - Ben-Gurion University

Machine learning approaches for optimizing small molecule inhibitors  

 

We develop inhibitors targeted against components in central molecular biology domains in the bacterial cell, such as DNA replication and protein translation. We use machine learning to predict the binding of small molecules to their target by utilizing their chemical and geometrical features.

 

akabayov-lab.org

akabayov@bgu.ac.il

 

What is the data science project's research question?

What would be the properties of better inhibitors for a specific target?

 

What data will be worked on?

Original datasets of small-molecule structures with their binding values to the target of inhibition. The data is an original data we collected in experiments. Upon publication the data will be publicly available.   

 

What tasks will this project involve?

Implementing unsupervised and supervised machine/deep learning algorithms to design improved inhibitors for a specific target.

 

What makes this project interesting to work on?

We provide original experimental data collected in my lab as datasets.
We use machine learning to obtain accurate results in a shorter time of the search, with a reduced search of chemical space
We verify by experiments the accuracy of the computational models.

 

What is the expected outcome?

Contribution to research paper, Contribution to software development

 

What infrastructure, programs and tools will be used? Can they be used remotely?

In-lab GPU equipped computers, and access to (University and external) servers.

 

What skills are necessary for this project?

Computational models, Data mining / Machine learning, Deep learning, Visualization    

Any student that knows python and machine-learning and has little interest in Chemistry is welcome

 

Barak Akabayov is looking for several visiting scientists, working on the project in a team.

Igal Bilik - Ben Gurion University

Igal Bilik - Ben Gurion University

Data-driven radar processing    

 

Development of signal processing methods for sensing systems (radar, radar-communication, radar-vision). The developed methods address challenges in a wide spectrum of applications: Autonomous vehicles and drones, swarm of moving drones, contact-less health monitoring, cognitive spectrum allocation and interference mitigation in multi-sensor environment. We develop methods for cognitive sensing,  bio-inspired sensing, sensing using conformal sensor arrays, deep-learning processing for radar sensors and others. 

 

https://danielbilik2003.wixsite.com/igalbilik/ssl-lab

bilik@bgu.ac.il

 

What is the data science project's research question?

How can data-driven methods overcome challenges of the conventional model-based processing.

 

What data will be worked on?

Radar measurements collected using various radar systems. Or data generated using simulation tools. Part of the data will be simulated, and part will be collected      

 

What tasks will this project involve?

Develop data-driven methods, select, design and implement deep neural networks, evaluate algorithms performance in real scenarios.

 

What makes this project interesting to work on?

Data-riven methods for the radar-based sensing is very pre-mature. Development of the novel data-driven methods has a break-through potential to enable multiple radar-based applications.    

 

What is the expected outcome?

Contribution to research paper, Contribution to software development

 

What infrastructure, programs and tools will be used? Can they be used remotely?

Radar systems are available in the lab, the data can be available remotely. Simulation tolls can also be available remotely.

 

What skills are necessary for this project?

Data mining / Machine learning, Deep learning

 

Interested candidates should be at PhD level or postdoc level. Igal Bilik is looking for 2 visiting scientists, working on the project together with the team.

Jason Friedman - Tel Aviv University

Jason Friedman - Tel Aviv University

Can structural differences between the hemispheres in the brain explain differences in motor control between left and right hands?

 

Our group studies questions related to motor control - how the brain controls movements. In particular, we are interested in how complex movements are generated and learnt from their building blocks. We use a variety of tools to study these questions in typical and clinical populations.

 

www.movementscienceslab.com

jason@tauex.tau.ac.il

 

What is the data science project's research question?

"The brains of all bilaterally symmetric animals on Earth are bicameral - they are divided into left and right hemispheres. It’s a remarkably conserved feature across species, indicating its importance for intelligence. The anatomy and functionality of the hemispheres have a large degree of overlap, but they specialise to possess different attributes.

The left is known to specialise in narrow class definitions and familiarity/routine and the right in broader classes and novelty [Goldberg et. al 2013]. For motor control - there are also interesting lateral effects. For example, in most people, the right hand is better at trajectories, and the left hand is better at position / posture (dynamic dominance theory [Sainburg 2005]).

One likely explanation for the emergent specialisation is small differences in parameterisation of the substrate. For example, synaptic plasticity, layer sizes and connectivity patterns within and across layers - more specifically the fact that the left exhibits a greater degree of short range connections and the right longer range connections.

In a previous work, we explored these questions through a bicameral ANN, mimicking differential specialisation. We differentiated the hemispheres via hyperparameter variation (representing substrate parameterization): sparsity, resource allocation (pyramid vs inverted pyramid) and learning rate. We demonstrated effective broad-narrow class specialization.

We hypothesise that the same mechanisms underpin specialisation observed in motor control - constituting the main question of the study.

More specifically, we hypothesise that the disparity in short/long range connectivity is sufficient to account for the observed differential specialisation. The reasoning is that long range connectivity (small world networks) is better suited to sequence learning where each state relates to other states, supporting the control of trajectories i.e. shifting states. And short range connectivity would be better at structural learning with integration of information across a localised region, supporting the control of posture i.e. maintaining state.

[1] E. Goldberg et al., “Hemispheric asymmetries of cortical volume in the human brain”, Cortex, 2013 [pdf]

[2] E. Goldberg, Creativity: “The Human Brain in the Age of Innovation”, Oxford University Press, 2018

[3] E. Goldberg, “The New Executive Brain”, Oxford University Press, 2009

[4] E. Goldberg and L. Costa. Hemisphere differences in the acquisition and use of descriptive systems, Brain and Language, 1981, 14, pp. 144-173

[5] E. Goldberg et al., “Cognitive bias, functional cortical geometry, and the frontal lobes: laterality, sex, and handedness”, J Cogn Neurosci. 1994 Summer;6(3) pp. 276-96

doi: 10.1162/jocn.1994.6.3.276.

[6] R. Sainburg, “Handedness: Differential Specializations for Control of Trajectory and Position”, Exercise and Sport Sciences Reviews, 2005, 4, pp. 206-213"

 

What data will be worked on?

3D kinematic data of arm reaching and postural tasks recorded in the lab (using a motion capture system) from human participants will be compared to the predictions of artificial neural networks developed to perform these tasks with the two hemispheres, using simple models of the biomechanics of arm movements. The data is open source.

 

What tasks will this project involve?

In summary:

Build a network, where short-range vs long-range connectivity can be controlled. A recurrent network, with localised receptive fields. The topology of recurrent connections between receptive fields determines whether the network has predominantly short-range or long-range (small world network) connectivity. Parameterised by distance, measured as the number of receptive fields. Interpret the spatio-temporal scale of motor movements as a spectrum, and define very short trajectories as spatial control. Choose a task that is a proxy for trajectory control. Measure performance at the task, as the network moves from small-world to large-world network. Analyse for specialisation, did it become better at trajectory or posture control as this feature was varied? How do the network predictions of kinematics compare to those collected in the lab? Put in context of left right brain specialisation

 

What makes this project interesting to work on?

For the potential impact to both cognitive sciences and AI. As mentioned for ‘research questions’, hemispheric specialisation is a striking and ubiquitous feature of bilateral life forms. Yet it is not well understood. A better understanding of hemispheric specialisation is an essential component for understanding brains and minds. In addition, hemispheric specialisation has not been explored or exploited in ML and AI. A better understanding would unlock novel architectures and potentially far superior capabilities for AI. It would also potentially provide a mechanistic explanation for the dynamic dominance theory, which is currently lacking.

 

What is the expected outcome?

Contribution to research paper.

 

What infrastructure, programs and tools will be used? Can they be used remotely?

Developing software for modeling artificial neural networks and comparing to experimental data. The work could in theory be performed remotely although it would ideally be performed in person.

 

What skills are necessary for this project?

Data analytics / statistics, Scientific computation, Computational models, Data mining / Machine learning, Deep learning

 

Interested candidates should be at PhD level. Jason Friedman is looking for 1 visiting scientist, working on the project individually.

Keren Agay-Shay - Bar Ilan University

Keren Agay-Shay - Bar Ilan University

Environmental Exposures and Gestational Diabetes

 

Highly multidisciplinary interest of focus: developing novel approaches for the in-depth understanding of the links between various exposures (ambient temperature and climate change, greenness, ambient air pollution and residence near refineries) and health outcomes, specifically adverse pregnancy outcomes, child health, cancer incidence, stress and short term psychological, physiological and cognitive responses, by using novel statistical tools and innovative epidemiological approaches, that were not used before in Israel and elsewhere.

 

http://research.md.biu.ac.il/labs/keren-agay-shay/research/
kagayshay@gmail.com  

 

What is the data science project's research question?

Analyzing associations between ambient exposures during pregnancy and gestational diabetes and other pregnancy outcomes

 

What data will be worked on?

Environmental exposure data and health data. The data is not open source.

 

What tasks will this project involve?

Using repeated measure models and multiple exposure models to investigate causality.

 

What makes this project interesting to work on?

I find that the interface between environmental exposure and humans and their mutual effects is fascinating.

 

What is the expected outcome?

Contribution to research paper

 

What infrastructure, programs and tools will be used? Can they be used remotely?

R, QGIS, STATA . The work can partly be used remotely (exposures-Yes, Outcome-Exposure data-No).

 

What skills are necessary for this project?

Data analytics / statistics, Computational models, Geographic Information Systems, Databases

 

Interested candidates should be at PhD or postdoc level. Keren Agay-Shay is looking for 2 visiting scientists, working on the project together with the team.

Keren Agay-Shay - Bar Ilan University

Keren Agay-Shay - Bar Ilan University

Climate change, built environment and adverse pregnancy outcomes   

 

I am an epidemiologist with extensive background in environment, biology, ecological microbiology, advanced statistical methods and public health. In the last several years we, in our research group, have developed our main, highly multidisciplinary interest of focus: developing novel approaches for the in-depth understanding of the links between external environmental exposures and health outcomes, specifically adverse pregnancy outcomes (APO), stress and cancer incidence. In our current research, we aim to evaluate both the beneficial and harmful exposure effects on health outcomes.

 

http://research.md.biu.ac.il/labs/keren-agay-shay/
kagayshay@gmail.com

 

 

What is the data science project's research question?

What is the most harmful exposure due to climate change that may increase the risk  for preterm delivery or birth defect?

 

What data will be worked on?

Birth registry and environmental exposure data. The data is not open source.

 

What tasks will this project involve?

Building statistical models. Spatial analysis

 

What makes this project interesting to work on?

Adaptation to climate change and studying climate change health effects is super relevant to be able to implement solutions.

 

What is the expected outcome?

Contribution to research paper, Contribution to software development, Contribution to implementation tool

 

What infrastructure, programs and tools will be used? Can they be used remotely?

R, STATA,SPSS,QGIS,ARCGIS. Yes. it can be partly used remotely.

 

What skills are necessary for this project?

Data analytics / statistics, Geographic Information Systems

 

Interested candidates should be at PhD or postdoc level. Keren Agay-Shay is looking for 2 visiting scientists, working on the project together with the team.

Masha Niv - Hebrew University of Jerusalem

Masha Niv - Hebrew University of Jerusalem

Sensory nutrition – from food ingredients to consumer reviews and back            

 

Niv lab studies molecular recognition of taste recognition. We have established a database of bitter molecules and developed ML tools for predicting bitterness and for matching molecules to taste receptors.
Benjamini lab is an applied statistics lab developing inference methods for structured data and for prediction models. Methodological focus areas include methods for data summarization, feature importance and interpretability, and for inference after selection.

 

https://biochem-food-nutrition.agri.huji.ac.il/mashaniv

masha.niv@mail.huji.ac.il

 

What is the data science project's research question?

Taste is the key factor for food choice and consumption. Reviews of food product provide a wealth of information that can be harnessed for optimizing health AND taste of food. The goal of this project is to construct a dataset and develop methods to extract knowledge from food reviews.

Specifically: Characterize food preferences of groups of consumers and predict additional products to their liking. Identify windows of nutrition opportunities, meaning products were taste can be improved by increasing healthiness. Products that are too salty, too sweet or too fatty, and thus making them more healthy will actually make them also more tasty.

 

What data will be worked on?

~300,000 reviews on Amazon food products as well as ~300,000 reviews from iHERB were already scraped and parsed in the Niv lab. Additional data may be available or obtained during the project. Some data is from Kaggle, other needs to be scraped. Proprietary data from sensory tests in Niv lab is available as well.

 

What tasks will this project involve?

Data parsing, natural language processing and coding, clustering / bi-way clustering, data-analysis and formulation of hypotheses, machine-learning.

 

What makes this project interesting to work on?

The tension between “tasty” and “healthy” might be partially settled by better tailoring of foods to groups of consumers, eventually leading to a healthier diet. The food industry and foodtech markets are huge, taste is the key driver of consumption, thus insights into taste have dramatic commercial implications.

 

What is the expected outcome?

NLP tools for analyzing food products reviews. Sensory profiles of consumers and links to their food preferences. Potential contribution to a research paper.

 

What infrastructure, programs and tools will be used? Can they be used remotely?

Analysis in a data-science environment (python or R based). Scripts and code that can be used remotely.

 

What skills are necessary for this project?

Data analytics / statistics, Computational models, Data mining / Machine learning, Visualization

 

Interested candidates should be advanced MSc, PhD level or postdoc Level. Masha Niv (in collaboration with Yuval Benjamini) is looking for 1 visiting scientist, working on the project with the team.

Nir Weinberger - Technion

Nir Weinberger - Technion

Learning partitioning estimates for regression 

 

I work on various problems in statistical inference, machine learning and information theory. The research is theory-oriented, and is either aims to obtain theoretical guarantees, or to develop algorithms which are theoretically motivated.

 

https://sites.google.com/view/nir-weinberger/home

nirwein@technion.ac.il

 

What is the data science project's research question?

We have recently proposed an algorithm called k-vectors (https://drive.google.com/file/d/1nfdQx2IhBaOLrDhW3bC_vX3zqG_3yRkx/view) which aims to learn an efficient partition of the feature space to cells for the task of regression. This algorithm can be thought of as a version of the k-means for regression. There are multiple open questions regarding this algorithm. 1) Experimentation in depth on a variety of data-sets. 2) Develop versions in which the k-vectors are sparse (to improve interoperability). 3) Develop finite-sample guarantees on its operation. 4) Develop spectral algorithms which parallel its operation. 

 

What data will be worked on?

Regression-type datasets from open databases. 

 

What tasks will this project involve?

As a first step, coding and experimentation with the k-vectors algorithm, including exploring various initialization methods. At the second step we will focus on one of the questions mentioned above. 

 

What makes this project interesting to work on?

It involves both theory and practice, and addresses a fundamental question in statistics and machine-learning, which is applicable to a multitude of practical scenarios.

 

What is the expected outcome?

Contribution to research paper.

 

What infrastructure, programs and tools will be used? Can they be used remotely?

Coding in Pyhton or Matlab. Can be used remotely. 

 

What skills are necessary for this project?

Data analytics / statistics, Data mining / Machine learning

 

Interested candidates should be at PhD level. Nir Weinberger is looking for 1 visiting scientist, working on the project individually.

Orly Lewis - Hebrew University of Jerusalem

Orly Lewis - Hebrew University of Jerusalem

ATLOMY - Anatomy in Ancient Greece and Rome: An Interactive Visual and Textual Atlas

 

We are an ERC-funded research developing a groundbreaking 3-D visualisation platform of Greco-Roman anatomy. Our primary objective is the analysis and recreation of ancient anatomical texts using state-of-the-art 3-D techniques. ATLOMY expands the understanding and accessibility of early medical knowledge and practices by using advanced NLP and CG. 

 

www.atlomy.org

orly.lewis@mail.huji.ac.il

 

What is the data science project's research question?

The backbone of ATLOMY’s Atlas is a complex analysis of the anatomical texts written 2000 years ago. The data science project will work towards achieving our goals of deciphering ancient anatomical descriptions, developing a model for machine-generated lexical entries and understanding how Ancient Greek and medieval Arabic technical language functions in practice.

 

What data will be worked on?

Our data is mostly unstructured. It is a corpus of Ancient Greek, Latin and Arabic texts in digitized form.

 

What tasks will this project involve?

- Text analysis using lemmatization, n-grams, PoS tagging and co-reference resolution.
- Identifying best models for pre-modern text analysis (ex.: word2vec/LDA/BERT/GPT3).
- Implementing the model(s) to produce an comprehensive data table from the unstructured text.

 

What makes this project interesting to work on?

You will be working with a unique interdisciplinary international team from academia and the Israeli hi-tech industry to develop a solution that can be integrated into a production environment. Our team includes experts in the fields of history, medicine, product, software, 3D modelling and data science.

 

What is the expected outcome?

Contribution to research paper, Contribution to software development

 

What infrastructure, programs and tools will be used? Can they be used remotely?

Google Colab, Jupiter Notebook, Pytorch, AWS. All can be done remotely. 

 

What skills are necessary for this project?

Data analytics / statistics, Computational models, Data mining / Machine learning, Databases

 

Interested candidates should be at postdoc level  Orly Lewis is looking for 1 visiting scientist, working on the project together with the team.

Avi Ostfeld - Technion

Avi Ostfeld - Technion

Investigation on the effect of pipe leakages on water quality in water distribution systems (WDS)

 

Water distribution systems security. Development and application of early warning detection system methodologies for monitoring stations allocation. Evolutionary optimization. Development and application of evolutionary optimization techniques (e.g., Genetic Algorithms, Ant Colony, Cross Entropy) to water distribution systems and water resources management. Reliability of water distribution systems. Development and application of reliability models for water distribution systems reliability simulation and management. Management of multi-quality water distribution systems. Integration of Geographical Information Systems with water resources models. Modeling and management of surface water quantity and quality. Development and application of machine learning techniques in water resources management.

 

https://ostfeld.net.technion.ac.il/

ostfeld@cv.technion.ac.il

 

What is the data science project's research question?

To estimate the effect of leakage uncertainty (location and volume) on the water quality estimates at the consumer stations.

 

What data will be worked on?

The student can work on open source leakage data obtained from the BattLeDIM 2020 (https://battledim.ucy.ac.cy/ )

 

What tasks will this project involve?

The student would have to perform the following tasks:
1) Literature survey to identify the water quality parameters that are affected by the leakages
2) Write a MATLAB/Python code to perform leakage simulations along with water quality estimates through EPANET.
3) Perform multiple simulations to quantify the effect of location as well the volume of leakages in water quality
4) Figure out if water quality sensors can aid in accurate leak localizations

 

What makes this project interesting to work on?

Leakages in water supply systems are inevitable, and there has not been a simulation study in estimating the effect of water quality due to leakages. Leak localization techniques mostly use pressure sensors. There has not been any investigation regarding using water quality sensors or a combination of pressure and water quality sensors to estimate the leak location with much more accuracy. This investigation can aid in a better understanding of the dynamics involved in the water quality estimates in real distribution systems. This study also can aid in improving the design and management strategies of water distribution networks.

What infrastructure, programs and tools will be used? Can they be used remotely?

A workstation with high processing capabilities
EPANET software
MATLAB or Python
Yes, they can be used remotely

 

 Avi Ostfeld is looking for 1 visiting scientist, working on the project individually.

Rami Puzis - BGU

Rami Puzis - BGU

Mining biases in language models 

 

At Complex Networks Analysis Lab at Ben-Gurion University (CNALAB@BGU) we tackle research problems in diverse domains using a combination of methods from graph theory and machine learning. Complex Networks are found in cyber security, social networks, communication networks and the Internet, biological networks, financial networks, text analytics and more. Scientific programmers working the CNA Lab @ BGU develop generic software tools and libraries to analyze the structure of networks derived from the various problem domains. Graduate research students apply these tools to investigate specific problems in their domain of interest.

 

https://faramirp.wixsite.com/puzis

puzis@post.bgu.ac.il

 

What is the data science project's research question?

To what extent do the biases in trained language models reflect the state of mind of text authors?

 

What data will be worked on?

A large unique corpus of tweets by healthcare professionals pre- and during COVID-19 outbreak. The data is not open source.

 

What tasks will this project involve?

Fine-tuning transformers from HuggingFace based on subsets of the data.
Formulating queries for bias quantification.
Evaluating and analyzing the biases.
Writing a report for publication.   

 

What makes this project interesting to work on?

Unique data. Significant problem with important consequences for the well-being of healthcare professionals during long term crises.   

 

What is the expected outcome?

Contribution to research paper.

 

What infrastructure, programs and tools will be used? Can they be used remotely?

Pytorch, Transformers (by HuggingFace), GPU cluster 

 

What skills are necessary for this project?

Deep learning, High-performance computing, NLP experience / BERT 

 

Interested candidates should be at PhD level. Rami Puzis is looking for 2 visiting scientist, working on the project together with the team.

Rami Puzis - BGU

Rami Puzis - BGU

Graph neural networks for centrality learning 

 

At Complex Networks Analysis Lab at Ben-Gurion University (CNALAB@BGU) we tackle research problems in diverse domains using a combination of methods from graph theory and machine learning. Complex Networks are found in cyber security, social networks, communication networks and the Internet, biological networks, financial networks, text analytics and more. Scientific programmers working the CNA Lab @ BGU develop generic software tools and libraries to analyze the structure of networks derived from the various problem domains. Graduate research students apply these tools to investigate specific problems in their domain of interest.

 

https://faramirp.wixsite.com/puzis

puzis@post.bgu.ac.il

 

What is the data science project's research question?

How to utilize attention in graph neural networks to mimic agent routing? How should the architecture of the transformers for graph neural networks look like to be able to compute diverse centrality measures? How can we speedup the convergence of learning a routing function for centrality computation? 

 

What data will be worked on?

Artificial random graphs and real networks from open datasets. The data is open source.

 

What tasks will this project involve?

Development of the graph neural network architectures. Performing experiments. Analyzing results. Writing a report toward publication.      

 

What makes this project interesting to work on?

Unique solution to an open problem. Deep dive into development of new neural network architecture for graph.  
Note: our preliminary results show that the approach for learning centrality measures using transformers has good potential. 

 

What is the expected outcome?

Contribution to research paper and to software development.

 

What infrastructure, programs and tools will be used? Can they be used remotely?

pytorch, networkx, open network datasets

 

What skills are necessary for this project?

Data analytics / statistics, Scientific computation, Deep learning, Complex networks

 

Interested candidates should be at PhD level. Rami Puzis is looking for 1 visiting scientist, working on the project individually.

Roni Porat - Hebrew University

Roni Porat - Hebrew University

The role of anger expressions in promoting social justice

 

My research examines micro and meso-level forces that motivate behavior and societal change. I am specifically interested in the forces that can reduce prejudice, violence, and discrimination. My first line of research considers the role of emotion for motivating violence on the one hand, and collective action toward on the other. In my second line of work I take a wide perspective that goes well beyond psychological theory to examine broad policy-related questions about the roots, causes, and remedies of group-based prejudice, inequality, and also of  sexual violence.

 

https://www.poratroni.com/

roni.porat@mail.huji.ac.il

 

What is the data science project's research question?

Are anger expressions by members of disadvantaged group beneficial for social justice? 

 

What data will be worked on?

This study will test the associations between expressing anger and viral sharing on Twitter. I plan to analyze a dataset that consists of over 1 million tweets collected recently by Dr. Amit Goldenberg during the #metoo demonstrations that took place in the US and in other major capitals across the world.

 

What tasks will this project involve?

This project will involve natural language processing analysis, categorizing whether and which emotion was expressed in each tweet. Using recent advances (Wang, Hale, Adelani, et al., 2019) I will identify the gender of person making the tweet, as well as their race . The analysis will provide insight as to whether anger expressed by a member of disadvantaged group (gender, race, and their intersection) is associated with viral sharing and liking. 

 

What makes this project interesting to work on?

The role of anger for promoting social justice has for years been debated. While some argue that anger is critical for disadvantaged groups demanding social equality, others warn that anger is purely destructive for human interactions and threatens our moral lives. The debate over anger’s social role for promoting equality has occupied the minds and thoughts of many. From ancient Greek philosophers like Aeschylus who thought of anger as evil and destructive, to political revolutionists like Malcolm X who perceived anger as a catalyst for social change.
Previous work on was conducted in laboratories, using vignette designs, and unrealistic stimulus. This project will be the first to test the role of anger for promoting social justice in the real world, looking at anger expressions in response to real world events as they unfold.  

 

What is the expected outcome?

Contribution to research paper.

 

What infrastructure, programs and tools will be used? Can they be used remotely?

R, VADER

 

What skills are necessary for this project?

Data analytics / statistics, Visualization, Databases

 

Interested candidates should be at PhD level. Roni Porat is looking for 1 visiting scientist, working on the project together with the team.

Ronny Bartsch - Bar Ilan University

Ronny Bartsch - Bar Ilan University

Predicting team-development from social interactions through the analysis of physiological networks

 

Our research team represents a synergistic effort across disciplines including experts on statistical physics, nonlinear dynamics and signal processing (Dr. Bartsch’s lab), and experts on biobehavioral research, developmental psychology and group interactions (Dr. Gordon’s lab). Our teams work together to forge a more crystallized understanding of the physiological factors that influence group bonding and team performance. 

 

https://ilushgordon.wixsite.com/ilanit-gordon-lab

roni.bartsch@biu.ac.il

 

 

What is the data science project's research question?

In the proposed project, we aim to predict the quality of team performance by studying and analyzing physiological signals measured from individuals while they are interacting in group tasks. The project will employ statistical physics approaches on a large database comprised of ~400 individuals. Our main research question is "Can we use physiological signals to predict which groups will perform best"?

 

What data will be worked on?

The database includes multiple physiological, behavioral and psychological assessments of group interactions during well-defined experiments. These experiments included several tasks in which group members had to interact with each other for ~10 minutes (e.g., by making a group decision, drumming together to a certain beat, discussing political views). As group members interacted, they were continuously monitored via video+audio recordings from several angles. Simultaneously, multi-channel physiological signals of cardiac and respiratory dynamics as well as electrodermal activity were recorded from each subject, which allow to determine states of the autonomic nervous system.

 

What tasks will this project involve?

The project will involve data mining tasks, developing visualization tools and basic network analysis algorithms. 

 

What makes this project interesting to work on?

This project is unique because it applies data science techniques to real-life data, thus it will allow us to make meaningful predictions about how group interactions unfold, solely based on individuals' physiological functions.

 

What is the expected outcome?

Contribution to research paper, Contribution to software development

 

What infrastructure, programs and tools will be used? Can they be used remotely?

Physiological and psychological data are de-identified, thus remote work is possible. All kinds of programming languages can be used: Matlab, Python, C++, JAVA....

 

Is the project open source?

No

 

What skills are necessary for this project?

Data analytics / statistics, Data mining / Machine learning, Visualization, Databases

 

Interested candidates should be at PhD-level.  Ronny Bartsch is looking for 1 visiting scientist, working on the project with the team.

Tamar Friedlander - Hebrew University of Jerusalem

Tamar Friedlander - Hebrew University of Jerusalem

Flowering prediction in olive trees using past winter temperatures and flowering data

 

We are a computational-theoretical group using mathematical models, computer simulations and data analysis to study problems in biology and evolution. We currently have 3 lines of research in the lab including evolution of a plant mating system, evolution on high-dimensional fitness landscapes and flowering prediction based on winter temperatures (offered project).

 

https://www.friedlander-lab.net/ 

tamar.friedlander@mail.huji.ac.il

 

 

What is the data science project's research question?

In many trees the flowering in spring (which then determines yield) strongly depends on having sufficiently cold winter and warm winters are known to disrupt flowering. While the temperatures trees should have for optimal flowering are known, is is not known what is the effect of warm period of different length. As the Mediterranean winter often includes warm periods, a more quantitative understanding of the relation between temperature and yield is needed.  

 

What data will be worked on?

We have data of olive flowering and temperature (both ambient and controlled) in different sites in Israel from the past 7 years which we would like to analyze for that sake. The data includes several trees at each location, where we have details of per-branch and per tree flowering.

 

What tasks will this project involve?

There are several options and the exact task can be discussed with the candidate. Examples  are: 1) analysis of flowering variation within and between trees in the same site and construction of a probabilistic model, 2) Analysis of temperature profiles to obtain rules for critical temperature periods affecting flowering (e.g. if the temperature was above some value for more then 2 days, etc.), 3) construction of web-based interface in which users can upload their own data and obtain flowering prediction.          

 

What makes this project interesting to work on?

This is a very burning topic for farmers in Israel - there past winter was very warm resulting in low yield. This is only expected to worsen with global warming.
The data can be used in different ways and interested students can use a variety of methodologies here, such as: statistics, parameter estimation, modeling and numerical optimization (not mandatory but depending on the task chosen).

 

What is the expected outcome?

Contribution to research paper, Contribution to software development

 

What infrastructure, programs and tools will be used? Can they be used remotely?

We have a server in the lab and if more computer power is needed the university cluster can also be used. 

 

Is the project open source?

No

 

What skills are necessary for this project?

Data analytics / statistics, Scientific computation, Computational models, High-performance computing, optional: parameter estimation, numerical optimization 

 

Interested candidates should be at Postdoc-level, but PhD students at advanced stage can also apply.  Tamar Friedlander is looking for 1 visiting scientist, working on the project.

Udi Sommer - Tel Aviv University

Udi Sommer - Tel Aviv University

A Paradigm Shift in the Understanding of Political Campaigns based on AI Tools

 

Research on a range of topics in political science using AI, DS and NLP tools. In particular, interest in polarization, COVID-19 & politics, judicial politics and political behavior. 

 

https://people.socsci.tau.ac.il/mu/udis/

udi.sommer@gmail.com

 

 

What is the data science project's research question?

How should we think of and analyze political campaigns? We will produce a large training set to train existing NLP tools towards classifying TV and social media ads on the target audience axis (base vs. center ads), using political science students, M-Turk and rules-of-thumb based ground truth (e.g., comparison to existing ideology scores like DW-Nominate). The tagged dataset will enable us to run state-of-the-art NLP algorithms (e.g., text classification and word embedding). In turn, these will enable us to confirm or refute our key theoretical contention about separate target audiences for whom separate political campaign strategies exist, which in turn produce separate bodies of campaign ads.

 

What data will be worked on?

Data for political ads is stored in accessible archives. Television ads are accessible through the Wesleyan Media Project (https://mediaproject.wesleyan.edu) and Facebook political ads through the Facebook political ads API. Our plan for tagging is based on a three-pronged approach:
1. Use political science students as RAs for tagging data – their advantage is in having the basic knowledge and understanding of the subject matter. The disadvantage is limited scalability to a large training set.
2. Use Mechanical-Turk marketplace for crowdsourcing tagging – this would require rigid and specific tagging instructions, as participants have no particular knowledge or understanding of the subject matter.
3. Use indirect methods to obtain ground-truth. For example, we will use tagging ads from non-competitive races as base-ads. Likewise, we will use DW-NOMINATE scores of the candidate whose ad we score, etc.

 

What tasks will this project involve?

Such a tagged dataset will enable us to run state-of-the-art NLP algorithms (text classification, word embedding and so fourth). In turn, these will enable further transformative research on several fronts. First, the actual success or failure of text classification based on the base/center dataset could confirm or refute our key theoretical contention: that indeed these are separate target audiences and that they experience different political campaigns. Achieving successful classification on text datasets would confirm that ads directed at the base are different from those aimed to the center, with key insights for our understanding of polarization. Second, successful classification would enable us to classify the entire universe of political ads and run robust statistics on these two groups. Consequently, we will be able to explore a range of different related questions: Are base/center ads more positive or negative? At what point during the campaign are they deployed? How do they use music and visuals? Do they last longer? etc. (many of those fields already exist in Wesleyan based on less sophisticated methods and less data, but it should still facilitate the process). Finally, NLP models trained on this dataset could inform us on the language used in the ads: which words/bigrams/ngrams are used most, which words are used in the same context (from word embedding) etc.    

 

What makes this project interesting to work on?

This is a project with a game-changing impact that will harness the power of NLP to delve into the study of political campaigns in a polarized era, a question that is doubly urgent in light of the crisis many democracies are experiencing. This high impact project will promote public interest in democracy as it touches on key elements of the democratic system. Plus, its implications for our understanding of political polarization and the prospects of bridging some of its gaps in the post-Trump and post-COVID era, are momentous.

 

What is the expected outcome?

Contribution to research paper, possibly also a book project

 

What infrastructure, programs and tools will be used? Can they be used remotely?

Python etc. All can be used remotely. 

 

Is the project open source?

Yes

 

What skills are necessary for this project?

Data analytics / statistics, Scientific computation, Data mining / Machine learning

 

Interested candidates should be at PhD-level.  Udi Sommer is looking for several visiting scientists, working on the project together with the team.

Yael Allweil - Technion

Yael Allweil - Technion

Developing a Digital Humanities Approach to Future Archiving

 

HousingLab research group, led by Associate. Prof. Yael Allweil, involves expanding three realms of knowledge production: (i) Architectural history of Israel and Palestine as a housing project, (ii) History and theory of housing as architecture, and (iii) Developing archival theory and methods for digital visual primary sources. HousingLab combines a humanities inquiry of the history of architecture as cultural production, with critical social science of housing as socio-economic need, and design theory and methods. HousingLab develops research methods that address the large scale and multifaceted nature of housing, and devises design theory and design methods to ensure that planning and design target housing needs for the greater public facing the global housing crisis. Looking into the future of architecture history, HousingLab engages in developing archival methods for visual digital data as primary sources.

 

https://architecture.technion.ac.il/research/labs/housinglab-history-and-future-of-living-research-group/

aryael@technion.ac.il

 

 

What is the data science project's research question?

Computer vision capabilities of the built environment using the Google Street View dataset and Convolutional Neural Networks

 

What data will be worked on?

Largescale visual repositories of the built environment from Google Street View 

 

What tasks will this project involve?

1. Developing methods and techniques to ‘read’  features of the built environment - its buildings, streetscapes and other elements - using various methods, including convolutional neural networks.
2. Applying machine learning-based image classification algorithms.
3. Collecting and labelling records - usually images - to identify desired features from various sources such as social media sites like Instagram or Twitter, or mapping sites like Google Street View.
4. Training models which can identify these features in unlabeled images.     

 

What makes this project interesting to work on?

This is an opportunity to work in an interdisciplinary environment and gain experience in developing machine learning models for researching the built environment and developing a better understanding of it. Students should expect to work in Python and JavaScript for training, testing, and extending models, developing data collection interfaces suitable for use by others. Students will learn to communicate across disciplines, work in a team, and contribute and develop their skills in solving problems using machine learning.

 

What is the expected outcome?

Contribution to research paper, Contribution to software development

 

What infrastructure, programs and tools will be used? Can they be used remotely?

Students should expect to work in Python and JavaScript. Can be used remotely.

 

Is the project open source?

Yes

 

What skills are necessary for this project?

Computational models, Data mining / Machine learning, Deep learning, Visualization

 

Interested candidates should be at PhD-level.  Yael Allweil is looking for 2 visiting scientists, working on the project together with the team. Supervisor in addition: Dean Jacob Grobman

Yaron Orenstein - Ben-Gurion University

Yaron Orenstein - Ben-Gurion University

Deep Learning in Genomics

 

We are very excited to utilize the most advanced machine learning methods to generate more accurate protein-DNA, -RNA and -peptide binding models. The recent advancement in neural networks, termed deep learning, has attracted much attention in the computational biology field. We are applying it successfully to many high-throughput datasets, and plan to take it even further by incorporating several orthogonal sources to improve in vivo binding prediction.

 

https://wwwee.ee.bgu.ac.il/~cb/index.html

yaronore@bgu.ac.il

 

 

What is the data science project's research question?

In the project, we will work on one of the large publicly available high-throughput datasets, or a dataset from one of our collaborators, to develop a deep-learning-based model to predict a molecular phenomena, and then interrogate the trained model for the molecular principles learned

 

What data will be worked on?

High-throughput genomic data, i.e, hundreds of thousands of short DNA or RNA sequences accompanied by a label (either a number or category)

 

What tasks will this project involve?

Designing a deep neural network for a genomic task, training the network after hyper-parameter optimisation, testing the network in cross-validation and on independent datasets, and interpreting the network for the principles it learned

 

What makes this project interesting to work on?

Genomic data is a lot of fun - there's lots of available supervised data (in the millions in some cases), and each data point is quite small making the algorithmic development quicker than in other domain. Plus, you get to make a scientific contribution and discovery.

 

What is the expected outcome?

Contribution to research paper, Contribution to software development

 

What infrastructure, programs and tools will be used? Can they be used remotely?

Python keras, numpy, scikit-learn, tensorflow on our computing servers or other cloud service. Yes, it can be accessed remotely.

 

Is the project open source?

Yes

 

What skills are necessary for this project?

Data analytics / statistics, Computational models, Data mining / Machine learning, Deep learning, Visualization

 

Interested candidates should be at PhD-level.  Yaron Orenstein is looking for 1 visiting scientists, working on the project together with the team.

Yoav Ram - Tel Aviv University

Yoav Ram - Tel Aviv University

Detecing high-order ecological interactions from growth curve data

 

We study population biology, including evolutionary biology, ecology, behaviour, and cultural evolution, using mathematical, computational, and statistical models and collaborations with empirical biologists.

 

http://www.yoavram.com

yoavram@tauex.tau.ac.il

 

 

What is the data science project's research question?

Can high-order ecological interactions, such as Allee effects, mutualism, and cross-feeding, be detected from growth curve data collected from both mono and mixed culture? This is a continuation of a previous project (Ram et al, PNAS 2019) in which we used similar data to predict results of competitions when only first-order (logistic) interactions are considered.

 

What data will be worked on?

Growth curve data from experiments with bacteria and/or yeast

 

What tasks will this project involve?

Developing and testing a computational statistical method for model fitting and model selection applied to population-dynamic growth models such as the competitive Lotka–Volterra equations.

 

What makes this project interesting to work on?

The project combines statistics, scientific computing, theoretical ecology, and interaction with empirical microbiologists, and, if successful, could be widely applied by microbiologists to detect interesting phenomena using simple and easy experiments.

 

What is the expected outcome?

Contribution to research paper, Contribution to software development

 

What infrastructure, programs and tools will be used? Can they be used remotely?

Python/R and a high-performance computational cluster with hundreds of CPUs; everything can be done remotely.

 

Is the project open source?

Data will be produced by an experimental collaborator after the method is tested on simulated data.

 

What skills are necessary for this project?

Data analytics / statistics, Scientific computation, Computational models

 

Interested candidates should be at PhD-level.  Yoav Ram is looking for 1 visiting scientists working individually on the project.

Sarah Keren - Israel Institute of Technology

Sarah Keren - Israel Institute of Technology

Automated Environment Design for Promoting Collaboration of Autonomous AI Agents

 

In order to collaborate effectively, AI agents must be able to reason about the behaviors of other agents, learn to communicate effectively and form commitments with each other, learn to ask for help from other agents, and understand how they can offer valuable assistance to other agents. The proposed research aims at promoting these capabilities via automated environment design and at showing the relationship between collaboration within a group and its ability to adapt to unexpected events.

 

https://sarahkeren.wixsite.com/sarahkeren-academics

sarahk@cs.technion.ac.il

 

 

What is the data science project's research question?

Based on the effect communication protocols have on group resilience of RL agents, we intend to examine additional environment design methods for collaboration. Specifically, we intend to enable agents to automatically extract symbolic representations of their tasks and the tasks of other agents, and use them to reason about their own needs and the needs of others.

 

What data will be worked on?

We will use data from simulated multi-agent RL domains and from multi-robot systems.

 

What tasks will this project involve?

Establishing and evaluating a new way to support collaboration between autonomous AI agents (e.g., a novel communication protocol, a new way to coordinate the usage of shared resources, etc.).

 

What makes this project interesting to work on?

It combines theory and practice in a challenging and fascinating domain.

 

What is the expected outcome?

Contribution to research paper, Contribution to software development

 

What infrastructure, programs and tools will be used? Can they be used remotely?

All tools can be used remotely. The applicant is expected to be proficient in Python and will be able to choose between working on simulated multi-agent RL domains or multi-robot systems (in which case familerary with ROS is required).

 

Is the project open source?

Yes

 

What skills are necessary for this project?

Computational models, Computer simulations, Data mining / Machine learning, Deep learning

 

Interested candidates should be at PhD-level. Sarah Keren is looking for several visiting scientists working on the project with the team.

Nadav Rappoport - Ben-Gurion University

Nadav Rappoport - Ben-Gurion University

Genetic risk scores model based on survival-analysis models

 

Our research interests lie in the area of Machine learning, Big biomedical data, and Clinical genomics. Currently, there is a tremendous amount of clinical data that is being gathered from a variety of sources and types. Including structured and unstructured data, Electronic Health Records (EHR), Bio molecular, Genetic, Wearables, and other types of data. Therefore, there is a need for developing new machine learning and deep learning methods that can make new insights from this type of data, in order to advance toward personalized medicine.

 

http://nadavrap.com

nadavrap@bgu.ac.il

 

 

What is the data science project's research question?

Develop survival-analysis based model for diseases prediction using genetic data

 

What data will be worked on?

The UK Biobank - a big dataset of genomic and phenotypic data

 

What tasks will this project involve?

1. Development of a pipeline.
2. Working on HPC for running many jobs.
3. Publishing a GitHub package.
4. Summarizing results in a publishable version.

 

What makes this project interesting to work on?

Exciting opportunity to work on big data, that may affect our understand on how genetic and diseases are associated through non-traditional models.

 

What is the expected outcome?

Contribution to research paper, Contribution to software development

 

What infrastructure, programs and tools will be used? Can they be used remotely?

Infrastructures are high performance computing which is accessible remotely. Though, in person is preferable at least part of the time.

 

Is the project open source?

Yes

 

What skills are necessary for this project?

Data analytics / statistics, Scientific computation, Computational models, Visualization, High-performance computing, Data warehousing

 

Interested candidates should be at least at MSc-level.  Nadav Rappoport is looking for 1 visiting scientists working on the project together with the team.

Ofir Lindenbaum - Bar Ilan University

Ofir Lindenbaum - Bar Ilan University

Self Supervised Learning for Tabular Data

 

Our main research goal is to develop automatic methods that would lead to novel scientific findings (in biology, physics, medicine, etc.). To achieve this goal we are currently developing deep learning methodologies that enable automatic learning from unstructured high-dimensional empirical observations (for example gene measurements). The lab is mainly focused on the following problems: feature selection, feature extraction, self-supervised learning, multimodal data fusion, and generative modeling.

 

https://www.eng.biu.ac.il/lindeno/

ofirlin@gmail.com

 

 

What is the data science project's research question?

How to develop self supervised learning frameworks for unstructured tabular data.

 

What data will be worked on?

Biomedical, RNA data, clinical and more.

 

What tasks will this project involve?

Developing a novel self supervised learning scheme suited for tabular biomedical data.
Experimenting with predictions on real data.

 

What makes this project interesting to work on?

Tabular data are the most commonly used form of data, yet NN based methodologies for working on such data still do not lead to satisfactory performance. Designing a method that could improve predictions in biomedicine will be highly impactful and could lead to improved personalized medicine.

 

What is the expected outcome?

Contribution to research paper, Contribution to software development

 

What infrastructure, programs and tools will be used? Can they be used remotely?

Servers located at the data science center in Bar Ilan university. Could be accessed remotely. 

 

Is the project open source?

Yes.

 

What skills are necessary for this project?

Computational models, Data mining / Machine learning, Deep learning

 

Interested candidates should be at PhD-level. Ofir Lindenbaum is looking for 1 visiting scientists working on the project together with the team.

Tirza Routtenberg - Ben-Gurion University

Tirza Routtenberg - Ben-Gurion University

Estimation of the missing-mass by the maximum profile likelihood

 

We are interested in: 1) Signal processing and optimization with applications in smart grid; 2) Statistical signal processing; 3) Detection and estimation theory; and 4) Graph Signal Processing (GSP)

 

https://www.ee.bgu.ac.il/~tirzar/publications2.html

tirzar@bgu.ac.il

 

 

What is the data science project's research question?

How the estimate unseen elements from a lexicon in a given dataset

 

What data will be worked on?

Theoretical work, can be implemented on various data types

 

What tasks will this project involve?

Mathematical development of the methods, conduction simulations on Matlab/Python

 

What makes this project interesting to work on?

Classical problem (stated by Laplace and by Turing), relevant for language processing

 

What is the expected outcome?

Contribution to research paper, Contribution to software development

 

What infrastructure, programs and tools will be used? Can they be used remotely?

Matlab or Python

 

Is the project open source?

It depends on the final data that we will use.

 

What skills are necessary for this project?

Data analytics / statistics, Scientific computation, Signal processing

 

Interested candidates should be at PhD-level. Tirza Routtenberg is looking for 1 visiting scientists working on the project individually.

Yotam Drier - Hebrew University of Jerusalem

Yotam Drier - Hebrew University of Jerusalem

Machine learning approaches for studying gene regulation and aberrations in cancer

 

We study the role of non-genic regulatory elements and how their disruption drives cancer. Other than expanding our understanding of gene regulation and dysregulation in cancer, we aim to leverage this knowledge to predict novel therapeutic targets for the development of new drugs, and develop models to predict patient outcome to help guiding treatment plans for cancer patients. To achieve this, we combine cutting edge experimental techniques with developing new machine learning algorithms and big-data analytical approaches.

 

https://yotamdrier.ekmd.huji.ac.il/

yotam.drier@mail.huji.ac.il

 

 

What is the data science project's research question?

Develop computational methods to predict chromosomal topology and its alterations in cancer. Other related questions are also an option.

 

What data will be worked on?

Several options exist: Publicly available data (Hi-C, ChIP-seq, DNA methylation, gene expression) and some in-house data we have generated in the lab. If the visiting student has his own data he wishes to analyze we can discuss this further.

 

What tasks will this project involve?

Test existing deep learning approaches to predict chromosomal structure (e.g. DeepC, Akita), identify poorly predicted regions, improve existing methods by adding computational modeling and/or additional data, apply to cancer data. Other potential applications of machine learning and/or deep learning methods to analyze gene dysregulation may also be an option.

 

What makes this project interesting to work on?

Combination of cutting-edge techniques, new data, and important questions that can impact our understanding of fundamental biological questions with application to cancer therapy.

 

What is the expected outcome?

Contribution to research paper

 

What infrastructure, programs and tools will be used? Can they be used remotely?

The Hebrew University compute cluster, can be used remotely.

 

Is the project open source?

Both

 

What skills are necessary for this project?

Computational models, Data mining / Machine learning

 

Interested candidates should be at Postdoc-level. Yotam Drier is looking for 1 visiting scientists working on the project individually.

Learn more

Join our Info Session

Save the date: Info session for applicants

Join our online info session with the Israeli hosts on March 3, noon CET, to learn more about them and their projects.

 

Sign up

 

Mehr Informationen zur Teilnahme gibt es in unseren FAQs. Kontaktieren Sie uns: hida@helmholtz.de

Zurück

Alternativ-Text

Newsletter bestellen