How do the existing algorithms differ and what are the trade-offs?
Dr. Dmitry Kobak is going to present the recent work on manifold learning and low-dimensional visualizations of single-cell transcriptomic data. Single-cell transcriptomics yields ever growing datasets containing RNA expression levels for thousands of genes from up to millions of cells, often exhibiting rich hierarchical structure, both continuous and discrete. Common data analysis pipelines include dimensionality reduction for visualising the data in two dimensions, most frequently performed using methods like t-SNE, UMAP, and ForceAtlas2. These methods are all examples of neighbour embeddings: their aim is to keep similar cells as neighbours in the embedding.
The most established algorithm, t-SNE, excels at revealing local structure in high-dimensional data, but often struggles to represent the global structure accurately. Dr. Dmitry Kobak will discuss how much this applies to other neighbor embedding algorithms. He will show that changing the balance between the attractive and the repulsive forces yields a spectrum of embeddings, characterized by a simple trade-off: stronger attraction can better represent continuous manifold structures, while stronger repulsion can better represent discrete cluster structures. Also, he will demonstrate that prominent neighbor embedding algorithms can all be placed onto this attraction-repulsion spectrum. He will elucidate other trade-offs, such as revealing coarser or finer cluster structure depending on the shape of the similarity kernel. Furthermore, He is going to demonstrate the influence of optimization parameters such as the learning rate and the initialization on the resulting embeddings. It will be discussed how to construct two-dimensional embeddings of other kinds of data, including library data and image data.