Adam Gayoso scite author profile

Single-cell RNA sequencing (scRNA-seq) studies of differentiating systems have raised fundamental questions regarding the discrete versus continuous nature of both differentiation and cell fate. Here we present Palantir, an algorithm that models trajectories of differentiating cells—treating cell fate as a probabilistic process—and leverages entropy to measure cell plasticity along the trajectory. Palantir generates a high-resolution pseudotime ordering of cells and, for each cell state, assigns a probability of differentiating into each terminal state. We apply our algorithm to human bone marrow scRNA-seq data and detect important landmarks of hematopoietic differentiation. Palantir’s resolution enables the identification of key transcription factors that drive lineage fate choice and closely track when cells lose plasticity. We show that Palantir outperforms existing algorithms in identifying cell lineages and recapitulating gene expression trends during differentiation generalizable to diverse tissue types and well-suited to resolve less-studied differentiating systems.

show abstract

Cell2location maps fine-grained cell types in spatial transcriptomics

Kleshchevnikov

et al. 2022

View full text Add to dashboard Cite

Joint probabilistic modeling of single-cell multi-omic data with totalVI

et al. 2021

View full text Add to dashboard Cite

The paired measurement of RNA and surface proteins in single cells with CITE-seq is a promising approach to connect transcriptional variation with cell phenotypes and functions. However, combining these paired views into a unified representation of cell state is made challenging by the unique technical characteristics of each measurement. Here we present Total Variational Inference (totalVI; https://scvi-tools.org ), a framework for end-to-end joint analysis of CITE-seq data that probabilistically represents the data as a composite of biological and technical factors including protein background and batch effects. To evaluate totalVI’s performance, we profiled immune cells from murine spleen and lymph nodes with CITE-seq, measuring over 100 surface proteins. We demonstrate that totalVI provides a cohesive solution for common analysis tasks like dimensionality reduction, the integration of datasets with different measured proteins, estimation of correlations between molecules, and differential expression testing.

show abstract

Mapping single-cell data to reference atlases by transfer learning

et al. 2021

View full text Add to dashboard Cite

Large single-cell atlases are now routinely generated to serve as references for analysis of smaller-scale studies. Yet learning from reference data is complicated by batch effects between datasets, limited availability of computational resources and sharing restrictions on raw data. Here we introduce a deep learning strategy for mapping query datasets on top of a reference called single-cell architectural surgery (scArches). scArches uses transfer learning and parameter optimization to enable efficient, decentralized, iterative reference building and contextualization of new datasets with existing references without sharing raw data. Using examples from mouse brain, pancreas, immune and whole-organism atlases, we show that scArches preserves biological state information while removing batch effects, despite using four orders of magnitude fewer parameters than de novo integration. scArches generalizes to multimodal reference mapping, allowing imputation of missing modalities. Finally, scArches retains coronavirus disease 2019 (COVID-19) disease variation when mapping to a healthy reference, enabling the discovery of disease-specific cell states. scArches will facilitate collaborative projects by enabling iterative construction, updating, sharing and efficient use of reference atlases.

show abstract

Interpretable factor models of single-cell RNA-seq via variational autoencoders

Gayoso²,

et al. 2020

View full text Add to dashboard Cite

Motivation Single-cell RNA-seq makes possible the investigation of variability in gene expression among cells, and dependence of variation on cell type. Statistical inference methods for such analyses must be scalable, and ideally interpretable. Results We present an approach based on a modification of a recently published highly scalable variational autoencoder framework that provides interpretability without sacrificing much accuracy. We demonstrate that our approach enables identification of gene programs in massive datasets. Our strategy, namely the learning of factor models with the auto-encoding variational Bayes framework, is not domain specific and may be useful for other applications. Availability and implementation The factor model is available in the scVI package hosted at https://github.com/YosefLab/scVI/. Contact v@nxn.se Supplementary information Supplementary data are available at Bioinformatics online.

show abstract

A Python library for probabilistic analysis of single-cell omics data

et al. 2022

View full text Add to dashboard Cite

To the Editor -Methods for analyzing single-cell data 1-4 perform a core set of computational tasks. These tasks include dimensionality reduction, cell clustering, cell-state annotation, removal of unwanted variation, analysis of differential expression, identification of spatial patterns of gene expression, and joint analysis of multi-modal omics data. Many of these methods rely on likelihood-based models to represent variation in the data; we refer to these as 'probabilistic

show abstract

The Tabula Sapiens: A multiple-organ, single-cell transcriptomic atlas of humans

Jones¹,

Karkanias²,

Bulthaup³

et al. 2022

Science

323

View full text Add to dashboard Cite

Molecular characterization of cell types using single-cell transcriptome sequencing is revolutionizing cell biology and enabling new insights into the physiology of human organs. We created a human reference atlas comprising nearly 500,000 cells from 24 different tissues and organs, many from the same donor. This atlas enabled molecular characterization of more than 400 cell types, their distribution across tissues, and tissue-specific variation in gene expression. Using multiple tissues from a single donor enabled identification of the clonal distribution of T cells between tissues, identification of the tissue-specific mutation rate in B cells, and analysis of the cell cycle state and proliferative potential of shared cell types across tissues. Cell type–specific RNA splicing was discovered and analyzed across tissues within an individual.

show abstract

scvi-tools: a library for deep probabilistic analysis of single-cell omics data

Gayoso

Lopez

Xing

et al. 2021

Preprint

View full text Add to dashboard Cite

Probabilistic models have provided the underpinnings for state-of-the-art performance in many single-cell omics data analysis tasks, including dimensionality reduction, clustering, differential expression, annotation, removal of unwanted variation, and integration across modalities. Many of the models being deployed are amenable to scalable stochastic inference techniques, and accordingly they are able to process single-cell datasets of realistic and growing sizes. However, the community-wide adoption of probabilistic approaches is hindered by a fractured software ecosystem resulting in an array of packages with distinct, and often complex interfaces. To address this issue, we developed scvi-tools (https://scvi-tools.org), a Python package that implements a variety of leading probabilistic methods. These methods, which cover many fundamental analysis tasks, are accessible through a standardized, easy-to-use interface with direct links to Scanpy, Seurat, and Bioconductor workflows. By standardizing the implementations, we were able to develop and reuse novel functionalities across different models, such as support for complex study designs through nonlinear removal of unwanted variation due to multiple covariates and reference-query integration via scArches. The extensible software building blocks that underlie scvi-tools also enable a developer environment in which new probabilistic models for single cell omics can be efficiently developed, benchmarked, and deployed. We demonstrate this through a code-efficient reimplementation of Stereoscope for deconvolution of spatial transcriptomics profiles. By catering to both the end user and developer audiences, we expect scvi-tools to become an essential software dependency and serve to formulate a community standard for probabilistic modeling of single cell omics.

show abstract

12 3 4

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Adam Gayoso

Characterization of cell fate probabilities in single-cell data with Palantir

Cell2location maps fine-grained cell types in spatial transcriptomics

Joint probabilistic modeling of single-cell multi-omic data with totalVI

Mapping single-cell data to reference atlases by transfer learning

Interpretable factor models of single-cell RNA-seq via variational autoencoders

A Python library for probabilistic analysis of single-cell omics data

The Tabula Sapiens: A multiple-organ, single-cell transcriptomic atlas of humans

scvi-tools: a library for deep probabilistic analysis of single-cell omics data

Contact Info

Product

Resources

About