2018
DOI: 10.1101/478503
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Generative modeling and latent space arithmetics predict single-cell perturbation response across cell types, studies and species

Abstract: Accurately modeling cellular response to perturbations is a central goal of computational biology. While such modeling has been proposed based on statistical, mechanistic and machine learning models in specific settings, no generalization of predictions to phenomena absent from training data (‘out-of-sample’) has yet been demonstrated. Here, we present scGen, a model combining variational autoencoders and latent space vector arithmetics for high-dimensional single-cell gene expression data. In benchmarks acros… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
33
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 25 publications
(33 citation statements)
references
References 52 publications
0
33
0
Order By: Relevance
“…This approach will confound the batch effect with biological differences between cell types or states that are not shared among datasets. Data integration methods such as Canonical Correlation Analysis (CCA; Butler et al , ), Mutual Nearest Neighbours (MNN; Haghverdi et al , ), Scanorama (preprint: Hie et al , ), RISC (preprint: Liu et al , ), scGen (preprint: Lotfollahi et al , ), LIGER (preprint: Welch et al , ), BBKNN (preprint: Park et al , ), and Harmony (preprint: Korsunsky et al , ) have been developed to overcome this issue. While data integration methods can also be applied to simple batch correction problems, we recommend to be wary of over‐correction given the increased degrees of freedom of non‐linear data integration approaches.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…This approach will confound the batch effect with biological differences between cell types or states that are not shared among datasets. Data integration methods such as Canonical Correlation Analysis (CCA; Butler et al , ), Mutual Nearest Neighbours (MNN; Haghverdi et al , ), Scanorama (preprint: Hie et al , ), RISC (preprint: Liu et al , ), scGen (preprint: Lotfollahi et al , ), LIGER (preprint: Welch et al , ), BBKNN (preprint: Park et al , ), and Harmony (preprint: Korsunsky et al , ) have been developed to overcome this issue. While data integration methods can also be applied to simple batch correction problems, we recommend to be wary of over‐correction given the increased degrees of freedom of non‐linear data integration approaches.…”
Section: Introductionmentioning
confidence: 99%
“…First applications to scRNA‐seq are starting to emerge from dimensionality reduction to denoising (e.g. scVis: Ding et al , ; scGen: preprint: Lotfollahi et al , ; DCA: Eraslan et al , ). Recently, deep learning has been used to produce an embedded workflow that can fit the data, denoise it and perform downstream analysis such as clustering and differential expression within the framework of the model (scVI: Lopez et al , ).…”
Section: Introductionmentioning
confidence: 99%
“…This would also help with transfer learning enabling modelling multiple data sets with the same model. As mentioned above, it should also be possible to combine encodings of the cells in the latent space and produce in-between cells like Lotfollahi et al (2018). We would also like to extent our investigation of what dimensions of the latent variables encode (Kinalis et al, 2019).…”
Section: Discussionmentioning
confidence: 99%
“…In contrast, when analyzing and integrating samples of multiple conditions (i.e., disease vs. health), batch effect removal must be performed with caution. Several approaches have been proposed to reduce batch effect including mutual nearest neighbors (MNN) correct, seurat3, ResNet, Harmony, Scanorama, BBKNN, scGen, and so on. MNN Correct is used to identify the connections shared by two datasets.…”
Section: Limitation and Challenges Of Scrna‐seq Technologymentioning
confidence: 99%
“…The second is how to remove batch effect 90 ResNet, 92 Harmony, 93 Scanorama, 94 BBKNN, 95 scGen, 96 and so on. MNN Correct is used to identify the connections shared by two datasets.…”
Section: Limitation and Challenges Of Scrna-seq Technologymentioning
confidence: 99%