2019
DOI: 10.1186/s13059-019-1766-4
|View full text |Cite
|
Sign up to set email alerts
|

scAlign: a tool for alignment, integration, and rare cell identification from scRNA-seq data

Abstract: scRNA-seq dataset integration occurs in different contexts, such as the identification of cell type-specific differences in gene expression across conditions or species, or batch effect correction. We present scAlign, an unsupervised deep learning method for data integration that can incorporate partial, overlapping, or a complete set of cell labels, and estimate per-cell differences in gene expression across datasets. scAlign performance is state-of-the-art and robust to cross-dataset variation in cell type-s… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

1
88
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 86 publications
(89 citation statements)
references
References 52 publications
1
88
0
Order By: Relevance
“…To reduce the dimensionality of the dataset, we performed latent semantic indexing followed by singular value decomposition (Methods). Batch correction was performed using the deep neural network-based scAlign 29 to correct for technical sources of variance, including individual variation and processing method (Extended Data Fig. 1e-f, Methods).…”
Section: Chromatin States Define the Major Cell Types In The Developimentioning
confidence: 99%
“…To reduce the dimensionality of the dataset, we performed latent semantic indexing followed by singular value decomposition (Methods). Batch correction was performed using the deep neural network-based scAlign 29 to correct for technical sources of variance, including individual variation and processing method (Extended Data Fig. 1e-f, Methods).…”
Section: Chromatin States Define the Major Cell Types In The Developimentioning
confidence: 99%
“…clustering), followed by examination of marker gene expression in the resulting clusters. To remove the effect of disease progression on clustering, we performed, prior to clustering, cross-sample alignment [27][28][29] of the data from each brain region using scAlign (see Methods), which learns a low-dimensional manifold (i.e. the alignment space) in which cells tend to cluster in a manner consistent with their biological function independent of technical and experimental factors 29 .…”
mentioning
confidence: 99%
“…To remove the effect of disease progression on clustering, we performed, prior to clustering, cross-sample alignment [27][28][29] of the data from each brain region using scAlign (see Methods), which learns a low-dimensional manifold (i.e. the alignment space) in which cells tend to cluster in a manner consistent with their biological function independent of technical and experimental factors 29 . Importantly, after identifying clusters in the alignment space, we used the original data for subsequent analyses involving examination of gene expression, such as identifying differentially expressed genes between clusters.…”
mentioning
confidence: 99%
“…In some cases, basic normalization 16,17 or batch correction 18,19 methods have been used to combine multiple scRNAseq datasets with limited success. Recently, several computational methods have been developed to address this challenge more comprehensively 2025 . General steps in these methods include feature selection/dimensionality reduction and quantitative learning for matching.…”
Section: Introductionmentioning
confidence: 99%
“…Both query and reference cells are aligned in a search space projected by PCA-based dimensionality reduction and canonical correlation analysis, to transfer cluster labels through “anchors”. Among many others 2325 , these methods have focused on individual cell level strategies when comparing a query dataset to a reference dataset, not relying on clustering results to guide supervised feature selection or cluster-level matching.…”
Section: Introductionmentioning
confidence: 99%