2021
DOI: 10.1101/gr.271874.120
|View full text |Cite
|
Sign up to set email alerts
|

A joint deep learning model enables simultaneous batch effect correction, denoising, and clustering in single-cell transcriptomics

Abstract: Recent developments of single-cell RNA-seq (scRNA-seq) technologies have led to enormous biological discoveries. As the scale of scRNA-seq studies increases, a major challenge in analysis is batch effects, which are inevitable in studies involving human tissues. Most existing methods remove batch effects in a low-dimensional embedding space. Although useful for clustering, batch effects are still present in the gene expression space, leaving downstream gene-level analysis susceptible to batch effects. Recent s… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
31
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
5

Relationship

1
9

Authors

Journals

citations
Cited by 51 publications
(35 citation statements)
references
References 29 publications
0
31
0
Order By: Relevance
“…The gene expression was then normalized by total UMI counts for each cell, scaled by 10000, and transformed with the natural logarithm. To account for batch effects originating from sample collection processes, the CarDEC framework 88 was applied with standard parameters. Both highly variable genes (top 3000) and lowly variable genes were used for building the model, and donor resources were used as the batch key.…”
Section: Gwas Datasetsmentioning
confidence: 99%
“…The gene expression was then normalized by total UMI counts for each cell, scaled by 10000, and transformed with the natural logarithm. To account for batch effects originating from sample collection processes, the CarDEC framework 88 was applied with standard parameters. Both highly variable genes (top 3000) and lowly variable genes were used for building the model, and donor resources were used as the batch key.…”
Section: Gwas Datasetsmentioning
confidence: 99%
“…Imputation of unmeasured genes could be a strategy to overcome this limitation. Also, for integrated datasets from multiple origins with strong batch effects, since the MCA uses the original gene expression data, the batch effect correction should also be made on the original gene expression data, instead of the low-dimensional embeddings, using methods such as combat 46 , CarDEC 47 , or Scanorama 47 , etc. Our original motivation of developing GSDensity is to build a tool for 'pathway-centric' analysis, with which we can dive into the data directly from the angle of pathways of interest.…”
Section: Discussionmentioning
confidence: 99%
“…There are many well-established integration methods available to address batch effects in scRNA-seq datasets, for example Seurat, Harmony, and LIGER 16,52,53 . Additionally, some deep learning-based methods such as HDMC and CarDEC are also available 54,55 . In this study, we rigorously tested cell-type score-based integration via MASI across various single-cell platforms, cytoplasm/nuclei, research groups, conditions, and individuals.…”
Section: Discussionmentioning
confidence: 99%