2020
DOI: 10.1101/2020.05.22.111211
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Flexible comparison of batch correction methods for single-cell RNA-seq using BatchBench

Abstract: As the cost of single-cell RNA-seq experiments has decreased, an increasing number of datasets are now available. Combining newly generated and publicly accessible datasets is challenging due to non-biological signals, commonly known as batch effects .Although there are several computational methods available that can remove batch effects, evaluating which method performs best is not straightforward. Here we present BatchBench ( https://github.com/cellgeni/batchbench ), a modular and flexible pipeline for comp… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
11
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
4
3

Relationship

3
4

Authors

Journals

citations
Cited by 12 publications
(12 citation statements)
references
References 41 publications
1
11
0
Order By: Relevance
“…however, these methods typically rank outside the top three when used for complex real data scenarios, which is in agreement with recent benchmarks on simpler batch structures 10,24 . In contrast, on more complex integration tasks, BBKNN, Scanorama (embeddings), and scVI performed well.…”
Section: Discussionsupporting
confidence: 87%
See 1 more Smart Citation
“…however, these methods typically rank outside the top three when used for complex real data scenarios, which is in agreement with recent benchmarks on simpler batch structures 10,24 . In contrast, on more complex integration tasks, BBKNN, Scanorama (embeddings), and scVI performed well.…”
Section: Discussionsupporting
confidence: 87%
“…The diversity in output formats from data integration methods poses a challenge to fair benchmarking 24 . Although input data are consistently preprocessed, requirements on scaling and HVG selection also differ between methods.…”
Section: Single-cell Integration Benchmarking (Scib)mentioning
confidence: 99%
“…For both the combined human kidney and Tabula Muris datasets, many cell types from the individual datasets did not overlap. Moreover, it has been shown that merging datasets can distort expression profiles 74 . For these reasons, batch-correction methods were not used on the concatenated datasets, as this might remove biological signal in addition to batch effects.…”
Section: Datasets Usedmentioning
confidence: 99%
“…A benchmarking study showed that LIGER, Seurat V3, and Harmony perform better than other existing methods through comprehensive comparisons among 14 state-of-the-art methods [17]. However, recent large-scale benchmarking studies [18,19] suggested that method performance is dependent on the complexity of the integration task and multiple batches introduce additional difficulties for those methods that perform well for two batches. Moreover, these methods did not explicitly distinguish technical variation from biological variation when aligning multiple single-cell datasets, which might mitigate biological variation as well when removing technical variation.…”
Section: Introductionmentioning
confidence: 99%