2020
DOI: 10.1186/s13059-019-1900-3
|View full text |Cite
|
Sign up to set email alerts
|

Benchmarking principal component analysis for large-scale single-cell RNA-sequencing

Abstract: Background: Principal component analysis (PCA) is an essential method for analyzing single-cell RNA-seq (scRNA-seq) datasets, but for large-scale scRNA-seq datasets, computation time is long and consumes large amounts of memory. Results: In this work, we review the existing fast and memory-efficient PCA algorithms and implementations and evaluate their practical application to large-scale scRNA-seq datasets. Our benchmark shows that some PCA algorithms based on Krylov subspace and randomized singular value dec… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
53
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 82 publications
(62 citation statements)
references
References 138 publications
0
53
0
Order By: Relevance
“…Since the various PCA approaches and implementations were recently benchmarked in a similar context [17], we focused on widely used approaches that had not yet been compared: Seurat's PCA, scran's denoisePCA, and GLM-PCA [40]. When relevant, we combined them with sctransform normalization.…”
Section: Dimensionality Reductionmentioning
confidence: 99%
See 3 more Smart Citations
“…Since the various PCA approaches and implementations were recently benchmarked in a similar context [17], we focused on widely used approaches that had not yet been compared: Seurat's PCA, scran's denoisePCA, and GLM-PCA [40]. When relevant, we combined them with sctransform normalization.…”
Section: Dimensionality Reductionmentioning
confidence: 99%
“…In addition, we did not compare any of the alignment and/or quantification methods used to obtain the count matrix, which was for instance discussed in [18]. Some steps, such as the implementation of the PCA, were also not explored in detail here as they have already been the object of recent and thorough study elsewhere [14,17]. We also considered only methods relying on Euclidean distance, while correlation was recently reported to be superior [57] and would require further investigation.…”
Section: Limitations and Open Questionsmentioning
confidence: 99%
See 2 more Smart Citations
“…Linear regression using nonlinear iterative partial least-squares (NIPALS), eigen analysis, or singular value decomposition (SVD) are a few of the many ways to factorize or decompose a matrix. SVD is a basic matrix operation, and fast approximations of SVD, including IRLBA, are commonly applied to sc data [extensively reviewed by (45)]. (3) concatenating matrices and applying a matrix factorization, usually singular value decomposition (SVD); and (4) visualizing results.…”
Section: The Impact Of Data Preprocessing On Dimension Reductionmentioning
confidence: 99%