2021
DOI: 10.1093/bib/bbab304
|View full text |Cite
|
Sign up to set email alerts
|

Supervised application of internal validation measures to benchmark dimensionality reduction methods in scRNA-seq data

Abstract: A typical single-cell RNA sequencing (scRNA-seq) experiment will measure on the order of 20 000 transcripts and thousands, if not millions, of cells. The high dimensionality of such data presents serious complications for traditional data analysis methods and, as such, methods to reduce dimensionality play an integral role in many analysis pipelines. However, few studies have benchmarked the performance of these methods on scRNA-seq data, with existing comparisons assessing performance via downstream analysis … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
1

Relationship

4
4

Authors

Journals

citations
Cited by 11 publications
(10 citation statements)
references
References 61 publications
0
10
0
Order By: Relevance
“…We assessed the performance of scPSD transformation, independent of any downstream analyses, in improving the cell-type clustering tendency and reducing the complexity of single-cell omics data. We previously proposed ( 14 ) a supervised application of internal validation measures (IVMs) such as silhouette score (SS) ( 39 ) and variance ratio criterion (VRC) ( 40 ), to quantify the compactness and separation of annotated cell-type clusters. We also defined a measure of the complexity of a multi-class dataset inspired by the Fisher's discriminant ratio (FDR) ( 28 ) (detailed in online Methods) to quantify the pairwise difference and dispersion of individual features among different cell types.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…We assessed the performance of scPSD transformation, independent of any downstream analyses, in improving the cell-type clustering tendency and reducing the complexity of single-cell omics data. We previously proposed ( 14 ) a supervised application of internal validation measures (IVMs) such as silhouette score (SS) ( 39 ) and variance ratio criterion (VRC) ( 40 ), to quantify the compactness and separation of annotated cell-type clusters. We also defined a measure of the complexity of a multi-class dataset inspired by the Fisher's discriminant ratio (FDR) ( 28 ) (detailed in online Methods) to quantify the pairwise difference and dispersion of individual features among different cell types.…”
Section: Resultsmentioning
confidence: 99%
“…SCALE (single-cell ATAC-seq analysis via latent feature extraction) ( 13 ). DR methods have varied performance in separating biological clusters as per our recent comprehensive benchmarking ( 14 ) and often perform poorly in facilitating the detection of rare cell populations ( 15 ). Furthermore, the capacity of different DR methods in extracting features from other single-cell omics, beyond scRNA-sequencing data, is undetermined and yet to be assessed systematically.…”
Section: Introductionmentioning
confidence: 99%
“…While not all representation learning methods require the same pre-processing steps, it is necessary to carefully consider these in order to suppress spurious or irrelevant variation and make data more amenable to effective representation learning and subsequent analyses. 45 Because pre-processing impacts the performance of any subsequent representation learning tools, iterative refinement of pre-processing on the basis of downstream outcomes may be important as an investigation matures.…”
Section: Common Steps In Representation Learning-centered Single-cell...mentioning
confidence: 99%
“…Different approaches have been proposed, such as principal component analysis (PCA) [ 54 ], nonnegative matrix factorization (NMF) [ 55 ], and deep neural networks [ 56 ]. The detailed features of these methods have been described in other reviews [ 57 , 58 ]. Zero inflation is another challenge of scRNA-seq data analysis.…”
Section: Application Of Scrna-seq In Time and Cancer Therapymentioning
confidence: 99%