2018
DOI: 10.1073/pnas.1817715116
|View full text |Cite
|
Sign up to set email alerts
|

Semisoft clustering of single-cell data

Abstract: SignificanceGrowth typically involves differentiation of cells from progenitors into more specialized descendants, often involving lineages of pure and transitional cells to achieve final form. Recent technology has enabled estimation of gene expression profiles of single cells and these profiles theoretically differentiate pure cell types. What is missing from the analytical toolbox is an efficient technique to classify pure and transitional cells from their profiles. Here we propose semisoft clustering with … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
69
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 77 publications
(69 citation statements)
references
References 32 publications
0
69
0
Order By: Relevance
“… a , The optimal number of mesenchymal cell clusters was determined using the SOUP method 15 , a semi-soft clustering algorithm designed to distinguish between distinct cell types and transition states between cell types. b , Main cluster identity from SOUP highlighted on the t-SNE from figure 1b (mesenchymal cell types only).…”
Section: Figure S1mentioning
confidence: 99%
“… a , The optimal number of mesenchymal cell clusters was determined using the SOUP method 15 , a semi-soft clustering algorithm designed to distinguish between distinct cell types and transition states between cell types. b , Main cluster identity from SOUP highlighted on the t-SNE from figure 1b (mesenchymal cell types only).…”
Section: Figure S1mentioning
confidence: 99%
“…Besides PCA, other DR techniques are also commonly used for cell clustering. For example, nonnegative matrix factorization (NMF) is used in SOUP [19]. Partial least squares is used in scPLS [20].…”
Section: Introductionmentioning
confidence: 99%
“…We recommend that users consider performing clustering experiments on 500 to 1000 highly variable genes. In addition to CIDR and SIMLR, we also compared scDMFK with other commonly used scRNA-seq data clustering methods, such as Seurat (Satija et al, 2015), SC3 (Kiselev et al, 2017), Raceid (Herman and GrĂŒn, 2018), and SOUP (Zhu et al, 2019). Considering that the Seurat method cannot give a specific number of clusters in advance, we ran it several times with its parameter "resolution" changing from 0.5 to 1.5 by 0.1 and took the best ARI and NMI value as its result and recorded the corresponding estimated cluster number.…”
Section: Discussionmentioning
confidence: 99%
“…Most existing clustering algorithms customized for single-cell analysis do not model and denoise such data. Typically, they first learn the predefined distance measure and similarity metric based on an original data matrix directly or a reduced data matrix by simple linear dimension reduction methods, like PCA and ICA, and then utilize the traditional hard clustering methods, such as standard k-means clustering (Marco et al, 2014;GrĂŒn et al, 2016), graph-based spectral clustering (Wang et al, 2017;Zhu et al, 2019) or community detection (Levine et al, 2015;Satija et al, 2015), density-based clustering (Jiang et al, 2016), integrated learning clustering (Kiselev et al, 2017;Yang et al, 2018), and hierarchical clustering (Zeisel et al, 2015;Lin et al, 2017). However, in addition to the possibility of spurious identification of cell subtypes by separating dimension reduction and clustering, expensive computation limits their performance on large-scale datasets.…”
Section: Introductionmentioning
confidence: 99%