2020
DOI: 10.7554/elife.56879
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised machine learning reveals risk stratifying glioblastoma tumor cells

Abstract: A goal of cancer research is to reveal cell subsets linked to continuous clinical outcomes to generate new therapeutic and biomarker hypotheses. We introduce a machine learning algorithm, Risk Assessment Population IDentification (RAPID), that is unsupervised and automated, identifies phenotypically distinct cell populations, and determines whether these populations stratify patient survival. With a pilot mass cytometry dataset of 2 million cells from 28 glioblastomas, RAPID identified tumor cells whos… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
45
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
2
2

Relationship

2
5

Authors

Journals

citations
Cited by 24 publications
(45 citation statements)
references
References 76 publications
0
45
0
Order By: Relevance
“…Here we used the t-SNE algorithm as a core method to reduce the dimensionality of the dataset and to visualize our data. t-SNE has been widely used in the unsupervised analysis of many types of biological data (Berman et al, 2014; Kollmorgen et al, 2020; Chen et al, 2020; Macosko et al, 2015; Kobak and Berens, 2019; Leelatian et al, 2020), including neural recordings (Dimitriadis et al, 2018). t-SNE minimizes the Kullback-Leibler divergence between a Gaussian distribution modeling pairwise distances between data points and a Student t-distribution modeling distances between the same points in a low (typically two) dimensional embedding (Van der Maaten and Hinton, 2008; Linderman and Steinerberger, 2019).…”
Section: Discussionmentioning
confidence: 99%
“…Here we used the t-SNE algorithm as a core method to reduce the dimensionality of the dataset and to visualize our data. t-SNE has been widely used in the unsupervised analysis of many types of biological data (Berman et al, 2014; Kollmorgen et al, 2020; Chen et al, 2020; Macosko et al, 2015; Kobak and Berens, 2019; Leelatian et al, 2020), including neural recordings (Dimitriadis et al, 2018). t-SNE minimizes the Kullback-Leibler divergence between a Gaussian distribution modeling pairwise distances between data points and a Student t-distribution modeling distances between the same points in a low (typically two) dimensional embedding (Van der Maaten and Hinton, 2008; Linderman and Steinerberger, 2019).…”
Section: Discussionmentioning
confidence: 99%
“…Marker Enrichment Modeling from the MEM package (https://github.com/cytolab/mem) was used to characterize feature enrichment in KNN region around each cell. MEM normally requires a comparison of a population against a reference control, such as a common reference sample (Diggins et al, 2017), all other cells (Diggins et al, 2018;Leelatian et al, 2020), or induced pluripotent stem cells (Greenplate et al, 2019). Here, a statistical reference point intended as a statistical null hypothesis was used as the MEM reference.…”
Section: Mem Analysis Of Enriched Featuresmentioning
confidence: 99%
“…Analysis algorithms typically rely on aggregate statistics for groups of cells, but the process of grouping the cells works best with larger, established populations ( Diggins et al, 2015 ; Irish et al, 2006 ; Saeys et al, 2016 ) or may include pre-filtering of cells by human experts ( Greenplate et al, 2016a ; Greenplate et al, 2019 ). Cytometry tools like SPADE ( Bendall et al, 2011 ; Qiu et al, 2011 ), FlowSOM ( Van Gassen et al, 2015 ), Phenograph ( Levine et al, 2015 ), Citrus ( Bruggner et al, 2014 ), and RAPID ( Leelatian et al, 2020 ) generally work best to characterize cell subsets representing >1% of the sample and are less capable of capturing extremely rare cells or subsets distinguished by only a fraction of measured features. Tools like t-SNE ( Amir el et al, 2013 ; Krijthe et al, 2015 ), opt-SNE ( Belkina et al, 2019 ), and UMAP ( Becht et al, 2018 ; McInnes et al, 2018 ) embed cells or learn a manifold and represent these transformations as algorithmically-generated axes.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations