2020
DOI: 10.1101/2020.08.03.233650
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Single-cell identity definition using random forests and recursive feature elimination

Abstract: Single cell RNA sequencing (scRNA-seq) enables detailed examination of a cell's underlying regulatory networks and the molecular factors contributing to its identity. We developed scRFE (single-cell identity definition using random forests and recursive feature elimination, pronounced 'surf') with the goal of easily generating interpretable gene lists that can accurately distinguish observations (single-cells) by their features(genes) given a class of interest. scRFE is an algorithm implemented as a Python pac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
2
2

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 27 publications
(40 reference statements)
0
5
0
Order By: Relevance
“…To investigate the relationship between the anatomical distribution of LSN subtypes and their transcriptional profiles, we performed spatial transcriptomics using MERFISH on coronal sections of P35 septum across 2 replicates ( Figure 3A ). We constructed a 500-gene panel by identifying the minimum number of genes necessary to define each LSN subtype using a one-versus-all random forest classifier on our snRNA-seq dataset (Park et al, 2020). We supplemented this list with genes that defined the transcriptionally similar pairs, as well as markers of MSNs, and non-neuronal cell types.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…To investigate the relationship between the anatomical distribution of LSN subtypes and their transcriptional profiles, we performed spatial transcriptomics using MERFISH on coronal sections of P35 septum across 2 replicates ( Figure 3A ). We constructed a 500-gene panel by identifying the minimum number of genes necessary to define each LSN subtype using a one-versus-all random forest classifier on our snRNA-seq dataset (Park et al, 2020). We supplemented this list with genes that defined the transcriptionally similar pairs, as well as markers of MSNs, and non-neuronal cell types.…”
Section: Resultsmentioning
confidence: 99%
“…This was accomplished by performing pairwise differential gene expression analysis between cell groups. Finally, we employed the random forest classifier scRFE on the single-nucleus dataset to identify the sets of genes that best describe our cell groups (Park et al, 2020). This list was uploaded and approved for probe encoding by the Vizgen Gene Panel Design Portal.…”
Section: Methods Detailsmentioning
confidence: 99%
“…6a). To compare the transcriptome profiles of the disproportionate numbers of clone HIV-1 (537 cells) with non-clone HIV-1 + (78,199 cells), we employed single-cell identity definition using random forests and recursive feature elimination (scRFE) 43 with bootstrapping to identify key genes that were necessary and sufficient to differentiate clone HIV-1 + from non-clone HIV-1 + . We identified 100 genes (Supplementary Table 8) that were as effective as the 5,000 most highly variable genes for differentiating clone HIV-1 + from non-clone HIV-1 + (Fig.…”
Section: Resultsmentioning
confidence: 99%
“…Advancements in single-cell technologies enabled high-dimensional immune profiling to dissect the heterogeneous states of immune cells, high-resolution capture of rare cells, T cell clonality, and identification of upstream drivers of immune dysregulation [32][33][34][35][36][37] . Further, computational techniques including supervised and unsupervised machine learning, network analysis, and statistical methodologies enable confident identification of higher-fidelity predictors of different cellular states from the sparse and highly complex single-cell multi-modal data [38][39][40][41][42][43] .…”
mentioning
confidence: 99%
“…Furthermore, using this model, we could isolate several putatively annotated molecules that contribute to discriminating macrophages upon different polarization. Although both LSC‐MS and RF are well‐established and have been largely utilized within their respective fields, the connection of these two provides an alternative to existing approaches given its effective performance on sparse, high‐dimensional data with collinear features and straightforward understandability for single‐cell data (Park et al, 2020). Despite the limitations in sample size and the inherent issues associated with single‐cell metabolomics; the study successfully demonstrates the applicability of untargeted LSC‐MS metabolomic profiling combined with RF for macrophage analysis.…”
Section: Discussionmentioning
confidence: 99%