2022
DOI: 10.1111/nph.18053
|View full text |Cite
|
Sign up to set email alerts
|

Identification of new marker genes from plant single‐cell RNA‐seq data using interpretable machine learning methods

Abstract: Summary An essential step in the analysis of single‐cell RNA sequencing data is to classify cells into specific cell types using marker genes. In this study, we have developed a machine learning pipeline called single‐cell predictive marker (SPmarker) to identify novel cell‐type marker genes in the Arabidopsis root. Unlike traditional approaches, our method uses interpretable machine learning models to select marker genes. We have demonstrated that our method can: assign cell types based on cells that were l… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
1

Relationship

2
6

Authors

Journals

citations
Cited by 15 publications
(7 citation statements)
references
References 54 publications
0
7
0
Order By: Relevance
“…To overcome this, we recommend use of multiplexed probe sets to simultaneously label cell-type specific RNA. Machine learning algorithms applied to single cell transcriptome datasets have recently expanded the pool of suitable gene candidates ( Yan et al , 2022 ).…”
Section: Resultsmentioning
confidence: 99%
“…To overcome this, we recommend use of multiplexed probe sets to simultaneously label cell-type specific RNA. Machine learning algorithms applied to single cell transcriptome datasets have recently expanded the pool of suitable gene candidates ( Yan et al , 2022 ).…”
Section: Resultsmentioning
confidence: 99%
“…On the first point, there are several recent approaches that have described the expression profiles of root cells and have tried to establish the developmental trajectories of the different cell types at a single-cell resolution not only for WT but also for some mutants and ambient conditions [19,28,36,79,98,111,147] and some of these approaches even take advantage of machine-learning methods to generate better quality data to identify cell-type marker genes from scRNA-seq data [143]. This approach would allow us to extend our GRN with key regulators that were not incorporated in our model by reconstructing regulatory networks based on massive data [142, 145] and to compare previously proposed conserved loops and incorporate new ones that are able to describe the dynamic behavior of epidermal differentiation in different developmental contexts. These new interactions can be validated in the context of our model to assess their relevance.…”
Section: Discussionmentioning
confidence: 99%
“…Single-cell RNA sequencing (scRNA-seq) technology has been widely used in identifying and characterizing various cell types in numerous tissues from different plant species [1][2][3][4][5][6] . Efforts to apply this technology in the model plant Arabidopsis have bene ted from the extensive knowledge of cell-type identity markers 7,8 . Contrastingly, accurate labeling of cell types in other plant species remains a challenge due to the scarcity of known cell-type marker genes 9 .…”
Section: Introductionmentioning
confidence: 99%
“…One way to address this problem is to increase the number of marker genes (i.e., N > 200) with the same statistical method but reduce the speci city threshold of these genes. To increase the number of markers without sacri cing the speci city, we have demonstrated that applying feature selection in machine learning enabled us to identify markers distinct from those identi ed by statistical approaches 7 . Therefore, we employed two additional machine-learning (ML) methods (SHAP-RF and SVM) to identify the top 200 marker genes for each cell-type cluster in each species.…”
Section: Introductionmentioning
confidence: 99%