Classification tree is a widely used machine learning method. It has multiple implementations as R packages; rpart, ctree, evtree, tree and C5.0. The details of these implementations are not the same, and hence their performances differ from one application to another. We are interested in their performance in the classification of cells using the single-cell RNA-Sequencing data. In this paper, we conducted a benchmark study using 22 Single-Cell RNA-sequencing data sets. Using cross-validation, we compare packages’ prediction performances based on their Precision, Recall, F1-score, Area Under the Curve (AUC). We also compared the Complexity and Run-time of these R packages. Our study shows that rpart and evtree have the best Precision; evtree is the best in Recall, F1-score and AUC; C5.0 prefers more complex trees; tree is consistently much faster than others, although its complexity is often higher than others.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.