2018
DOI: 10.1002/bimj.201700243
|View full text |Cite
|
Sign up to set email alerts
|

Making complex prediction rules applicable for readers: Current practice in random forest literature and recommendations

Abstract: Ideally, prediction rules should be published in such a way that readers may apply them, for example, to make predictions for their own data. While this is straightforward for simple prediction rules, such as those based on the logistic regression model, this is much more difficult for complex prediction rules derived by machine learning tools. We conducted a survey of articles reporting prediction rules that were constructed using the random forest algorithm and published in PLOS ONE in 2014-2015 in the field… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
9
1

Relationship

3
7

Authors

Journals

citations
Cited by 17 publications
(6 citation statements)
references
References 28 publications
(47 reference statements)
0
6
0
Order By: Relevance
“…Methods yielding models which are sparse with regard to the number of variables and number of omics types are often considered preferable from a practical perspective. Interpretation and practical application of the model to the prediction of independent data are easier with regression-based methods yielding coefficients that reflect the effects of variables on the outcome than with machine learning algorithms [ 8 ].…”
Section: Methodsmentioning
confidence: 99%
“…Methods yielding models which are sparse with regard to the number of variables and number of omics types are often considered preferable from a practical perspective. Interpretation and practical application of the model to the prediction of independent data are easier with regression-based methods yielding coefficients that reflect the effects of variables on the outcome than with machine learning algorithms [ 8 ].…”
Section: Methodsmentioning
confidence: 99%
“…RF is a regression-based classification algorithm that aggregates many decision trees trained on randomly sampled subsets of a complex dataset. 24 , 25 When developing the RF model, a grid search was used to determine the best combination of tuning parameters, including number of trees and number of features at each split. These models were chosen based on previous machine learning studies that focused on binary classifications.…”
Section: Methodsmentioning
confidence: 99%
“… 15 Random forest is a regression-based classification algorithm with the aggregation of a large number of decision trees trained on randomly sampled subsets of complex dataset. 16 , 17 Random forest applies randomness when building each individual tree, and thus prediction by grouping could be more accurate than any individual tree. Both models were trained and tested using cross-validation with 80% data as a training set and 20% as a validation set.…”
Section: Methodsmentioning
confidence: 99%