2017
DOI: 10.1155/2017/6817627
|View full text |Cite
|
Sign up to set email alerts
|

A Variable Impacts Measurement in Random Forest for Mobile Cloud Computing

Abstract: Recently, the importance of mobile cloud computing has increased. Mobile devices can collect personal data from various sensors within a shorter period of time and sensor-based data consists of valuable information from users. Advanced computation power and data analysis technology based on cloud computing provide an opportunity to classify massive sensor data into given labels. Random forest algorithm is known as black box model which is hardly able to interpret the hidden process inside. In this paper, we pr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
23
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 45 publications
(23 citation statements)
references
References 14 publications
0
23
0
Order By: Relevance
“…Variable importance was verified based on the calculated mean decrease in accuracy [ 16 ] (in the case of RF) and Olden’s method output [ 17 ] (in the case of ANN). The data was split randomly into two groups: one half of the data (50%) comprised the training set (used for model learning), while the other half was used for testing (the previously developed model was used to make predictions on this new data; the observed rates of false positive, false negative, true positive, and true negative were used to evaluate performance by class).…”
Section: Methodsmentioning
confidence: 99%
“…Variable importance was verified based on the calculated mean decrease in accuracy [ 16 ] (in the case of RF) and Olden’s method output [ 17 ] (in the case of ANN). The data was split randomly into two groups: one half of the data (50%) comprised the training set (used for model learning), while the other half was used for testing (the previously developed model was used to make predictions on this new data; the observed rates of false positive, false negative, true positive, and true negative were used to evaluate performance by class).…”
Section: Methodsmentioning
confidence: 99%
“…While we have attempted to reduce training time and the potential for overfitting with careful feature selection methods, random forest modeling has inherent limitations, which include high model complexity requiring computational resources and longer training periods than other machine learning frameworks. We use Mean Decrease Accuracy for feature selection, which has been known to have limitations due to the multicollinearity problem (variable impact calculation is less accurate when there are high numbers of correlated variables) (Hur et al, 2017 ). Future studies with significantly larger sample sizes will be required to improve upon this framework for general student athlete injury risk.…”
Section: Discussionmentioning
confidence: 99%
“…In future studies, gradient tree boosting, another tree-based machine learning model, could be considered to reduce the computational resources necessary for random forest modeling. Additionally, new methods of feature selection that solve the multicollinearity problem, such as the Shapley Value method (Hur et al, 2017 ), could be considered for future selection in future studies.…”
Section: Discussionmentioning
confidence: 99%
“…By doing so, the importance of a variable in predicting the response is quantified by evaluating the difference of how much including or excluding that variable decreases or increases accuracy [18][19][20]. This difference is referred to as the Mean Decrease Accuracy (MDA), and is computed by the formula shown in Equation 3 [21,22].…”
Section: Unsupervised Variable Selectionmentioning
confidence: 99%