Sensitivity Analysis of k-Fold Cross Validation in Prediction Error Estimation

Rodríguez, Juan Diego; Pérez, Aritz; Lozano, José A.

doi:10.1109/tpami.2009.187

Cited by 1,425 publications

(675 citation statements)

References 9 publications

Supporting

Mentioning

656

Contrasting

Unclassified

Order By: Relevance

“…Each set is used exactly once as the test set while the remaining data is used as the training set (Rodríguez et al, 2010). Based on the fivefold cross-validation method to examine the prediction models produced by the above algorithms, the whole database was randomly divided into ten distinct parts because of the amount of test data is 10% of the whole database (Lin et al, 2006).…”

Section: Evaluation Of the Predictive Performancementioning

confidence: 99%

Ensemble artificial neural networks applied to predict the key risk factors of hip bone fracture for elders

Liu

Cui

Chou

et al. 2015

Biomedical Signal Processing and Control

View full text Add to dashboard Cite

Hip bone fracture is one of the most important causes of morbidity and mortality in the elder adults. It is necessary to establish a prediction model to provide suggestions for elders. A total of 725 subjects were involved, including 228 patients with first low-trauma hip fracture and 497 ages-, sex-, and living area-matched controls (215 from the same hospital and 282 from community). All the subjects were interviewed with the same questionnaire, and the answers of the interviewees were recorded to be the database. Three-layer back-propagation Artificial Neural Networks (ANN) models were applied for females and males separately in this study to predict the risk of hip bone fracture for elders. Furthermore, to improve the accuracies and the generalizations of the models, the ensemble ANNs method was applied. To understand variables contributions and find the important variables for predicting hip fracture, sensitivity analysis and connection weights approach were applied. In this study, three ANNs prediction models were tested with different architectures. With the fivefold crossvalidation method evaluating the performances, one of the three models turned out to be the best prediction model and achieved a big success of prediction. The best area under the receiver operating characteristic (ROC) curve and the accuracy of the prediction model are 0.91 ± 0.028 (mean ± SD) and 0.85 ± 0.029 for females, while for males are 0.99 ± 0.015 and 0.93 ± 0.020. With the method of sensitivity analysis and connection weights, input variables were ranked according to contributions/importance, and the top 10 variables show great proportion of contribution to predict hip fracture. The top 10 important variables causing hip fracture for both females and males are similar to our previous results got from logistic regression model and other related researches. In conclusion, ANNs has successfully been to establish prediction models for predicting the risk of hip bone fracture for both female and male elder adults respectively and identified the top 10 important variables from 74 input variables to predict hip bone fracture of elders. This study verified the performance of ANNs to be a highly complex prediction model. Sept 2014Dear Prof. R. Allen, I use the electronic version to send this manuscript to you. The manuscript title is: "Ensemble back-propagation neural networks for predicting the risk of hip bone fracture for elders in Taiwan". We are submitting this material for possible publication in "Biomedical Signal Processing and Control". This material has not been submitted for publication or published elsewhere in whole or part. We believe this manuscript represents an original and significant contribution to the field of "Neural networks for predicting the risk of hip bone fracture" and therefore would like to be considered for publication in "Original Articles". AbstractHip bone fracture is one of the most important causes of morbidity and mortality in the elder adults. It is necessary to establish a prediction mod...

show abstract

Section: Evaluation Of the Predictive Performancementioning

confidence: 99%

Ensemble artificial neural networks applied to predict the key risk factors of hip bone fracture for elders

Liu

Cui

Chou

et al. 2015

Biomedical Signal Processing and Control

View full text Add to dashboard Cite

show abstract

“…Therefore, an optimal process based on a genetic algorithm (GA) is used to identify the best parameter values. This optimization uses the accuracy of the training dataset as the fitness function, and applies K-fold cross-validation 21 to analyze the variable generalization ability of each generation. The program flow of the GA used in the proposed method is shown in Fig.…”

Section: Optimization Of Svm Parametersmentioning

confidence: 99%

An SVM approach with alternating current potential drop technique to classify pits and cracks on the bottom of a metal plate

Gan

Wan

et al. 2016

AIP Advances

View full text Add to dashboard Cite

An SVM approach with alternating current potential drop technique to classify pits and cracks on the bottom of a metal plateThe alternating current potential drop (ACPD) is a nondestructive technique that is widely used to detect and size defects in conductive material. This paper describes a combined ACPD and support vector machine (SVM) approach to accurately recognize typical defects on the bottom surface of a metal plate, i.e., pits and cracks. We first conducted a simulation study, and then, based on ACPD, measured five voltage ratios between the test region and reference region. The analysis of finite simulation data enables the binary classification of two kinds of defects. To obtain an accurate separating hyperplane, key parameters of the SVM classifier were optimized using a genetic algorithm with training data from the simulations. Based on the optimized SVM classifier, reliable estimates of the defects in a metal plate were then obtained. The recognition results of the simulation dataset shows that the trained and optimized SVM model has a high classification accuracy, and the metal plate experiment also indicates that the model has good precision in actual defect classification. C 2016 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons Attribution (CC BY) license

show abstract

“…The repeated r times k-cv consists of estimating the error as the average of r k-cv estimations with different random partitions into folds. This method considerably reduces the variance of the error estimation [41].…”

Section: Multi-dimensional Classification Evaluationmentioning

confidence: 99%

“…The classification accuracies, which have been estimated via 20 runs of 5-fold non-stratified cross validation (20×5cv) [41], are shown in Table 6. The results of using the four different feature sets (unigrams, unigrams + bigrams, PoS, and the ASOMO features) in conjunction with 20 the three different learning approaches (multiple uni-dimensional, Cartesian class variable, and multi-dimensional classifiers) in a supervised framework are shown.…”

mentioning

confidence: 99%

See 1 more Smart Citation

Approaching Sentiment Analysis by using semi-supervised learning of multi-dimensional classifiers

Ortigosa-Hernández

Rodríguez

Alzate³

et al. 2012

Neurocomputing

View full text Add to dashboard Cite

Sentiment Analysis is defined as the computational study of opinions, sentiments and emotions expressed in text. Within this broad field, most of the work has been focused on either Sentiment Polarity classification, where a text is classified as having positive or negative sentiment, or Subjectivity classification, in which a text is classified as being subjective or objective. However, in this paper, we consider instead a real-world problem in which the attitude of the author is characterised by three different (but related) target variables: Subjectivity, Sentiment Polarity, Will to Influence, unlike the two previously stated problems, where there is only a single variable to be predicted. For that reason, the (uni-dimensional) common approaches used in this area yield suboptimal solutions to this problem. In order to bridge this gap, we propose, for the first time, the use of the novel multi-dimensional classification paradigm in the Sentiment Analysis domain. This methodology is able to join the different target variables in the same classification task so as to take advantage of the potential statistical relations between them. In addition, and in order to take advantage of the huge amount of unlabelled information available nowadays in this context, we propose the extension of the multi-dimensional classification framework to the semi-supervised domain. Experimental results for this problem show that our semi-supervised multi-dimensional approach outperforms the most common Sentiment Analysis approaches, concluding that our approach is beneficial to improve the recognition rates for this problem, and in extension, could be considered to solve future Sentiment Analysis problems.

show abstract

Sensitivity Analysis of k-Fold Cross Validation in Prediction Error Estimation

Cited by 1,425 publications

References 9 publications

Ensemble artificial neural networks applied to predict the key risk factors of hip bone fracture for elders

Ensemble artificial neural networks applied to predict the key risk factors of hip bone fracture for elders

An SVM approach with alternating current potential drop technique to classify pits and cracks on the bottom of a metal plate

Approaching Sentiment Analysis by using semi-supervised learning of multi-dimensional classifiers

Contact Info

Product

Resources

About