María-Dolores Cubiles-de-la-Vega scite author profile

The main models of machine learning are briefly reviewed and considered for building a classifier to identify the Fragile X Syndrome (FXS). We have analyzed 172 patients potentially affected by FXS in Andalusia (Spain) and, by means of a DNA test, each member of the data set is known to belong to one of two classes: affected, not affected. The whole predictor set, formed by 40 variables, and a reduced set with only nine predictors significantly associated with the response are considered. Four alternative base classification models have been investigated: logistic regression, classification trees, multilayer perceptron and support vector machines. For both predictor sets, the best accuracy, considering both the mean and the standard deviation of the test error rate, is achieved by the support vector machines, confirming the increasing importance of this learning algorithm. Three ensemble methods - bagging, random forests and boosting - were also considered, amongst which the bagged versions of support vector machines stand out, especially when they are constructed with the reduced set of predictor variables. The analysis of the sensitivity, the specificity and the area under the ROC curve agrees with the main conclusions extracted from the accuracy results. All of these models can be fitted by free R programs.fragile X syndrome, support vector machines, multilayer perceptron, classification trees, logistic regression, ensemble methods, R system,

show abstract

Identification of outlier bootstrap samples

García¹,

Pino‐Mejías²,

Muñoz-Pichardo³

et al. 1997

Journal of Applied Statistics

View full text Add to dashboard Cite

We define a variation of Efron's method II based on the outlier bootstrap sample concept. A criterion for the identification of such samples is given, with which a variation in the bootstrap sample generation algorithm is introduced. The results of several simulations are analyzed in which, in comparison with Efron's method II, a higher degree of closeness to the estimated quantities can be observed.

show abstract

Consistency of the reduced bootstrap for sample means

Jiménez-Gamero

García

Cubiles-de-la-Vega

2006

Statistics & Probability Letters

View full text Add to dashboard Cite

Bagging Classification Models with Reduced Bootstrap

Pino‐Mejías

Cubiles-de-la-Vega

López-Coello

et al. 2004

View full text Add to dashboard Cite

Abstract.Bagging is an ensemble method proposed to improve the predictive performance of learning algorithms, being specially effective when applied to unstable predictors. It is based on the aggregation of a certain number of prediction models, each one generated from a bootstrap sample of the available training set. We introduce an alternative method for bagging classification models, motivated by the reduced bootstrap methodology, where the generated bootstrap samples are forced to have a number of distinct original observations between two values k 1 and k 2 . Five choices for k 1 and k 2 are considered, and the five resulting models are empirically studied and compared with bagging on three real data sets, employing classification trees and neural networks as the base learners. This comparison reveals for this reduced bagging technique a trend to diminish the mean and the variance of the error rate.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.