Background-Asthma is a heterogeneous clinical disorder. Methods for objective identification of disease subtypes will focus on clinical interventions and help identify causative pathways. Few studies have explored phenotypes at a molecular level.
The PROMISE workshop seeks to deliver to the software engineering community useful, usable, verifiable, and repeatable models. To provide a sound and realistic basis for creating predictive models, and to allow researchers to conduct repeatable software engineering experiments, we maintain the PROMISE repository, a growing collection that now contains 57 empiricallybased data sets.
An important step in building effective predictive models applies one or more sampling techniques. Traditional sampling techniques include random, stratified, systemic, and clustered. The problem with these techniques is that they focus on the class attribute, rather than the non-class attributes. For example, if a test instance's nearest neighbor is from the opposite class of the training set, then it seems doomed to misclassification. To illustrate this problem, this paper conducts 20 experiments on five different NASA defect datasets (CM1, JM1, KC1, KC2, PC1) using two different learners (J48 and Naïve Bayes). Each data set is divided into 3 groups, a training set, and "nice/nasty" neighbor test sets. Using a nearest neighbor approach, "Nice neighbors" consist of those test instances closest to class training instances. "Nasty neighbors" are closest to opposite class training instances. The "Nice" experiments average 94 percent accuracy and the "Nasty" experiments average 20 percent accuracy. Based on these results a new nearest neighbor sampling technique is proposed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.