Undersampling is a widely adopted method to deal with imbalance pattern classification problems. Current methods mainly depend on either random resampling on the majority class or resampling at the decision boundary. Random-based undersampling fails to take into consideration informative samples in the data while resampling at the decision boundary is sensitive to class overlapping. Both techniques ignore the distribution information of the training dataset. In this paper, we propose a diversified sensitivity-based undersampling method. Samples of the majority class are clustered to capture the distribution information and enhance the diversity of the resampling. A stochastic sensitivity measure is applied to select samples from both clusters of the majority class and the minority class. By iteratively clustering and sampling, a balanced set of samples yielding high classifier sensitivity is selected. The proposed method yields a good generalization capability for 14 UCI datasets.
In this paper, we focus on data-driven approaches to human activity recognition (HAR). Data-driven approaches rely on good quality data during training, however, a shortage of high quality, large-scale, and accurately annotated HAR datasets exists for recognizing activities of daily living (ADLs) within smart environments. The contributions of this paper involve improving the quality of an openly available HAR dataset for the purpose of data-driven HAR and proposing a new ensemble of neural networks as a data-driven HAR classifier. Specifically, we propose a homogeneous ensemble neural network approach for the purpose of recognizing activities of daily living within a smart home setting. Four base models were generated and integrated using a support function fusion method which involved computing an output decision score for each base classifier. The contribution of this work also involved exploring several approaches to resolving conflicts between the base models. Experimental results demonstrated that distributing data at a class level greatly reduces the number of conflicts that occur between the base models, leading to an increased performance prior to the application of conflict resolution techniques. Overall, the best HAR performance of 80.39% was achieved through distributing data at a class level in conjunction with a conflict resolution approach, which involved calculating the difference between the highest and second highest predictions per conflicting model and awarding the final decision to the model with the highest differential value.
The training of a multilayer perceptron neural network (MLPNN) concerns the selection of its architecture and the connection weights via the minimization of both the training error and a penalty term. Different penalty terms have been proposed to control the smoothness of the MLPNN for better generalization capability. However, controlling its smoothness using, for instance, the norm of weights or the Vapnik-Chervonenkis dimension cannot distinguish individual MLPNNs with the same number of free parameters or the same norm. In this paper, to enhance generalization capabilities, we propose a stochastic sensitivity measure (ST-SM) to realize a new penalty term for MLPNN training. The ST-SM determines the expectation of the squared output differences between the training samples and the unseen samples located within their Q -neighborhoods for a given MLPNN. It provides a direct measurement of the MLPNNs output fluctuations, i.e., smoothness. We adopt a two-phase Pareto-based multiobjective training algorithm for minimizing both the training error and the ST-SM as biobjective functions. Experiments on 20 UCI data sets show that the MLPNNs trained by the proposed algorithm yield better accuracies on testing data than several recent and classical MLPNN training methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.