This article describes how the enormous size of data in IoT needs efficient data mining model for information extraction, classification and mining hidden patterns from data. CBR is a learning, mining and problem-solving approach which solves a problem by relating past similar solved problems. One issue with CBR is feature weight to measure the similarity among cases to mine similar past cases. NN's pruning is a popular method, which extracts feature weights from a trained neural network without losing much generality of the training set by using four mechanisms: sensitivity, activity, saliency and relevance. However, training NN with imbalanced data leads the classifier to get biased towards the majority class. Therefore, this article proposes a hybrid CBR model with RUS and cost sensitive back propagation neural network in IoT environment to deal with the feature weighting problem in imbalance data. The proposed model is validated with six real-life datasets. The experimental results show that the proposed model is better than other feature weighting methods.
The subject of a class imbalance is a well-investigated topic which addresses performance degradation of standard learning models due to uneven distribution of classes in a dataspace. Cluster-based undersampling is a popular solution in the domain which offers to eliminate majority class instances from a definite number of clusters to balance the training data. However, distance-based elimination of instances often got affected by the underlying data distribution. Recently, ensemble learning techniques have emerged as effective solution due to its weighted learning principle of rare instances. In this article, a boosting aided adaptive cluster-based undersampling technique is proposed to facilitate elimination of learning- insignificant majority class instances from the clusters, detected through AdaBoost ensemble learning model. The proposed work is validated with seven existing cluster based undersampling techniques for six binary datasets and three classification models. The experimental results have established the effectives of the proposed technique than the existing methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.