Abstract:-In Internetworking system, the huge amount of data is scattered, generated and processed over the network. The data mining techniques are used to discover the unknown pattern from the underlying data. A traditional classification model is used to classify the data based on past labelled data. However in many current applications, data is increasing in size with fluctuating patterns. Due to this new feature may arrive in the data. It is present in many applications like sensornetwork, banking and telecommunication systems, financial domain, Electricity usage and prices based on its demand and supplyetc .Thus change in data distribution reduces the accuracy of classifying the data. It may discover some patterns as frequent while other patterns tend to disappear and wrongly classify. To mine such data distribution, traditionalclassification techniques may not be suitable as the distribution generating the items can change over time so data from the past may become irrelevant or even false for the current prediction. For handlingsuch varying pattern of data, concept drift mining approach is used to improve the accuracy of classification techniques. In this paper we have proposed ensemble approach for improving the accuracy of classifier. The ensemble classifier is applied on 3 different data sets. We investigated different features for the different chunk of data which is further given to ensemble classifier. We observed the proposed approach improves the accuracy of classifier for different chunks of data.
In recent times, enormous growth of real-world data and its usage raised an issue of processing data for extracting meaningful patterns. Due to huge volumes and diversification in the data, traditional knowledge of data mining algorithms lower down the accuracy. Due to data drift, a good accuracy model may not outperform for generalized data. So, it is necessary to handle the causes of drift and its impact on the model accuracy. Existing approaches use a fixed size sliding window approach. In contrast, our approach uses both fixed window and adaptive window approach to detect the concept drift. We have used maximum likelihood estimation technique. CUSUM chart ,Simple Moving Average and a cross correlation technique to detect a change in the concept . We have analyzed the impact of variable size chunk data on different ensemble model. Our approach improves the classifier accuracy using better feature selection and evolution method. The combine approach of variable sized sample and weighted Ensemble classifiers not only detect the change in the concept but also applied for drift detection. Under data drift strategy, we inclusively compare the classifiers performance on electricity dataset ,refereed by research community .
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.