THE ENSEMBLE METHOD DEVELOPMENT OF CLASSIFICATION OF THE COMPUTER SYSTEM STATE BASED ON DECISIONS TREES Ab s t r a c t. The subject of this article is exploration of methods for identifying the status of a computer system. The purpose of the article is development of a method for classifying a computer system anomalous state based on ensemble methods. Task: To investigate the usage of algorithms for building decision trees: REPTree, Random Tree, J48, HoeffdingTree, DecisionStump and bagging and boosting decision tree ensembles to identify a computer system anomalous state by analyzing operating system events. The methods used are artificial intelligence, machine learning and ensemble classification methods. The following results were obtained: the methods of identifying the computer systems anomalous state based on ensemble methods were investigated, namely, bagging, boosting, and classifiers: REPTree, Random Tree, J48, HoeffdingTree, DecisionStump to identify a computer system anomalous state. The different classifiers set and classifiers ensembles were developed. Training and cross-validation on each algorithm was performed. The developed classifiers performance has been evaluated. The research suggests an ensemble method of a computer system state classifying based on the J48 decision tree algorithm. Conclusions. The scientific novelty of the obtained results consists in creating an ensemble method for classifying the state of a computer system based on a decision tree, which makes it possible to increase the reliability and speed of classification. K e ywor d s : computer system; decision trees; ensemble methods; boosting; bagging; operating system events; anomalous state.
Context. The problem of identification a computer system state was investigated. The object of the research is the identification process of the computer system state. The subject of the research is computer system state identifying means and methods. Objective. The purpose of the work is to develop a method for identifying the computer system state. Method. The method has been developed for identifying a computer system state based on integrated use the procedure for grouping unlabeled initial data and using machine learning technology based on the «Isolation Forest» algorithm, which provides to identify a computer system state and to distinguished the process name that initiated the abnormal state. Therefore, for collecting statistical data in the form of operating system functioning events, data method has been proposed and developed along with software. The analysis of functioning events has been performed. The result of analysis showed that the most informative are read and write operations. To set up a single dataset, read and write operations compared with the process name and combined into one array of event groups, so that it is possible to single out the process that causes the abnormal state of the computer system. As a result of the research, the «Isolation Forest» algorithm has been selected as a component of the method for identifying the computer system state. An accuracy and efficiency assessment of the developed method of identifying a computer system state has been carried out. Results. The developed method is implemented and investigated when solving the problem of identifying anomalies in the functioning of computer systems. Conclusions. The experiments carried out confirmed the efficiency of the proposed method. It allows us recommended the method for practical use in order to improve efficiency of identifying the computer system state and use it as an express method. Areas for further research may lie in the creation of the ensemble of fuzzy trees based on the proposed method and optimization of this software implementation.
The subject of the article is a study of methods of determining the informativeness of attributes. The aim of the article is improvement of the classification quality of a computer system state by selecting the most informative features. Objective: To explore methods for selecting optimal information features to identify a computer system state based on an analysis of the Windows operating system events. The methods used are: machine learning methods, ensemble methods, methods of selecting the optimal information features. The following results were obtained: analysis of the Windows operating system events was performed, methods of selection the optimal information features were investigated: wrapper methods (Wrappers), embedded methods (Embedded) and filter methods (Filters). The informativeness assessment and selection features were performed for identifying a computer system state. An ensemble method for classifying a computer system state based on a bagging and J48 decision tree was developed to evaluate the effectiveness of selected features. The dependency of the classification accuracy of a computer system state on the selected features was investigated, and the attributes set that provides the maximum classification accuracy of a computer system state was determined. Conclusions. The scientific novelty of the results is in the analysis of the Windows operating system events, assessment of their informativeness and selection of features in the identification a computer system state.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.