Machine learning is rapidly emerging as a valuable technology thanks to its ability to learn patterns from large data sets and solve problems that are impossible to model using conventional programming logic. As machine learning techniques become more mainstream, they are being applied to a wider range of application domains. These algorithms are now trusted to make critical decisions in secure and adversarial environments such as healthcare, fraud detection, and network security, in which mistakes can be incredibly costly. They are also a critical component to most modern autonomous systems. However, the data driven approach utilized by these machine learning methods can prove to be a weakness if the data on which the models rely are corrupted by either nefarious or accidental means. Models that utilize on-line learning or periodic retraining to learn new patterns and account for data distribution changes are particularly susceptible to corruption through model drift. In modeling this type of scenario, specially crafted data points are added to the training set over time to adversely influence the system, inducing model drift which leads to incorrect classifications. Our work is focused on exploring the resistance of various machine learning algorithms to such an approach. In this paper we present an experimental framework designed to measure the susceptibility of anomaly detection algorithms to model drift. We also exhibit our preliminary results using various machine learning algorithms commonly found in intrusion detection research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.