Machine learning is rapidly emerging as a valuable technology thanks to its ability to learn patterns from large data sets and solve problems that are impossible to model using conventional programming logic. As machine learning techniques become more mainstream, they are being applied to a wider range of application domains. These algorithms are now trusted to make critical decisions in secure and adversarial environments such as healthcare, fraud detection, and network security, in which mistakes can be incredibly costly. They are also a critical component to most modern autonomous systems. However, the data driven approach utilized by these machine learning methods can prove to be a weakness if the data on which the models rely are corrupted by either nefarious or accidental means. Models that utilize on-line learning or periodic retraining to learn new patterns and account for data distribution changes are particularly susceptible to corruption through model drift. In modeling this type of scenario, specially crafted data points are added to the training set over time to adversely influence the system, inducing model drift which leads to incorrect classifications. Our work is focused on exploring the resistance of various machine learning algorithms to such an approach. In this paper we present an experimental framework designed to measure the susceptibility of anomaly detection algorithms to model drift. We also exhibit our preliminary results using various machine learning algorithms commonly found in intrusion detection research.
This paper presents the design and implementation of an adaptive open-set speaker identification system with genetic learning classifier systems. One of the challenging problems in using learning classifier systems for numerical problems is the knowledge representation. The voice samples are a series of real numbers that must be encoded in a classifier format. We investigate several different methods for representing voice samples for classifier systems and study the efficacy of the methods. We also identify several challenges for learning classifier systems in the speaker identification problem and introduce new methods to improve the learning and classification abilities of the systems. Experimental results show that our system successfully learns 200 voice features at the accuracies of 60% to 80%, which is considered a strong result in the speaker identification community. This research presents the feasibility of using learning classifier systems for the speaker identification problem.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.