a b s t r a c tAnomaly detection in sequence data is becoming more and more important in a wide variety of application domains such as credit card fraud detection, health care in medical field, and intrusion detection in cyber security. In the existing anomaly detection approaches, Markov chain techniques are widely accepted for their simple realization and few parameters. However, the short memory property of a classical Markov model ignores the interaction among data, and the long memory property of a higher order Markov model clouds the relationship between the previous data and current test data, and reduces the reliability of the model. Besides, both of these models cannot successfully describe the sequences changing with a tendency. In this paper, we propose an anomaly detection approach based on a dynamic Markov model. This approach segments sequence data by a sliding window. In the sliding window, we define the states of data according to the value of the data and establish a higher order Markov model with a proper order consequently, to balance the length of the memory property and keep up with the trend of sequences. In addition, an anomaly substitution strategy is proposed to prevent the detected anomalies from impacting the building of the models and keep anomaly detection continuously. The experimental results using simulated datasets and real-world datasets have demonstrated that the proposed approach improves the adaptability and stability of anomaly detection in sequence data.
Anomaly detection in time series is a popular topic focusing on a variety of applications, which achieves a wealth of results. However, there are many cases of missing anomalies and false alarms for most existing works. Inspired by the concept of interval sets, this paper proposes an anomaly detection algorithm called probability interval and tries to detect the anomaly data in time series from a new perspective. In the proposed algorithm, a time series is divided into several subsequences. Each subsequence is regarded as an interval set depending on its value space and boundary of the subsequence. The similarity measurements between interval sets adopt interval operations and point probability distributions of the interval bounds. In addition, based on similarity results, the anomaly score is defined. The experimental results on artificial and real datasets indicate that the proposed algorithm has better discriminative performance than the piecewise aggregate approximation method and greatly reduces the false alarm rate.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.