2014 IEEE International Conference on Data Mining 2014
DOI: 10.1109/icdm.2014.45
|View full text |Cite
|
Sign up to set email alerts
|

RS-Forest: A Rapid Density Estimator for Streaming Anomaly Detection

Abstract: Anomaly detection in streaming data is of high interest in numerous application domains. In this paper, we propose a novel one-class semi-supervised algorithm to detect anomalies in streaming data. Underlying the algorithm is a fast and accurate density estimator implemented by multiple fully randomized space trees (RS-Trees), named RS-Forest. The piecewise constant density estimate of each RS-tree is defined on the tree node into which an instance falls. Each incoming instance in a data stream is scored by th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
46
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
7
2

Relationship

1
8

Authors

Journals

citations
Cited by 75 publications
(46 citation statements)
references
References 20 publications
0
46
0
Order By: Relevance
“…Other proposals that deal with the broader problem of outlier detection in data streams include detection of changes, e.g., [29]; consideration of discrete sequences, e.g., [30]; techniques that rely on estimating the deviation from the expected values in time-series, e.g., [31] and density, e.g., [32]; specialized techniques for sensor networks, e.g., [33], and probabilistic streams, e.g., [34,35]; and solutions for the high-dimensionality problem in streaming outlier detection, e.g., [36]. Distance-based outlier detection has been also considered in [37] without considering incremental outlier computation though, [38], which employs data editing techniques, and [39], which focuses on efficient correlation computation techniques for multiple time series.…”
Section: Symbolmentioning
confidence: 99%
“…Other proposals that deal with the broader problem of outlier detection in data streams include detection of changes, e.g., [29]; consideration of discrete sequences, e.g., [30]; techniques that rely on estimating the deviation from the expected values in time-series, e.g., [31] and density, e.g., [32]; specialized techniques for sensor networks, e.g., [33], and probabilistic streams, e.g., [34,35]; and solutions for the high-dimensionality problem in streaming outlier detection, e.g., [36]. Distance-based outlier detection has been also considered in [37] without considering incremental outlier computation though, [38], which employs data editing techniques, and [39], which focuses on efficient correlation computation techniques for multiple time series.…”
Section: Symbolmentioning
confidence: 99%
“…Anguilli and Fassetti (2007) proposed a distance-based outlier detection algorithm to find the data instance anomalies over the data stream. Wu, Zhang, Fan, Edward, and Yu (2014) proposed a data structure called RS-Forest for modeling the density anomalies over data streams. Pham, Venkatesh, Lazarescu, and Budhaditya (2014) proposed a residual space analysis based method to detect the anomalies in a large-scale data stream network.…”
Section: Related Workmentioning
confidence: 99%
“…Recent works on anomaly detection are more focused on stream learning. Such examples include half‐space trees (HSTa) algorithm, randomized space Forest (RS‐Forest), ensemble of random cut trees algorithm, and subspace embedding–based methods. () Most of these algorithms are inspired by HSTa algorithm, where random decision trees were built in advance without data.…”
Section: Related Workmentioning
confidence: 99%