2016 IEEE 16th International Conference on Data Mining (ICDM) 2016
DOI: 10.1109/icdm.2016.0102
|View full text |Cite
|
Sign up to set email alerts
|

Incorporating Expert Feedback into Active Anomaly Discovery

Abstract: Anomaly detectors are often used to produce a ranked list of statistical anomalies, which are examined by human analysts in order to extract the actual anomalies of interest. Unfortunately, in realworld applications, this process can be exceedingly difficult for the analyst since a large fraction of high-ranking anomalies are false positives and not interesting from the application perspective. In this paper, we aim to make the analyst's job easier by allowing for analyst feedback during the investigation proc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

4
124
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 111 publications
(128 citation statements)
references
References 15 publications
(8 reference statements)
4
124
0
Order By: Relevance
“…To induce diversity among base detectors, distinct initialization hyperparameters-specifically the number of neighbors (M inP ts) used in each LOF detector-are randomly selected in the range of [5,200]. It is noted that a narrower range (i.e., [10,150]) is taken for certain datasets due to limitations associated with size or computational cost. For GG AOM and GG MOA, the Tables 2 and 3 summarize the ROC-AUC and mAP scores on the 20 datasets.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…To induce diversity among base detectors, distinct initialization hyperparameters-specifically the number of neighbors (M inP ts) used in each LOF detector-are randomly selected in the range of [5,200]. It is noted that a narrower range (i.e., [10,150]) is taken for certain datasets due to limitations associated with size or computational cost. For GG AOM and GG MOA, the Tables 2 and 3 summarize the ROC-AUC and mAP scores on the 20 datasets.…”
Section: Methodsmentioning
confidence: 99%
“…Since the ground truth is often absent in outlier mining [1], unsupervised detection methods are commonly used for this task [5,8,17]. However, unsupervised approaches are susceptible to generating high false positive and false negative rates [10]. To improve model accuracy and stability in these scenarios, recent research explores ensemble approaches to outlier detection [1,3,24,32].…”
Section: Introductionmentioning
confidence: 99%
“…A few semi-supervised anomaly detection algorithms exist in literature such as Lightweight on-line detector of anomalies (LODA) [18,19]. In [18], the authors make use of an unsupervised ensemble-based algorithm which is further enhanced with an active learning phase in [19]. A semi-supervised accuracy-at-the-top loss function is used to update weights of LODA based on user feedback.…”
Section: B Semi-supervised Anomaly Detectionmentioning
confidence: 99%
“…A semi-supervised accuracy-at-the-top loss function is used to update weights of LODA based on user feedback. The base anomaly detection algorithm is not limited to LODA and the authors also show that the active anomaly detection phase can be incorporated with tree based algorithms too [20]. On the other hand in [21], the authors formulate anomaly detection as an optimization problem and propose a generalization using both labeled and unlabeled instances.…”
Section: B Semi-supervised Anomaly Detectionmentioning
confidence: 99%
“…Rather, it utilizes the fact that anomalous data are "few and different". Most anomaly detection algorithms find anomalies by understanding the distribution of their properties and isolating them from the rest of normal data samples [4], [13], [5]. In an Isolation Forest, data is sub-sampled, and processed in a tree structure based on random cuts in the values of randomly selected features in the dataset.…”
Section: Introductionmentioning
confidence: 99%