2015 IEEE Symposium Series on Computational Intelligence 2015
DOI: 10.1109/ssci.2015.224
|View full text |Cite
|
Sign up to set email alerts
|

Model-Based Outlier Detection for Object-Relational Data

Abstract: Outliers are anomalous and interesting objects that are notably different from the rest of the data. The outlier detection task has sometimes been considered as removing noise from the data. However, it is usually the significantly interesting deviations that are of most interest. Different outlier detection techniques work with various data formats. The outlier detection process needs to be sensitive to the nature of the underlying data. Most of the previous work on outlier detection was designed for proposit… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
1
1

Relationship

2
5

Authors

Journals

citations
Cited by 8 publications
(5 citation statements)
references
References 49 publications
0
5
0
Order By: Relevance
“…It is important to detect the outliers efficiently and accurately to improve the reliability of WSN data. The general outlier detection methods can be classified into four classes: statistical-based methods, [4][5][6] nearest neighbor-based methods, 7,9 clustering-based methods, [10][11][12] and classification-based methods. [13][14][15][16][17] Statistical-based methods capture the distribution of the data and evaluate how well the data instance matches the model.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…It is important to detect the outliers efficiently and accurately to improve the reliability of WSN data. The general outlier detection methods can be classified into four classes: statistical-based methods, [4][5][6] nearest neighbor-based methods, 7,9 clustering-based methods, [10][11][12] and classification-based methods. [13][14][15][16][17] Statistical-based methods capture the distribution of the data and evaluate how well the data instance matches the model.…”
Section: Related Workmentioning
confidence: 99%
“…Finally, collected datasets have high dimensionality and large scalability for certain cases, presenting issues for data processing. In the past several years, numerous methods have been proposed to perform outlier detection for WSNs [4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19] (reviewed in section ''Related work''). However, the majority of these can only address the first two challenges, and most of them cannot be directly applied to high-dimensional and large-scalability data because of the following issues: 15 (1) time-consuming-as the dimension of the input data vector increases, the number of feature subspaces increases exponentially, which results in an exponential search space; (2) low detection rate-the high proportion of irrelevant features in highdimensional datasets unavoidably include noises, which makes the true outliers inconspicuous; and (3) high false alarm rate-in high-dimensional space, we can always determine at least one feature subspace for each point of a dataset that defines such a point as an outlier, that is, every data instance can be considered as an outlier under a particular circumstance.…”
Section: Introductionmentioning
confidence: 99%
“…In the third step, a binary classification algorithm is used to classify meta-alerts into attacks and false alarms. An extended statistical unsupervised method has been used to detect outliers in objectrelational data (Riahi et al, 2015). For this purpose, a metric was introduced based on the likelihood ratio of vectors of population association and individual association.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Outlier detection typically leverages the data distribution of a single column to detect errors (e.g., any datapoint that is more than 3 standard deviation is an outlier) [28]. Some works also study how to leverage the relationships between multiple columns to detect outliers [17,18,40,48]. However, none of them allows a user to specify a set of SCs explicitly and then guides the user to detect the errors based on the SCs.…”
Section: Related Workmentioning
confidence: 99%