2011 Second International Conference on Emerging Applications of Information Technology 2011
DOI: 10.1109/eait.2011.25
|View full text |Cite
|
Sign up to set email alerts
|

An Outlier Detection Method Based on Clustering

Abstract: Abstract-In this paper we propose a clustering based method to capture outliers. We apply K-means clustering algorithm to divide the data set into clusters. The points which are lying near the centroid of the cluster are not probable candidate for outlier and we can prune out such points from each cluster. Next we calculate a distance based outlier score for remaining points. The computations needed to calculate the outlier score reduces considerably due to the pruning of some points. Based on the outlier scor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
39
0
2

Year Published

2015
2015
2024
2024

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 75 publications
(41 citation statements)
references
References 16 publications
(17 reference statements)
0
39
0
2
Order By: Relevance
“…Clustering methodology requires the definition of a similar properties between patterns, which is not easy to specify without having any prior knowledge about cluster shapes [8]. Clustering is not a new concept but data clustering specifically for outlier detection is a recent discipline under continuous development.…”
Section: B Clustering -mentioning
confidence: 99%
“…Clustering methodology requires the definition of a similar properties between patterns, which is not easy to specify without having any prior knowledge about cluster shapes [8]. Clustering is not a new concept but data clustering specifically for outlier detection is a recent discipline under continuous development.…”
Section: B Clustering -mentioning
confidence: 99%
“…of data used, size of the dataset etc [33]. Now with the tremendous growth of data, best outlier detection algorithms have to be applied on large data sets [8].…”
Section: International Journal Of Computer Applications (0975 -8887) mentioning
confidence: 99%
“…Unsupervised learning Supervised learning Statistical Others Figure 5 depicts the techniques to solve the outliers issue. Several papers make frequent use of unsupervised learning (i.e., partitional, density and hierarchical algorithms) and statistical methods [19][20][21][22][23][24]; lesser extent the supervised learning (i.e., variations of decision tree, k-nn and support vector machine algorithms) and genetic algorithms [25][26][27]. …”
Section: Outliersmentioning
confidence: 99%