15th International Conference on Advanced Computing and Communications (ADCOM 2007) 2007
DOI: 10.1109/adcom.2007.74
|View full text |Cite
|
Sign up to set email alerts
|

Unbalanced data classification using extreme outlier elimination and sampling techniques for fraud detection

Abstract: Detecting fraud from the highly overlapped and imbalanced fraud dataset is a challenging task. To solve this problem, we propose a new approach called extreme outlier elimination and hybrid sampling technique. k Reverse Nearest Neighbors (kRNNs) concept used as a data cleaning method for eliminating extreme outliers in minority regions. Hybrid sampling technique, a combination of SMOTE to over-sample the minority data (fraud samples) and random undersampling to under-sample the majority data (non-fraud samples… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
19
0

Year Published

2009
2009
2021
2021

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 49 publications
(19 citation statements)
references
References 11 publications
(10 reference statements)
0
19
0
Order By: Relevance
“…To solve the problem of highly unbalanced database and overlapping in [7] propose a new approach called elimination of points at the end of the border and hybrid sampling technique. The concept of k-neighbors is used as a cleaning method to remove data points from the end of the border in minority regions.…”
Section: Related Workmentioning
confidence: 99%
“…To solve the problem of highly unbalanced database and overlapping in [7] propose a new approach called elimination of points at the end of the border and hybrid sampling technique. The concept of k-neighbors is used as a cleaning method to remove data points from the end of the border in minority regions.…”
Section: Related Workmentioning
confidence: 99%
“…First, fraud is a rare event because the legitimate claims almost always outnumber the fraudulent ones. For instance, more than 80% of the papers reviewed in Phua et al () have skewed data with less than 30% fraud. The sparsity of the fraud data can be addressed with methods such as non‐negative matrix factorisation, singular value decomposition and principal component analysis (Zhu et al , ).…”
Section: Medical Claims Datamentioning
confidence: 99%
“…Despite the wide adoption of fraud detection methods in these domains, the level of attention given to medical fraud assessment has been relatively limited (Phua et al , ). Some of the aforementioned methods are applicable for detection of fraudulent medical claims.…”
Section: Introductionmentioning
confidence: 99%
“…The predictions of the majority class have a high possibility to get good performance, whereas the predictions of minority classes generally have poor performance. Class imbalance is prevalent in many real-world applications [12,40,41] such as bioinformatics, anomaly detection, intrusion detection, fraud detection and especially in medical diagnosis. These applications usually focus on the minority class.…”
Section: Introductionmentioning
confidence: 99%