2017
DOI: 10.26483/ijarcs.v8i7.4530
|View full text |Cite
|
Sign up to set email alerts
|

Rare Class Problem in Data Mining: Review

Abstract: Class imbalance problem is getting so much attention of researchers now a days. In real life there are number of applications that generates imbalanced data sets. Imbalance nature of data makes classification task difficult. Dealing with these kinds of imbalanced dataset is the one of the biggest challenge in the data mining. Imbalanced dataset means the ratio of positive and negative classes is not balanced. The class that is having more number of samples is known as majority class and the class that is havin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 20 publications
(21 reference statements)
0
4
0
Order By: Relevance
“…RUS attempts to balance the class distribution by randomly removing an instance of the majority class. This generates the problem of losing valuable information [15]. Because of its efficiency, simplicity, and speed, RUS has shown very good performance, and it is used in boosting for these reasons [7,11].…”
Section: Data-level Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…RUS attempts to balance the class distribution by randomly removing an instance of the majority class. This generates the problem of losing valuable information [15]. Because of its efficiency, simplicity, and speed, RUS has shown very good performance, and it is used in boosting for these reasons [7,11].…”
Section: Data-level Methodsmentioning
confidence: 99%
“…One-class learning techniques can be used at the algorithm level to recognize instances of one class and deny others. This approach optimizes the performance of the learning algorithm on unseen data [15].…”
Section: Algorithm-level Approachesmentioning
confidence: 99%
“…For instance, we determined throughout our research, when applying sampling in machine learning, a balanced ratio (50:50) is not as beneficial for Big Data as it is for smaller datasets, at least in the Medicare fraud detection domain. Studies that focus on class rarity are far less common [2,37]. With regards to the research presented in this paper, we limit our discussion to studies that employ Big Data for studying class imbalance in relation to rarity.…”
Section: Related Workmentioning
confidence: 99%
“…The analysis drawn from a comparative study of each reported research solution is shown in Table 4. Dongre and Malik (2017) reported that data adjusting provide the better solution than other techniques. But Dongre and Malik (2017) suggest that a hybrid approach gives the best solution for class imbalance learning.…”
Section: Cost-sensitive Boostingmentioning
confidence: 99%