2018
DOI: 10.1155/2018/6275435
|View full text |Cite
|
Sign up to set email alerts
|

A Framework of Rebalancing Imbalanced Healthcare Data for Rare Events’ Classification: A Case of Look-Alike Sound-Alike Mix-Up Incident Detection

Abstract: Identifying rare but significant healthcare events in massive unstructured datasets has become a common task in healthcare data analytics. However, imbalanced class distribution in many practical datasets greatly hampers the detection of rare events, as most classification methods implicitly assume an equal occurrence of classes and are designed to maximize the overall classification accuracy. In this study, we develop a framework for learning healthcare data with imbalanced distribution via incorporating diff… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
28
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 46 publications
(28 citation statements)
references
References 34 publications
(48 reference statements)
0
28
0
Order By: Relevance
“…This assumption leads to a tendency to favor the majority class when applied to an imbalanced dataset, which can decrease the accuracy in classifying minority occurrences. 37 Because this is a frequently encountered scenario, prior groups have studied various rebalancing strategies. 37,38 Previous analysis has shown an improvement in prediction accuracy when applying rebalancing methods, as long as the dataset is sufficient in size.…”
Section: Discussionmentioning
confidence: 99%
“…This assumption leads to a tendency to favor the majority class when applied to an imbalanced dataset, which can decrease the accuracy in classifying minority occurrences. 37 Because this is a frequently encountered scenario, prior groups have studied various rebalancing strategies. 37,38 Previous analysis has shown an improvement in prediction accuracy when applying rebalancing methods, as long as the dataset is sufficient in size.…”
Section: Discussionmentioning
confidence: 99%
“…Many problems in the study to detect a disease, more prioritizing the measurement in the case of recall [14]. In this study, recall is important because high recall means that the CNN Model made has a slight error rate in detecting a person who affected by pneumonia or tuberculosis.…”
Section: Model Evaluation Of Undersampling and Oversampling Datasetmentioning
confidence: 99%
“…In this case, classification accuracy (A) can mislead to select the best performing model. Techniques to select the best model for data with class imbalance are: Choosing the performance metrics those that focus on the minority class, oversampling the minority class using SMOTE to rebalance the class, undersampling the majority class to rebalance the class and selecting classification algorithms such as those that penalize misclassification errors differently [ Zhao et al,( 2018)]. The classification algorithms such as LR, SVM, MLP and K-NN are used for creating classification model.…”
Section: Fig 2 Class Distributionmentioning
confidence: 99%