2019
DOI: 10.3390/app9245287
|View full text |Cite
|
Sign up to set email alerts
|

Predictive Modeling of ICU Healthcare-Associated Infections from Imbalanced Data. Using Ensembles and a Clustering-Based Undersampling Approach

Abstract: Early detection of patients vulnerable to infections acquired in the hospital environment is a challenge in current health systems given the impact that such infections have on patient mortality and healthcare costs. This work is focused on both the identification of risk factors and the prediction of healthcare-associated infections in intensive-care units by means of machine-learning methods. The aim is to support decision making addressed at reducing the incidence rate of infections. In this field, it is ne… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 19 publications
(16 citation statements)
references
References 48 publications
0
16
0
Order By: Relevance
“…For severely imbalanced classification problems, Sánchez-Hernández et al [ 47 ] suggest G-mean, the geometric mean of true positive rate (TPR) and true negative rate (TNR) (Eq. (1) ), as an objective measure of predictive power.…”
Section: Resultsmentioning
confidence: 99%
“…For severely imbalanced classification problems, Sánchez-Hernández et al [ 47 ] suggest G-mean, the geometric mean of true positive rate (TPR) and true negative rate (TNR) (Eq. (1) ), as an objective measure of predictive power.…”
Section: Resultsmentioning
confidence: 99%
“…Both Bagging and boosting adopt this voting approach, but they derive the models in different ways. In Bagging, the models have equal weight, whereas in boosting, the base weak learners receive more weights, and the simple weak learners are combined into a more complex strong ensemble [25]. In this study, Bagging and AdaBoost approaches were implemented on the dataset to improve the classification performance of decision tree.…”
Section: Machine Learning Techniquesmentioning
confidence: 99%
“…There is no complete agreement between different authors about when a dataset is considered imbalanced. In this paper, we consider that a dataset is imbalanced when the imbalance ratio (IR) is higher than 1.5, since previous experiments [4,79] have shown that above this threshold, the classification of examples of the minority class(es) is usually significantly lower than the examples of the majority class(es). In addition, many studies in the literature on imbalanced classification start from this IR value to select the datasets [14].…”
Section: Datasetsmentioning
confidence: 99%
“…However, these conclusions cannot be generalized since the datasets used in the study have a low imbalance ratio and most of them have a low number of attributes and examples. In [4], a clustering-based undersampling strategy combined with ensemble classifiers is proposed and tested with only one dataset in a medical application domain. An ensemble approach is also presented in [43] for assisted reproductive technology outcome prediction.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation