2019
DOI: 10.3233/jifs-182656
|View full text |Cite
|
Sign up to set email alerts
|

Bootstrapping and multiple imputation ensemble approaches for classification problems

Abstract: Presence of missing values in a dataset can adversely affect the performance of a classifier. Single and Multiple Imputation are normally performed to fill in the missing values. In this paper, we present several variants of combining single and multiple imputation with bootstrapping to create ensembles that can model uncertainty and diversity in the data, and that are robust to high missingness in the data. We present three ensemble strategies: bootstrapping on incomplete data followed by (i) single imputatio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 19 publications
(12 citation statements)
references
References 32 publications
0
11
0
Order By: Relevance
“…MV and ICD continue to be prevalent in numerous real-world problems and across many application areas [6], [7], including water quality anomaly detection domain. Consequently, these occurrences have continued to generate lots of attention from researchers because the majority of conventional predictive machine learning algorithms are not developed to handle these challenge in data, because they assume completeness of data and a balanced class distribution [8], [9]. As a result, predictive or classification algorithms perform sub-optimally on these kinds of datasets if not properly handled, resulting in bias, inaccurate and lowquality predictive performance of the classifiers [7], [8], [10].…”
Section: Introductionmentioning
confidence: 99%
“…MV and ICD continue to be prevalent in numerous real-world problems and across many application areas [6], [7], including water quality anomaly detection domain. Consequently, these occurrences have continued to generate lots of attention from researchers because the majority of conventional predictive machine learning algorithms are not developed to handle these challenge in data, because they assume completeness of data and a balanced class distribution [8], [9]. As a result, predictive or classification algorithms perform sub-optimally on these kinds of datasets if not properly handled, resulting in bias, inaccurate and lowquality predictive performance of the classifiers [7], [8], [10].…”
Section: Introductionmentioning
confidence: 99%
“…Nevertheless, there is still a long way until such systems will reach industry-ready stages, both in terms of practical accuracy and affordability. While current research shows promising results predominantly in cattle and pigs, there are still many avenues to be explored for better automation of the whole BW estimation process, such as 1) the ability of CV and ML/DL hybrid BW predictive systems to cope with missing information possibly via imputation and data enhancement approaches as suggested by ( Khan et al, 2019 ), 2) the generalization power and the breed-agnostic prediction for any given species, 3) the automatic detection and identification of an animal remotely, and 4) the automatic recognition and adaptability based on an animal’s development stage, posture, and position within a sensor’s field of view.…”
Section: Discussionmentioning
confidence: 99%
“…Approaches in literature on missing values handling using ensemble methods are discussed in the following. Authors in [122], proposed a bootstrapping ensemble to model uncertainty and variety in the data that has high missingness. They performed an extensive evaluation of their approach by varying the missingness ratio of the missing data.…”
Section: Ensemble Methodsmentioning
confidence: 99%