2022
DOI: 10.48550/arxiv.2206.08478
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Classification of datasets with imputed missing values: does imputation quality matter?

Abstract: Classifying samples in incomplete datasets is a common aim for machine learning practitioners, but is non-trivial. Missing data is found in most real-world datasets and these missing values are typically imputed using established methods, followed by classification of the now complete, imputed, samples. The focus of the machine learning researcher is then to optimise the downstream classification performance. In this study, we highlight that it is imperative to consider the quality of the imputation. We demons… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 37 publications
0
1
0
Order By: Relevance
“…Most important of all, the data management or completeness of data for automated decision making plays an essential role when it comes to statistics and machine learning. The integration between statistics and machine learning can be used to train automated models with imputed missing values in the data, which would improve the generalizability and robustness of models [ 100 ].…”
Section: Discussionmentioning
confidence: 99%
“…Most important of all, the data management or completeness of data for automated decision making plays an essential role when it comes to statistics and machine learning. The integration between statistics and machine learning can be used to train automated models with imputed missing values in the data, which would improve the generalizability and robustness of models [ 100 ].…”
Section: Discussionmentioning
confidence: 99%