2008
DOI: 10.1007/978-3-540-79299-4_7
|View full text |Cite
|
Sign up to set email alerts
|

Missing Value Imputation Based on Data Clustering

Abstract: Missing values widely exist in many real-world datasets, which hinders the performing of advanced data analytics. Properly filling these missing values is crucial but challenging, especially when the missing rate is high. Many approaches have been proposed for missing value imputation (MVI), but they are mostly heuristics-based, lacking a principled foundation and do not perform satisfactorily in practice. In this paper, we propose a probabilistic framework based on deep generative models for MVI. Under this f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
51
0
3

Year Published

2009
2009
2019
2019

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 75 publications
(54 citation statements)
references
References 22 publications
0
51
0
3
Order By: Relevance
“…The most popular method in statistics is the regression imputation method. Common regression methods include the parametric methods (such as linear regression and the nonlinear imputation method) and the non-parametric methods (such as kernel imputation in [28]). The parametric regression imputations are superior if a dataset can be adequately modeled parametrically, or if users can correctly specify the parametric forms for the dataset.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The most popular method in statistics is the regression imputation method. Common regression methods include the parametric methods (such as linear regression and the nonlinear imputation method) and the non-parametric methods (such as kernel imputation in [28]). The parametric regression imputations are superior if a dataset can be adequately modeled parametrically, or if users can correctly specify the parametric forms for the dataset.…”
Section: Related Workmentioning
confidence: 99%
“…Another commonly used and efficient imputation is the k nearest neighbor imputation (called kNN imputation, or kNNI), which is one of the hot deck techniques used to compensate for missing data [3,28]. It uses only the k most relevant complete instances in the dataset for imputing a missing datum.…”
mentioning
confidence: 99%
“…A grade table is constructed where courses are columns and students are rows, courses are labeled from T1, students are labeled from S1.The missing value in the table should be processed. The techniques of missing value imputation are: list wise deletion, mean imputation and some types of hot-deck imputation [8][9][10]. The listwise deletion is used to deal with missing value in the paper.…”
Section: Advances In Computer Science Research (Acsr) Volume 73mentioning
confidence: 99%
“…Parametric methods like Nearest Neighbour [4][10] [25] have been used for the prediction of missing attribute(s). Non-parametric technique such as empirical likelihood [32], clustering [26], Semi-parametric techniques [21] [33] have also been applied for missing data imputation. Techniques like mixture model clustering [9], machine learning [12] have been used for imputing missing data.…”
Section: Related Workmentioning
confidence: 99%