2007
DOI: 10.1109/fuzzy.2007.4295430
|View full text |Cite
|
Sign up to set email alerts
|

Using Entropy to Impute Missing Data in a Classification Task

Abstract: International audienceIn real applications, part of the data is usually missing. But most techniques of data analysis and data mining can only deal with complete data. In this paper, a new taxonomy of imputation methods is proposed. Within this taxonomy a new technique, based on entropy measures is introduced. Its behaviour is studied through an empirical comparative analysis

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0
1

Year Published

2009
2009
2015
2015

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 12 publications
(7 citation statements)
references
References 9 publications
0
5
0
1
Order By: Relevance
“…Therefore, the 'supervised missing imputation' as been applied by means of the method 'Cmean'. This simple method has proved to be very effective, consists in imputing the mean for continuous variables, or the most repeated for categorical variables (Delavallade and Dang, 2007;Little and Rubin, 2002). In the supervised variant of 'Cmean', the imputed values are the mean of the values, in those cases that have the same recruitment level.…”
Section: Supervised Classification Based Methodologymentioning
confidence: 99%
“…Therefore, the 'supervised missing imputation' as been applied by means of the method 'Cmean'. This simple method has proved to be very effective, consists in imputing the mean for continuous variables, or the most repeated for categorical variables (Delavallade and Dang, 2007;Little and Rubin, 2002). In the supervised variant of 'Cmean', the imputed values are the mean of the values, in those cases that have the same recruitment level.…”
Section: Supervised Classification Based Methodologymentioning
confidence: 99%
“…Delavallade and Dang [17] propose a new technique, based on the entropy measure, that finds a distribution value with more discrimination power for each missing value. Besides, they propose a new taxonomy for the methods, dividing them into: observation space or variable space, iterative or noniterative, local information or global information, stochastic or deterministic, prediction model or class information.…”
Section: Procedures Based On Direct Manipulation Of Missing Datamentioning
confidence: 99%
“…It consists of the imputation of missing data using complete objects in a small neighborhood of the incomplete ones. In [5], a new approach is proposed , using the entropy to estimate the missing values.…”
Section: Introductionmentioning
confidence: 99%