2011
DOI: 10.5120/2619-3544
|View full text |Cite
|
Sign up to set email alerts
|

Evaluation of Three Simple Imputation Methods for Enhancing Preprocessing of Data with Missing Values

Abstract: One of the important stages of data mining is preprocessing, where the data is prepared for different mining tasks. Often, the real-world data tends to be incomplete, noisy, and inconsistent. It is very common that the data are not obtainable for every observation of every variable. So the presence of missing variables is obvious in the data set. A most important task when preprocessing the data is, to fill in missing values, smooth out noise and correct inconsistencies. This paper presents the missing value p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
21
0
1

Year Published

2013
2013
2023
2023

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 40 publications
(26 citation statements)
references
References 18 publications
0
21
0
1
Order By: Relevance
“…If such values are not calculated by using appropriate means or incomplete records are not removed, they can cause poor analytical results [6, 7]. A number of methods can be used to replace the missing values by using measures like mean and median [51]. …”
Section: Challenges In Implementing Data Mining Process For Clinicmentioning
confidence: 99%
“…If such values are not calculated by using appropriate means or incomplete records are not removed, they can cause poor analytical results [6, 7]. A number of methods can be used to replace the missing values by using measures like mean and median [51]. …”
Section: Challenges In Implementing Data Mining Process For Clinicmentioning
confidence: 99%
“…We will show the accuracy of our algorithm based on the evaluation standard of RandIndex [20]. For uncertain data set D (includes N objects), let T={T 1 , T 2 , …, T k } represent the original clusters, and C={C 1 , C 2 , …, C m } be the clusters produced by a clustering algorithm.…”
Section: A Evaluation Standardmentioning
confidence: 99%
“…Three algorithms have been presented below, that compute missing values and their attributes (M) in dataset (D) (Somasundaram and Nedunchezhian 2011).…”
Section: Algorithms For Computation Of Missing Valuesmentioning
confidence: 99%
“…2 nd approach of mean attribute value substitution method is time consuming & expensive, but gives best results for missing values problem. 3 rd approach of random attribute value substitution method causes distortion in data distributions by assuming that all missing values are with the same value, however this method still manages to provide comparable results (Somasundaram and Nedunchezhian 2011). Weak point of these techniques is the need of strong model assumptions.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation