2021
DOI: 10.7717/peerj-cs.619
|View full text |Cite
|
Sign up to set email alerts
|

Advanced methods for missing values imputation based on similarity learning

Abstract: The real-world data analysis and processing using data mining techniques often are facing observations that contain missing values. The main challenge of mining datasets is the existence of missing values. The missing values in a dataset should be imputed using the imputation method to improve the data mining methods’ accuracy and performance. There are existing techniques that use k-nearest neighbors algorithm for imputing the missing values but determining the appropriate k value can be a challenging task. T… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 21 publications
(5 citation statements)
references
References 41 publications
0
5
0
Order By: Relevance
“…The easiest way to deal with the missing data is to remove the corresponding entries completely [47], but that would lead to loss of crucial information. Another method is to impute the missing values with the mean value of the available data [48], however that would not preserve the relationships between inputs and outputs.…”
Section: Techniques To Handle Erroneous and Missing Datamentioning
confidence: 99%
“…The easiest way to deal with the missing data is to remove the corresponding entries completely [47], but that would lead to loss of crucial information. Another method is to impute the missing values with the mean value of the available data [48], however that would not preserve the relationships between inputs and outputs.…”
Section: Techniques To Handle Erroneous and Missing Datamentioning
confidence: 99%
“…In this study, an iterative sequential imputation process was executed via regression with the lightgbm algorithm. Other techniques have been proposed such as a hybrid missing data imputation method incorporating records similarity using the global correlation structure by using k-nearest neighbors and iterative imputation algorithms 86 or by merits integration of decision trees and fuzzy clustering into an iterative learning approach. 87 A quantile-based discretization function was performed in this study to discretize features into bins.…”
Section: Discussionmentioning
confidence: 99%
“…Khaled M. Fouad. et al [14] proposed a method incorporating KNN and iterative imputation algorithms which can impute missing data depended on the similarity between records.…”
Section: Discussionmentioning
confidence: 99%