Shell-neighbor method and its application in missing data imputation

Zhang, Shichao

doi:10.1007/s10489-009-0207-6

Cited by 111 publications

(56 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The simplest way is imputation with constant values, zeros, random values, mean values (over all data set [18], over the class of the data item [9]). The more-sophisticated techniques are nearestneighbor selection [36,37], Expectation-Maximization (EM) algorithm [8] or hot-deck [23] and cold-deck [13] techniques to avoid imputation of non-existing values.…”

Section: Imputationmentioning

confidence: 99%

Comparison of incomplete data handling techniques for neuro-fuzzy systems

Sikora

Simiński

2014

csci

View full text Add to dashboard Cite

Section: Imputationmentioning

confidence: 99%

Comparison of incomplete data handling techniques for neuro-fuzzy systems

Sikora

Simiński

2014

csci

View full text Add to dashboard Cite

“…Non-parametric method is applied when the relationship between the conditional www.ijacsa.thesai.org attributes is unknown. Parametric methods like Nearest Neighbour [4][10] [25] have been used for the prediction of missing attribute(s). Non-parametric technique such as empirical likelihood [32], clustering [26], Semi-parametric techniques [21] [33] have also been applied for missing data imputation.…”

Section: Related Workmentioning

confidence: 99%

Imputation And Classification Of Missing Data Using Least Square Support Vector Machines – A New Approach In Dementia Diagnosis

Sivapriya¹,

Kamal²,

Thavavel³

2012

IJARAI

View full text Add to dashboard Cite

Abstract-This paper presents a comparison of different data imputation approaches used in filling missing data and proposes a combined approach to estimate accurately missing attribute values in a patient database. The present study suggests a more robust technique that is likely to supply a value closer to the one that is missing for effective classification and diagnosis. Initially data is clustered and z-score method is used to select possible values of an instance with missing attribute values. Then multiple imputation method using LSSVM (Least Squares Support Vector Machine) is applied to select the most appropriate values for the missing attributes. Five imputed datasets have been used to demonstrate the performance of the proposed method. Experimental results show that our method outperforms conventional methods of multiple imputation and mean substitution. Moreover, the proposed method CZLSSVM (Clustered Z-score Least Square Support Vector Machine) has been evaluated in two classification problems for incomplete data. The efficacy of the imputation methods have been evaluated using LSSVM classifier. Experimental results indicate that accuracy of the classification is increases with CZLSSVM in the case of missing attribute value estimation. It is found that CZLSSVM outperforms other data imputation approaches like decision tree, rough sets and artificial neural networks, K-NN (KNearest Neighbour) and SVM. Further it is observed that CZLSSVM yields 95 per cent accuracy and prediction capability than other methods included and tested in the study.

show abstract

“…It exploits the k relevant instances of the data and provides simplicity, ease of implementation and achieved high accuracy. [9]. Also, the fuzzy rule based is widely in the data imputation method which sculpts the linguistic model structure which has the tendency to evaluate the value of missing data and mitigates the dimension reduction problem [10].…”

Section: Introductionmentioning

confidence: 99%

Grey Fuzzy Neural Network-Based Hybrid Model for Missing Data Imputation in Mixed Database

Kuppusamy¹,

Paramasivam²

2017

IJIES

View full text Add to dashboard Cite

Nowadays, the missing data imputation is the novel paradigm to replace with the imputed value of the missing attribute. The missing data occurs due to bias information, non-response of the system. In the medical domain, it becomes the major challenge to impute the both categorical and numerical data. In this paper, the Grey Fuzzy Neural Network is proposed for missing data imputation in the mixed database. Initially, the WLI fuzzy clustering mechanism is utilized to generate the different clusters in which the medical data are grouped together. Then, we intend to integrate the Grey Wolf Optimizer (GWO) with the ANFIS network model, termed the Grey Fuzzy Neural Network (GFNN). The proposed method is mainly used to determine the optimal parameters to design the membership function. Finally, the hybrid prediction model is used to find out the imputed data for both categorical and numerical. In the hybrid prediction model, the categorical data is then imputed by the distance measure. The experimental results are validated, and performance is analysed by metrics such as MSE and RMSE using MATLAB implementation. The outcome of the proposed GFNN attains lower 0.13 MSE, and 0.35 RMSE ensures to impute the data significantly in the missing attribute of the mixed database.

show abstract

Shell-neighbor method and its application in missing data imputation

Cited by 111 publications

References 24 publications

Comparison of incomplete data handling techniques for neuro-fuzzy systems

Comparison of incomplete data handling techniques for neuro-fuzzy systems

Imputation And Classification Of Missing Data Using Least Square Support Vector Machines – A New Approach In Dementia Diagnosis

Grey Fuzzy Neural Network-Based Hybrid Model for Missing Data Imputation in Mixed Database

Contact Info

Product

Resources

About