2020
DOI: 10.11591/eei.v9i2.2090
|View full text |Cite
|
Sign up to set email alerts
|

A comparative study of different imputation methods for daily rainfall data in east-coast Peninsular Malaysia

Abstract: Rainfall data are the most significant values in hydrology and climatology modelling. However, the datasets are prone to missing values due to various issues. This study aspires to impute the rainfall missing values by using various imputation method such as Replace by Mean, Nearest Neighbor, Random Forest, Non-linear Interactive Partial Least-Square (NIPALS) and Markov Chain Monte Carlo (MCMC). Daily rainfall datasets from 48 rainfall stations across east-coast Peninsular Malaysia were used in this study. The… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 10 publications
(3 citation statements)
references
References 19 publications
(27 reference statements)
0
3
0
Order By: Relevance
“…For Khayati et al ( 2020 ), they implemented an experiment with twelve algorithms and indices, and the result allowed to identify the limitations and that the algorithms do not offer a high accuracy and methodology of how to analyze the algorithms. Also, for Nor et al ( 2020 ), a comparative model of imputation methods for daily rainfall data was implemented and based on linear and non-linear stochastic and index statistic methods to determine the performance of the algorithms and that the random forest–coupled method with multiple linear regression (RF-MLR) shows adequate performance for time series data loss imputation. And Afrifa-Yamoah et al ( 2020 ) implemented a model based on a Kalman filter, an autoregressive integrated moving average (ARIMA), on twelve-monthly time series, on variables such as hourly temperature, humidity, and wind speed resulting in good indicators under indices such as root mean square error and symmetric mean absolute percentage error and recommended to apply it to other meteorological variables.…”
Section: Introductionmentioning
confidence: 99%
“…For Khayati et al ( 2020 ), they implemented an experiment with twelve algorithms and indices, and the result allowed to identify the limitations and that the algorithms do not offer a high accuracy and methodology of how to analyze the algorithms. Also, for Nor et al ( 2020 ), a comparative model of imputation methods for daily rainfall data was implemented and based on linear and non-linear stochastic and index statistic methods to determine the performance of the algorithms and that the random forest–coupled method with multiple linear regression (RF-MLR) shows adequate performance for time series data loss imputation. And Afrifa-Yamoah et al ( 2020 ) implemented a model based on a Kalman filter, an autoregressive integrated moving average (ARIMA), on twelve-monthly time series, on variables such as hourly temperature, humidity, and wind speed resulting in good indicators under indices such as root mean square error and symmetric mean absolute percentage error and recommended to apply it to other meteorological variables.…”
Section: Introductionmentioning
confidence: 99%
“…Missing data reconstruction is crucial, especially in an event where all available resources, including partial information, must be used. The lack of particular data can pose severe problems in hydrological studies, resulting in uncertainty and low efficiency of water resource systems [4][5][6].…”
Section: Introductionmentioning
confidence: 99%
“…The most convenient method for dealing with missing data is to delete the entire observations with partial data and analyse the remaining complete data [6]. On the other hand, deleted data may result in discontinuous data, resulting in information loss and skewed conclusions.…”
Section: Introductionmentioning
confidence: 99%