A Hybrid Method for Interpolating Missing Data in Heterogeneous Spatio-Temporal Datasets

Deng, Min; Zide, Fan; Liu, Qiliang; Gong, Jianya

doi:10.3390/ijgi5020013

Cited by 21 publications

(14 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It has been found that the accuracy of the monitoring results obtained by the automatic gravimetric method meets the requirements of the standard manual monitoring method (Cheng, Gong, & Pan, ). To reduce the effect of missing data on the prediction results, a dataset with fewer missing values was selected, and some individual missing values were estimated using a spatio‐temporal interpolation method (Deng et al, ). Figure shows the spatial distribution of these monitoring sites.…”

Section: Resultsmentioning

confidence: 99%

“…The non‐separated approach aims to define a joint space–time covariance structure (Heuvelink & Griffith, ). When a space–time covariance structure is defined, kriging methods can be used to obtain the best linear unbiased prediction at a given location by computing a weighted average of the known values at its neighboring sites (Deng, Fan, Liu, & Gong, ). Although space–time dependence is considered in space–time geostatistics, the nonlinearities and nonstationarities of a space–time series cannot be well handled by space–time geostatistics.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Heterogeneous Space–Time Artificial Neural Networks for Space–Time Series Prediction

Deng

Yang

Liu

et al. 2017

Transactions in GIS

Self Cite

View full text Add to dashboard Cite

Space-time series prediction plays a key role in the domain of geographic data mining and knowledge discovery. In general, the existing methods of space-time series prediction can be divided into two main categories: statistical machine learning methods. Comparatively, machine leaning methods have obvious advantages with respect to handling nonlinear problems. However, space-time dependence and the heterogeneity of space-time data are not well addressed by the existing machine learning methods. Because of this limitation, an accurate prediction of a space-time series is still a challenging problem. Therefore, in this study, both space-time dependence and heterogeneity are incorporated into the feedback artificial neural network, and heterogeneous space-time artificial neural networks (HSTANNs) are developed for space-time series prediction. First, to handle spatial heterogeneity, space-time series clustering is used to divide the study area into a set of homogeneous sub-areas. Then, a space-time autocorrelation analysis is employed to explore the space-time dependence structure of the dataset. Finally, a HSTANN is established for each sub-area. Further, HSTANNs are applied to predict the concentrations of fine particulate matter (PM 2.5 ) in Beijing-Tianjin-Hebei. The experimental results show that when compared with other methods, the accuracy of the forecasting results is considerably improved by using HSTANNs.

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Heterogeneous Space–Time Artificial Neural Networks for Space–Time Series Prediction

Deng

Yang

Liu

et al. 2017

Transactions in GIS

Self Cite

View full text Add to dashboard Cite

show abstract

“…To compare different imputation models with cross-validation, we used three indices to measure the actual prediction accuracy, namely, the standardized allocation error (SAE) 11 , the mean square error (MSE) and the coefficient of determination (R 2 ) 21 , 35 . All these indices compare the model-predicted values with observed values.…”

Section: Methodsmentioning

confidence: 99%

“…The R 2 is an index for assessing the agreement between observed and estimated values, with the value ranging from 0 for complete disagreement to 1 for perfect agreement. Scatterplots were created to compare the observed values and estimated values in the cross-validation 1 , 21 .…”

Section: Methodsmentioning

confidence: 99%

“…Thus large-scale official statistics data usually have spatial structures, particularly spatial autocorrelation 19 . Moreover, temporal autocorrelation, in which observations that are temporally close to each other tend to be similar, is also likely to be inherent in official statistics data 20 , 21 . On the one hand, information about spatial and/or temporal structures can be utilized for estimating missing values, especially when other information, such as that from samples and auxiliary data, is unavailable.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Estimating missing values in China’s official socioeconomic statistics using progressive spatiotemporal Bayesian hierarchical modeling

Song

Yang²,

Shi

et al. 2018

Sci Rep

View full text Add to dashboard Cite

Due to a large number of missing values, both spatially and temporally, China has not published a complete official socioeconomic statistics dataset at the county level, which is the country’s basic scale of official statistics data collection. We developed a procedure to impute the missing values under the Bayesian hierarchical modeling framework. The procedure incorporates two novelties. First, it takes into account spatial autocorrelations and temporal trends for those easier-to-impute variables with small missing percentages. Second, it further uses the first-step complete variables as covariate information to improve the modeling of more-difficult-to-impute variables with large missing percentages. We applied this progressive spatiotemporal (PST) method to China’s official socioeconomic statistics during 2002–2011 and compared it with four other widely used imputation methods, including k-nearest neighbors (kNN), expectation maximum (EM), singular value decomposition (SVD) and random forest (RF). The results show that the PST method outperforms these methods, thus proving the effects of sophisticatedly incorporating the additional spatial and temporal information and progressively utilizing the covariate information. This study has an outcome that allows China to construct a complete socioeconomic dataset and establishes a methodology that can be generally useful for estimating missing values in large spatiotemporal datasets.

show abstract

Fuzzy-based missing value imputation technique for air pollution data

2022

View full text Add to dashboard Cite

A Hybrid Method for Interpolating Missing Data in Heterogeneous Spatio-Temporal Datasets

Cited by 21 publications

References 37 publications

Heterogeneous Space–Time Artificial Neural Networks for Space–Time Series Prediction

Heterogeneous Space–Time Artificial Neural Networks for Space–Time Series Prediction

Estimating missing values in China’s official socioeconomic statistics using progressive spatiotemporal Bayesian hierarchical modeling

Fuzzy-based missing value imputation technique for air pollution data

Contact Info

Product

Resources

About