Abstract:Space-time interpolation is widely used to estimate missing or unobserved values in a dataset integrating both spatial and temporal records. Although space-time interpolation plays a key role in space-time modeling, existing methods were mainly developed for space-time processes that exhibit stationarity in space and time. It is still challenging to model heterogeneity of space-time data in the interpolation model. To overcome this limitation, in this study, a novel space-time interpolation method considering … Show more
“…It has been found that the accuracy of the monitoring results obtained by the automatic gravimetric method meets the requirements of the standard manual monitoring method (Cheng, Gong, & Pan, ). To reduce the effect of missing data on the prediction results, a dataset with fewer missing values was selected, and some individual missing values were estimated using a spatio‐temporal interpolation method (Deng et al, ). Figure shows the spatial distribution of these monitoring sites.…”
Section: Resultsmentioning
confidence: 99%
“…The non‐separated approach aims to define a joint space–time covariance structure (Heuvelink & Griffith, ). When a space–time covariance structure is defined, kriging methods can be used to obtain the best linear unbiased prediction at a given location by computing a weighted average of the known values at its neighboring sites (Deng, Fan, Liu, & Gong, ). Although space–time dependence is considered in space–time geostatistics, the nonlinearities and nonstationarities of a space–time series cannot be well handled by space–time geostatistics.…”
Space-time series prediction plays a key role in the domain of geographic data mining and knowledge discovery. In general, the existing methods of space-time series prediction can be divided into two main categories: statistical machine learning methods. Comparatively, machine leaning methods have obvious advantages with respect to handling nonlinear problems. However, space-time dependence and the heterogeneity of space-time data are not well addressed by the existing machine learning methods. Because of this limitation, an accurate prediction of a space-time series is still a challenging problem. Therefore, in this study, both space-time dependence and heterogeneity are incorporated into the feedback artificial neural network, and heterogeneous space-time artificial neural networks (HSTANNs) are developed for space-time series prediction. First, to handle spatial heterogeneity, space-time series clustering is used to divide the study area into a set of homogeneous sub-areas. Then, a space-time autocorrelation analysis is employed to explore the space-time dependence structure of the dataset. Finally, a HSTANN is established for each sub-area. Further, HSTANNs are applied to predict the concentrations of fine particulate matter (PM 2.5 ) in Beijing-Tianjin-Hebei. The experimental results show that when compared with other methods, the accuracy of the forecasting results is considerably improved by using HSTANNs.
“…It has been found that the accuracy of the monitoring results obtained by the automatic gravimetric method meets the requirements of the standard manual monitoring method (Cheng, Gong, & Pan, ). To reduce the effect of missing data on the prediction results, a dataset with fewer missing values was selected, and some individual missing values were estimated using a spatio‐temporal interpolation method (Deng et al, ). Figure shows the spatial distribution of these monitoring sites.…”
Section: Resultsmentioning
confidence: 99%
“…The non‐separated approach aims to define a joint space–time covariance structure (Heuvelink & Griffith, ). When a space–time covariance structure is defined, kriging methods can be used to obtain the best linear unbiased prediction at a given location by computing a weighted average of the known values at its neighboring sites (Deng, Fan, Liu, & Gong, ). Although space–time dependence is considered in space–time geostatistics, the nonlinearities and nonstationarities of a space–time series cannot be well handled by space–time geostatistics.…”
Space-time series prediction plays a key role in the domain of geographic data mining and knowledge discovery. In general, the existing methods of space-time series prediction can be divided into two main categories: statistical machine learning methods. Comparatively, machine leaning methods have obvious advantages with respect to handling nonlinear problems. However, space-time dependence and the heterogeneity of space-time data are not well addressed by the existing machine learning methods. Because of this limitation, an accurate prediction of a space-time series is still a challenging problem. Therefore, in this study, both space-time dependence and heterogeneity are incorporated into the feedback artificial neural network, and heterogeneous space-time artificial neural networks (HSTANNs) are developed for space-time series prediction. First, to handle spatial heterogeneity, space-time series clustering is used to divide the study area into a set of homogeneous sub-areas. Then, a space-time autocorrelation analysis is employed to explore the space-time dependence structure of the dataset. Finally, a HSTANN is established for each sub-area. Further, HSTANNs are applied to predict the concentrations of fine particulate matter (PM 2.5 ) in Beijing-Tianjin-Hebei. The experimental results show that when compared with other methods, the accuracy of the forecasting results is considerably improved by using HSTANNs.
“…To compare different imputation models with cross-validation, we used three indices to measure the actual prediction accuracy, namely, the standardized allocation error (SAE) 11 , the mean square error (MSE) and the coefficient of determination (R 2 ) 21 , 35 . All these indices compare the model-predicted values with observed values.…”
Section: Methodsmentioning
confidence: 99%
“…The R 2 is an index for assessing the agreement between observed and estimated values, with the value ranging from 0 for complete disagreement to 1 for perfect agreement. Scatterplots were created to compare the observed values and estimated values in the cross-validation 1 , 21 .…”
Section: Methodsmentioning
confidence: 99%
“…Thus large-scale official statistics data usually have spatial structures, particularly spatial autocorrelation 19 . Moreover, temporal autocorrelation, in which observations that are temporally close to each other tend to be similar, is also likely to be inherent in official statistics data 20 , 21 . On the one hand, information about spatial and/or temporal structures can be utilized for estimating missing values, especially when other information, such as that from samples and auxiliary data, is unavailable.…”
Due to a large number of missing values, both spatially and temporally, China has not published a complete official socioeconomic statistics dataset at the county level, which is the country’s basic scale of official statistics data collection. We developed a procedure to impute the missing values under the Bayesian hierarchical modeling framework. The procedure incorporates two novelties. First, it takes into account spatial autocorrelations and temporal trends for those easier-to-impute variables with small missing percentages. Second, it further uses the first-step complete variables as covariate information to improve the modeling of more-difficult-to-impute variables with large missing percentages. We applied this progressive spatiotemporal (PST) method to China’s official socioeconomic statistics during 2002–2011 and compared it with four other widely used imputation methods, including k-nearest neighbors (kNN), expectation maximum (EM), singular value decomposition (SVD) and random forest (RF). The results show that the PST method outperforms these methods, thus proving the effects of sophisticatedly incorporating the additional spatial and temporal information and progressively utilizing the covariate information. This study has an outcome that allows China to construct a complete socioeconomic dataset and establishes a methodology that can be generally useful for estimating missing values in large spatiotemporal datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.