ABSTRACT:Time series data in practical applications always contain missing values due to sensor malfunction, network failure, outliers etc. In order to handle missing values in time series, as well as the lack of considering temporal properties in machine learning models, we propose a spatiotemporal prediction framework based on missing value processing algorithms and deep recurrent neural network (DRNN). By using missing tag and missing interval to represent time series patterns, we implement three different missing value fixing algorithms, which are further incorporated into deep neural network that consists of LSTM (Long Short-term Memory) layers and fully connected layers. Real-world air quality and meteorological datasets (Jingjinji area, China) are used for model training and testing. Deep feed forward neural networks (DFNN) and gradient boosting decision trees (GBDT) are trained as baseline models against the proposed DRNN. Performances of three missing value fixing algorithms, as well as different machine learning models are evaluated and analysed. Experiments show that the proposed DRNN framework outperforms both DFNN and GBDT, therefore validating the capacity of the proposed framework. Our results also provides useful insights for better understanding of different strategies that handle missing values.
Estimating exposure to fine Particulate Matter (PM<sub>2.5</sub>) requires surface with high spatial resolution. Aerosol optical depth (AOD) is one of MODIS products, being used to monitor PM<sub>2.5</sub> concentration on ground level indirectly. In this research, AOD was derived in fine spatial resolution of 1×1 Km by utilizing an algorithm developed in which local aerosol models and conditions were took into account. Afterwards, due to spatial varying the relation between AOD-PM<sub>2.5</sub>, a regional scale geographically weighted regression model (GWR) was developed to derive daily seamless surface concentration of PM<sub>2.5</sub> over Beijing, Tianjin and Hebei. For this purpose , various combinations of explanatory variables were investigated in the base of data availability, among which the best one includes AOD, PBL height, mean value of RH in boundary layer, mean value of temperature in boundary layer, wind speed and pressure was selected for the proposed GWR model over study area. The results show that, our model produces surface concentration of PM<sub>2.5</sub> with annual RMSE of 18.6μg/m<sup>3</sup>. Besides, the feasibility of our model in estimating air pollution level was also assessed and high compatibility between model and ground monitoring was observed, which demonstrates the capability of the MODIS AOD and proposed model to estimate ground level PM<sub>2.5</sub>.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.