Low-cost air quality sensors are promising supplements to regulatory monitors for PM2.5 exposure assessment. However, little has been done to incorporate the low-cost sensor measurements in large-scale PM2.5 exposure modeling. We conducted spatially varying calibration and developed a downweighting strategy to optimize the use of low-cost sensor data in PM2.5 estimation. In California, PurpleAir low-cost sensors were paired with air quality system (AQS) regulatory stations, and calibration of the sensors was performed by geographically weighted regression. The calibrated PurpleAir measurements were then given lower weights according to their residual errors and fused with AQS measurements into a random forest model to generate 1 km daily PM2.5 estimates. The calibration reduced PurpleAir’s systematic bias to ∼0 μg/m3 and residual errors by 36%. Increased sensor bias was found to be associated with higher temperature and humidity, as well as longer operating time. The weighted prediction model outperformed the AQS-based prediction model with an improved random cross-validation (CV) R 2 of 0.86, an improved spatial CV R 2 of 0.81, and a lower prediction error. The temporal CV R 2 did not improve due to the temporal discontinuity of PurpleAir. The inclusion of PurpleAir data allowed the predictions to better reflect PM2.5 spatial details and hotspots.
Satellite aerosol optical depth (AOD) has been widely employed to evaluate ground fine particle (PM 2.5 ) levels, whereas snow/cloud covers often lead to a large proportion of non-random missing AOD values. As a result, the fully covered and unbiased PM 2.5 estimates will be hard to generate. Among the current approaches to deal with the data gap issue, few have considered the cloud-AOD relationship and none of them have considered the snow-AOD relationship. This study examined the impacts of snow and cloud covers on AOD and PM 2.5 and made full-coverage PM 2.5 predictions by considering these impacts. To estimate missing AOD values, daily gap-filling models with snow/cloud fractions and meteorological covariates were developed using the random forest algorithm. By using these models in New York State, a daily AOD data set with a 1-km resolution was generated with a complete coverage. The "out-of-bag" R 2 of the gap-filling models averaged 0.93 with an interquartile range from 0.90 to 0.95. Subsequently, a random forest-based PM 2.5 prediction model with the gap-filled AOD and covariates was built to predict fully covered PM 2.5 estimates. A ten-fold cross-validation for the prediction model showed a good performance with an R 2 of 0.82. In the gap-filling models, the snow fraction was of higher significance to the snow season compared with the rest of the year. The prediction models fitted with/without the snow fraction also suggested the discernible changes in PM 2.5 patterns, further confirming the significance of this parameter. Compared with the methods without considering snow and cloud covers, our PM 2.5 prediction surfaces showed more spatial details and reflected small-scale terrain-driven PM 2.5 patterns. The proposed methods can be generalized to the areas with extensive snow/cloud covers and large proportions of missing satellite AOD data for predicting PM 2.5 levels with high resolutions and complete coverage.
It is well recognized that exposure to fine particulate matter (PM2.5) affects health adversely, yet few studies from South America have documented such associations due to the sparsity of PM2.5 measurements. Lima’s topography and aging vehicular fleet results in severe air pollution with limited amounts of monitors to effectively quantify PM2.5 levels for epidemiologic studies. We developed an advanced machine learning model to estimate daily PM2.5 concentrations at a 1 km2 spatial resolution in Lima, Peru from 2010 to 2016. We combined aerosol optical depth (AOD), meteorological fields from the European Centre for Medium-Range Weather Forecasts (ECMWF), parameters from the Weather Research and Forecasting model coupled with Chemistry (WRF-Chem), and land use variables to fit a random forest model against ground measurements from 16 monitoring stations. Overall cross-validation R2 (and root mean square prediction error, RMSE) for the random forest model was 0.70 (5.97 μg/m3). Mean PM2.5 for ground measurements was 24.7 μg/m3 while mean estimated PM2.5 was 24.9 μg/m3 in the cross-validation dataset. The mean difference between ground and predicted measurements was −0.09 μg/m3 (Std.Dev. = 5.97 μg/m3), with 94.5% of observations falling within 2 standard deviations of the difference indicating good agreement between ground measurements and predicted estimates. Surface downwards solar radiation, temperature, relative humidity, and AOD were the most important predictors, while percent urbanization, albedo, and cloud fraction were the least important predictors. Comparison of monthly mean measurements between ground and predicted PM2.5 shows good precision and accuracy from our model. Furthermore, mean annual maps of PM2.5 show consistent lower concentrations in the coast and higher concentrations in the mountains, resulting from prevailing coastal winds blown from the Pacific Ocean in the west. Our model allows for construction of long-term historical daily PM2.5 measurements at 1 km2 spatial resolution to support future epidemiological studies.
Regulatory monitoring networks are often too sparse to support community-scale PM 2.5 exposure assessment while emerging low-cost sensors have the potential to fill in the gaps. To date, limited studies, if any, have been conducted to utilize low-cost sensor measurements to improve PM 2.5 prediction with high spatiotemporal resolutions based on statistical models. Imperial County in California is an exemplary region with sparse Air Quality System (AQS) monitors and a community-operated low-cost network entitled Identifying Violations Affecting Neighborhoods (IVAN). This study aims to evaluate the contribution of IVAN measurements to the quality of PM 2.5 prediction. We adopted the Random Forest algorithm to estimate daily PM 2.5 concentrations at 1-km spatial resolution using three different PM 2.5 datasets (AQS-only, IVAN-only, and AQS/ IVAN combined). The results showed that the integration of low-cost sensor measurements is an effective way to significantly improve the quality of PM 2.5 prediction with an increase of crossvalidation (CV) R 2 by ~0.2. The IVAN measurements also contributed to the increased importance *
Ambient exposure to fine particulate matter (PM 2.5 ) is one of the top global health concerns. We estimate the PM 2.5 -related health benefits of emission reduction over New York State (NYS) from 2002 to 2012 using seven publicly available PM 2.5 products that include information from groundbased observations, remote sensing and chemical transport models. While these PM 2.5 products differ in spatial patterns, they show consistent decreases in PM 2.5 by 28%-37% from 2002 to 2012. We evaluate these products using two sets of independent ground-based observations from the New York City Community Air Quality Survey (NYCCAS) Program for an urban area, and the Saint Regis Mohawk Tribe Air Quality Program for a remote area. Inclusion of satellite remote sensing improves the representativeness of surface PM 2.5 in the remote area. Of the satellite-based products, only the statistical land use regression approach captures some of the spatial variability across New York City measured by NYCCAS. We estimate the PM 2.5 -related mortality burden by applying an integrated exposure-response function to the different PM 2.5 products. The multi-product mean PM 2.5 -related mortality burden over NYS decreased by 5660 deaths (67%) from 8410 (95% confidence interval (CI): 4570-12 400) deaths in 2002 to 2750 (CI: 700-5790) deaths in 2012. We estimate a 28% uncertainty in the state-level PM 2.5 mortality burden due to the choice of PM 2.5 products, but such uncertainty is much smaller than the uncertainty (130%) associated with the exposure-response function.
Background: Studies of PM2.5 health effects are influenced by the spatiotemporal coverage and accuracy of exposure estimates. The use of satellite remote sensing data such as aerosol optical depth (AOD) in PM2.5 exposure modeling has increased recently in the US and elsewhere in the world. However, few studies have addressed this issue in southern California due to challenges with reflective surfaces and complex terrain. Methods: We examined the factors affecting the associations with satellite AOD using a two-stage spatial statistical model. The first stage estimated the temporal PM2.5/AOD relationships using a linear mixed effects model at 1 km resolution. The second stage accounted for spatial variation using geographically weighted regression. Goodness of fit for the final model was evaluated by comparing the daily PM2.5 concentrations generated by cross-validation (CV) with observations. These methods were applied to a region of southern California spanning from Los Angeles to San Diego. Results: Mean predicted PM2.5 concentration for the study domain was 8.84 µg m−3. Linear regression between CV predicted PM2.5 concentrations and observations had an R 2 of 0.80 and RMSE 2.25 µg m−3. The ratio of PM2.5 to PM10 proved an important variable in modifying the AOD/PM2.5 relationship (β = 14.79, p ≤ 0.001). Including this ratio improved model performance significantly (a 0.10 increase in CV R 2 and a 0.56 µg m−3 decrease in CV RMSE). Discussion: Utilizing the high-resolution MAIAC AOD, fine-resolution PM2.5 concentrations can be estimated where measurements are sparse. This study adds to the current literature using remote sensing data to achieve better exposure data in the understudied region of Southern California. Overall, we demonstrate the usefulness of MAIAC AOD and the importance of considering coarser particles in dust prone areas.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.