Spatiotemporally resolved particulate matter (PM) estimates are
essential for reconstructing long and short-term exposures in epidemiological
research. Improved estimates of PM2.5 and PM10 concentrations were produced over Italy for 2013–2015 using
satellite remote-sensing data and an ensemble modeling approach. The
following modeling stages were used: (1) missing values of the satellite-based
aerosol optical depth (AOD) product were imputed using a spatiotemporal
land-use random-forest (RF) model incorporating AOD data from atmospheric
ensemble models; (2) daily PM estimations were produced using four
modeling approaches: linear mixed effects, RF, extreme gradient boosting,
and a chemical transport model, the flexible air quality regional
model. The filled-in MAIAC AOD together with additional spatial and
temporal predictors were used as inputs in the three first models;
(3) a geographically weighted generalized additive model (GAM) ensemble
model was used to fuse the estimations from the four models by allowing
the weights of each model to vary over space and time. The GAM ensemble
model outperformed the four separate models, decreasing the cross-validated
root mean squared error by 1–42%, depending on the model. The
spatiotemporally resolved PM estimations produced by the suggested
model can be applied in future epidemiological studies across Italy.
Satellite-derived estimates of aerosol optical depth (AOD) are key
predictors in particulate air pollution models. The multi-step retrieval
algorithms that estimate AOD also produce quality control variables but these
have not been systematically used to address the measurement error in AOD. We
compare three machine-learning methods: random forests, gradient boosting, and
extreme gradient boosting (XGBoost) to characterize and correct measurement
error in the Multi-Angle Implementation of Atmospheric Correction (MAIAC) 1
× 1 km AOD product for Aqua and Terra satellites across the
Northeastern/Mid-Atlantic USA versus collocated measures from 79 ground-based
AERONET stations over 14 years. Models included 52 quality control, land use,
meteorology, and spatially-derived features. Variable importance measures
suggest relative azimuth, AOD uncertainty, and the AOD difference in
30–210 km moving windows are among the most important features for
predicting measurement error. XGBoost outperformed the other machine-learning
approaches, decreasing the root mean squared error in withheld testing data by
43% and 44% for Aqua and Terra. After correction using XGBoost, the correlation
of collocated AOD and daily PM2.5 monitors across the region
increased by 10 and 9 percentage points for Aqua and Terra. We demonstrate how
machine learning with quality control and spatial features substantially
improves satellite-derived AOD products for air pollution modeling.
Nitrogen dioxide
(NO2) remains an important traffic-related
pollutant associated with both short- and long-term health effects.
We aim to model daily average NO2 concentrations in Switzerland
in a multistage framework with mixed-effect and random forest models
to respectively downscale satellite measurements and incorporate local
sources. Spatial and temporal predictor variables include data from
the Ozone Monitoring Instrument, Copernicus Atmosphere Monitoring
Service, land use, and meteorological variables. We derived robust
models explaining ∼58% (R
2 range,
0.56–0.64) of the variation in measured NO2 concentrations
using mixed-effect models at a 1 × 1 km resolution. The random
forest models explained ∼73% (R
2 range, 0.70–0.75) of the overall variation in the residuals
at a 100 × 100 m resolution. This is one of the first studies
showing the potential of using earth observation data to develop robust
models with fine-scale spatial (100 × 100 m) and temporal (daily)
variation of NO2 across Switzerland from 2005 to 2016.
The novelty of this study is in demonstrating that methods originally
developed for particulate matter can also successfully be applied
to NO2. The predicted NO2 concentrations will
be made available to facilitate health research in Switzerland.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.