Satellite-derived estimates of aerosol optical depth (AOD) are key
predictors in particulate air pollution models. The multi-step retrieval
algorithms that estimate AOD also produce quality control variables but these
have not been systematically used to address the measurement error in AOD. We
compare three machine-learning methods: random forests, gradient boosting, and
extreme gradient boosting (XGBoost) to characterize and correct measurement
error in the Multi-Angle Implementation of Atmospheric Correction (MAIAC) 1
× 1 km AOD product for Aqua and Terra satellites across the
Northeastern/Mid-Atlantic USA versus collocated measures from 79 ground-based
AERONET stations over 14 years. Models included 52 quality control, land use,
meteorology, and spatially-derived features. Variable importance measures
suggest relative azimuth, AOD uncertainty, and the AOD difference in
30–210 km moving windows are among the most important features for
predicting measurement error. XGBoost outperformed the other machine-learning
approaches, decreasing the root mean squared error in withheld testing data by
43% and 44% for Aqua and Terra. After correction using XGBoost, the correlation
of collocated AOD and daily PM2.5 monitors across the region
increased by 10 and 9 percentage points for Aqua and Terra. We demonstrate how
machine learning with quality control and spatial features substantially
improves satellite-derived AOD products for air pollution modeling.
Aim:We compared predictive modeling approaches to estimate placental methylation using cord blood methylation. Materials & methods: We performed locus-specific methylation prediction using both linear regression and support vector machine models with 174 matched pairs of 450k arrays. Results: At most CpG sites, both approaches gave poor predictions in spite of a misleading improvement in array-wide correlation. CpG islands and gene promoters, but not enhancers, were the genomic contexts where the correlation between measured and predicted placental methylation levels achieved higher values. We provide a list of 714 sites where both models achieved an R 2 ≥0.75. Conclusion: The present study indicates the need for caution in interpreting cross-tissue predictions. Few methylation sites can be predicted between cord blood and placenta.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.