Abstract. Low-cost sensing strategies hold the promise of denser air quality monitoring networks, which could significantly improve our understanding of personal air pollution exposure. Additionally, low-cost air quality sensors could be deployed to areas where limited monitoring exists. However, low-cost sensors are frequently sensitive to environmental conditions and pollutant cross-sensitivities, which have historically been poorly addressed by laboratory calibrations, limiting their utility for monitoring. In this study, we investigated different calibration models for the Real-time Affordable Multi-Pollutant (RAMP) sensor package, which measures CO, NO 2 , O 3 , and CO 2 . We explored three methods: (1) laboratory univariate linear regression, (2) empirical multiple linear regression, and (3) machine-learning-based calibration models using random forests (RF). Calibration models were developed for 16-19 RAMP monitors (varied by pollutant) using training and testing windows spanning August 2016 through February 2017 in Pittsburgh, PA, US. The random forest models matched (CO) or significantly outperformed (NO 2 , CO 2 , O 3 ) the other calibration models, and their accuracy and precision were robust over time for testing windows of up to 16 weeks. Following calibration, average mean absolute error on the testing data set from the random forest models was 38 ppb for CO (14 % relative error), 10 ppm for CO 2 (2 % relative error), 3.5 ppb for NO 2 (29 % relative error), and 3.4 ppb for O 3 (15 % relative error), and Pearson r versus the reference monitors exceeded 0.8 for most units. Model performance is explored in detail, including a quantification of model variable importance, accuracy across different concentration ranges, and performance in a range of monitoring contexts including the National Ambient Air Quality Standards (NAAQS) and the US EPA Air Sensors Guidebook recommendations of minimum data quality for personal exposure measurement. A key strength of the RF approach is that it accounts for pollutant cross-sensitivities. This highlights the importance of developing multipollutant sensor packages (as opposed to single-pollutant monitors); we determined this is especially critical for NO 2 and CO 2 . The evaluation reveals that only the RF-calibrated sensors meet the US EPA Air Sensors Guidebook recommendations of minimum data quality for personal exposure measurement. We also demonstrate that the RF-model-calibrated sensors could detect differences in NO 2 concentrations between a near-road site and a suburban site less than 1.5 km away. From this study, we conclude that combining RF models with carefully controlled state-of-the-art multipollutant sensor packages as in the RAMP monitors appears to be a very promising approach to address the poor performance that has plagued low-cost air quality sensors.
Assessing the intracity spatial distribution and temporal variability in air quality can be facilitated by a dense network of monitoring stations. However, the cost of implementing such a network can be prohibitive if traditional high-quality, expensive monitoring systems are used. To this end, the Real-time Affordable Multi-Pollutant (RAMP) monitor has been developed, which can measure up to five gases including the criteria pollutant gases carbon monoxide (CO), nitrogen dioxide (NO 2 ), and ozone (O 3 ), along with temperature and relative humidity. This study compares various algorithms to calibrate the RAMP measurements including linear and quadratic regression, clustering, neural networks, Gaussian processes, and hybrid random forest-linear regression models. Using data collected by almost 70 RAMP monitors over periods ranging up to 18 months, we recommend the use of limited quadratic regression calibration models for CO, neural network models for NO, and hybrid models for NO 2 and O 3 for any low-cost monitor using electrochemical sensors similar to those of the RAMP. Furthermore, generalized calibration models may be used instead of individual models with only a small reduction in overall performance. Generalized models also transfer better when the RAMP is deployed to other locations. For long-term deployments, it is recommended that model performance be re-evaluated and new models developed periodically, due to the noticeable change in performance over periods of a year or more. This makes generalized calibration models even more useful since only a subset of deployed monitors are needed to build these new models. These results will help guide future efforts in the calibration and use of low-cost sensor systems worldwide.Published by Copernicus Publications on behalf of the European Geosciences Union.
Abstract. Low-cost sensing strategies hold the promise of denser air quality monitoring networks, which could significantly improve our understanding of personal air pollution exposure. Additionally, low-cost air quality sensors could be deployed to areas where limited monitoring exists. However, low-cost sensors are frequently sensitive to environmental conditions and pollutant cross-sensitivities, which have historically been poorly addressed by laboratory calibrations, limiting their utility for monitoring. In this study, we investigated different calibration models for the Real-time Affordable Multi-Pollutant (RAMP) sensor package, which measures CO, NO2, O3, and CO2. We explored three methods: 1) laboratory univariate linear regression, 2) empirical multivariate linear regression and 3) machine-learning based calibration models using random forests (RF). Calibration models were developed for 19 RAMP monitors using training and testing windows spanning August 2016 through February 2017 in Pittsburgh, PA. The random forest models matched (CO) or significantly outperformed (NO2, CO2, O3) the other calibration models, and their accuracy and precision was robust over time for testing windows of up to 16 weeks. Following calibration, average mean absolute error on the testing dataset from the random forest models was 38 ppb for CO (14 % relative error), 10 ppm for CO2 (2 % relative error), 3.5 ppb for NO2 (29 % relative error) and 3.4 ppb for O3 (15 % relative error), and Pearson r versus the reference monitors exceeded 0.8 for most units. Model performance is explored in detail, including a quantification of model variable importance, accuracy across different concentration ranges, and performance in a range of monitoring contexts including the National Ambient Air Quality Standards (NAAQS), and the US EPA Air Sensors Guidebook recommendations of minimum data quality for personal exposure measurement. A key strength of the RF approach is that it accounts for pollutant cross sensitivities. This highlights the importance of developing multipollutant sensor packages (as opposed to single pollutant monitors); we determined this is especially critical for NO2 and CO2. The evaluation reveals that only the RF-calibrated sensors meet the US EPA Air Sensors Guidebook recommendations of minimum data quality for personal exposure measurement. We also demonstrate that the RF model calibrated sensors could detect differences in NO2 concentrations between a near-road site and a suburban site less than 1.5 km away. From this study, we conclude that combining RF models with the RAMP monitors appears to be a very promising approach to address the poor performance that has plagued low cost air quality sensors.
Abstract. Assessing the intra-city spatial distribution and temporal variability of air quality can be facilitated by a dense network of monitoring stations. However, the cost of implementing such a network can be prohibitive if traditional high-quality, expensive monitoring systems are used. To this end, the Real-time Affordable Multi-Pollutant (RAMP) monitor has been developed, which can measure up to five gases including the criteria pollutant gases carbon monoxide (CO), nitrogen dioxide (NO2), and ozone (O3), along with temperature and relative humidity. This study compares various algorithms to calibrate the RAMP measurements including linear and quadratic regression, clustering, neural networks, Gaussian processes, and random forests. Using data collected by more than sixty RAMP monitors over periods ranging up to eighteen months, it was found that quadratic regression models or a hybrid of random forest and linear models tend to be the most effective calibration models overall. In specific cases, other types of models can have comparable or even superior performance. Furthermore, generalized calibration models may be used instead of individual models with only a small reduction in overall performance. For long-term deployments, it is recommended that new models be developed each year, due to the noticeable change in performance when models for one year were used for processing data collected in the subsequent year. This makes annually-developed generalized calibration models even more useful since only a subset of deployed monitors are needed to build these models. These results will help guide future efforts in the calibration and use of low-cost sensor systems worldwide.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.