Abstract. We developed a two-stage model called the random-forest–generalised additive
model (RF–GAM), based on satellite data, meteorological factors, and other
geographical covariates, to predict the surface 8 h O3 concentrations
across the remote Tibetan Plateau. The 10-fold cross-validation result
suggested that RF–GAM showed excellent performance, with the highest
R2 value (0.76) and lowest root-mean-square error (RMSE)
(14.41 µg m−3), compared with other seven machine-learning models. The
predictive performance of RF–GAM showed significant seasonal
discrepancy, with the highest R2 value observed in summer (0.74),
followed by winter (0.69) and autumn (0.67), and the lowest one in spring
(0.64). Additionally, the unlearning ground-observed O3 data collected
from open-access websites were applied to test the transferring ability of the
novel model and confirmed that the model was robust in predicting the surface
8 h O3 concentration during other periods (R2=0.67, RMSE = 25.68 µg m−3). RF–GAM was then used to predict the daily 8 h
O3 level over the Tibetan Plateau during 2005–2018 for the first time. It
was found that the estimated O3 concentration displayed a slow increase,
from 64.74±8.30 µg m−3 to 66.45±8.67 µg m−3 from 2005 to 2015, whereas it decreased from the peak to 65.87±8.52 µg m−3 during 2015–2018. Besides this, the estimated 8 h
O3 concentrations exhibited notable spatial variation, with the highest
values in some cities of the northern Tibetan Plateau, such as Huangnan (73.48±4.53 µg m−3) and Hainan (72.24±5.34 µg m−3), followed by the cities in the central region, including Lhasa
(65.99±7.24 µg m−3) and Shigatse (65.15±6.14 µg m−3), and the lowest O3 concentration occurred in a city of
the southeastern Tibetan Plateau called Aba (55.17±12.77 µg m−3).
Based on the 8 h O3 critical value (100 µg m−3) provided by
the World Health Organization (WHO), we further estimated the annual mean
nonattainment days over the Tibetan Plateau. It should be noted that most of the
cities on the Tibetan Plateau had excellent air quality, while several
cities (e.g. Huangnan, Haidong, and Guoluo) still suffered from more than
40 nonattainment days each year, which should be given more attention in order to
alleviate local O3 pollution. The results shown herein confirm that the
novel hybrid model improves the prediction accuracy and can be applied to
assess the potential health risk, particularly in remote regions with
few monitoring sites.