High level of tropospheric ozone concentration, exceeding allowable level has been frequently reported in Malaysia. This study proposes accurate model based on Machine Learning algorithms to predict Tropospheric ozone concentration in major cities located in Kuala Lumpur and Selangor, Malaysia. The proposed models were developed using three-year of historical data for different parameters as input to predict 24-hour and 12-hour of tropospheric ozone concentration. Different Machine Learning algorithms have been investigated, viz. Linear Regression, Neural Network and Boosted Decision Tree. The results revealed that wind speed, humidity, Nitrogen Oxide, Carbon Monoxide and Nitrogen Dioxide have significant influence on ozone formation. Boosted Decision Tree outperformed Linear regression and Neural Network algorithms for all stations. The performance of the proposed model improved by using 12-hours dataset instead of the 24-hour where R 2 values were equal to 0.91, 0.88 and 0.87 for the three investigated stations. To assess the uncertainties of the Boosted Decision Tree model, 95% prediction uncertainties (95PPU) d-factors were introduced.95PPU showed about 94.4, 93.4, 96.7% and the d-factors were 0.001015, 0.001016 and 0.001124 which relate to S1, S2 and S3, respectively. The obtained results provide a reliable prediction model to mimic actual ozone concentration in different locations in Malaysia.
To accurately predict tropospheric ozone concentration(O 3 ), it is needed to investigate the variety of artificial intelligence techniques' performance, such as machine learning, deep learning and hybrid models. This research aims to effectively predict the hourly ozone trend via fewer input variables. This ozone prediction attempt is performed on diversity data of air pollutants (NO 2 , NO x , CO, SO 2 ) and meteorological parameters (wind-speed and humidity). The historical datasets are collected from 3 sites in Malaysia. The study's methodology progressed in two paths: standalone and hybrid models where hourly-averaged datasets are applied based on 5-time horizon analysis scenario, with different inputs' combinations. For evaluation, all models are tested throughout 5-performance indicator and illustrated on Modified Taylor diagram. Sensitivity analysis of input variables is quantified. Additionally, uncertainty analysis is conducted to assess their confidence level associated with Willmott Index. Based on R 2 , results indicated that XGBoost has higher accuracy compared to MLP and SVR; meanwhile, LSTM and CNN outweighs XGBoost. In terms of robustness and accuracy, the proposed hybrid model possesses superlative performance compared to all above-mentioned techniques. The proposed model achieved exceptional results as the highest R 2 , the highest 95% confidence degree, and narrower confidence interval width, are 93.48%, 98.16%, and 0.0014195, respectively.
Accurately predicting meteorological parameters such as air temperature and humidity plays a crucial role in air quality management. This study proposes different machine learning algorithms: Gradient Boosting Tree (G.B.T.), Random forest (R.F.), Linear regression (LR) and different artificial neural network (ANN) architectures (multi-layered perceptron, radial basis function) for prediction of such as air temperature (T) and relative humidity (Rh). Daily data over 24 years for Kula Terengganu station were obtained from the Malaysia Meteorological Department. Results showed that MLP-NN performs well among the others in predicting daily T and Rh with R of 0.7132 and 0.633, respectively. However, in monthly prediction T also MLP-NN model provided closer standards deviation to actual value and can be used to predict monthly T with R 0.8462. Whereas in prediction monthly Rh, the RBF-NN model's efficiency was higher than other models with R of 0.7113. To validate the performance of the trained both artificial neural network (ANN) architectures MLP-NN and RBF-NN, both were applied to an unseen data set from observation data in the region. The results indicated that on either architecture of ANN, there is good potential to predict daily and monthly T and Rh values with an acceptable range of accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.