Prediction of Air Quality Index Using Machine Learning Techniques: A Comparative Analysis

Gupta, N. Srinivasa; Mohta, Yashvi; Heda, Khyati; Armaan, Raahil; Valarmathi, B.; Arulkumaran, G.

doi:10.1155/2023/4916267

Cited by 35 publications

(5 citation statements)

References 38 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Te AQI [20] of New Delhi, Bangalore, Kolkata, and Hyderabad has been calculated using three diferent techniques: support vector regression (SVR), random forest regression (RFR), and CatBoost regression (CR). Random forest regression yields lower root mean square error (RMSE) values in Bangalore (0.5674), Kolkata (0.1403), and Hyderabad (0.3826) and higher accuracy in comparison to SVR and CatBoost regression for Kolkata (90.9700%) and Hyderabad (78.3672%), while CatBoost regression yields lower RMSE values in New Delhi (0.2792) and the highest accuracy in New Delhi (79.8622%) and Bangalore (68.6860%).…”

Section: Literature Surveymentioning

confidence: 99%

A Machine Learning Approach for Environmental Assessment on Air Quality and Mitigation Strategy

Shetty,

Seema,

Sowmya

et al. 2024

Journal of Engineering

View full text Add to dashboard Cite

Air pollution has a significant impact on environment resulting in consequences such as global warming and acid rain. Toxic emissions from vehicles are one of the primary sources of pollution. Assessment of air pollution data is critical in order to assist residents in locating the safest areas in the city that are ideal for life. In this work, density-based spatial clustering of applications with noise (DBSCAN) is used which is among the widely used clustering algorithms in machine learning. It is not only capable of finding clusters of various sizes and shapes but can also detect outliers. DBSCAN takes in two important input parameters—Epsilon (Eps) and Minimum Points (MinPts). Even the slightest of variations in the parameter values fed to DBSCAN makes a big difference in the clustering. There is a need to find Eps value in as minimum time as possible. In this work, the goal is to find the Eps value in less time. For this purpose, a search tree technique is used for finding the Eps input to the DBSCAN algorithm. Predicting air pollution is a complex task due to various challenges associated with the dynamic and multifaceted nature of the atmosphere such as meteorological variability, local emissions and sources, data quality and availability, and emerging pollutants. Extensive experiments prove that the search tree approach to find Eps is quicker and efficient in comparison to the widely used KNN algorithm. The time reduction to find Eps makes a significant impact as the dataset size increases. The input parameters are fed to DBSCAN algorithm to obtain clustering results.

show abstract

Section: Literature Surveymentioning

confidence: 99%

A Machine Learning Approach for Environmental Assessment on Air Quality and Mitigation Strategy

Shetty,

Seema,

Sowmya

et al. 2024

Journal of Engineering

View full text Add to dashboard Cite

show abstract

“…In the RF, each node is split using the optimal splitter chosen from a subset of predictors. At every node, random predictors are utilized, and this element of randomness offers overfit protection (Alamsyah & Salma, 2018;Schonlau & Zou, 2020;Yarragunta et al, 2021;Hai et al, 2022;Benifa et al, 2022;Ravindiran et al, 2023;Baladjay et al, 2023;Gupta et al, 2023;Elvin, 2024;Aram et al, 2024). When presented with new data, each DT makes its own prediction.…”

Section: Random Forest (Rf)mentioning

confidence: 99%

“…Following this, the testing dataset is employed to introduce novel inputs to the system, thereby evaluating its precision and efficacy. This testing phase holds significant importance as it verifies the model's capability to apply learned knowledge to novel or previously unseen data (Ameer et al, 2019;Simu et al, 2020;Yarragunta et al, 2021;Gupta et al, 2023). Through this study endeavour, it is expected to identify the most accurate predictive model for forecasting employee turnover using air pollution data, while also contributing valuable insights for future research endeavours in this field.…”

Section: Introductionmentioning

confidence: 96%

Knowledge Management Approach in Comparative Study of Air Pollution Prediction Model

ROHAJAWATI,

SETYODEWI,

TRESNANTO

et al. 2024

Appl. Comput. Sci.

View full text Add to dashboard Cite

This study utilizes knowledge management (KM) to highlight a documentation-centric approach that is enhanced through artificial intelligence. Knowledge management can improve the decision-making process for predicting models that involved datasets, such as air pollution. Currently, air pollution has become a serious global issue, impacting almost every major city worldwide. As the capital and a central hub for various activities, Jakarta experiences heightened levels of activity, resulting in increased vehicular traffic and elevated air pollution levels. The comparative study aims to measure the accuracy levels of the naïve bayes, decision trees, and random forest prediction models. Additionally, the study uses evaluation measurements to assess how well the machine learning performs, utilizing a confusion matrix. The dataset’s duration is three years, from 2019 until 2021, obtained through Jakarta Open Data. The study found that the random forest achieved the best results with an accuracy rate of 94%, followed by the decision tree at 93%, and the naïve bayes had the lowest at 81%. Hence, the random forest emerges as a reliable predictive model for prediction of air pollution.

show abstract

“…Air quality prediction accuracy could be enhanced by using advanced machine learning algorithms like CatBoost [21]. CatBoost is a type of gradient-boosting algorithm adept at working with complex, highdimensional data sets like those used for modeling urban air quality [22].…”

Section: Introductionmentioning

confidence: 99%

Urban Air Quality Classification Using Machine Learning Approach to Enhance Environmental Monitoring

Idroes,

Noviandy,

Maulana

et al. 2023

Leuser J. Environ. Stud.

View full text Add to dashboard Cite

Urban areas worldwide grapple with environmental challenges, notably air pollution. DKI Jakarta, Indonesia's capital city, is emblematic of this struggle, where rapid urbanization contributes to increased pollutants. This study employed the CatBoost machine learning algorithm, known for its resistance to overfitting and capability to handle missing data, to predict urban air quality based on pollutant levels from 2010 to 2021. The dataset, sourced from Jakarta's air quality monitoring stations, includes pollutants such as PM10, SO2, CO, O3, and NO2. After preprocessing, we used 80% of the data for training and 20% for testing. The model displayed high accuracy (0.9781), precision (0.9722), and recall (0.9728). The feature importance chart revealed O3 (Ozone) as the top influencer of air quality predictions, followed by PM10. Our findings highlight the dominant pollutants affecting urban air quality in Jakarta, Indonesia and emphasizing the need for targeted strategies to reduce their concentrations and ensure a cleaner and healthier urban environment.

show abstract

Prediction of Air Quality Index Using Machine Learning Techniques: A Comparative Analysis

Cited by 35 publications

References 38 publications

A Machine Learning Approach for Environmental Assessment on Air Quality and Mitigation Strategy

A Machine Learning Approach for Environmental Assessment on Air Quality and Mitigation Strategy

Knowledge Management Approach in Comparative Study of Air Pollution Prediction Model

Urban Air Quality Classification Using Machine Learning Approach to Enhance Environmental Monitoring

Contact Info

Product

Resources

About