2022
DOI: 10.3390/app12136465
|View full text |Cite
|
Sign up to set email alerts
|

A Method of Pruning and Random Replacing of Known Values for Comparing Missing Data Imputation Models for Incomplete Air Quality Time Series

Abstract: The data obtained from air quality monitoring stations, which are used to carry out studies using data mining techniques, present the problem of missing values. This paper describes a research work on missing data imputation. Among the most common methods, the method that best imputes values to the available data set is analysed. It uses an algorithm that randomly replaces all known values in a dataset once with imputed values and compares them with the actual known values, forming several subsets. Data from s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 57 publications
(66 reference statements)
0
4
0
Order By: Relevance
“…Kalman prediction is used for various types of physiological data like heart rate variability and body weight variability [69][70][71] as well as for other time series data [72]. Luis Alfonso et al [73] used data imputation techniques on air quality data and evaluated the results using RMSE values. The study compared the performance of the Kalman smoothing algorithm with other imputation methods such as kNN and RF (Random Forest).…”
Section: Kalman Predictionmentioning
confidence: 99%
“…Kalman prediction is used for various types of physiological data like heart rate variability and body weight variability [69][70][71] as well as for other time series data [72]. Luis Alfonso et al [73] used data imputation techniques on air quality data and evaluated the results using RMSE values. The study compared the performance of the Kalman smoothing algorithm with other imputation methods such as kNN and RF (Random Forest).…”
Section: Kalman Predictionmentioning
confidence: 99%
“…Additionally, this research introduces a random pruning and replacement method for known values to compare missing data imputation models. This approach offers a practical means to evaluate and choose appropriate imputation models for incomplete datasets, potentially proving valuable in the context of recovering lost microclimate data in the UB Forest [12].…”
Section: Introductionmentioning
confidence: 99%
“…Anomaly detection frameworks for wearable device data, emphasizing the increasing use of wearable devices in clinical studies, demonstrate the need for robust imputation methods in this domain [15]. Furthermore, proposes a random pruning and replacement method of known values to compare missing data imputation models, providing insight into the behavior of different imputation methods for incomplete air quality time series [12]. These studies collectively underscore the need for efficient, accurate, and domain-specific imputation methods to address missing data across various fields, from proteomics to health records and environmental sensing.…”
Section: Introductionmentioning
confidence: 99%
“…They compared and analyzed the prediction performance of three ML algorithms and discovered that the RF algorithm was the most accurate in predicting meteorological factors and air pollutants. Menéndez García et al [21] designed an RF model to forecast the air quality of Swiss meteorological factors and air pollutants. They proved that RF had excellent accuracy in air quality prediction.…”
Section: Introductionmentioning
confidence: 99%