2020
DOI: 10.14778/3377369.3377383
|View full text |Cite
|
Sign up to set email alerts
|

Mind the gap

Abstract: Recording sensor data is seldom a perfect process. Failures in power, communication or storage can leave occasional blocks of data missing, affecting not only real-time monitoring but also compromising the quality of near- and off-line data analysis. Several recovery (imputation) algorithms have been proposed to replace missing blocks. Unfortunately, little is known about their relative performance, as existing comparisons are limited to either a small subset of relevant algorithms or to very few datasets or o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 54 publications
(16 citation statements)
references
References 52 publications
0
6
0
Order By: Relevance
“…The imputation performance of each method was evaluated by measuring the MAE and RMSE of the imputed and original data. MAE and RMSE are defined as: (5) In addition, the proposed method was compared with the experimental results of existing methods, such as locf, nocb, nearest, linear, spline, mean, median, and knn. In the initial parameter-setting stage, the window size was set to five for the mean and median methods using the windows.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The imputation performance of each method was evaluated by measuring the MAE and RMSE of the imputed and original data. MAE and RMSE are defined as: (5) In addition, the proposed method was compared with the experimental results of existing methods, such as locf, nocb, nearest, linear, spline, mean, median, and knn. In the initial parameter-setting stage, the window size was set to five for the mean and median methods using the windows.…”
Section: Resultsmentioning
confidence: 99%
“…Missing values can lead to biased results and affect the performance of machine learning algorithms [1,3,4]. In particular, "blackouts" are extreme missing scenarios, in which all the sensors are quiet simultaneously, causing widespread and aligned missing blocks [5]. Until recently, few algorithms have imputed missing blocks with high accuracy in blackouts [5].…”
Section: Introductionmentioning
confidence: 99%
“…Moreover, these methods do not contend well with high data sparsity. In addition, there exist deep learning studies for imputating multi-variate time series data [10], [11], [13], [17], [36], [44], [56]. For instance, Che et al [13] incorporate masking and time-lag mechanisms into a vanilla GRU and impute nulls based on a weighted combination of the last observation and a global mean.…”
Section: Bisimmentioning
confidence: 99%
“…Notably, their study also evaluated performance in downstream Machine Learning (ML) tasks, finding that the imputation rendered a 10–20% performance increase. Khayati et al [ 20 ] focused on sensor time series imputation, comparing 16 recovery algorithms on six public and two synthetic datasets, including block missings, which are more reflective of WSN data characteristics. Their findings suggested that the optimal recovery method often depends on dataset-specific characteristics.…”
Section: Introductionmentioning
confidence: 99%