Imputation of Missing Values in Time Series Using an Adaptive-Learned Median-Filled Deep Autoencoder

Pan, Zhuofu; Wang, Yalin; Wang, Kai; Chen, Hongtian; Yang, Chunhua; Gui, Weihua

doi:10.1109/tcyb.2022.3167995

Cited by 38 publications

(13 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Some other works (Che et al 2018;Cao et al 2018;Tang et al 2020) integrate the missing prediction as an intermediate step in time series prediction. In terms of auto-encoder-based methods, some other works (Miranda et al 2011(Miranda et al , 2012Pan et al 2022) take the missing parts as random noises and recover the missing value with the output of the delicately-designed auto-encoder. Moreover, with the recent advancement of generative adversarial theory, many works(Weihan 2020; Luo et al 2019;Yoon, Jordon, and Schaar 2018) follow the basic generative adversarial idea with the utilization of deep learning neural networks to train the specifically structured generators and discriminators and generate the value of the missing parts.…”

Section: Related Work Missing Data Imputationmentioning

confidence: 99%

DBT-DMAE: An Effective Multivariate Time Series Pre-Train Model under Missing Data

Zhang¹,

Yang²,

Li³

2022

Preprint

View full text Add to dashboard Cite

Multivariate time series(MTS) is a universal data type related to many practical applications. However, MTS suffers from missing data problems, which leads to degradation or even collapse of the downstream tasks, such as prediction and classification. The concurrent missing data handling procedures could inevitably arouse the biased estimation and redundancy-training problem when encountering multiple downstream tasks. This paper presents a universally applicable MTS pre-train model, DBT-DMAE, to conquer the abovementioned obstacle. First, a missing representation module is designed by introducing dynamic positional embedding and random masking processing to characterize the missing symptom. Second, we proposed an auto-encoder structure to obtain the generalized MTS encoded representation utilizing an ameliorated TCN structure called dynamic-bidirectional-TCN as the basic unit, which integrates the dynamic kernel and time-fliping trick to draw temporal features effectively. Finally, the overall feed-in and loss strategy is established to ensure the adequate training of the whole model. Comparative experiment results manifest that the DBT-DMAE outperforms the other state-of-the-art methods in six real-world datasets and two different downstream tasks. Moreover, ablation and interpretability experiments are delivered to verify the validity of DBT-DMAE's substructures.

show abstract

Section: Related Work Missing Data Imputationmentioning

confidence: 99%

DBT-DMAE: An Effective Multivariate Time Series Pre-Train Model under Missing Data

Zhang¹,

Yang²,

Li³

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…State-of-the-art imputation methods include principal component analysis (PCA) [9], [10] based on machine learning, expectation maximization based on statistical methods, and autoencoders and generative adversarial nets (GAN) based on deep learning [11]- [15]. Each method is more applicable than the others in certain situations based on its advantages and disadvantages.…”

Section: Introductionmentioning

confidence: 99%

“…For example, expectation maximization requires assumptions about data distribution and cannot be applied to a dataset with a mixture of continuous and categorical variables [16], whereas an autoencoder can be used for estimating missing data when part of the dataset is missing, but requires a complete dataset for training [17]. On the other hand, generative adversarial imputation nets (GAIN), the latest technique for data imputation based on GAN, exhibit excellent data imputation performance even when complete data are unavailable [11]- [15].…”

Section: Introductionmentioning

confidence: 99%

Missing Data Imputation Algorithm for Transmission Systems Based on Multivariate Imputation With Principal Component Analysis

et al. 2022

View full text Add to dashboard Cite

As the importance of utility condition is increasingly acknowledged, the use of asset management technologies in the electric power industry has rapidly grown. The global trend of asset management follows risk management that accounts for the probability and consequences of failures. Because asset management systems tend to be composed of various legacy systems, each of which is installed and designed to collect data according to a certain data type and acquisition purpose, it is necessary to develop a system that cleans and integrates data acquired from each legacy system. This study explores the development of an asset management system for a transmission system as a representative linear asset consisting of different segments in a sequence. First, the configurations and characteristics of linear asset datasets are analyzed. Second, an automatic data cleaning system, equipped with six types of data cleaning functions for extracting dirty data from entire datasets, is proposed. An algorithm for data imputation, which is essential for estimating the remaining useful life, is developed based on principal component analysis-iterative algorithm (PCA-IA). Afterward, the performance of the proposed system is verified using actual data with the help of the Korea Electric Power Corporation (KEPCO). Specifically, to evaluate the performance of the proposed system, an automatic cleaning process is demonstrated using actual legacy datasets.

show abstract

“…However, they are powerless in the face of deep learning-based observers and classifiers structured by complex multilayer nonlinearities. [16][17][18][19] In the field of explainable artificial intelligence, 20,21 attribution algorithms 22 are techniques for enhancing the interpretability of deep networks. It aims to evaluate the contribution of network inputs to the outputs based on the learned knowledge from a well-trained classification model.…”

Section: Introductionmentioning

confidence: 99%

“…These methods provide effective FI for shallow learning‐based fault detection indices. However, they are powerless in the face of deep learning‐based observers and classifiers structured by complex multilayer nonlinearities 16‐19 …”

Section: Introductionmentioning

confidence: 99%

Layer‐wise contribution‐filtered propagation for deep learning‐based fault isolation

Pan

Wang

et al. 2022

Intl J Robust & Nonlinear

Self Cite

View full text Add to dashboard Cite

Deep learning is gradually mainstreaming into data‐driven methods, relying on the advantages of extracting complicated nonlinear features. However, the black‐box property makes its decision rules non‐transparent, resulting in difficulty in attribution tasks, which aim to backtrack the contribution of network inputs to the outputs. Fault isolation and localization are techniques for diagnosing the root cause of system failures, which have a consistent objective with attribution for a deep learning‐based fault observer or classifier. Unfortunately, most fault isolation methods are based on shallow learning methods. Also, many attribution algorithms are linear without considering the influence of nonlinear activation functions. The related concerns motivate us to propose a new approach, namely layer‐wise contribution‐filtered propagation (LCP), for deep learning‐based fault isolation. In LCP, reasonable contributions are defined based on the influence of each layer input on maximizing the absolute output activation. A symbolic function is designed to identify neurons with negative contributions, which are then filtered and forbidden to backpropagate to the previous layer. By guiding correct attribution, LCP is available for any nonlinear activation functions and their combinations. It also provides a solution for fault isolation with stacked sample inputs, in which one single variable has several attributions associated with different times. Finally, two chemical simulations verify the effectiveness of the proposed method.

show abstract

Imputation of Missing Values in Time Series Using an Adaptive-Learned Median-Filled Deep Autoencoder

Cited by 38 publications

References 29 publications

DBT-DMAE: An Effective Multivariate Time Series Pre-Train Model under Missing Data

DBT-DMAE: An Effective Multivariate Time Series Pre-Train Model under Missing Data

Missing Data Imputation Algorithm for Transmission Systems Based on Multivariate Imputation With Principal Component Analysis

Layer‐wise contribution‐filtered propagation for deep learning‐based fault isolation

Contact Info

Product

Resources

About