An Integrative DTW-based imputation method for gene expression time series data

Kostadinova, Elena; Boeva, Veselka; Boneva, Liliana; Tsiporkova, Elena

doi:10.1109/is.2012.6335145

Cited by 7 publications

(5 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The results showed that the combination of MICE and RF was more efficient than original methods for multivariate imputation. K-Nearest Neighbors ( -NN)-based imputation is also a popular method for completing missing values such as [11,26,27,[30][31][32]. This approach identifies most similar patterns in the space of available features to impute missing data.…”

Section: Classical Multivariate Imputation Methodsmentioning

confidence: 99%

“…Step a and Step b for data (32) return position of -the most similar window to (33) end for (34) Replace the missing values at the position by average vector of the window after and the one previous (35) end for (36) end for (37) return -imputed time series This makes it possible to find out windows that have the most similar dynamics and shape to the queries. …”

Section: Fuzzy-weighted Similarity Measure Between Subsequencesmentioning

confidence: 99%

See 1 more Smart Citation

A New Fuzzy Logic-Based Similarity Measure Applied to Large Gap Imputation for Uncorrelated Multivariate Time Series

Phan

Bigand

Caillault

2018

Applied Computational Intelligence and Soft Computing

View full text Add to dashboard Cite

The completion of missing values is a prevalent problem in many domains of pattern recognition and signal processing. Analyzing data with incompleteness may lead to a loss of power and unreliable results, especially for large missing subsequence(s). Therefore, this paper aims to introduce a new approach for filling successive missing values in low/uncorrelated multivariate time series which allows managing a high level of uncertainty. In this way, we propose using a novel fuzzy weighting-based similarity measure. The proposed method involves three main steps. Firstly, for each incomplete signal, the data before a gap and the data after this gap are considered as two separated reference time series with their respective query windows and . We then find the most similar subsequence ( ) to the subsequence before this gap and the most similar one ( ) to the subsequence after the gap . To find these similar windows, we build a new similarity measure based on fuzzy grades of basic similarity measures and on fuzzy logic rules. Finally, we fill in the gap with average values of the window following and the one preceding . The experimental results have demonstrated that the proposed approach outperforms the state-of-the-art methods in case of multivariate time series having low/noncorrelated data but effective information on each signal.

show abstract

Section: Classical Multivariate Imputation Methodsmentioning

confidence: 99%

Section: Fuzzy-weighted Similarity Measure Between Subsequencesmentioning

confidence: 99%

A New Fuzzy Logic-Based Similarity Measure Applied to Large Gap Imputation for Uncorrelated Multivariate Time Series

Phan

Bigand

Caillault

2018

Applied Computational Intelligence and Soft Computing

View full text Add to dashboard Cite

show abstract

“…The abovementioned approaches are widely used in various fields, such as multimedia, healthcare, and finance [ 29 ]. These approaches have been applied in major research topics including earthquake prediction [ 30 ], terrestrial ecosystem dynamics [ 31 ], stock-price data, exchange-rate analysis [ 32 ], and bioinformatics [ 33 ]. However, they have not been specialized for binary time-series.…”

Section: Introductionmentioning

confidence: 99%

Novel Features for Binary Time Series Based on Branch Length Similarity Entropy

Lee

Park

2021

Entropy

View full text Add to dashboard Cite

Branch length similarity (BLS) entropy is defined in a network consisting of a single node and branches. In this study, we mapped the binary time-series signal to the circumference of the time circle so that the BLS entropy can be calculated for the binary time-series. We obtained the BLS entropy values for “1” signals on the time circle. The set of values are the BLS entropy profile. We selected the local maximum (minimum) point, slope, and inflection point of the entropy profile as the characteristic features of the binary time-series and investigated and explored their significance. The local maximum (minimum) point indicates the time at which the rate of change in the signal density becomes zero. The slope and inflection points correspond to the degree of change in the signal density and the time at which the signal density changes occur, respectively. Moreover, we show that the characteristic features can be widely used in binary time-series analysis by characterizing the movement trajectory of Caenorhabditis elegans. We also mention the problems that need to be explored mathematically in relation to the features and propose candidates for additional features based on the BLS entropy profile.

show abstract

“…DTWcost is used as distance metric instead of pointwise distance measurements. Kostadinova et al [27] proposed an Integrative DTW-Based Imputation algorithm that is particularly suited for the estimation of missing values in gene expression time series data using multiple related information in datasets. This algorithm identifies an appropriate set of estimation matrices by using DTW-cost distance in order to measure similarities between gene expression matrices.…”

Section: Introductionmentioning

confidence: 99%

Which DTW method applied to marine univariate time series imputation

Phan

Caillault

Lefebvre

et al. 2017

OCEANS 2017 - Aberdeen

View full text Add to dashboard Cite

Abstract-Missing data are ubiquitous in any domains of applied sciences. Processing datasets containing missing values can lead to a loss of efficiency and unreliable results, especially for large missing sub-sequence(s). Therefore, the aim of this paper is to build a framework for filling missing values in univariate time series and to perform a comparison of different similarity metrics used for the imputation task. This allows to suggest the most suitable methods for the imputation of marine univariate time series. In the first step, the missing data are completed on various mono-dimensional time series. To fill a missing sub-sequence (gap) in a time series, we first find the most similar sub-sequence to the sub-sequence before (resp. after) this gap according a Dynamic Time Warping (DTW)-cost. Then we complete the gap by the next (resp. previous) sub-sequence of the most similar one. Through experiments results on 5 different datasets we conclude that i) DTW gives the best results when considering the accuracy of imputation values and ii) Adaptive Feature Based DTW (AFBDTW) metric yields very similar shape of imputation values similar to the one of true values.

show abstract

An Integrative DTW-based imputation method for gene expression time series data

Cited by 7 publications

References 20 publications

A New Fuzzy Logic-Based Similarity Measure Applied to Large Gap Imputation for Uncorrelated Multivariate Time Series

A New Fuzzy Logic-Based Similarity Measure Applied to Large Gap Imputation for Uncorrelated Multivariate Time Series

Novel Features for Binary Time Series Based on Branch Length Similarity Entropy

Which DTW method applied to marine univariate time series imputation

Contact Info

Product

Resources

About