Son Hoang Dau scite author profile

Abstract-Linear erasure codes with local repairability are desirable for distributed data storage systems. An [n, k, d] code having all-symbol (r, δ)-locality, denoted as (r, δ)a, is considered optimal if it also meets the minimum Hamming distance bound. The existing results on the existence and the construction of optimal (r, δ)a codes are limited to only the special case of δ = 2, and to only two small regions within this special case, namely, m = 0 or m ≥ (v +δ −1) > (δ −1), where m = n mod (r+δ −1) and v = k mod r. This paper investigates the existence conditions and presents deterministic constructive algorithms for optimal (r, δ)a codes with general r and δ. First, a structure theorem is derived for general optimal (r, δ)a codes which helps illuminate some of their structure properties. Next, the entire problem space with arbitrary n, k, r and δ is divided into eight different cases (regions) with regard to the specific relations of these parameters. For two cases, it is rigorously proved that no optimal (r, δ)a could exist. For four other cases the optimal (r, δ)a codes are shown to exist, deterministic constructions are proposed and the lower bound on the required field size for these algorithms to work is provided. Our new constructive algorithms not only cover more cases, but for the same cases where previous algorithms exist, the new constructions require a considerably smaller field, which translates to potentially lower computational complexity. Our findings substantially enriches the knowledge on (r, δ)a codes, leaving only two cases in which the existence of optimal codes are yet to be determined.

show abstract

Error Correction for Index Coding With Side Information

Dau

Skachek

Chee

2013

IEEE Trans. Inform. Theory

129

View full text Add to dashboard Cite

Abstract-A problem of index coding with side information was first considered by Y. Birk and T. Kol (IEEE INFOCOM, 1998).In the present work, a generalization of index coding scheme, where transmitted symbols are subject to errors, is studied. Errorcorrecting methods for such a scheme, and their parameters, are investigated. In particular, the following question is discussed: given the side information hypergraph of index coding scheme and the maximal number of erroneous symbols δ, what is the shortest length of a linear index code, such that every receiver is able to recover the required information? This question turns out to be a generalization of the problem of finding a shortestlength error-correcting code with a prescribed error-correcting capability in the classical coding theory.The Singleton bound and two other bounds, referred to as the α-bound and the κ-bound, for the optimal length of a linear error-correcting index code (ECIC) are established. For large alphabets, a construction based on concatenation of an optimal index code with an MDS classical code, is shown to attain the Singleton bound. For smaller alphabets, however, this construction may not be optimal. A random construction is also analyzed. It yields another inexplicit bound on the length of an optimal linear ECIC.Further, the problem of error-correcting decoding by a linear ECIC is studied. It is shown that in order to decode correctly the desired symbol, the decoder is required to find one of the vectors, belonging to an affine space containing the actual error vector. The syndrome decoding is shown to produce the correct output if the weight of the error pattern is less or equal to the error-correcting capability of the corresponding ECIC.Finally, the notion of static ECIC, which is suitable for use with a family of instances of an index coding problem, is introduced. Several bounds on the length of static ECIC's are derived, and constructions for static ECIC's are discussed. Connections of these codes to weakly resilient Boolean functions are established.

show abstract

The UEA multivariate time series classification archive, 2018

Bagnall¹,

Dau²,

Lines³

et al. 2018

Preprint

117

View full text Add to dashboard Cite

Generating Synthetic Time Series to Augment Sparse Datasets

Forestier¹,

Petitjean²,

Dau³

et al. 2017

118

View full text Add to dashboard Cite

In machine learning, data augmentation is the process of creating synthetic examples in order to augment a dataset used to learn a model. One motivation for data augmentation is to reduce the variance of a classifier, thereby reducing error. In this paper, we propose new data augmentation techniques specifically designed for time series classification, where the space in which they are embedded is induced by Dynamic Time Warping (DTW). The main idea of our approach is to average a set of time series and use the average time series as a new synthetic example. The proposed methods rely on an extension of DTW Barycentric Averaging (DBA), the averaging technique that is specifically developed for DTW. In this paper, we extend DBA to be able to calculate a weighted average of time series under DTW. In this case, instead of each time series contributing equally to the final average, some can contribute more than others. This extension allows us to generate an infinite number of new examples from any set of given time series. To this end, we propose three methods that choose the weights associated to the time series of the dataset. We carry out experiments on the 85 datasets of the UCR archive and demonstrate that our method is particularly useful when the number of available examples is limited (e.g. 2 to 6 examples per class) using a 1-NN DTW classifier. Furthermore, we show that augmenting full datasets is beneficial in most cases, as we observed an increase of accuracy on 56 datasets, no effect on 7 and a slight decrease on only 22.

show abstract

On the Security of Index Coding With Side Information

Dau

Skachek

Chee

2012

IEEE Trans. Inform. Theory

View full text Add to dashboard Cite

Security aspects of the Index Coding with Side Information (ICSI) problem are investigated. Building on the results of Bar-Yossef et al. (2006), the properties of linear index codes are further explored. The notion of weak security, considered by Bhattad and Narayanan (2005) in the context of network coding, is generalized to block security.It is shown that the linear index code based on a matrix L, whose column space code C(L) has length n, minimum distance d and dual distance d ⊥ , is (d − 1 − t)-block secure (and hence also weakly secure) if the adversary knows in advance t ≤ d − 2 messages, and is completely insecure if the adversary knows in advance more than n − d ⊥ messages. Strong security is examined under the conditions that the adversary: (i) possesses t messages in advance; (ii) eavesdrops at most µ transmissions; (iii) corrupts at most δ transmissions. We prove that for sufficiently large q, an optimal linear index code which is strongly secure against such an adversary has length κq +µ+2δ. Here κq is a generalization of the min-rank over Fq of the side information graph for the ICSI problem in its original formulation in the work of Bar-Yossef et al.

show abstract

Time series joins, motifs, discords and shapelets: a unifying view that exploits the matrix profile

Yeh

Zhu

Ulanova

et al. 2017

Data Min Knowl Disc

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Son Hoang Dau

The UCR time series archive

Matrix Profile I: All Pairs Similarity Joins for Time Series: A Unifying View That Includes Motifs, Discords and Shapelets

Optimal Locally Repairable Linear Codes

Error Correction for Index Coding With Side Information

The UEA multivariate time series classification archive, 2018

Generating Synthetic Time Series to Augment Sparse Datasets

On the Security of Index Coding With Side Information

Time series joins, motifs, discords and shapelets: a unifying view that exploits the matrix profile

Contact Info

Product

Resources

About