Levenshtein introduced the problem of constructing k-deletion correcting codes in 1966, proved that the optimal redundancy of those codes is O(k log N ), and proposed an optimal redundancy single-deletion correcting code (using the so-called VT construction). However, the problem of constructing optimal redundancy k-deletion correcting codes remained open. Our key contribution is a solution to this longstanding open problem. We present a k-deletion correcting code that has redundancy 8k log n+ o(log n) and encoding/decoding algorithms of complexity O(n 2k+1 ) for constant k.Recently, an independent work [9] proposed a k deletion code with O(k log n) redundancy and better complexity of poly(n, k). Compare to the constant 8k log n in this paper, the constant in [9] is not explicitly given and is at least 200k log n. Moreover, the approaches in [9] and this paper are different.Next we identify and describe our key ideas. The key building blocks in our code construction are: (i) generalizing the VT construction to k deletions by considering constrained sequences, (ii) separating the encoded vector to blocks and using concatenated codes and (iii) a novel strategy to separate the vector to blocks by a single pattern.In our previous work for 2-deletions codes [13], we generalized the VT construction. In particular, we proved that while the higher order parity checks n i=1 i j c i mod (n j + 1), j = 0, 1, . . . , t might not work in general, those parity checks work in the two deletions case when the sequences are constrained to have no adjacent 1's. In this paper we generalize this idea, specifically, the higher order parity checks work for k = t/2 deletions when the sequences we need to protect satisfy the following constraint: The distance between any two adjacent 1's is at least k.The fact that we can correct k deletions using the generalization of the VT construction on constrained sequences, enables a concatenated code construction, which separates the sequence c into small blocks. Each block is protected by an inner This work was presented in part at the IEEE International Symposium on Information Theory, Paris, France, July 2019. 1 The notion O k denotes parameterized complexity, i.e., O k (N log 4 N ) = f (k)O(N log 4 N ) for some function f . 2 code, usually a k-deletion code. All the blocks together are protected by an outer code, for example, a Reed-Solomon code. Separating and identifying the boundaries between blocks is one of the main challenges in the concatenated code construction. The work in [10], [11] resolved this issue by inserting markers between blocks. In [7], an approach that uses occurrences of short subsequences, called patterns, as markers was proposed. The success of decoding in existing approaches requires that the patterns can not be destroyed or generated by k deletions / insertions.Here, we improve the redundancy in [7] by using a single pattern to separate the blocks and allowing it to be destroyed or generated by deletions / insertions. The pattern, which we call synchronization patte...
The interest in channel models in which the data is sent as an unordered set of binary strings has increased lately, due to emerging applications in DNA storage, among others. In this paper we analyze the minimal redundancy of binary codes for this channel under substitution errors, and provide several constructions, some of which are shown to be asymptotically optimal. The surprising result in this paper is that while the information vector is sliced into a set of unordered strings, the amount of redundant bits that are required to correct errors is asymptotically equal to the amount required in the classical error correcting paradigm. 1 The edit distance between two strings is the minimum number of deletions, insertions, and substitutions that turn one to another. 2 As long as the number of insertions is not equal to the number of deletions, an event that occurs in negligible probability.
Construction of capacity achieving deletion correcting codes has been a baffling challenge for decades. A recent breakthrough by Brakensiek et al., alongside novel applications in DNA storage, have reignited the interest in this longstanding open problem. In spite of recent advances, the amount of redundancy in existing codes is still orders of magnitude away from being optimal. In this paper, a novel approach for constructing binary two-deletion correcting codes is proposed. By this approach, parity symbols are computed from indicator vectors (i.e., vectors that indicate the positions of certain patterns) of the encoded message, rather than from the message itself. Most interestingly, the parity symbols and the proof of correctness are a direct generalization of their counterparts in the Varshamov-Tenengolts construction. Our techniques require 7 log(n) + o(log(n) redundant bits to encode an n-bit message, which is near-optimal.
Levenshtein introduced the problem of constructing k-deletion correcting codes in 1966, proved that the optimal redundancy of those codes is O(k log N ), and proposed an optimal redundancy single-deletion correcting code (using the so-called VT construction). However, the problem of constructing optimal redundancy k-deletion correcting codes remained open. Our key contribution is a solution to this longstanding open problem. We present a k-deletion correcting code that has redundancy 8k log n+ o(log n) and encoding/decoding algorithms of complexity O(n 2k+1 ) for constant k.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.