Linear-Time Algorithm for Long LCF with k Mismatches

Charalampopoulos, Panagiotis; Crochemore, Maxime; Iliopoulos, Costas S.; Kociumaka, Tomasz; Pissis, Solon P.; Radoszewski, Jakub; Rytter, Wojciech; Waleń, Tomasz

doi:10.4230/lipics.cpm.2018.23

Cited by 12 publications

(24 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Implementation and complexity Our Algorithm 1, excluding the counting phase in the leaves, has exactly the same structure as Algorithm 1 in [9]. This is verified in detail in "Appendix A".…”

Section: Remark 39mentioning

confidence: 87%

“…The following result from [9] shows that this abstract procedure can be implemented efficiently. In the statement below, we ignore the meaning of the resulting family of strings (which is important for computing the longest common substring of two strings with k mismatches) and focus only on its size and the complexity of its construction.…”

Section: A Application Of the Construction From [9]mentioning

confidence: 97%

“…In contributions 1 and 5, we apply recent advances in the Longest Common Substring with k Mismatches problem that were presented in [9,27], respectively (see also [34]). In particular, compared to [9], our contribution 1 requires a careful counting of substring pairs to avoid multiple counting and a thorough analysis of the space usage. Technically this is the most involved contribution.…”

Section: The All-pairs Hamming Distance Problemmentioning

confidence: 99%

“…This is verified in detail in "Appendix A". Proposition 13 from [9] provides a bound on the total number of the generated modified strings and an efficient implementation based on finger-search trees. We apply that proposition for a family F composed of substrings T m i to obtain the following bounds.…”

Section: Remark 39mentioning

confidence: 99%

“…In [9], a recursive procedure shown in Algorithm 2 was developed. This procedure takes as input a string P and a family F P that consists of tuples (S, F, b) such that F ∈ F for some string family F, S is a suffix of F of length |S| = |F| − |P|, and b = k − d H (F, P S) ≥ 0.…”

Section: A Application Of the Construction From [9]mentioning

confidence: 99%

See 4 more Smart Citations

Efficient Computation of Sequence Mappability

et al. 2022

Self Cite

View full text Add to dashboard Cite

Sequence mappability is an important task in genome resequencing. In the (k, m)-mappability problem, for a given sequence T of length n, the goal is to compute a table whose ith entry is the number of indices $$j \ne i$$ j ≠ i such that the length-m substrings of T starting at positions i and j have at most k mismatches. Previous works on this problem focused on heuristics computing a rough approximation of the result or on the case of $$k=1$$ k = 1 . We present several efficient algorithms for the general case of the problem. Our main result is an algorithm that, for $$k=O(1)$$ k = O ( 1 ) , works in $$O(n)$$ O ( n ) space and, with high probability, in $$O(n \cdot \min \{m^k,\log ^k n\})$$ O ( n · min { m k , log k n } ) time. Our algorithm requires a careful adaptation of the k-errata trees of Cole et al. [STOC 2004] to avoid multiple counting of pairs of substrings. Our technique can also be applied to solve the all-pairs Hamming distance problem introduced by Crochemore et al. [WABI 2017]. We further develop $$O(n^2)$$ O ( n 2 ) -time algorithms to compute all (k, m)-mappability tables for a fixed m and all $$k\in \{0,\ldots ,m\}$$ k ∈ { 0 , … , m } or a fixed k and all $$m\in \{k,\ldots ,n\}$$ m ∈ { k , … , n } . Finally, we show that, for $$k,m = \Theta (\log n)$$ k , m = Θ ( log n ) , the (k, m)-mappability problem cannot be solved in strongly subquadratic time unless the Strong Exponential Time Hypothesis fails. This is an improved and extended version of a paper presented at SPIRE 2018.

show abstract

Section: Remark 39mentioning

confidence: 87%

Section: A Application Of the Construction From [9]mentioning

confidence: 97%

Section: The All-pairs Hamming Distance Problemmentioning

confidence: 99%

Section: Remark 39mentioning

confidence: 99%

Section: A Application Of the Construction From [9]mentioning

confidence: 99%

See 3 more Smart Citations

Efficient Computation of Sequence Mappability

et al. 2022

Self Cite

View full text Add to dashboard Cite

show abstract

Efficient Computation of Sequence Mappability

Alzamel

Charalampopoulos

Iliopoulos

et al. 2018

String Processing and Information Retrieval

Self Cite

View full text Add to dashboard Cite

Sequence mappability is an important task in genome re-sequencing. In the (k, m)-mappability problem, for a given sequence T of length n, our goal is to compute a table whose ith entry is the number of indices j = i such that length-m substrings of T starting at positions i and j have at most k mismatches. Previous works on this problem focused on heuristic approaches to compute a rough approximation of the result or on the case of k = 1. We present several efficient algorithms for the general case of the problem. Our main result is an algorithm that works in O(n min{m k , log k+1 n}) time and O(n) space for k = O(1). It requires a careful adaptation of the technique of Cole et al. [STOC 2004] to avoid multiple counting of pairs of substrings. We also show O(n 2 )-time algorithms to compute all results for a fixed m and all k = 0, . . . , m or a fixed k and all m = k, . . . , n − 1. Finally we show that the (k, m)-mappability problem cannot be solved in strongly subquadratic time for k, m = Θ(log n) unless the Strong Exponential Time Hypothesis fails.

show abstract

Quantum Algorithms for Longest Common Substring with a Gap

Gibney,

Hossen

2024

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Linear-Time Algorithm for Long LCF with k Mismatches

Cited by 12 publications

References 0 publications

Efficient Computation of Sequence Mappability

Efficient Computation of Sequence Mappability

Efficient Computation of Sequence Mappability

Quantum Algorithms for Longest Common Substring with a Gap

Contact Info

Product

Resources

About