The longest common substring with $k$-mismatches problem is to find, given
two strings $S_1$ and $S_2$, a longest substring $A_1$ of $S_1$ and $A_2$ of
$S_2$ such that the Hamming distance between $A_1$ and $A_2$ is $\le k$. We
introduce a practical $O(nm)$ time and $O(1)$ space solution for this problem,
where $n$ and $m$ are the lengths of $S_1$ and $S_2$, respectively. This
algorithm can also be used to compute the matching statistics with
$k$-mismatches of $S_1$ and $S_2$ in $O(nm)$ time and $O(m)$ space. Moreover,
we also present a theoretical solution for the $k = 1$ case which runs in $O(n
\log m)$ time, assuming $m\le n$, and uses $O(m)$ space, improving over the
existing $O(nm)$ time and $O(m)$ space bound of Babenko and Starikovskaya.Comment: Accepted versio
Given a pattern $P$ and a text $T$, both strings over a binary alphabet, the
binary jumbled string matching problem consists in telling whether any
permutation of $P$ occurs in $T$. The indexed version of this problem, i.e.,
preprocessing a string to efficiently answer such permutation queries, is hard
and has been studied in the last few years. Currently the best bounds for this
problem are $O(n^2/\log^2 n)$ (with O(n) space and O(1) query time) and
$O(r^2\log r)$ (with O(|L|) space and $O(\log|L|)$ query time), where $r$ is
the length of the run-length encoding of $T$ and $|L| = O(n)$ is the size of
the index. In this paper we present new results for this problem. Our first
result is an alternative construction of the index by Badkobeh et al. that
obtains a trade-off between the space and the time complexity. It has
$O(r^2\log k + n/k)$ complexity to build the index, $O(\log k)$ query time, and
uses $O(n/k + |L|)$ space, where $k$ is a parameter. The second result is an
$O(n^2 \log^2 w / w)$ algorithm (with O(n) space and O(1) query time), based on
word-level parallelism where $w$ is the word size in bits
We consider how to index strings, trees and graphs for jumbled pattern matching when we are asked to return a match if one exists. For example, we show how, given a tree containing two colours, we can build a quadratic-space index with which we can find a match in time proportional to the size of the match. We also show how we need only linear space if we are content with approximate matches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.