Sequence mappability is an important task in genome resequencing. In the (k, m)-mappability problem, for a given sequence T of length n, the goal is to compute a table whose ith entry is the number of indices $$j \ne i$$
j
≠
i
such that the length-m substrings of T starting at positions i and j have at most k mismatches. Previous works on this problem focused on heuristics computing a rough approximation of the result or on the case of $$k=1$$
k
=
1
. We present several efficient algorithms for the general case of the problem. Our main result is an algorithm that, for $$k=O(1)$$
k
=
O
(
1
)
, works in $$O(n)$$
O
(
n
)
space and, with high probability, in $$O(n \cdot \min \{m^k,\log ^k n\})$$
O
(
n
·
min
{
m
k
,
log
k
n
}
)
time. Our algorithm requires a careful adaptation of the k-errata trees of Cole et al. [STOC 2004] to avoid multiple counting of pairs of substrings. Our technique can also be applied to solve the all-pairs Hamming distance problem introduced by Crochemore et al. [WABI 2017]. We further develop $$O(n^2)$$
O
(
n
2
)
-time algorithms to compute all (k, m)-mappability tables for a fixed m and all $$k\in \{0,\ldots ,m\}$$
k
∈
{
0
,
…
,
m
}
or a fixed k and all $$m\in \{k,\ldots ,n\}$$
m
∈
{
k
,
…
,
n
}
. Finally, we show that, for $$k,m = \Theta (\log n)$$
k
,
m
=
Θ
(
log
n
)
, the (k, m)-mappability problem cannot be solved in strongly subquadratic time unless the Strong Exponential Time Hypothesis fails. This is an improved and extended version of a paper presented at SPIRE 2018.