Proceedings of the Thirty-Fifth ACM Symposium on Theory of Computing - STOC '03 2003
DOI: 10.1145/780587.780590
|View full text |Cite
|
Sign up to set email alerts
|

A sublinear algorithm for weakly approximating edit distance

Abstract: We show how to determine whether the edit distance between two given strings is small in sublinear time. Specifically, we present a test which, given two n-character strings A and B, runs in time o(n) and with high probability returns "CLOSE" if their edit distance is O(n α ), and "FAR" if their edit distance is Ω(n), where α is a fixed parameter less than 1. Our algorithm for testing the edit distance works by recursively subdividing the strings A and B into smaller substrings and looking for pairs of substri… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

4
36
0

Year Published

2004
2004
2015
2015

Publication Types

Select...
3
3
3

Relationship

1
8

Authors

Journals

citations
Cited by 26 publications
(40 citation statements)
references
References 4 publications
4
36
0
Order By: Relevance
“…It is well known that for two strings x, y ∈ V drawn independently at random, with probability Ω(1) they satisfy ed(x, y) ≥ Ω(n). (Indeed, for every x ∈ {0, 1} n , the number of strings y ∈ {0, 1} n within edit distance (say) n/10 from x is (3n) n/10 = o(2 n ); see [7,Lemma 8] and [22,Lemma 4.4].) We thus get…”
Section: We Now Wish To Upper Bound Prmentioning
confidence: 95%
“…It is well known that for two strings x, y ∈ V drawn independently at random, with probability Ω(1) they satisfy ed(x, y) ≥ Ω(n). (Indeed, for every x ∈ {0, 1} n , the number of strings y ∈ {0, 1} n within edit distance (say) n/10 from x is (3n) n/10 = o(2 n ); see [7,Lemma 8] and [22,Lemma 4.4].) We thus get…”
Section: We Now Wish To Upper Bound Prmentioning
confidence: 95%
“…If such a sketching technique exists, it gives a significant saving in storage space (since the sketches are much smaller than the original data items) as well as running time. Sketch constructions have been developed for a number of purposes, including estimating set membership [5], estimating similarity of sets [6], estimating distinct elements and vector norms [1,12], as well as estimating string edit distance [4]. It is shown in [8] how sketch constructions can be derived from rounding techniques used in approximation algorithms.…”
Section: Compact Data Structuresmentioning
confidence: 99%
“…image database is that a group of researchers defined 32 sets of similar images for this image collection 4 . These sets represent different categories and have different number of similar images in each set.…”
Section: Algorithmmentioning
confidence: 99%
“…It seems possible to estimate probabilistically, with sublinear complexity, whether the L-distance of two sequences is 'small' or 'large'; see Batu et al (2003). Whether an improvement of this rather coarse result or even a replacement of the L-distance is possible, with at most linear complexity and a non-probabilistic outcome, seems open.…”
Section: Introductionmentioning
confidence: 99%