2005
DOI: 10.1145/1103963.1103971
|View full text |Cite
|
Sign up to set email alerts
|

The greedy algorithm for the minimum common string partition problem

Abstract: In the Minimum Common String Partition problem (MCSP), we are given two strings on input, and we wish to partition them into the same collection of substrings, minimizing the number of the substrings in the partition. This problem is NP-hard, even for a special case, denoted 2-MCSP, where each letter occurs at most twice in each input string. We study a greedy algorithm for MCSP that at each step extracts a longest common substring from the given strings. We show that the approximation ratio of this algorithm … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
32
0

Year Published

2007
2007
2022
2022

Publication Types

Select...
7
2

Relationship

2
7

Authors

Journals

citations
Cited by 36 publications
(32 citation statements)
references
References 12 publications
0
32
0
Order By: Relevance
“…The NP-hardness holds even if the instance is restricted to have multiplicity at most k, i.e., an instance of k-MCSP, for any k 2. Approximation algorithms for MCSP have been recently investigated in [9,[44][45][46]. In particular, [9] presents an approximation algorithm based on a new graphical structure called pair-match graphs.…”
Section: Minimum Common Substring Partitionmentioning
confidence: 99%
“…The NP-hardness holds even if the instance is restricted to have multiplicity at most k, i.e., an instance of k-MCSP, for any k 2. Approximation algorithms for MCSP have been recently investigated in [9,[44][45][46]. In particular, [9] presents an approximation algorithm based on a new graphical structure called pair-match graphs.…”
Section: Minimum Common Substring Partitionmentioning
confidence: 99%
“…Choosing an optimal set of strings c j might be intractable, since even if the strings are restricted to be the prefixes or suffixes of words in the text, the problem of finding the set is NP-complete [7], and other similar problems of devising a code have also been shown to be NP-complete in [5,10,6]. A natural approach is thus to suggest heuristical solutions and compare their efficiencies.…”
Section: Introductionmentioning
confidence: 99%
“…However, when the input genome contains some genes which appear twice, even the one-sided scaffold filling problem becomes NP-complete. The latter problem has a close connection with the Minimum Common String Partition (MCSP) problem [2,3,4,7,8,9]. This paper is organized as follows.…”
Section: Introductionmentioning
confidence: 99%