Abstract:Abstract. Given two comparative maps, that is two sequences of markers each representing a genome, the Maximal Strip Recovery problem (MSR) asks to extract a largest sequence of markers from each map such that the two extracted sequences are decomposable into non-overlapping strips (or synteny blocks). This aims at dening a robust set of synteny blocks between dierent species, which is a key to understand the evolution process since their last common ancestor. In this paper, we add a fundamental constraint to … Show more
“…In this section, we present a (d + 1.5)-approximation algorithm for the two minimization problems CMSR-d and δ-gap-CMSR-d. Recall that 2d-approximation algorithms [2,8,1] were known for the two maximization problems MSR-d and δ-gap-MSR-d. Let k be the number of deleted markers in an optimal solution. Then the number of single-markers in the input maps is at most (2d + 1)k because each single-marker is either deleted or adjacent to a deleted marker.…”
Section: An Approximation Algorithm For Cmsr-d and δ-Gap-cmsr-dmentioning
confidence: 99%
“…For the four variants of the maximal strip recovery problem, MSR-d, CMSRd, δ-gap-MSR-d, and δ-gap-CMSR-d, several hardness results have been obtained [2,9,6,1,7,8], and a variety of algorithms have been developed, including heuristics [10], approximation algorithms [2,1,5], and FPT algorithms [9,5]. For example, it is known that MSR-d admits a 2d-approximation algorithm for any d ≥ 2 [2,8], and that δ-gap-MSR-d admits a 2d-approximation algorithm for any d ≥ 2 and δ ≥ 1 and a 1.8-approximation algorithm for d = 2 and δ = 1 [1].…”
Section: Introductionmentioning
confidence: 99%
“…Compared to the approximation upper bound of 2d [2,1,8] for the two maximization problems MSR-d and δ-gap-MSR-d, which almost matches (at least asymptotically) the current best lower bound of Ω(d/ log d) [8], our upper bound of d + 1.5 for the two minimization problems CMSR-d and δ-gap-CMSR-d is still far away from the constant lower bound in [8]. It is an intriguing question whether CMSR-d and δ-gap-CMSR-d admit approximation algorithms with constant ratios independent of d.…”
An essential task in comparative genomics is usually to decompose two or more genomes into synteny blocks, that is, segments of chromosomes with similar contents. In this paper, we study the Maximal Strip Recovery problem (MSR) [Zheng et al. 07], which aims at finding an optimal decomposition of a set of genomes into synteny blocks, amidst possible noise and ambiguities. We present a panel of new or improved FPT and approximation algorithms for the MSR problem and its variants. Our main results include the first FPT algorithm for the variant δ-gap-MSR-d, an FPT algorithm for CMSR-d and δ-gap-CMSR-d running in time O(2.360 k poly(nd)), where k is the number of markers or genes considered as erroneous, and a (d + 1.5)-approximation algorithm for CMSR-d and δ-gap-CMSR-d.
“…In this section, we present a (d + 1.5)-approximation algorithm for the two minimization problems CMSR-d and δ-gap-CMSR-d. Recall that 2d-approximation algorithms [2,8,1] were known for the two maximization problems MSR-d and δ-gap-MSR-d. Let k be the number of deleted markers in an optimal solution. Then the number of single-markers in the input maps is at most (2d + 1)k because each single-marker is either deleted or adjacent to a deleted marker.…”
Section: An Approximation Algorithm For Cmsr-d and δ-Gap-cmsr-dmentioning
confidence: 99%
“…For the four variants of the maximal strip recovery problem, MSR-d, CMSRd, δ-gap-MSR-d, and δ-gap-CMSR-d, several hardness results have been obtained [2,9,6,1,7,8], and a variety of algorithms have been developed, including heuristics [10], approximation algorithms [2,1,5], and FPT algorithms [9,5]. For example, it is known that MSR-d admits a 2d-approximation algorithm for any d ≥ 2 [2,8], and that δ-gap-MSR-d admits a 2d-approximation algorithm for any d ≥ 2 and δ ≥ 1 and a 1.8-approximation algorithm for d = 2 and δ = 1 [1].…”
Section: Introductionmentioning
confidence: 99%
“…Compared to the approximation upper bound of 2d [2,1,8] for the two maximization problems MSR-d and δ-gap-MSR-d, which almost matches (at least asymptotically) the current best lower bound of Ω(d/ log d) [8], our upper bound of d + 1.5 for the two minimization problems CMSR-d and δ-gap-CMSR-d is still far away from the constant lower bound in [8]. It is an intriguing question whether CMSR-d and δ-gap-CMSR-d admit approximation algorithms with constant ratios independent of d.…”
An essential task in comparative genomics is usually to decompose two or more genomes into synteny blocks, that is, segments of chromosomes with similar contents. In this paper, we study the Maximal Strip Recovery problem (MSR) [Zheng et al. 07], which aims at finding an optimal decomposition of a set of genomes into synteny blocks, amidst possible noise and ambiguities. We present a panel of new or improved FPT and approximation algorithms for the MSR problem and its variants. Our main results include the first FPT algorithm for the variant δ-gap-MSR-d, an FPT algorithm for CMSR-d and δ-gap-CMSR-d running in time O(2.360 k poly(nd)), where k is the number of markers or genes considered as erroneous, and a (d + 1.5)-approximation algorithm for CMSR-d and δ-gap-CMSR-d.
“…The idea is actually quite simple and has been used many times previously [21,19,10]. Note that any strip of length l > 3 is a concatenation of shorter strips of lengths 2 and 3, for example, 4 = 2 + 2, 5 = 2 + 3, etc.…”
Section: A Polynomial-time 2d-approximation For Msr-dmentioning
confidence: 99%
“…Bulteau, Fertin, and Rusu [10] recently proposed a restricted variant of Maximal Strip Recovery called δ-gap-MSR, which is MSR-2 with the additional constraint that at most δ markers may be deleted between any two adjacent markers of a strip in each genomic map. We now define δ-gap-MSR-d and δ-gap-CMSR-d as the restricted variants of the two problems MSR-d and CMSR-d, respectively, with the additional δgap constraint.…”
In comparative genomic, the first step of sequence analysis is usually to decompose two or more genomes into syntenic blocks that are segments of homologous chromosomes. For the reliable recovery of syntenic blocks, noise and ambiguities in the genomic maps need to be removed first. Maximal Strip Recovery (MSR) is an optimization problem proposed by Zheng, Zhu, and Sankoff for reliably recovering syntenic blocks from genomic maps in the midst of noise and ambiguities. Given d genomic maps as sequences of gene markers, the objective of MSR-d is to find d subsequences, one subsequence of each genomic map, such that the total length of syntenic blocks in these subsequences is maximized. For any constant d ≥ 2, a polynomial-time 2d-approximation for MSR-d was previously known. In this paper, we show that for any d ≥ 2, MSR-d is APX-hard, even for the most basic version of the problem in which all gene markers are distinct and appear in positive orientation in each genomic map. Moreover, we provide the first explicit lower bounds on approximating MSR-d for all d ≥ 2. In particular, we show that MSR-d is NP-hard to approximate within Ω(d/ log d). From the other direction, we show that the previous 2d-approximation for MSR-d can be optimized into a polynomial-time algorithm even if d is not a constant but is part of the input. We then extend our inapproximability results to several related problems including CMSR-d, δ-gap-MSR-d, and δ-gap-CMSR-d.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.