Abstract:Given a set of leaf-labelled trees with identical leaf sets, the MAST problem, respectively MCT problem, consists of finding a largest subset of leaves such that all input trees restricted to these leaves are isomorphic, respectively compatible. In this paper, we propose extensions of these problems to the context of supertree inference, where input trees have non-identical leaf sets. This situation is of particular interest in phylogenetics. The resulting problems are called SMAST and SMCT.A sufficient condit… Show more
“…By requiring that induced subtrees of the input trees are compatible, and not strictly isomorphic, MCT usually leads to selecting a larger set of leaves than allowed by MAST. Note that another variant of MAST has been recently proposed to build phylogenetic supertrees, where input trees have different leaf sets [Berry and Nicolas 2004].…”
Section: Introductionmentioning
confidence: 99%
“…linear time, significantly improving on the former O(kn 4 ) time algorithm of [Amir and Keselman 1997], refined into an O(kn 3 ) algorithm in [Berry and Nicolas 2004]. The improvement in the complexity results from ordering the subtrees of two compared trees such that conflicting triples of leaves are readily identified from the minimum and maximum leaves contained in a subtree.…”
Given a set of leaf-labeled trees with identical leaf sets, the well-known Maximum Agreement Subtree (MAST) problem consists in finding a subtree homeomorphically included in all input trees and with the largest number of leaves. MAST and its variant called Maximum Compatible Tree (MCT) are of particular interest in computational biology. This paper presents a lineartime approximation algorithm to solve the complement version of MAST, namely identifying the smallest set of leaves to remove from input trees to obtain isomorphic trees. We also present an O(n 2 + kn) algorithm to solve the complement version of MCT. For both problems, we thus achieve significantly lower running times than previously known algorithms. Fast running times are especially important in phylogenetics where large collections of trees are routinely produced by resampling procedures, such as the non parametric bootstrap or Bayesian MCMC methods.
“…By requiring that induced subtrees of the input trees are compatible, and not strictly isomorphic, MCT usually leads to selecting a larger set of leaves than allowed by MAST. Note that another variant of MAST has been recently proposed to build phylogenetic supertrees, where input trees have different leaf sets [Berry and Nicolas 2004].…”
Section: Introductionmentioning
confidence: 99%
“…linear time, significantly improving on the former O(kn 4 ) time algorithm of [Amir and Keselman 1997], refined into an O(kn 3 ) algorithm in [Berry and Nicolas 2004]. The improvement in the complexity results from ordering the subtrees of two compared trees such that conflicting triples of leaves are readily identified from the minimum and maximum leaves contained in a subtree.…”
Given a set of leaf-labeled trees with identical leaf sets, the well-known Maximum Agreement Subtree (MAST) problem consists in finding a subtree homeomorphically included in all input trees and with the largest number of leaves. MAST and its variant called Maximum Compatible Tree (MCT) are of particular interest in computational biology. This paper presents a lineartime approximation algorithm to solve the complement version of MAST, namely identifying the smallest set of leaves to remove from input trees to obtain isomorphic trees. We also present an O(n 2 + kn) algorithm to solve the complement version of MCT. For both problems, we thus achieve significantly lower running times than previously known algorithms. Fast running times are especially important in phylogenetics where large collections of trees are routinely produced by resampling procedures, such as the non parametric bootstrap or Bayesian MCMC methods.
“…Following the notation and terminology of [7], a tree T has a leaf set L(T ) in bijection with a label set. Each internal node (including the root) has at least two children.…”
Section: Definitionsmentioning
confidence: 99%
“…Consider the trees T 1 , T 2 and T 3 ( Figure 5) where L(T 1 )={1, 2, 3, 4, 5, 6, 11, 12, 13, 14, 15}, L(T 2 ) = {1, 2,3,4,5,6,7,8,9, 10}, L(T 3 ) = {7, 8, 9, 10, 20, 21, 22, 23, 24}, where…”
Section: Supertree For a Family Of Trees With Refinementmentioning
confidence: 99%
“…Semple and Steel [6] proposed MinCutSupertree algorithm to build a rooted supertree for a family of rooted weighted binary trees. Berry and Nicholas [7] propose MergeTrees, an algorithm for constructing a supertree compatible with two binary trees by grafting "specific subtrees" or "specific leaves" of one tree onto the other. For |T | ≥ 3, they apply MergeTrees repeatedly to pairs of trees, each time reducing |T | by 1.…”
Abstract. We describe a new supertree algorithm that extends the type of information that can be used for phylogenetic inference. Its input is a set of constraints that expresses either the hierarchical relationships in a family of given phylogenies, or/and other relations between clusters of sets of species. The output of the algorithm is a multifurcating rooted supertree which satisfies all constraints. Moreover, if there were contradictions in the set of constraints the corresponding part of the supertree is identified and its set of constraints is displayed such as the user may decide to modify or keep it. Our algorithm is not affected by the order in which the input phylogenies or other constraints are presented. We apply our method to a number of data sets.
Given a set of leaf-labelled trees with identical leaf sets, the well-known MAST problem consists of finding a subtree homeomorphically included in all input trees and with the largest number of leaves. MAST and its variant called MCT are of particular interest in computational biology. This paper presents positive and negative results on the approximation of MAST, MCT and their complement versions, denoted CMAST and CMCT. For CMAST and CMCT on rooted trees we give 3-approximation algorithms achieving significantly lower running times than those previously known. In particular, the algorithm for CMAST runs in linear time. The approximation threshold for CMAST, resp. CMCT, is shown to be the same whenever collections of rooted trees or of unrooted trees are considered. Moreover, hardness of approximation results are stated for CMAST, CMCT and MCT on small number of trees, and for MCT on unbounded number of trees.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.