Let T be an unrooted binary tree with n distinctly labelled leaves. Deriving its name from the field of phylogenetics, a convex character on T is simply a partition of the leaves such that the minimal spanning subtrees induced by the blocks of the partition are mutually disjoint. In earlier work Kelk and Stamoulis (Advances in Applied Mathematics 84 (2017), pp. 34-46) defined g k (T ) as the number of convex characters where each block has at least k leaves. Exact expressions were given for g 1 and g 2 , where the topology of T turns out to be irrelevant, and it was noted that for k ≥ 3 topological neutrality no longer holds. In this article, for every k ≥ 3 we describe tree topologies achieving the maximum and minimum values of g k and determine corresponding expressions and exponential bounds for g k . Finally, we reflect briefly on possible algorithmic applications of these results.
Phylogenetic trees are used to model evolution: leaves are labelled to represent contemporary species ("taxa") and interior vertices represent extinct ancestors. Informally, convex characters are measurements on the contemporary species in which the subset of species (both contemporary and extinct) that share a given state, form a connected subtree. In [19] it was shown how to efficiently count, list and sample certain restricted subfamilies of convex characters, and algorithmic applications were given. We continue this work in a number of directions. First, we show how combining the enumeration of convex characters with existing parameterised algorithms can be used to speed up exponential-time algorithms for the maximum agreement forest problem in phylogenetics. Second, we re-visit the quantity g2(T ), defined as the number of convex characters on T in which each state appears on at least 2 taxa. We use this to give an algorithm with running time O(φ n • poly(n)), where φ ≈ 1.6181 is the golden ratio and n is the number of taxa in the input trees, for computation of maximum parsimony distance on two state characters. By further restricting the characters counted by g2(T ) we open an interesting bridge to the literature on enumeration of matchings. By crossing this bridge we improve the running time of the aforementioned parsimony distance algorithm to O(1.5895 n •poly(n)), and obtain a number of new results in themselves relevant to enumeration of matchings on at-most binary trees.
The rooted subtree prune and regraft (rSPR) distance between two rooted binary phylogenetic trees is a well-studied measure of topological dissimilarity that is NP-hard to compute. Here we describe an improved linear kernel for the problem. In particular, we show that if the classical subtree and chain reduction rules are augmented with a modified type of chain reduction rule, the resulting trees have at most 9k − 3 leaves, where k is the rSPR distance; and that this bound is tight. The previous best-known linear kernel had size O(28k). To achieve this improvement we introduce cyclic generators, which can be viewed as cyclic analogues of the generators used in the phylogenetic networks literature. As a corollary to our main result we also give an improved weighted linear kernel for the minimum hybridization problem on two rooted binary phylogenetic trees.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.