Summary We describe eMPRess, a software program for phylogenetic tree reconciliation under the duplication-transfer-loss model that systematically addresses the problems of choosing event costs and selecting representative solutions, enabling users to make more robust inferences. Availability and implementation eMPRess is freely available at http://www.cs.hmc.edu/empress. Supplementary information Supplementary data are available at Bioinformatics online.
BackgroundMaximum parsimony reconciliation in the duplication-transfer-loss model is a widely-used method for analyzing the evolutionary histories of pairs of entities such as hosts and parasites, symbiont species, and species and genes. While efficient algorithms are known for finding maximum parsimony reconciliations, the number of such reconciliations can be exponential in the size of the trees. Since these reconciliations can differ substantially from one another, making inferences from any one reconciliation may lead to conclusions that are not supported, or may even be contradicted, by other maximum parsimony reconciliations. Therefore, there is a need to find small sets of best representative reconciliations when the space of solutions is large and diverse.ResultsWe provide a general framework for hierarchical clustering the space of maximum parsimony reconciliations. We demonstrate this framework for two specific linkage criteria, one that seeks to maximize the average support of the events found in the reconciliations in each cluster and the other that seeks to minimize the distance between reconciliations in each cluster. We analyze the asymptotic worst-case running times and provide experimental results that demonstrate the viability and utility of this approach.ConclusionsThe hierarchical clustering algorithm method proposed here provides a new approach to find a set of representative reconciliations in the potentially vast and diverse space of maximum parsimony reconciliations.
Gene trees can differ from species trees due to a variety of biological phenomena, the most prevalent being gene duplication, horizontal gene transfer, gene loss, and coalescence. To explain topological incongruence between the two trees, researchers apply reconciliation methods, often relying on a maximum parsimony framework. However, while several studies have investigated the space of maximum parsimony reconciliations (MPRs) under the duplication-loss and duplicationtransfer-loss models, the space of MPRs under the duplication-losscoalescence (DLC) model remains poorly understood. To address this problem, we present new algorithms for computing the size of MPR space under the DLC model and sampling from this space uniformly at random. Our algorithms are efficient in practice, with runtime polynomial in the size of the species and gene tree when the number of genes that map to any given species is fixed, thus proving that the MPR problem is fixedparameter tractable. We have applied our methods to a biological data set of 16 fungal species to provide the first key insights in the space of MPRs under the DLC model. Our results show that a plurality reconciliation, and underlying events, are likely to be representative of MPR space.
BackgroundReconciliation methods are widely used to explain incongruence between a gene tree and species tree. However, the common approach of inferring maximum parsimony reconciliations (MPRs) relies on user-defined costs for each type of event, which can be difficult to estimate. Prior work has explored the relationship between event costs and maximum parsimony reconciliations in the duplication-loss and duplication-transfer-loss models, but no studies have addressed this relationship in the more complicated duplication-loss-coalescence model.ResultsWe provide a fixed-parameter tractable algorithm for computing Pareto-optimal reconciliations and recording all events that arise in those reconciliations, along with their frequencies. We apply this method to a case study of 16 fungi to systematically characterize the complexity of MPR space across event costs and identify events supported across this space.ConclusionThis work provides a new framework for studying the relationship between event costs and reconciliations that incorporates both macro-evolutionary events and population effects and is thus broadly applicable across eukaryotic species.
BackgroundMaximum parsimony reconciliation in the duplication-transfer-loss model is widely used in studying the evolutionary histories of genes and species and in studying coevolution of parasites and their hosts and pairs of symbionts. While efficient algorithms are known for finding maximum parsimony reconciliations, the number of reconciliations can grow exponentially in the size of the trees. An understanding of the space of maximum parsimony reconciliations is necessary to determine whether a single reconciliation can adequately represent the space or whether multiple representative reconciliations are needed.ResultsWe show that for any instance of the reconciliation problem, the distribution of pairwise distances can be computed exactly by an efficient polynomial-time algorithm with respect to several different distance metrics. We describe the algorithm, analyze its asymptotic worst-case running time, and demonstrate its utility and viability on a large biological dataset.ConclusionsThis result provides new insights into the structure of the space of maximum parsimony reconciliations. These insights are likely to be useful in the wide range of applications that employ reconciliation methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.