Reconstruction of phylogenetic trees is a fundamental problem in computational biology. While excellent heuristic methods are available for many variants of this problem, new advances in phylogeny inference will be required if we are to be able to continue to make effective use of the rapidly growing stores of variation data now being gathered. In this paper, we present two integer linear programming (ILP) formulations to find the most parsimonious phylogenetic tree from a set of binary variation data. One method uses a flow-based formulation that can produce exponential numbers of variables and constraints in the worst case. The method has, however, proven extremely efficient in practice on datasets that are well beyond the reach of the available provably efficient methods, solving several large mtDNA and Y-chromosome instances within a few seconds and giving provably optimal results in times competitive with fast heuristics than cannot guarantee optimality. An alternative formulation establishes that the problem can be solved with a polynomial-sized ILP. We further present a web server developed based on the exponential-sized ILP that performs fast maximum parsimony inferences and serves as a front end to a database of precomputed phylogenies spanning the human genome.
We present a comprehensive survey of combinatorial algorithms and theorems about lattice protein folding models obtained in the almost 15 years since the publication in 1995 of the first protein folding approximation algorithm with mathematically guaranteed error bounds [60]. The results presented here are mainly about the HP-protein folding model introduced by Ken Dill in 1985 [37]. The main topics of this survey include: approximation algorithms for linear-chain and side-chain lattice models, as well as off-lattice models, NP-completeness theorems about a variety of protein folding models, contact map structure of self-avoiding walks and HP-folds, combinatorics and algorithmics for side-chain models, bi-sphere packing and the Kepler conjecture, and the protein sidechain self-assembly conjecture. As an appealing bridge between the hybrid of continuous mathematics and discrete mathematics, a cornerstone of the mathematical difficulty of the protein folding problem, we show how work on 2D self-avoiding walks contact-map decomposition [56] can build upon the exact RNA contacts counting formula by Mike Waterman and collaborators [96] leading to renewed hope for analytical closed-form approximations for statistical mechanics of protein folding in lattice models. We also include in this paper a few new results, research directions within reach of rigorous results, and a set of open problems that merit future exploration. * Dedicated to Michael Waterman on the occasion of his 67th birthday.
In the traveling salesman path problem, we are given a set of cities, traveling costs between city pairs and fixed source and destination cities. The objective is to find a minimum cost path from the source to destination visiting all cities exactly once. In this paper, we study polyhedral and combinatorial properties of a variant we call the traveling salesman walk problem, in which the objective is to find a minimum cost walk from the source to destination visiting all cities at least once. We first characterize traveling salesman walk perfect graphs, graphs for which the convex hull of incidence vectors of traveling salesman walks can be described by linear inequalities. We show these graphs have a description by way of forbidden minors and also characterize them constructively. We also address the asymmetric traveling salesman path problem (ATSPP) and give a factor O( √ n)-approximation algorithm for this problem. Mathematics Subject Classification (2000)68Q25 · 68R10 · 90C05 · 90C27
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.