Elaboration of Bayesian phylogenetic inference methods has continued at pace in recent years with major new advances in nearly all aspects of the joint modelling of evolutionary data. It is increasingly appreciated that some evolutionary questions can only be adequately answered by combining evidence from multiple independent sources of data, including genome sequences, sampling dates, phenotypic data, radiocarbon dates, fossil occurrences, and biogeographic range information among others. Including all relevant data into a single joint model is very challenging both conceptually and computationally. Advanced computational software packages that allow robust development of compatible (sub-)models which can be composed into a full model hierarchy have played a key role in these developments. Developing such software frameworks is increasingly a major scientific activity in its own right, and comes with specific challenges, from practical software design, development and engineering challenges to statistical and conceptual modelling challenges. BEAST 2 is one such computational software platform, and was first announced over 4 years ago. Here we describe a series of major new developments in the BEAST 2 core platform and model hierarchy that have occurred since the first release of the software, culminating in the recent 2.5 release.
Phylogenetics and phylodynamics are central topics in modern evolutionary biology. Phylogenetic methods reconstruct the evolutionary relationships among organisms, whereas phylodynamic approaches reveal the underlying diversification processes that lead to the observed relationships. These two fields have many practical applications in disciplines as diverse as epidemiology, developmental biology, palaeontology, ecology, and linguistics. The combination of increasingly large genetic data sets and increases in computing power is facilitating the development of more sophisticated phylogenetic and phylodynamic methods. Big data sets allow us to answer complex questions. However, since the required analyses are highly specific to the particular data set and question, a black-box method is not sufficient anymore. Instead, biologists are required to be actively involved with modeling decisions during data analysis. The modular design of the Bayesian phylogenetic software package BEAST 2 enables, and in fact enforces, this involvement. At the same time, the modular design enables computational biology groups to develop new methods at a rapid rate. A thorough understanding of the models and algorithms used by inference software is a critical prerequisite for successful hypothesis formulation and assessment. In particular, there is a need for more readily available resources aimed at helping interested scientists equip themselves with the skills to confidently use cutting-edge phylogenetic analysis software. These resources will also benefit researchers who do not have access to similar courses or training at their home institutions. Here, we introduce the “Taming the Beast” (https://taming-the-beast.github.io/) resource, which was developed as part of a workshop series bearing the same name, to facilitate the usage of the Bayesian phylogenetic software package BEAST 2.
Elaboration of Bayesian phylogenetic inference methods has continued at pace in recent years with major new advances in nearly all aspects of the joint modelling of evolutionary data. It is increasingly appreciated that some evolutionary questions can only be adequately answered by combining evidence from multiple independent sources of data, including genome sequences, sampling dates, phenotypic data, radiocarbon dates, fossil occurrences, and biogeographic range information among others. Including all relevant data into a single joint model is very challenging both conceptually and r 15, 2018 1/29 computationally. Advanced computational software packages that allow robust development of compatible (sub-)models which can be composed into a full model hierarchy have played a key role in these developments. Developing such software frameworks is increasingly a major scientific activity in its own right, and comes with specific challenges, from practical software design, development and engineering challenges to statistical and conceptual modelling challenges. BEAST 2 is one such computational software platform, and was first announced over 4 years ago. Here we describe a series of major new developments in the BEAST 2 core platform and model hierarchy that have occurred since the first release of the software, culminating in the recent 2.5 release. Author summaryBayesian phylogenetic inference methods have undergone considerable development in 1 recent years, and joint modelling of rich evolutionary data, including genomes, 2 phenotypes and fossil occurrences is increasingly common. Advanced computational 3 software packages that allow robust development of compatible (sub-)models which can 4 be composed into a full model hierarchy have played a key role in these developments. 5Developing scientific software is increasingly crucial to advancement in many fields of 6 biology. The challenges range from practical software development and engineering, 7 distributed team coordination, conceptual development and statistical modelling, to 8 validation and testing. BEAST 2 is one such computational software platform for 9 phylogenetics, population genetics and phylodynamics, and was first announced over 4 10 years ago. Here we describe the full range of new tools and models available on the 11 BEAST 2.5 platform, which expand joint evolutionary inference in many new directions, 12 especially for joint inference over multiple data types, non-tree models and complex 13 phylodynamics. 14 24 LogAnalyser, LogCombiner, TreeAnnotator, DensiTree [3], as well as a package 25 manager. 26 Shortly after its release, a number of packages were added, such as MASTER for 27 simulating stochastic population dynamics models [4], MultiTypeTree for inferring 28 structured coalescent models [5], RBS for reversible jump across substitution models [6], 29 SNAPP for multi species coalescent over SNP data [7], subst-bma for Bayesian model 30 averaging over site models [8], and BDSKY for the birth-death skyline tree model [9]. 31 All these package...
Heterogeneous populations can lead to important differences in birth and death rates across a phylogeny. Taking this heterogeneity into account is necessary to obtain accurate estimates of the underlying population dynamics. We present a new multitype birth–death model (MTBD) that can estimate lineage-specific birth and death rates. This corresponds to estimating lineage-dependent speciation and extinction rates for species phylogenies, and lineage-dependent transmission and recovery rates for pathogen transmission trees. In contrast with previous models, we do not presume to know the trait driving the rate differences, nor do we prohibit the same rates from appearing in different parts of the phylogeny. Using simulated data sets, we show that the MTBD model can reliably infer the presence of multiple evolutionary regimes, their positions in the tree, and the birth and death rates associated with each. We also present a reanalysis of two empirical data sets and compare the results obtained by MTBD and by the existing software BAMM. We compare two implementations of the model, one exact and one approximate (assuming that no rate changes occur in the extinct parts of the tree), and show that the approximation only slightly affects results. The MTBD model is implemented as a package in the Bayesian inference software BEAST 2 and allows joint inference of the phylogeny and the model parameters.[Birth–death; lineage specific rates, multi-type model.]
Fossil information is essential for estimating species divergence times, and can be integrated into Bayesian phylogenetic inference using the fossilized birth–death (FBD) process. An important aspect of palaeontological data is the uncertainty surrounding specimen ages, which can be handled in different ways during inference. The most common approach is to fix fossil ages to a point estimate within the known age interval. Alternatively, age uncertainty can be incorporated by using priors, and fossil ages are then directly sampled as part of the inference. This study presents a comparison of alternative approaches for handling fossil age uncertainty in analysis using the FBD process. Based on simulations, we find that fixing fossil ages to the midpoint or a random point drawn from within the stratigraphic age range leads to biases in divergence time estimates, while sampling fossil ages leads to estimates that are similar to inferences that employ the correct ages of fossils. Second, we show a comparison using an empirical dataset of extant and fossil cetaceans, which confirms that different methods of handling fossil age uncertainty lead to large differences in estimated node ages. Stratigraphic age uncertainty should thus not be ignored in divergence time estimation and instead should be incorporated explicitly.
1The multi-type birth-death model with sampling is a phylodynamic model which enables 2 quantification of past population dynamics in structured populations, based on phylogenetic 3 trees. The BEAST 2 package bdmm implements an algorithm for numerically computing the 4 probability density of a phylogenetic tree given the population dynamic parameters under 5 this model. In the initial release of bdmm, analyses were limited to trees consisting of up to 6 approximately 250 genetic samples for numerical reasons. We implemented important 7 algorithmic changes to bdmm which dramatically increase the number of genetic samples 8 that can be analyzed, and improve the numerical robustness and efficiency of the 9 calculations. Being able to use bigger datasets leads to improved precision of parameter 10 estimates. Furthermore, we report on several model extensions to bdmm, inspired by 11 properties common to empirical datasets. We apply this improved algorithm to two partly 12 overlapping datasets of Influenza A virus HA sequences sampled around the world, one with 13 500 samples, the other with only 175, for comparison. We report and compare the global 14 migration patterns and seasonal dynamics inferred from each dataset. 15 Availability: The latest release with our updates, bdmm 0.3.5, is freely available as an 16 open access package of BEAST 2. The source code can be accessed at 17 https://github.com/denisekuehnert/bdmm.18
Statistical phylogenetic methods are the foundation for a wide range of evolutionary and epidemiological studies. However, as these methods grow increasingly complex, users often encounter significant challenges with summarizing, visualizing and communicating their key results. We present RevGadgets, an R package for creating publication‐quality figures from the results of a large variety of phylogenetic analyses performed in RevBayes (and other phylogenetic software packages). We demonstrate how to use RevGadgets through a set of vignettes that cover the most common use cases that researchers will encounter. RevGadgets is an open‐source, extensible package that will continue to evolve in parallel with RevBayes, helping researchers to make sense of and communicate the results of a diverse array of analyses.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.