Elaboration of Bayesian phylogenetic inference methods has continued at pace in recent years with major new advances in nearly all aspects of the joint modelling of evolutionary data. It is increasingly appreciated that some evolutionary questions can only be adequately answered by combining evidence from multiple independent sources of data, including genome sequences, sampling dates, phenotypic data, radiocarbon dates, fossil occurrences, and biogeographic range information among others. Including all relevant data into a single joint model is very challenging both conceptually and computationally. Advanced computational software packages that allow robust development of compatible (sub-)models which can be composed into a full model hierarchy have played a key role in these developments. Developing such software frameworks is increasingly a major scientific activity in its own right, and comes with specific challenges, from practical software design, development and engineering challenges to statistical and conceptual modelling challenges. BEAST 2 is one such computational software platform, and was first announced over 4 years ago. Here we describe a series of major new developments in the BEAST 2 core platform and model hierarchy that have occurred since the first release of the software, culminating in the recent 2.5 release.
Phylodynamics - the field aiming to quantitatively integrate the ecological and evolutionary dynamics of rapidly evolving populations like those of RNA viruses – increasingly relies upon coalescent approaches to infer past population dynamics from reconstructed genealogies. As sequence data have become more abundant, these approaches are beginning to be used on populations undergoing rapid and rather complex dynamics. In such cases, the simple demographic models that current phylodynamic methods employ can be limiting. First, these models are not ideal for yielding biological insight into the processes that drive the dynamics of the populations of interest. Second, these models differ in form from mechanistic and often stochastic population dynamic models that are currently widely used when fitting models to time series data. As such, their use does not allow for both genealogical data and time series data to be considered in tandem when conducting inference. Here, we present a flexible statistical framework for phylodynamic inference that goes beyond these current limitations. The framework we present employs a recently developed method known as particle MCMC to fit stochastic, nonlinear mechanistic models for complex population dynamics to gene genealogies and time series data in a Bayesian framework. We demonstrate our approach using a nonlinear Susceptible-Infected-Recovered (SIR) model for the transmission dynamics of an infectious disease and show through simulations that it provides accurate estimates of past disease dynamics and key epidemiological parameters from genealogies with or without accompanying time series data.
Coalescent theory is routinely used to estimate past population dynamics and demographic parameters from genealogies. While early work in coalescent theory only considered simple demographic models, advances in theory have allowed for increasingly complex demographic scenarios to be considered. The success of this approach has lead to coalescent-based inference methods being applied to populations with rapidly changing population dynamics, including pathogens like RNA viruses. However, fitting epidemiological models to genealogies via coalescent models remains a challenging task, because pathogen populations often exhibit complex, nonlinear dynamics and are structured by multiple factors. Moreover, it often becomes necessary to consider stochastic variation in population dynamics when fitting such complex models to real data. Using recently developed structured coalescent models that accommodate complex population dynamics and population structure, we develop a statistical framework for fitting stochastic epidemiological models to genealogies. By combining particle filtering methods with Bayesian Markov chain Monte Carlo methods, we are able to fit a wide class of stochastic, nonlinear epidemiological models with different forms of population structure to genealogies. We demonstrate our framework using two structured epidemiological models: a model with disease progression between multiple stages of infection and a two-population model reflecting spatial structure. We apply the multi-stage model to HIV genealogies and show that the proposed method can be used to estimate the stage-specific transmission rates and prevalence of HIV. Finally, using the two-population model we explore how much information about population structure is contained in genealogies and what sample sizes are necessary to reliably infer parameters like migration rates.
A year of genomic surveillance reveals how the SARS-CoV-2 pandemic unfolded in Africa
Background and Methodology: The current Ebola virus epidemic in West Africa has been spreading at least since December 2013. The first confirmed case of Ebola virus in Sierra Leone was identified on May 25. Based on viral genetic sequencing data from 72 individuals in Sierra Leone collected between the end of May and mid June, we utilize a range of phylodynamic methods to estimate the basic reproductive number (R0). We additionally estimate the expected lengths of the incubation and infectious periods of the virus. Finally, we use phylogenetic trees to examine the role played by population structure in the epidemic. Results: The median estimates of R0 based on sequencing data alone range between 1.65-2.18, with the most plausible model yielding a median R0 of 2.18 (95% HPD 1.24-3.55). Importantly, our results indicate that, at least until mid June, relief efforts in Sierra Leone were ineffective at lowering the effective reproductive number of the virus. We estimate the expected length of the infectious period to be 2.58 days (median; 95% HPD 1.24-6.98). The dataset appears to be too small in order to estimate the incubation period with high certainty (median expected incubation period 4.92 days; 95% HPD 2.11-23.20). While our estimates of the duration of infection tend to be smaller than previously reported, phylodynamic analyses support a previous estimate that 70% of cases were observed and included in the present dataset. The dataset is too small to show a particular population structure with high significance, however our preliminary analyses suggest that half the population is spreading the virus with an R0 well above 2, while the other half of the population is spreading with an R0 below 1. Conclusions: Overall we show that sequencing data can robustly infer key epidemiological parameters. Such estimates inform public health officials and help to coordinate effective public health efforts. Thus having more sequencing data available for the ongoing Ebola virus epidemic and at the start of new outbreaks will foster a quick understanding of the dynamics of the pathogen.
Petaloid organs are a major component of the floral diversity observed across nearly all major clades of angiosperms. The variable morphology and development of these organs has led to the hypothesis that they are not homologous but, rather, have evolved multiple times. A particularly notable example of petal diversity, and potential homoplasy, is found within the order Ranunculales, exemplified by families such as Ranunculaceae, Berberidaceae, and Papaveraceae. To investigate the molecular basis of petal identity in Ranunculales, we used a combination of molecular phylogenetics and gene expression analysis to characterize APETALA3 (AP3) and PISTILLATA (PI) homologs from a total of 13 representative genera of the order. One of the most striking results of this study is that expression of orthologs of a single AP3 lineage is consistently petal-specific across both Ranunculaceae and Berberidaceae. We conclude from this finding that these supposedly homoplastic petals in fact share a developmental genetic program that appears to have been present in the common ancestor of the two families. We discuss the implications of this type of molecular data for long-held typological definitions of petals and, more broadly, the evolution of petaloid organs across the angiosperms.
Phylogeographic methods can help reveal the movement of genes between populations of organisms. This has been widely done to quantify pathogen movement between different host populations, the migration history of humans, and the geographic spread of languages or gene flow between species using the location or state of samples alongside sequence data. Phylogenies therefore offer insights into migration processes not available from classic epidemiological or occurrence data alone. Phylogeographic methods have however several known shortcomings. In particular, one of the most widely used methods treats migration the same as mutation, and therefore does not incorporate information about population demography. This may lead to severe biases in estimated migration rates for data sets where sampling is biased across populations. The structured coalescent on the other hand allows us to coherently model the migration and coalescent process, but current implementations struggle with complex data sets due to the need to infer ancestral migration histories. Thus, approximations to the structured coalescent, which integrate over all ancestral migration histories, have been developed. However, the validity and robustness of these approximations remain unclear. We present an exact numerical solution to the structured coalescent that does not require the inference of migration histories. Although this solution is computationally unfeasible for large data sets, it clarifies the assumptions of previously developed approximate methods and allows us to provide an improved approximation to the structured coalescent. We have implemented these methods in BEAST2, and we show how these methods compare under different scenarios.
Recent phylogenetic analyses indicate that RNA virus populations carry a significant deleterious mutation load. This mutation load has the potential to shape patterns of adaptive evolution via genetic linkage to beneficial mutations. Here, we examine the effect of deleterious mutations on patterns of influenza A subtype H3N2's antigenic evolution in humans. By first analyzing simple models of influenza that incorporate a mutation load, we show that deleterious mutations, as expected, act to slow the virus's rate of antigenic evolution, while making it more punctuated in nature. These models further predict three distinct molecular pathways by which antigenic cluster transitions occur, and we find phylogenetic patterns consistent with each of these pathways in influenza virus sequences. Simulations of a more complex phylodynamic model further indicate that antigenic mutations act in concert with deleterious mutations to reproduce influenza's spindly hemagglutinin phylogeny, co-circulation of antigenic variants, and high annual attack rates.DOI: http://dx.doi.org/10.7554/eLife.07361.001
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.