The ongoing SARS-CoV-2 outbreak marks the first time that large amounts of genome sequence data have been generated and made publicly available in near real-time. Early analyses of these data revealed low sequence variation, a finding that is consistent with a recently emerging outbreak, but which raises the question of whether such data are sufficiently informative for phylogenetic inferences of evolutionary rates and time scales. The phylodynamic threshold is a key concept that refers to the point in time at which sufficient molecular evolutionary change has accumulated in available genome samples to obtain robust phylodynamic estimates. For example, before the phylodynamic threshold is reached, genomic variation is so low that even large amounts of genome sequences may be insufficient to estimate the virus’s evolutionary rate and the time scale of an outbreak. We collected genome sequences of SARS-CoV-2 from public databases at 8 different points in time and conducted a range of tests of temporal signal to determine if and when the phylodynamic threshold was reached, and the range of inferences that could be reliably drawn from these data. Our results indicate that by February 2nd 2020, estimates of evolutionary rates and time scales had become possible. Analyses of subsequent data sets, that included between 47 to 122 genomes, converged at an evolutionary rate of about 1.1 × 10−3 subs/site/year and a time of origin of around late November 2019. Our study provides guidelines to assess the phylodynamic threshold and demonstrates that establishing this threshold constitutes a fundamental step for understanding the power and limitations of early data in outbreak genome surveillance.
24The rapid sharing of sequence information as seen throughout the current SARS-CoV-2 25 epidemic, represents an inflection point for genomic epidemiology. Here we describe 26 aspects of coronavirus evolutionary genetics revealed from these data, and provide the first 27 direct RNA sequence of SARS-CoV-2, detailing coronaviral subgenome-length mRNA 28 architecture. 30The ongoing epidemic of 2019 novel coronavirus (now called SARS-CoV-2, causing the 31 disease COVID-19), which originated in Wuhan, China, has been declared a public health 32 emergency of international concern by the World Health Organisation (WHO) [1][2][3][4]. SARS- 33CoV-2 is a positive-sense single-stranded RNA ((+)ssRNA) virus of the Coronaviridae family, 34 with related Betacoronaviruses capable of infecting mammalian and avian hosts, resulting in 35 author/funder. All rights reserved. No reuse allowed without permission.
Phylodynamic models use pathogen genome sequence data to infer epidemiological dynamics. With the increasing genomic surveillance of pathogens, especially during the SARS‐CoV‐2 pandemic, new practical questions about their use are emerging. One such question focuses on the inclusion of un‐sequenced case occurrence data alongside sequenced data to improve phylodynamic analyses. This approach can be particularly valuable if sequencing efforts vary over time. Using simulations, we demonstrate that birth–death phylodynamic models can employ occurrence data to eliminate bias in estimates of the basic reproductive number due to misspecification of the sampling process. In contrast, the coalescent exponential model is robust to such sampling biases, but in the absence of a sampling model it cannot exploit occurrence data. Subsequent analysis of the SARS‐CoV‐2 epidemic in the northwest USA supports these results. We conclude that occurrence data are a valuable source of information in combination with birth–death models. These data should be used to bolster phylodynamic analyses of infectious diseases and other rapidly spreading species in the future.
Background Many countries have attempted to mitigate and control COVID-19 through non-pharmaceutical interventions, particularly with the aim of reducing population movement and contact. However, it remains unclear how the different control strategies impacted the local phylodynamics of the causative SARS-CoV-2 virus. Aim We aimed to assess the duration of chains of virus transmission within individual countries and the extent to which countries exported viruses to their geographical neighbours. Methods We analysed complete SARS-CoV-2 genomes to infer the relative frequencies of virus importation and exportation, as well as virus transmission dynamics, in countries of northern Europe. We examined virus evolution and phylodynamics in Denmark, Finland, Iceland, Norway and Sweden during the first year of the COVID-19 pandemic. Results The Nordic countries differed markedly in the invasiveness of control strategies, which we found reflected in transmission chain dynamics. For example, Sweden, which compared with the other Nordic countries relied more on recommendation-based rather than legislation-based mitigation interventions, had transmission chains that were more numerous and tended to have more cases. This trend increased over the first 8 months of 2020. Together with Denmark, Sweden was a net exporter of SARS-CoV-2. Norway and Finland implemented legislation-based interventions; their transmission chain dynamics were in stark contrast to their neighbouring country Sweden. Conclusion Sweden constituted an epidemiological and evolutionary refugium that enabled the virus to maintain active transmission and spread to other geographical locations. Our analysis reveals the utility of genomic surveillance where monitoring of active transmission chains is a key metric.
Genomic surveillance is increasingly common for infectious pathogens. Phylodynamic models can take advantage of pathogen genome sequence data to infer epidemiological dynamics, such as those based on the exponential growth coalescent and the birth-death process. Here we investigate the potential of including case notification data without associated genome sequences in such phylodynamic analyses. Using simulations, we demonstrate that birth-death phylodynamic models can capitalise on notification data to eliminate bias in estimates of the basic reproductive number, R0, particularly when the sampling rate varies over time. In addition, an analysis of data collected from the 2009 pandemic H1N1 influenza virus demonstrates that using only samples from the prevalence peak results in biased estimates of the reproductive number over time, whereas using case notification data has a comparable accuracy to that achieved when using genome samples throughout the duration of the pandemic.
Phylodynamics requires an interdisciplinary understanding of phylogenetics, epidemiology, and sta- tistical inference. It has also experienced more intense application that ever before amid the SARS- CoV-2 pandemic. In light of this, we present a review of phylodynamic models beginning with foundational models and assumptions. Our target audience is public health researchers, epidemiol- ogists, and biologists seeking a working knowledge of the links between epidemiology, evolutionary models, and resulting epidemiological inference. We discuss the assumptions linking evolutionary models of pathogen population size to epidemiological models of the infected population size. We then describe statistical inference for phylodynamic models and list how output parameters can be rearranged for epidemiological interpretation. We go on to cover more sophisticated models and finish by highlighting future directions.
Plague has an enigmatic history as a zoonotic pathogen. This infectious disease will unexpectedly appear in human populations and disappear just as suddenly. As a result, a long-standing line of inquiry has been to estimate when and where plague appeared in the past. However, there have been significant disparities between phylogenetic studies of the causative bacterium, Yersinia pestis, regarding the timing and geographic origins of its reemergence. Here, we curate and contextualize an updated phylogeny of Y. pestis using 601 genome sequences sampled globally. Through a detailed Bayesian evaluation of temporal signal in subsets of these data we demonstrate that a Y. pestis-wide molecular clock is unstable. To resolve this, we developed a new approach in which each Y. pestis population was assessed independently, enabling us to recover substantial temporal signal in five populations, including the ancient pandemic lineages which we now estimate may have emerged decades, or even centuries, before a pandemic was historically documented from European sources. Despite this methodological advancement, we only obtain robust divergence dates from populations sampled over a period of at least 90 years, indicating that genetic evidence alone is insufficient for accurately reconstructing the timing and spread of short-term plague epidemics.
Plague has an enigmatic history as a zoonotic pathogen. This potentially devastating infectious disease will unexpectedly appear in human populations and disappear just as suddenly. As a result, a long-standing line of inquiry has been to estimate when and where plague appeared in the past. However, there have been significant disparities between phylogenetic studies of the causative bacterium, Yersinia pestis, regarding the timing and geographic origins of its reemergence. Here, we curate and contextualize an updated phylogeny of Y. pestis using 601 genome sequences sampled globally. We perform a detailed Bayesian evaluation of temporal signal in subsets of these data and demonstrate that a Y. pestis-wide molecular clock model is unstable. To resolve this, we devised a new approach in which each Y. pestis population was assessed independently. This enabled us to recover significant temporal signal in five populations, including the ancient pandemic lineages which we now estimate may have emerged decades, or even centuries, before a pandemic was historically documented from European sources. Despite this, we only obtain robust divergence dates from populations sampled over a period of at least 90 years, indicating that genetic evidence alone is insufficient for accurately reconstructing the timing and spread of short-term plague epidemics. Finally, we identify key historical data sets that can be used in future research, which will complement the strengths and mitigate the weaknesses of genomic data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.