Stochastic simulation is a key tool in population genetics, since the models involved are often analytically intractable and simulation is usually the only way of obtaining ground-truth data to evaluate inferences. Because of this, a large number of specialized simulation programs have been developed, each filling a particular niche, but with largely overlapping functionality and a substantial duplication of effort. Here, we introduce msprime version 1.0, which efficiently implements ancestry and mutation simulations based on the succinct tree sequence data structure and the tskit library. We summarize msprime’s many features, and show that its performance is excellent, often many times faster and more memory efficient than specialized alternatives. These high-performance features have been thoroughly tested and validated, and built using a collaborative, open source development model, which reduces duplication of effort and promotes software quality via community engagement.
The explosion in population genomic data demands ever more complex modes of analysis, and increasingly these analyses depend on sophisticated simulations. Re-cent advances in population genetic simulation have made it possible to simulate large and complex models, but specifying such models for a particular simulation engine remains a difficult and error-prone task. Computational genetics researchers currently re-implement simulation models independently, leading to inconsistency and duplication of effort. This situation presents a major barrier to empirical researchers seeking to use simulations for power analyses of upcoming studies or sanity checks on existing genomic data. Population genetics, as a field, also lacks standard benchmarks by which new tools for inference might be measured. Here we describe a new resource, stdpopsim, that attempts to rectify this situation. Stdpopsim is a community-driven open source project, which provides easy access to a growing catalog of published simulation models from a range of organisms and supports multiple simulation engine backends. This resource is available as a well-documented python library with a simple command-line interface. We share some examples demonstrating how stdpopsim can be used to systematically compare demographic inference methods, and we encourage a broader community of developers to contribute to this growing resource.
The Tasmanian tiger or thylacine (Thylacinus cynocephalus) was the largest carnivorous Australian marsupial to survive into the modern era. Despite last sharing a common ancestor with the eutherian canids ~160 million years ago, their phenotypic resemblance is considered the most striking example of convergent evolution in mammals. The last known thylacine died in captivity in 1936 and many aspects of the evolutionary history of this unique marsupial apex predator remain unknown. Here we have sequenced the genome of a preserved thylacine pouch young specimen to clarify the phylogenetic position of the thylacine within the carnivorous marsupials, reconstruct its historical demography and examine the genetic basis of its convergence with canids. Retroposon insertion patterns placed the thylacine as the basal lineage in Dasyuromorphia and suggest incomplete lineage sorting in early dasyuromorphs. Demographic analysis indicated a long-term decline in genetic diversity starting well before the arrival of humans in Australia. In spite of their extraordinary phenotypic convergence, comparative genomic analyses demonstrated that amino acid homoplasies between the thylacine and canids are largely consistent with neutral evolution. Furthermore, the genes and pathways targeted by positive selection differ markedly between these species. Together, these findings support models of adaptive convergence driven primarily by cis-regulatory evolution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.