Selective sweeps can increase genetic differentiation among populations and cause allele frequency spectra to depart from the expectation under neutrality. We present a likelihood method for detecting selective sweeps that involves jointly modeling the multilocus allele frequency differentiation between two populations. We use Brownian motion to model genetic drift under neutrality, and a deterministic model to approximate the effect of a selective sweep on single nucleotide polymorphisms (SNPs) in the vicinity. We test the method with extensive simulated data, and demonstrate that in some scenarios the method provides higher power than previously reported approaches to detect selective sweeps, and can provide surprisingly good localization of the position of a selected allele. A strength of our technique is that it uses allele frequency differentiation between populations, which is much more robust to ascertainment bias in SNP discovery than methods based on the allele frequency spectrum. We apply this method to compare continentally diverse populations, as well as Northern and Southern Europeans. Our analysis identifies a list of loci as candidate targets of selection, including well-known selected loci and new regions that have not been highlighted by previous scans for selection.
Modern humans have occupied almost all possible environments globally since exiting Africa about 100,000 years ago. Both behavioral and biological adaptations have contributed to their success in surviving the rigors of climatic extremes, including cold, strong ultraviolet radiation, and high altitude. Among these environmental stresses, high-altitude hypoxia is the only condition in which traditional technology is incapable of mediating its effects. Inhabiting at >3,000-m high plateau, the Tibetan population provides a widely studied example of high-altitude adaptation. Yet, the genetic mechanisms underpinning long-term survival in this environmental extreme remain unknown. We performed an analysis of genome-wide sequence variations in Tibetans. In combination with the reported data, we identified strong signals of selective sweep in two hypoxia-related genes, EPAS1 and EGLN1. For these two genes, Tibetans show unusually high divergence from the non-Tibetan lowlanders (Han Chinese and Japanese) and possess high frequencies of many linked sequence variations as reflected by the Tibetan-specific haplotypes. Further analysis in seven Tibetan populations (1,334 individuals) indicates the prevalence of selective sweep across the Himalayan region. The observed indicators of natural selection on EPAS1 and EGLN1 suggest that during the long-term occupation of high-altitude areas, the functional sequence variations for acquiring biological adaptation to high-altitude hypoxia have been enriched in Tibetan populations.
BackgroundThe coronavirus disease 2019 (COVID‐19) is rapidly spreading in China and more than 30 countries over last two months. COVID‐19 has multiple characteristics distinct from other infectious diseases, including high infectivity during incubation, time delay between real dynamics and daily observed number of confirmed cases, and the intervention effects of implemented quarantine and control measures.MethodsWe develop a Susceptible, Un‐quanrantined infected, Quarantined infected, Confirmed infected (SUQC) model to characterize the dynamics of COVID‐19 and explicitly parameterize the intervention effects of control measures, which is more suitable for analysis than other existing epidemic models.ResultsThe SUQC model is applied to the daily released data of the confirmed infections to analyze the outbreak of COVID‐19 in Wuhan, Hubei (excluding Wuhan), China (excluding Hubei) and four first‐tier cities of China. We found that, before January 30, 2020, all these regions except Beijing had a reproductive number R > 1, and after January 30, all regions had a reproductive number R < 1, indicating that the quarantine and control measures are effective in preventing the spread of COVID‐19. The confirmation rate of Wuhan estimated by our model is 0.0643, substantially lower than that of Hubei excluding Wuhan (0.1914), and that of China excluding Hubei (0.2189), but it jumps to 0.3229 after February 12 when clinical evidence was adopted in new diagnosis guidelines. The number of un‐quarantined infected cases in Wuhan on February 12, 2020 is estimated to be 3,509 and declines to 334 on February 21, 2020. After fitting the model with data as of February 21, 2020, we predict that the end time of COVID‐19 in Wuhan and Hubei is around late March, around mid March for China excluding Hubei, and before early March 2020 for the four tier‐one cities. A total of 80,511 individuals are estimated to be infected in China, among which 49,510 are from Wuhan, 17,679 from Hubei (excluding Wuhan), and the rest 13,322 from other regions of China (excluding Hubei). Note that the estimates are from a deterministic ODE model and should be interpreted with some uncertainty.ConclusionsWe suggest that rigorous quarantine and control measures should be kept before early March in Beijing, Shanghai, Guangzhou and Shenzhen, and before late March in Hubei. The model can also be useful to predict the trend of epidemic and provide quantitative guide for other countries at high risk of outbreak, such as South Korea, Japan, Italy and Iran.
Summary An adaptive variant of the human Ectodysplasin receptor, EDARV370A, is one of the strongest candidates of recent positive selection from genome-wide scans. We have modeled EDAR370A in mice and characterized its phenotype and evolutionary origins in humans. Our computational analysis suggests the allele arose in Central China approximately 30,000 years ago. Although EDAR370A has been associated with increased scalp hair thickness and changed tooth morphology in humans, its direct biological significance and potential adaptive role remain unclear. We generated a knock-in mouse model and find that, as in humans, hair thickness is increased in EDAR370A mice. We identify novel biological targets affected by the mutation, including mammary and eccrine glands. Building on these results, we find that EDAR370A is associated with an increased number of active eccrine glands in the Han Chinese. This interdisciplinary approach yields unique insight into the generation of adaptive variation among modern humans.
Familial Mediterranean fever (FMF) is an autoinflammatory disease caused by homozygous or compound heterozygous gain-of-function mutations in MEFV , encoding pyrin, an inflammasome protein. Heterozygous carrier frequencies for multiple MEFV mutations are high in several Mediterranean populations, suggesting that they confer selective advantage. Among 2,313 Turks, we found extended haplotype homozygosity flanking FMF-associated mutations, indicating evolutionarily recent positive selection of FMF-associated mutations. Two pathogenic pyrin variants independently arose >1,800 years ago. Mutant pyrin interacts less avidly with Yersinia pestis virulence factor YopM than wild type human pyrin, thereby attenuating YopM-induced IL-1β suppression. Relative to healthy controls, leukocytes from FMF patients harboring homozygous or compound heterozygous mutations and from asymptomatic heterozygous carriers released heightened IL-1β specifically in response to Y. pestis . Y. pestis -infected Mefv M680I/M680I FMF knock-in mice exhibited IL-1-dependent increased survival relative to wild-type knock-in mice. Thus, FMF mutations that were positively selected in Mediterranean populations confer heightened resistance to Y. pestis .
Summary Although southern African Khoisan populations are often assumed to have remained largely isolated during prehistory, there is growing evidence for a migration of pastoralists from eastern Africa some 2,000 years ago [1–5], prior to the arrival of Bantu-speaking populations in southern Africa. Eastern Africa harbors distinctive lactase persistence (LP) alleles [6–8], and therefore LP alleles in southern African populations may be derived from this eastern African pastoralist migration. We sequenced the lactase enhancer region in 457 individuals from 18 Khoisan and seven Bantu-speaking groups from Botswana, Namibia, and Zambia and additionally genotyped four short tandem repeat (STR) loci that flank the lactase enhancer region. We found nine single-nucleotide polymorphisms, of which the most frequent is −14010*C, which was previously found to be associated with LP in Kenya and Tanzania and to exhibit a strong signal of positive selection [8]. This allele occurs in significantly higher frequency in pastoralist groups and in Khoe-speaking groups in our study, supporting the hypothesis of a migration of eastern African pastoralists that was primarily associated with Khoe speakers [2]. Moreover, we find a signal of ongoing positive selection in all three pastoralist groups in our study, as well as (surprisingly) in two foraging groups.
A novel RNA virus, the severe acute respiratory syndrome coronavirus 2 ( SARS-CoV-2 ), is responsible for the ongoing outbreak of coronavirus disease 2019 ( COVID-19 ). Population genetic analysis could be useful for investigating the origin and evolutionary dynamics of COVID-19. However, due to extensive sampling bias and existence of infection clusters during the epidemic spread, direct applications of existing approaches can lead to biased parameter estimations and data misinterpretation. In this study, we first present robust estimator for the time to the most recent common ancestor (TMRCA) and the mutation rate, and then apply the approach to analyze 12,909 genomic sequences of SARS-CoV-2. The mutation rate is inferred to be 8.69 × 10 −4 per site per year with a 95% confidence interval (CI) of [8.61 × 10 −4 , 8.77 × 10 −4 ], and the TMRCA of the samples inferred to be Nov 28, 2019 with a 95% CI of [Oct 20, 2019, Dec 9, 2019]. The results indicate that COVID-19 might originate earlier than and outside of Wuhan Seafood Market. We further demonstrate that genetic polymorphism patterns, including the enrichment of specific haplotypes and the temporal allele frequency trajectories generated from infection clusters, are similar to those caused by evolutionary forces such as natural selection. Our results show that population genetic methods need to be developed to efficiently detangle the effects of sampling bias and infection clusters to gain insights into the evolutionary mechanism of SARS-CoV-2. Software for implementing VirusMuT can be downloaded at https://bigd.big.ac.cn/biocode/tools/BT007081 .
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.