Thibaut Jombart scite author profile

BackgroundThe dramatic progress in sequencing technologies offers unprecedented prospects for deciphering the organization of natural populations in space and time. However, the size of the datasets generated also poses some daunting challenges. In particular, Bayesian clustering algorithms based on pre-defined population genetics models such as the STRUCTURE or BAPS software may not be able to cope with this unprecedented amount of data. Thus, there is a need for less computer-intensive approaches. Multivariate analyses seem particularly appealing as they are specifically devoted to extracting information from large datasets. Unfortunately, currently available multivariate methods still lack some essential features needed to study the genetic structure of natural populations.ResultsWe introduce the Discriminant Analysis of Principal Components (DAPC), a multivariate method designed to identify and describe clusters of genetically related individuals. When group priors are lacking, DAPC uses sequential K-means and model selection to infer genetic clusters. Our approach allows extracting rich information from genetic data, providing assignment of individuals to groups, a visual assessment of between-population differentiation, and contribution of individual alleles to population structuring. We evaluate the performance of our method using simulated data, which were also analyzed using STRUCTURE as a benchmark. Additionally, we illustrate the method by analyzing microsatellite polymorphism in worldwide human populations and hemagglutinin gene sequence variation in seasonal influenza.ConclusionsAnalysis of simulated data revealed that our approach performs generally better than STRUCTURE at characterizing population subdivision. The tools implemented in DAPC for the identification of clusters and graphical representation of between-group structures allow to unravel complex population structures. Our approach is also faster than Bayesian clustering algorithms by several orders of magnitude, and may be applicable to a wider range of datasets.

show abstract

adegenet 1.3-1: new tools for the analysis of genome-wide SNP data

Jombart

Ahmed

2011

2,533

2,144

View full text Add to dashboard Cite

show abstract

The effect of control strategies to reduce social mixing on outcomes of the COVID-19 epidemic in Wuhan, China: a modelling study

Prem¹,

Liu²,

Russell³

et al. 2020

The Lancet Public Health

1,964

1,981

View full text Add to dashboard Cite

Background In December, 2019, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), a novel coronavirus, emerged in Wuhan, China. Since then, the city of Wuhan has taken unprecedented measures in response to the outbreak, including extended school and workplace closures. We aimed to estimate the effects of physical distancing measures on the progression of the COVID-19 epidemic, hoping to provide some insights for the rest of the world.Methods To examine how changes in population mixing have affected outbreak progression in Wuhan, we used synthetic location-specific contact patterns in Wuhan and adapted these in the presence of school closures, extended workplace closures, and a reduction in mixing in the general community. Using these matrices and the latest estimates of the epidemiological parameters of the Wuhan outbreak, we simulated the ongoing trajectory of an outbreak in Wuhan using an age-structured susceptible-exposed-infected-removed (SEIR) model for several physical distancing measures. We fitted the latest estimates of epidemic parameters from a transmission model to data on local and internationally exported cases from Wuhan in an age-structured epidemic framework and investigated the age distribution of cases. We also simulated lifting of the control measures by allowing people to return to work in a phased-in way and looked at the effects of returning to work at different stages of the underlying outbreak (at the beginning of March or April).Findings Our projections show that physical distancing measures were most effective if the staggered return to work was at the beginning of April; this reduced the median number of infections by more than 92% (IQR 66-97) and 24% (13-90) in mid-2020 and end-2020, respectively. There are benefits to sustaining these measures until April in terms of delaying and reducing the height of the peak, median epidemic size at end-2020, and affording health-care systems more time to expand and respond. However, the modelled effects of physical distancing measures vary by the duration of infectiousness and the role school children have in the epidemic.Interpretation Restrictions on activities in Wuhan, if maintained until April, would probably help to delay the epidemic peak. Our projections suggest that premature and sudden lifting of interventions could lead to an earlier secondary peak, which could be flattened by relaxing the interventions gradually. However, there are limitations to our analysis, including large uncertainties around estimates of R 0 and the duration of infectiousness.

show abstract

Pandemic Potential of a Strain of Influenza A (H1N1): Early Findings

Fraser

Donnelly

Cauchemez

et al. 2009

Science

1,703

110

1,560

View full text Add to dashboard Cite

A novel influenza A (H1N1) virus has spread rapidly across the globe. Judging its pandemic potential is difficult with limited data, but nevertheless essential to inform appropriate health responses. By analyzing the outbreak in Mexico, early data on international spread, and viral genetic diversity, we make an early assessment of transmissibility and severity. Our estimates suggest that 23,000 (range 6000 to 32,000) individuals had been infected in Mexico by late April, giving an estimated case fatality ratio (CFR) of 0.4% (range: 0.3 to 1.8%) based on confirmed and suspected deaths reported to that time. In a community outbreak in the small community of La Gloria, Veracruz, no deaths were attributed to infection, giving an upper 95% bound on CFR of 0.6%. Thus, although substantial uncertainty remains, clinical severity appears less than that seen in the 1918 influenza pandemic but comparable with that seen in the 1957 pandemic. Clinical attack rates in children in La Gloria were twice that in adults (<15 years of age: 61%; ≥15 years: 29%). Three different epidemiological analyses gave basic reproduction number (R0) estimates in the range of 1.4 to 1.6, whereas a genetic analysis gave a central estimate of 1.2. This range of values is consistent with 14 to 73 generations of human-to-human transmission having occurred in Mexico to late April. Transmissibility is therefore substantially higher than that of seasonal flu, and comparable with lower estimates of R0 obtained from previous influenza pandemics.

show abstract

Effects of non-pharmaceutical interventions on COVID-19 cases, deaths, and demand for hospital services in the UK: a modelling study

Davies

Kucharski

Eggo

et al. 2020

The Lancet Public Health

807

865

View full text Add to dashboard Cite

Background Non-pharmaceutical interventions have been implemented to reduce transmission of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in the UK. Projecting the size of an unmitigated epidemic and the potential effect of different control measures has been crucial to support evidence-based policy making during the early stages of the epidemic. This study assesses the potential impact of different control measures for mitigating the burden of COVID-19 in the UK. Methods We used a stochastic age-structured transmission model to explore a range of intervention scenarios, tracking 66•4 million people aggregated to 186 county-level administrative units in England, Wales, Scotland, and Northern Ireland. The four base interventions modelled were school closures, physical distancing, shielding of people aged 70 years or older, and self-isolation of symptomatic cases. We also modelled the combination of these interventions, as well as a programme of intensive interventions with phased lockdown-type restrictions that substantially limited contacts outside of the home for repeated periods. We simulated different triggers for the introduction of interventions, and estimated the impact of varying adherence to interventions across counties. For each scenario, we projected estimated new cases over time, patients requiring inpatient and critical care (ie, admission to the intensive care units [ICU]) treatment, and deaths, and compared the effect of each intervention on the basic reproduction number, R 0 .Findings We projected a median unmitigated burden of 23 million (95% prediction interval 13-30) clinical cases and 350 000 deaths (170 000-480 000) due to COVID-19 in the UK by December, 2021. We found that the four base interventions were each likely to decrease R 0 , but not sufficiently to prevent ICU demand from exceeding health service capacity. The combined intervention was more effective at reducing R 0 , but only lockdown periods were sufficient to bring R 0 near or below 1; the most stringent lockdown scenario resulted in a projected 120 000 cases (46 000-700 000) and 50 000 deaths (9300-160 000). Intensive interventions with lockdown periods would need to be in place for a large proportion of the coming year to prevent health-care demand exceeding availability.Interpretation The characteristics of SARS-CoV-2 mean that extreme measures are probably required to bring the epidemic under control and to prevent very large numbers of deaths and an excess of demand on hospital beds, especially those in ICUs.Funding Medical Research Council.

show abstract

Global, regional, and national estimates of the population at increased risk of severe COVID-19 due to underlying health conditions in 2020: a modelling study

Clark¹,

Jit²,

Warren‐Gash³

et al. 2020

The Lancet Global Health

885

827

View full text Add to dashboard Cite

Background The risk of severe COVID-19 if an individual becomes infected is known to be higher in older individuals and those with underlying health conditions. Understanding the number of individuals at increased risk of severe COVID-19 and how this varies between countries should inform the design of possible strategies to shield or vaccinate those at highest risk. Methods We estimated the number of individuals at increased risk of severe disease (defined as those with at least one condition listed as "at increased risk of severe COVID-19" in current guidelines) by age (5-year age groups), sex, and country for 188 countries using prevalence data from the Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) 2017 and UN population estimates for 2020. The list of underlying conditions relevant to COVID-19 was determined by mapping the conditions listed in GBD 2017 to those listed in guidelines published by WHO and public health agencies in the UK and the USA. We analysed data from two large multimorbidity studies to determine appropriate adjustment factors for clustering and multimorbidity. To help interpretation of the degree of risk among those at increased risk, we also estimated the number of individuals at high risk (defined as those that would require hospital admission if infected) using age-specific infection-hospitalisation ratios for COVID-19 estimated for mainland China and making adjustments to reflect country-specific differences in the prevalence of underlying conditions and frailty. We assumed males were twice at likely as females to be at high risk. We also calculated the number of individuals without an underlying condition that could be considered at increased risk because of their age, using minimum ages from 50 to 70 years. We generated uncertainty intervals (UIs) for our estimates by running low and high scenarios using the lower and upper 95% confidence limits for country population size, disease prevalences, multimorbidity fractions, and infection-hospitalisation ratios, and plausible low and high estimates for the degree of clustering, informed by multimorbidity studies. Findings We estimated that 1•7 billion (UI 1•0-2•4) people, comprising 22% (UI 15-28) of the global population, have at least one underlying condition that puts them at increased risk of severe COVID-19 if infected (ranging from <5% of those younger than 20 years to >66% of those aged 70 years or older). We estimated that 349 million (186-787) people (4% [3-9] of the global population) are at high risk of severe COVID-19 and would require hospital admission if infected (ranging from <1% of those younger than 20 years to approximately 20% of those aged 70 years or older). We estimated 6% (3-12) of males to be at high risk compared with 3% (2-7) of females. The share of the population at increased risk was highest in countries with older populations, African countries with high HIV/AIDS prevalence, and small island nations with high diabetes prevalence. Estimates of the number of individuals at increased risk wer...

show abstract

How to measure and test phylogenetic signal

et al. 2012

View full text Add to dashboard Cite

Summary1. Phylogenetic signal is the tendency of related species to resemble each other more than species drawn at random from the same tree. This pattern is of considerable interest in a range of ecological and evolutionary research areas, and various indices have been proposed for quantifying it. Unfortunately, these indices often lead to contrasting results, and guidelines for choosing the most appropriate index are lacking. 2. Here, we compare the performance of four commonly used indices using simulated data. Data were generated with numerical simulations of trait evolution along phylogenetic trees under a variety of evolutionary models. We investigated the sensitivity of the approaches to the size of phylogenies, the resolution of tree structure and the availability of branch length information, examining both the response of the selected indices and the power of the associated statistical tests. 3. We found that under a Brownian motion (BM) model of trait evolution, Abouheif's C mean and Pagel's k performed well and substantially better than Moran's I and Blomberg's K. Pagel's k provided a reliable effect size measure and performed better for discriminating between more complex models of trait evolution, but was computationally more demanding than Abouheif's C mean . Blomberg's K was most suitable to capture the effects of changing evolutionary rates in simulation experiments. 4. Interestingly, sample size influenced not only the uncertainty but also the expected values of most indices, while polytomies and missing branch length information had only negligible impacts. 5. We propose guidelines for choosing among indices, depending on (a) their sensitivity to true underlying patterns of phylogenetic signal, (b) whether a test or a quantitative measure is required and (c) their sensitivities to different topologies of phylogenies. 6. These guidelines aim to better assess phylogenetic signal and distinguish it from random trait distributions. They were developed under the assumption of BM, and additional simulations with more complex trait evolution models show that they are to a certain degree generalizable. They are particularly useful in comparative analyses, when requiring a proxy for niche similarity, and in conservation studies that explore phylogenetic loss associated with extinction risks of specific clades.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Thibaut Jombart

adegenet: a R package for the multivariate analysis of genetic markers

Discriminant analysis of principal components: a new method for the analysis of genetically structured populations

adegenet 1.3-1: new tools for the analysis of genome-wide SNP data

The effect of control strategies to reduce social mixing on outcomes of the COVID-19 epidemic in Wuhan, China: a modelling study

Pandemic Potential of a Strain of Influenza A (H1N1): Early Findings

Effects of non-pharmaceutical interventions on COVID-19 cases, deaths, and demand for hospital services in the UK: a modelling study

Global, regional, and national estimates of the population at increased risk of severe COVID-19 due to underlying health conditions in 2020: a modelling study

How to measure and test phylogenetic signal

Contact Info

Product

Resources

About