Genome-wide association studies (GWAS) have become indispensable in human medicine and genomics, but very few have been carried out on bacteria. Here we introduce Scoary, an ultra-fast, easy-to-use, and widely applicable software tool that scores the components of the pan-genome for associations to observed phenotypic traits while accounting for population stratification, with minimal assumptions about evolutionary processes. We call our approach pan-GWAS to distinguish it from traditional, single nucleotide polymorphism (SNP)-based GWAS. Scoary is implemented in Python and is available under an open source GPLv3 license at https://github.com/AdmiralenOla/Scoary.Electronic supplementary materialThe online version of this article (doi:10.1186/s13059-016-1108-8) contains supplementary material, which is available to authorized users.
Repeated emergence, not international dissemination, is behind the rise of multidrug-resistant lineage 4 tuberculosis.
RNA viruses are abundant infectious agents and present in all domains of life. Arthropods, including ticks, are well known as vectors of many viruses of concern for human and animal health. Despite their obvious importance, the extent and structure of viral diversity in ticks is still poorly understood, particularly in Europe. Using a bulk RNA-sequencing approach that captures the complete transcriptome, we analysed the virome of the most common tick in Europe – Ixodes ricinus. In total, RNA sequencing was performed on six libraries consisting of 33 I. ricinus nymphs and adults sampled in Norway. Despite the small number of animals surveyed, our virus identification pipeline revealed nine diverse and novel viral species, phylogenetically positioned within four different viral groups – bunyaviruses, luteoviruses, mononegavirales and partitiviruses – and sometimes characterized by extensive genetic diversity including a potentially novel genus of bunyaviruses. This work sheds new light on the virus diversity in I. ricinus, expands our knowledge of potential host/vector-associations and tick-transmitted viruses within several viral groups, and pushes the latitudinal limit where it is likely to find tick-associated viruses. Notably, our phylogenetic analysis revealed the presence of tick-specific virus clades that span multiple continents, highlighting the role of ticks as important virus reservoirs.
Hospitals worldwide are facing an increasing incidence of hard-to-treat infections. Limiting infections and providing patients with optimal drug regimens require timely strain identification as well as virulence and drug-resistance profiling. Additionally, prophylactic interventions based on the identification of environmental sources of recurrent infections (e.g., contaminated sinks) and reconstruction of transmission chains (i.e., who infected whom) could help to reduce the incidence of nosocomial infections. WGS could hold the key to solving these issues. However, uptake in the clinic has been slow. Some major scientific and logistical challenges need to be solved before WGS fulfils its potential in clinical microbial diagnostics. In this review we identify major bottlenecks that need to be resolved for WGS to routinely inform clinical intervention and discuss possible solutions.
BackgroundThe core genome consists of genes shared by the vast majority of a species and is therefore assumed to have been subjected to substantially stronger purifying selection than the more mobile elements of the genome, also known as the accessory genome. Here we examine intragenic base composition differences in core genomes and corresponding accessory genomes in 36 species, represented by the genomes of 731 bacterial strains, to assess the impact of selective forces on base composition in microbes. We also explore, in turn, how these results compare with findings for whole genome intragenic regions.ResultsWe found that GC content in coding regions is significantly higher in core genomes than accessory genomes and whole genomes. Likewise, GC content variation within coding regions was significantly lower in core genomes than in accessory genomes and whole genomes. Relative entropy in coding regions, measured as the difference between observed and expected trinucleotide frequencies estimated from mononucleotide frequencies, was significantly higher in the core genomes than in accessory and whole genomes. Relative entropy was positively associated with coding region GC content within the accessory genomes, but not within the corresponding coding regions of core or whole genomes.ConclusionThe higher intragenic GC content and relative entropy, as well as the lower GC content variation, observed in the core genomes is most likely associated with selective constraints. It is unclear whether the positive association between GC content and relative entropy in the more mobile accessory genomes constitutes signatures of selection or selective neutral processes.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-017-3543-7) contains supplementary material, which is available to authorized users.
Renibacterium salmoninarum is the causative agent of bacterial kidney disease, a major pathogen of salmonid fish species worldwide. Very low levels of intra-species genetic diversity have hampered efforts to understand the transmission dynamics and recent evolutionary history of this Gram-positive bacterium. We exploited recent advances in the next-generation sequencing technology to generate genome-wide single-nucleotide polymorphism (SNP) data from 68 diverse R. salmoninarum isolates representing broad geographical and temporal ranges and different host species. Phylogenetic analysis robustly delineated two lineages (lineage 1 and lineage 2); futhermore, dating analysis estimated that the time to the most recent ancestor of all the isolates is 1239 years ago (95% credible interval (CI) 444–2720 years ago). Our data reveal the intercontinental spread of lineage 1 over the last century, concurrent with anthropogenic movement of live fish, feed and ova for aquaculture purposes and stocking of recreational fisheries, whilst lineage 2 appears to have been endemic in wild Eastern Atlantic salmonid stocks before commercial activity. The high resolution of the SNP-based analyses allowed us to separate closely related isolates linked to neighboring fish farms, indicating that they formed part of single outbreaks. We were able to demonstrate that the main lineage 1 subgroup of R. salmoninarum isolated from Norway and the UK likely represent an introduction to these areas ∼40 years ago. This study demonstrates the promise of this technology for analysis of micro and medium scale evolutionary relationships in veterinary and environmental microorganisms, as well as human pathogens.
The "Beijing" Mycobacterium tuberculosis (Mtb) lineage 2 (L2) is spreading globally and has been associated with accelerated disease progression and increased antibiotic resistance. Here we performed a phylodynamic reconstruction of one of the L2 sublineages, the central Asian clade (CAC), which has recently spread to western Europe. We find that recent historical events have contributed to the evolution and dispersal of the CAC. Our timing estimates indicate that the clade was likely introduced to Afghanistan during the 1979-1989 Soviet-Afghan war and spread further after population displacement in the wake of the American invasion in 2001. We also find that drug resistance mutations accumulated on a massive scale in Mtb isolates from former Soviet republics after the fall of the Soviet Union, a pattern that was not observed in CAC isolates from Afghanistan. Our results underscore the detrimental effects of political instability and population displacement on tuberculosis control and demonstrate the power of phylodynamic methods in exploring bacterial evolution in space and time.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.