Enterococcus faecium is a gut commensal of humans and animals but is also listed on the WHO global priority list of multidrug-resistant pathogens. Many of its antibiotic resistance traits reside on plasmids and have the potential to be disseminated by horizontal gene transfer. Here, we present the first comprehensive population-wide analysis of the pan-plasmidome of a clinically important bacterium, by whole-genome sequence analysis of 1,644 isolates from hospital, commensal, and animal sources of E. faecium. Long-read sequencing on a selection of isolates resulted in the completion of 305 plasmids that exhibited high levels of sequence modularity. We further investigated the entirety of all plasmids of each isolate (plasmidome) using a combination of short-read sequencing and machine-learning classifiers. Clustering of the plasmid sequences unraveled different E. faecium populations with a clear association with hospitalized patient isolates, suggesting different optimal configurations of plasmids in the hospital environment. The characterization of these populations allowed us to identify common mechanisms of plasmid stabilization such as toxin-antitoxin systems and genes exclusively present in particular plasmidome populations exemplified by copper resistance, phosphotransferase systems, or bacteriocin genes potentially involved in niche adaptation. Based on the distribution of k-mer distances between isolates, we concluded that plasmidomes rather than chromosomes are most informative for source specificity of E. faecium. IMPORTANCE Enterococcus faecium is one of the most frequent nosocomial pathogens of hospital-acquired infections. E. faecium has gained resistance against most commonly available antibiotics, most notably, against ampicillin, gentamicin, and vancomycin, which renders infections difficult to treat. Many antibiotic resistance traits, in particular, vancomycin resistance, can be encoded in autonomous and extrachromosomal elements called plasmids. These sequences can be disseminated to other isolates by horizontal gene transfer and confer novel mechanisms to source specificity. In our study, we elucidated the total plasmid content, referred to as the plasmidome, of 1,644 E. faecium isolates by using short- and long-read whole-genome technologies with the combination of a machine-learning classifier. This was fundamental to investigate the full collection of plasmid sequences present in our collection (pan-plasmidome) and to observe the potential transfer of plasmid sequences between E. faecium hosts. We observed that E. faecium isolates from hospitalized patients carried a larger number of plasmid sequences compared to that from other sources, and they elucidated different configurations of plasmidome populations in the hospital environment. We assessed the contribution of different genomic components and observed that plasmid sequences have the highest contribution to source specificity. Our study suggests that E. faecium plasmids are regulated by complex ecological constraints rather than physical interaction between hosts.
Some of the most common infectious diseases are caused by bacteria that naturally colonise humans asymptomatically. Combating these opportunistic pathogens requires an understanding of the traits that differentiate infecting strains from harmless relatives. Staphylococcus epidermidis is carried asymptomatically on the skin and mucous membranes of virtually all humans but is a major cause of nosocomial infection associated with invasive procedures. Here we address the underlying evolutionary mechanisms of opportunistic pathogenicity by combining pangenome-wide association studies and laboratory microbiology to compare S. epidermidis from bloodstream and wound infections and asymptomatic carriage. We identify 61 genes containing infection-associated genetic elements (k-mers) that correlate with in vitro variation in known pathogenicity traits (biofilm formation, cell toxicity, interleukin-8 production, methicillin resistance). Horizontal gene transfer spreads these elements, allowing divergent clones to cause infection. Finally, Random Forest model prediction of disease status (carriage vs. infection) identifies pathogenicity elements in 415 S. epidermidis isolates with 80% accuracy, demonstrating the potential for identifying risk genotypes pre-operatively.
The sequences reported in this paper have been deposited in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (see URLs) with the Bioproject accession number PRJNA434389 and the NCBI Gene Expression Omnibus (GEO) under accession number GSE113058. ETHICS STATEMENTAll mouse studies were performed in accordance with a protocol (AUP-0318-0016) approved by the Institutional Animal Care and Use Committee at Houston Methodist Research Institute. All studies with human blood and blood components were performed in accordance with a protocol (01-I-N055) approved by the Institutional Review Board for human subjects, National Institute of Allergy and Infectious Diseases. All study volunteers gave written informed consent.
Chickens are the most common birds on Earth and colibacillosis is among the most common diseases affecting them. This major threat to animal welfare and safe sustainable food production is difficult to combat because the etiological agent, avian pathogenic Escherichia coli (APEC), emerges from ubiquitous commensal gut bacteria, with no single virulence gene present in all disease-causing isolates. Here, we address the underlying evolutionary mechanisms of extraintestinal spread and systemic infection in poultry. Combining population scale comparative genomics and pangenome-wide association studies, we compare E. coli from commensal carriage and systemic infections. We identify phylogroup-specific and species-wide genetic elements that are enriched in APEC, including pathogenicity-associated variation in 143 genes that have diverse functions, including genes involved in metabolism, lipopolysaccharide synthesis, heat shock response, antimicrobial resistance and toxicity. We find that horizontal gene transfer spreads pathogenicity elements, allowing divergent clones to cause infection. Finally, a Random Forest model prediction of disease status (carriage vs. disease) identifies pathogenic strains in the emergent ST-117 poultry-associated lineage with 73% accuracy, demonstrating the potential for early identification of emergent APEC in healthy flocks.
Theory of graphical models has matured over more than three decades to provide the backbone for several classes of models that are used in a myriad of applications such as genetic mapping of diseases, credit risk evaluation, reliability and computer security, etc. Despite of their generic applicability and wide adoptance, the constraints imposed by undirected graphical models and Bayesian networks have also been recognized to be unnecessarily stringent under certain circumstances. This observation has led to the proposal of several generalizations that aim at more relaxed constraints by which the models can impose local or context-specific dependence structures. Here we consider an additional class of such models, termed as stratified graphical models. We develop a method for Bayesian learning of these models by deriving an analytical expression for the marginal likelihood of data under a specific subclass of decomposable stratified models. A non-reversible Markov chain Monte Carlo approach is further used to identify models that are highly supported by the posterior distribution over the model space. Our method is illustrated and compared with ordinary graphical models through application to several real and synthetic datasets.
Adaptive immune receptor repertoires (AIRR) are key targets for biomedical research as they record past and ongoing adaptive immune responses. The capacity of machine learning (ML) to identify complex discriminative sequence patterns renders it an ideal approach for AIRR-based diagnostic and therapeutic discovery. To date, widespread adoption of AIRR ML has been inhibited by a lack of reproducibility, transparency, and interoperability. immuneML ( immuneml.uio.no ) addresses these concerns by implementing each step of the AIRR ML process in an extensible, open-source software ecosystem that is based on fully specified and shareable workflows. To facilitate widespread user adoption, immuneML is available as a command-line tool and through an intuitive Galaxy web interface, and extensive documentation of workflows is provided. We demonstrate the broad applicability of immuneML by (i) reproducing a large-scale study on immune state prediction, (ii) developing, integrating, and applying a novel method for antigen specificity prediction, and (iii) showcasing streamlined interpretability-focused benchmarking of AIRR ML. 1.
The Genetics of Sexuality and Aggression (GSA) project was launched at the Abo Akademi University in Turku, Finland in 2005 and has so far undertaken two major population-based data collections involving twins and siblings of twins. To date, it consists of about 14,000 individuals (including 1,147 informative monozygotic twin pairs, 1,042 informative same-sex dizygotic twin pairs, 741 informative opposite-sex dizygotic twin pairs). Participants have been recruited through the Central Population Registry of Finland and were 18-49 years of age at the time of the data collections. Saliva samples for DNA genotyping (n = 4,278) and testosterone analyses (n = 1,168) were collected in 2006. The primary focus of the data collections has been on sexuality (both sexual functioning and sexual behavior) and aggressive behavior. This paper provides an overview of the data collections as well as an outline of the phenotypes and biological data assembled within the project. A detailed overview of publications can be found at the project's Web site: http://www.cebg.fi/.Keywords: aggression, environmental, genetic, population based, sexuality, twin studyThe Genetics of Sexuality and Aggression (GSA) project in Finland has completed two major data collections since its inception in 2005. The major research goals of the GSA project are to conduct investigations with genetically sensitive designs, in large, population-based samples, on phenotypes related to sexuality and aggression. What makes this data bank unique is its focus on sexuality and aggression-related phenotypes that have rarely been studied in population-based samples. For several of these phenotypes, the GSA samples are the largest genetically informative samples in the world. Most of our research has focused on sexuality and gender identity-related phenotypes, such as sexual function and dysfunction, sexual behavior and its variations, and body image. A number of relevant psychiatric phenotypes have also been measured (e.g., behaviors and attitudes related to eating, and psychopathology such as anxiety and depression). Furthermore, the research site has a long history of conducting aggression research with genetically informative designs (in both mice and humans), which is now continued within the framework of the GSA project. For example, a major research goal of the GSA project is to examine genetic effects on aggressive behavior, including identifying genes that moderate the association between alcohol use and aggressive behavior.The project is conducted at the Department of Psychology and Logopedics at the Abo Akademi University in Turku, Finland. From the second data collection (carried out in 2006) onwards, collaborations with the Department of Pharmacology at the Sahlgrenska Academy, University of Gothenburg, Sweden, have been undertaken in order to collect biological markers. For a subset of the second data collection, genotype data is available, as well as data on free testosterone (T) levels, both collected by means of non-invasive saliva sampling. For the ...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.