Abstract:We report the frequent, convergent loss of two genes encoding the substrate-binding protein and the ATP-binding protein of an ATP-binding cassette (ABC) transporter from the genomes of unrelated Clostridioides difficile strains. This specific genomic deletion was strongly associated with the reduced uptake of tyrosine and phenylalanine and production of derived Stickland fermentation products, including p-cresol, suggesting that the affected ABC transporter had been responsible for the import of aromatic amino… Show more
“…For Illumina sequencing, genomic DNA was extracted from bacterial isolates by using the DNeasy Blood and Tissue kit (Qiagen), and libraries were prepared as described previously [46] and sequenced on an Illumina NextSeq 500 machine using a Mid-Output kit (Illumina) with 300 cycles. For generating complete genome sequences, we applied SMRT long-read sequencing on an RSII instrument (Pacific Biosciences) in combination with Illumina sequencing as reported previously [46]. All genome sequencing data were submitted to the European Nucleotide Archive (www.ebi.ac.uk/ena) under study numbers PRJEB33768, PRJEB33779 and PRJEB33780.…”
Section: Methodsmentioning
confidence: 99%
“…Sequencing reads were mapped to the reference genome sequence from C. difficile strain R20291 (sequence accession number FN545816) by using BWA-MEM and sequence variation was detected by applying VarScan2 as reported previously [46]. Sequence variation likely generated by recombination was detected through analysis with ClonalFrameML [47] and removed prior to determination of pairwise sequence distances [15] and to construction of maximum-likelihood phylogenetic trees with RAxML (version 8.2.9) [48].…”
Clostridioides difficile
is the primary infectious cause of antibiotic-associated diarrhea. Local transmissions and international outbreaks of this pathogen have been previously elucidated by bacterial whole-genome sequencing, but comparative genomic analyses at the global scale were hampered by the lack of specific bioinformatic tools. Here we introduce a publicly accessible database within EnteroBase (http://enterobase.warwick.ac.uk) that automatically retrieves and assembles
C. difficile
short-reads from the public domain, and calls alleles for core-genome multilocus sequence typing (cgMLST). We demonstrate that comparable levels of resolution and precision are attained by EnteroBase cgMLST and single-nucleotide polymorphism analysis. EnteroBase currently contains 18 254 quality-controlled
C. difficile
genomes, which have been assigned to hierarchical sets of single-linkage clusters by cgMLST distances. This hierarchical clustering is used to identify and name populations of
C. difficile
at all epidemiological levels, from recent transmission chains through to epidemic and endemic strains. Moreover, it puts newly collected isolates into phylogenetic and epidemiological context by identifying related strains among all previously published genome data. For example, HC2 clusters (i.e. chains of genomes with pairwise distances of up to two cgMLST alleles) were statistically associated with specific hospitals (P<10−4) or single wards (P=0.01) within hospitals, indicating they represented local transmission clusters. We also detected several HC2 clusters spanning more than one hospital that by retrospective epidemiological analysis were confirmed to be associated with inter-hospital patient transfers. In contrast, clustering at level HC150 correlated with k-mer-based classification and was largely compatible with PCR ribotyping, thus enabling comparisons to earlier surveillance data. EnteroBase enables contextual interpretation of a growing collection of assembled, quality-controlled
C. difficile
genome sequences and their associated metadata. Hierarchical clustering rapidly identifies database entries that are related at multiple levels of genetic distance, facilitating communication among researchers, clinicians and public-health officials who are combatting disease caused by
C. difficile
.
“…For Illumina sequencing, genomic DNA was extracted from bacterial isolates by using the DNeasy Blood and Tissue kit (Qiagen), and libraries were prepared as described previously [46] and sequenced on an Illumina NextSeq 500 machine using a Mid-Output kit (Illumina) with 300 cycles. For generating complete genome sequences, we applied SMRT long-read sequencing on an RSII instrument (Pacific Biosciences) in combination with Illumina sequencing as reported previously [46]. All genome sequencing data were submitted to the European Nucleotide Archive (www.ebi.ac.uk/ena) under study numbers PRJEB33768, PRJEB33779 and PRJEB33780.…”
Section: Methodsmentioning
confidence: 99%
“…Sequencing reads were mapped to the reference genome sequence from C. difficile strain R20291 (sequence accession number FN545816) by using BWA-MEM and sequence variation was detected by applying VarScan2 as reported previously [46]. Sequence variation likely generated by recombination was detected through analysis with ClonalFrameML [47] and removed prior to determination of pairwise sequence distances [15] and to construction of maximum-likelihood phylogenetic trees with RAxML (version 8.2.9) [48].…”
Clostridioides difficile
is the primary infectious cause of antibiotic-associated diarrhea. Local transmissions and international outbreaks of this pathogen have been previously elucidated by bacterial whole-genome sequencing, but comparative genomic analyses at the global scale were hampered by the lack of specific bioinformatic tools. Here we introduce a publicly accessible database within EnteroBase (http://enterobase.warwick.ac.uk) that automatically retrieves and assembles
C. difficile
short-reads from the public domain, and calls alleles for core-genome multilocus sequence typing (cgMLST). We demonstrate that comparable levels of resolution and precision are attained by EnteroBase cgMLST and single-nucleotide polymorphism analysis. EnteroBase currently contains 18 254 quality-controlled
C. difficile
genomes, which have been assigned to hierarchical sets of single-linkage clusters by cgMLST distances. This hierarchical clustering is used to identify and name populations of
C. difficile
at all epidemiological levels, from recent transmission chains through to epidemic and endemic strains. Moreover, it puts newly collected isolates into phylogenetic and epidemiological context by identifying related strains among all previously published genome data. For example, HC2 clusters (i.e. chains of genomes with pairwise distances of up to two cgMLST alleles) were statistically associated with specific hospitals (P<10−4) or single wards (P=0.01) within hospitals, indicating they represented local transmission clusters. We also detected several HC2 clusters spanning more than one hospital that by retrospective epidemiological analysis were confirmed to be associated with inter-hospital patient transfers. In contrast, clustering at level HC150 correlated with k-mer-based classification and was largely compatible with PCR ribotyping, thus enabling comparisons to earlier surveillance data. EnteroBase enables contextual interpretation of a growing collection of assembled, quality-controlled
C. difficile
genome sequences and their associated metadata. Hierarchical clustering rapidly identifies database entries that are related at multiple levels of genetic distance, facilitating communication among researchers, clinicians and public-health officials who are combatting disease caused by
C. difficile
.
Cyanobacteria are dominant primary producers of various ecosystems and they colonize marine as well as freshwater and terrestrial habitats. On the basis of their oxygenic photosynthesis they are known to synthesize a high number of secondary metabolites, which makes them promising for biotechnological applications. State-of-the-art sequencing and analytical techniques and the availability of several axenic strains offer new opportunities for the understanding of the hidden metabolic potential of cyanobacteria beyond those of single model organisms. Here, we report comprehensive genomic and metabolic analyses of five non-marine cyanobacteria, that is, Nostoc sp. DSM 107007, Anabaena variabilis DSM 107003, Calothrix desertica DSM 106972, Chroococcidiopsis cubana DSM 107010, Chlorogloeopsis sp. PCC 6912, and the reference strain Synechocystis sp. PCC 6803. Five strains that are prevalently belonging to the order Nostocales represent the phylogenetic depth of clade B1, a morphologically highly diverse sister lineage of clade B2 that includes strain PCC 6803. Genome sequencing, light and scanning electron microscopy revealed the characteristics and axenicity of the analyzed strains. Phylogenetic comparisons showed the limits of the 16S rRNA gene for the classification of cyanobacteria, but documented the applicability of a multilocus sequence alignment analysis based on 43 conserved protein markers. The analysis of metabolites of the core carbon metabolism showed parts of highly conserved metabolic pathways as well as lineage specific pathways such as the glyoxylate shunt, which was acquired by cyanobacteria at least twice via horizontal gene transfer. Major metabolic changes were observed when we compared alterations between day and night samples. Furthermore, our results showed metabolic potential of cyanobacteria beyond Synechocystis sp. PCC 6803 as model organism and may encourage the cyanobacterial community to broaden their research to related organisms with higher metabolic activity in the desired pathways.
“…Recent studies also suggest that CD630_ 08760 may function as a tyrosine transporter per its homology to the CodY-regulated neighbor gene, CD630_08730 (36). Furthermore, Steglich et al (37) observed decreases in tyrosine uptake and Stickland fermentation in clinical isolates lacking CD630_08760 and CD630_08780.…”
Section: Assignment Of Putative Functions To Genes In Egrin Modulesmentioning
Though Clostridioides difficile is among the most studied anaerobes, we know little about the systems level interplay of metabolism and regulation that underlies its ability to negotiate complex immune and commensal interactions while colonizing the human gut. We have compiled publicly available resources, generated through decades of work by the research community, into two models and a portal to support comprehensive systems analysis of C. difficile. First, by compiling a compendium of 148 transcriptomes from 11 studies we have generated an Environment and Gene Regulatory Influence Network (EGRIN) model that organizes 90% of all genes in the C. difficile genome into 297 high quality modules based on evidence for their conditional co-regulation by at least 120 transcription factors. EGRIN predictions, validated with independently-generated datasets, have recapitulated previously characterized C. difficile regulons of key transcriptional regulators, refined and extended membership of genes within regulons, and implicated new genes for sporulation, carbohydrate transport and metabolism. Findings further predict pathogen behaviors in in vivo colonization, and interactions with beneficial and detrimental commensals. Second, by advancing a constraints-based metabolic model, we have discovered that 15 amino acids, diverse carbohydrates, and 24 genes across glyoxylate, Wood-Ljungdahl, nucleotide, amino acid, and carbohydrate metabolism are essential to support growth of C. difficile within an intestinal environment. Models and supporting resources are accessible through an interactive web portal (http://networks.systemsbiology.net/cdiff-portal/) to support collaborative systems analyses of C. difficile.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.