Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at http://antismash.secondarymetabolites.org. Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products. At the enzyme level, active sites of key biosynthetic enzymes are now pinpointed through a curated pattern-matching procedure and Enzyme Commission numbers are assigned to functionally classify all enzyme-coding genes. Additionally, chemical structure prediction has been improved by incorporating polyketide reduction states. Finally, in order for users to be able to organize and analyze multiple antiSMASH outputs in a private setting, a new XML output module allows offline editing of antiSMASH annotations within the Geneious software.
BackgroundNon-ribosomal peptide synthetases (NRPSs) are large multimodular enzymes that synthesize a wide range of biologically active natural peptide compounds, of which many are pharmacologically important. Peptide bond formation is catalyzed by the Condensation (C) domain. Various functional subtypes of the C domain exist: An LCL domain catalyzes a peptide bond between two L-amino acids, a DCL domain links an L-amino acid to a growing peptide ending with a D-amino acid, a Starter C domain (first denominated and classified as a separate subtype here) acylates the first amino acid with a β-hydroxy-carboxylic acid (typically a β-hydroxyl fatty acid), and Heterocyclization (Cyc) domains catalyze both peptide bond formation and subsequent cyclization of cysteine, serine or threonine residues. The homologous Epimerization (E) domain flips the chirality of the last amino acid in the growing peptide; Dual E/C domains catalyze both epimerization and condensation.ResultsIn this paper, we report on the reconstruction of the phylogenetic relationship of NRPS C domain subtypes and analyze in detail the sequence motifs of recently discovered subtypes (Dual E/C, DCL and Starter domains) and their characteristic sequence differences, mutually and in comparison with LCL domains. Based on their phylogeny and the comparison of their sequence motifs, LCL and Starter domains appear to be more closely related to each other than to other subtypes, though pronounced differences in some segments of the protein account for the unequal donor substrates (amino vs. β-hydroxy-carboxylic acid). Furthermore, on the basis of phylogeny and the comparison of sequence motifs, we conclude that Dual E/C and DCL domains share a common ancestor. In the same way, the evolutionary origin of a C domain of unknown function in glycopeptide (GP) NRPSs can be determined to be an LCL domain. In the case of two GP C domains which are most similar to DCL but which have LCL activity, we postulate convergent evolution.ConclusionWe systematize all C domain subtypes including the novel Starter C domain. With our results, it will be easier to decide the subtype of unknown C domains as we provide profile Hidden Markov Models (pHMMs) for the sequence motifs as well as for the entire sequences. The determined specificity conferring positions will be helpful for the mutation of one subtype into another, e.g. turning DCL to LCL, which can be a useful step for obtaining novel products.
We present a new support vector machine (SVM)-based approach to predict the substrate specificity of subtypes of a given protein sequence family. We demonstrate the usefulness of this method on the example of aryl acid-activating and amino acid-activating adenylation domains (A domains) of nonribosomal peptide synthetases (NRPS). The residues of gramicidin synthetase A that are 8 Å around the substrate amino acid and corresponding positions of other adenylation domain sequences with 397 known and unknown specificities were extracted and used to encode this physico-chemical fingerprint into normalized real-valued feature vectors based on the physico-chemical properties of the amino acids. The SVM software package SVMlight was used for training and classification, with transductive SVMs to take advantage of the information inherent in unlabeled data. Specificities for very similar substrates that frequently show cross-specificities were pooled to the so-called composite specificities and predictive models were built for them. The reliability of the models was confirmed in cross-validations and in comparison with a currently used sequence-comparison-based method. When comparing the predictions for 1230 NRPS A domains that are currently detectable in UniProt, the new method was able to give a specificity prediction in an additional 18% of the cases compared with the old method. For 70% of the sequences both methods agreed, for <6% they did not, mainly on low-confidence predictions by the existing method. None of the predictive methods could infer any specificity for 2.4% of the sequences, suggesting completely new types of specificity.
An ever-increasing demand for novel antimicrobials to treat life-threatening infections caused by the global spread of multidrug-resistant bacterial pathogens stands in stark contrast to the current level of investment in their development, particularly in the fields of natural-product-derived and synthetic small molecules. New agents displaying innovative chemistry and modes of action are desperately needed worldwide to tackle the public health menace posed by antimicrobial resistance. Here, our consortium presents a strategic blueprint to substantially improve our ability to discover and develop new antibiotics. We propose both short-term and long-term solutions to overcome the most urgent limitations in the various sectors of research and funding, aiming to bridge the gap between academic, industrial and political stakeholders, and to unite interdisciplinary expertise in order to efficiently fuel the translational pipeline for the benefit of future generations.
SummaryStreptomyces coelicolor GlnR is a global regulator that controls genes involved in nitrogen metabolism. By genomic screening 10 new GlnR targets were identified, including enzymes for ammonium assimilation (glnII, gdhA), nitrite reduction (nirB), urea cleavage (ureA) and a number of biochemically uncharacterized proteins (SCO0255, SCO0888, SCO2195, SCO2400, SCO2404, SCO7155). For the GlnR regulon, a GlnR binding site which comprises the sequence gTnAc-n6-GaAAc-n6-GtnAC-n6-GAAAc-n6 has been found. Reverse transcription analysis of S. coelicolor and the S. coelicolor glnR mutant revealed that GlnR activates or represses the expression of its target genes. Furthermore, glnR expression itself was shown to be nitrogen-dependent. Physiological studies of S. coelicolor and the S. coelicolor glnR mutant with ammonium and nitrate as the sole nitrogen source revealed that GlnR is not only involved in ammonium assimilation but also in ammonium supply. BLAST analysis demonstrated that GlnRhomologous proteins are present in different actinomycetes containing the glnA gene with the conserved GlnR binding site. By DNA binding studies, it was furthermore demonstrated that S. coelicolor GlnR is able to interact with these glnA upstream regions. We therefore suggest that GlnR-mediated regulation is not restricted to Streptomyces but constitutes a regulon conserved in many actinomycetes.
BackgroundDuring the lifetime of a fermenter culture, the soil bacterium S. coelicolor undergoes a major metabolic switch from exponential growth to antibiotic production. We have studied gene expression patterns during this switch, using a specifically designed Affymetrix genechip and a high-resolution time-series of fermenter-grown samples.ResultsSurprisingly, we find that the metabolic switch actually consists of multiple finely orchestrated switching events. Strongly coherent clusters of genes show drastic changes in gene expression already many hours before the classically defined transition phase where the switch from primary to secondary metabolism was expected. The main switch in gene expression takes only 2 hours, and changes in antibiotic biosynthesis genes are delayed relative to the metabolic rearrangements. Furthermore, global variation in morphogenesis genes indicates an involvement of cell differentiation pathways in the decision phase leading up to the commitment to antibiotic biosynthesis.ConclusionsOur study provides the first detailed insights into the complex sequence of early regulatory events during and preceding the major metabolic switch in S. coelicolor, which will form the starting point for future attempts at engineering antibiotic production in a biotechnological setting.
We performed molecular phylogenetic analyses of glutamine synthetase (GS) genes in order to investigate their evolutionary history. The analyses were done on 30 DNA sequences of the GS gene which included both prokaryotes and eukaryotes. Two types of GS genes are known at present: the GSI gene found so far only in prokaryotes and the GSII gene found in both prokaryotes and eukaryotes. Our study has shown that the two types of GS gene were produced by a gene duplication which preceded, perhaps by >1000 million years, the divergence of eukaryotes and prokaryotes. The results are consistent with the facts that (t0 GS is a key enzyme of nitrogen metabolism found in all extant life forms and (fi) the oldest biological fossils date back 3800 million years. Thus, we suggest that GS genes are one of the oldest existing and functioning genes in the history of gene evolution and that GSI genes should also exist in eukaryotes. Furthermore, our study may stimulate investigation on the evolution of "preprokaryotes," by which we mean the organisms that existed during the era between the origin of life and the divergence of prokaryotes and eukaryotes.Glutamine synthetase (GS) is a key enzyme in nitrogen metabolism; it has dual functions in two essential biochemical reactions, ammonia assimilation and glutamine biosynthesis (1, 2). It is also one of the few amide synthetases found in organisms. Prokaryotes and eukaryotes were once thought to synthesize different GSs: GSI for the former and GSII for the latter. It is now known, however, that GSII is also present in bacteria belonging to Rhizobiaceae (3-6), Frankiaceae (7), and Streptomycetaceae (8, 9). GSI, by contrast, has not been found in any eukaryote.Glutamine produced by GS is essential for protein synthesis, and its amide nitrogen is donated to synthesize many essential metabolites. It is thus natural to consider GS as present in, and probably indispensable to, all organisms. In view of the central roles played by GS, it is reasonable to believe that the GS gene is extremely old. From the sequence alignment of GSI from Salmonella typhimurium and GSII from alfalfa (10), we could observe that the differences in amino acids between them was 0.75 per site. This value is quite large compared with those for other proteins, suggesting also that the GSI and GSII genes share a very old comnmon ancestor.The aforementioned discovery of the GSII gene in plant symbiotic bacteria led to the suggestion that the gene had originated from host plants through lateral gene transfer (3). This was later questioned by the further findings of the GSII gene in plant nonsymbiotic actinomycetes (8, 9). Shatters and Kahn (6) have suggested that the common ancestor of the GSII genes in Rhizobiaceae and in the host plant must be older than the plant itself, and have argued against the gene transfer.In this paper we have traced the evolutionary history of the GS genes, using our own nucleotide sequence data and others' data from prokaryotic and eukaryotic species in order to estimate the age of...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.