With an ever-increasing amount of (meta)genomic data being deposited in sequence databases, (meta)genome mining for natural product biosynthetic pathways occupies a critical role in the discovery of novel pharmaceutical drugs, crop protection agents and biomaterials. The genes that encode these pathways are often organised into biosynthetic gene clusters (BGCs). In 2015, we defined the Minimum Information about a Biosynthetic Gene cluster (MIBiG): a standardised data format that describes the minimally required information to uniquely characterise a BGC. We simultaneously constructed an accompanying online database of BGCs, which has since been widely used by the community as a reference dataset for BGCs and was expanded to 2021 entries in 2019 (MIBiG 2.0). Here, we describe MIBiG 3.0, a database update comprising large-scale validation and re-annotation of existing entries and 661 new entries. Particular attention was paid to the annotation of compound structures and biological activities, as well as protein domain selectivities. Together, these new features keep the database up-to-date, and will provide new opportunities for the scientific community to use its freely available data, e.g. for the training of new machine learning models to predict sequence-structure-function relationships for diverse natural products. MIBiG 3.0 is accessible online at https://mibig.secondarymetabolites.org/.
Bacterial natural products and their analogues constitute more than half of the new small molecule drugs developed over the last several decades. Despite this success, interest in natural products from major pharmaceutical companies has decreased even as genomics has uncovered the large number of biosynthetic gene clusters (BGCs) that encode for novel natural products. To date though, there is still a lack of universal strategies and enabling technologies to discover natural products at scale and speed. This review highlights several of the opportunities provided by genome sequencing and bioinformatics, challenges associated with translating genomes into natural products, and examples of successful strain prioritization and BGC activation strategies that have been used in the genomic era for natural product discovery from cultivatable bacteria.
There is significant interest in diversifying the structures of polyketides to create new analogues of these bioactive molecules. This has traditionally been done by focusing on engineering the acyltransferase (AT) domains of polyketide synthases (PKSs) responsible for the incorporation of malonyl-CoA extender units. Non-natural extender units have been utilized by engineered PKSs previously; however, most of the work to date has been accomplished with ATs that are either naturally promiscuous and/or located in terminal modules lacking downstream bottlenecks. These limitations have prevented the engineering of ATs with low native promiscuity and the study of any potential gatekeeping effects by domains downstream of an engineered AT. In an effort to address this gap in PKS engineering knowledge, the substrate preferences of the final two modules of the pikromycin PKS were compared for several non-natural extender units and through active site mutagenesis. This led to engineering of the methylmalonyl-CoA specificity of both modules and inversion of their selectivity to prefer consecutive non-natural derivatives.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.