Investigation of the Molecular Landscape of Bacterial Aromatic Polyketides by Global Analysis of Type II Polyketide Synthases

Chen, Shanchong; Zhang, Chi; Zhang, Lihan

doi:10.1002/ange.202202286

Cited by 3 publications

(4 citation statements)

References 46 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This approach differs from traditional unsupervised learning for clustering (44), as it strives to strike a balance between the sequence embeddings and the compound class labels to improve the model's accuracy. For example, T2PK AQ-256-8 consists of 8 building blocks, but its KSβ is confirmed as ancestral nonoxidative, which differs from other KSβs that involve the biosynthesis of T2PKs with 8 building blocks (6). Clearly, the state-of-the-art performance of the model trained with 9 refined class labels suggests that the classification effect is unsatisfactory when simply using five biosynthetic building blocks as labels.…”

Section: Discussionmentioning

confidence: 99%

“…T2 polyketide synthase is a family of single heterodimeric ketosynthases that iteratively catalyzes the elongation of the polyketide chain structure, leading to our inability to precisely predict T2PK structures. As introduced previously, despite the multiple sequence alignment approaches based on KSβ (5,6), incorporation of new sequences into the evolutionary model may alter the structure of the original phylogenetic tree and therefore compromise the accuracy of the predictions. To address Fig.…”

Section: Discussionmentioning

confidence: 99%

“…In contrast, KSβ sequences without corresponding chemical structures were classified as 'unlabeled'. To further curate the unlabeled KSβ, the 164 labeled KSα and KSβ sequences were used to obtain 20 kb putative T2PK minimum BGCs following the inhouse pipeline described by Chen et al (6). For the criterion of a reliable T2PK gene cluster, KSα and KSβ sequences should be identified in the same contig.…”

Section: Protein Sequence Data Preparation and Embedding Using Protei...mentioning

confidence: 99%

“…The core skeleton of T2PKs is highly correlated with the KSα/KSβ protein structure. Hillenmeyer et al observed correlations between KSβ protein phylogeny and the building blocks of T2PK skeletons (5), while Chen et al utilized KSβ as a biomarker to construct a coevolutionary statistical model (phylogenetic tree) to expand the T2PK biosynthetic landscape (6). However, the above and other (7)(8)(9) methods frequently rely on multiple sequence alignments, which are time-consuming and do not effectively represent protein structural information (10,11).…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Protein language model-based end-to-end type II polyketide prediction without sequence alignment

Qin

Zhang

Huang

et al. 2023

Preprint

View full text Add to dashboard Cite

Natural products are important sources for drug development, and the precise prediction of their structures assembled by modular proteins is an area of great interest. In this study, we introduce DeepT2, an end-to-end, cost-effective, and accurate machine learning platform to accelerate the identification of type II polyketides (T2PKs), which represent a significant portion of the natural product world. Our algorithm is based on advanced natural language processing models and utilizes the core biosynthetic enzyme, chain length factor (CLF or KSbeta), as computing inputs. The process involves sequence embedding, data labeling, classifier development, and novelty detection, which enable precise classification and prediction directly from KSbeta without sequence alignments. Combined with metagenomics and metabolomics, we evaluated the ability of DeepT2 and found this model could easily detect and classify KSbeta either as a single sequence or a mixture of bacterial genomes, and subsequently identify the corresponding T2PKs in a labeled categorized class or as novel. Our work highlights deep learning as a promising framework for genome mining and therefore provides a meaningful platform for discovering medically important natural products.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Discussionmentioning

confidence: 99%

Section: Protein Sequence Data Preparation and Embedding Using Protei...mentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Protein language model-based end-to-end type II polyketide prediction without sequence alignment

Qin

Zhang

Huang

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

Total Biosynthesis of Mutaxanthene Unveils a Flavoprotein Monooxygenase Catalyzing Xanthene Ring Formation

et al. 2023

View full text Add to dashboard Cite

Flavoprotein monooxygenases (FPMOs) play important roles in generating structural complexity and diversity in natural products biosynthesized by type II polyketide synthases (PKSs). In this study, we used genome mining to discover novel mutaxanthene analogues and investigated the biosynthesis of these aromatic polyketides and their unusual xanthene framework. We determined the complete biosynthetic pathway of mutaxathene through in vivo gene deletion and in vitro biochemical experiments. We show that a multifunctional FPMO, MtxO4, catalyzes ring rearrangement and generates the required xanthene ring through a multistep transformation. In addition, we successfully obtained all necessary enzymes for in vitro reconstitution and completed the total biosynthesis of mutaxanthene in a stepwise manner. Our results revealed the formation of a rare xanthene ring in type II polyketide biosynthesis, and demonstrate the potential of using total biosynthesis for the discovery of natural products synthesized by type II PKSs.

show abstract

Total Biosynthesis of Mutaxanthene Unveils a Flavoprotein Monooxygenase Catalyzing Xanthene Ring Formation

et al. 2023

View full text Add to dashboard Cite

show abstract

Investigation of the Molecular Landscape of Bacterial Aromatic Polyketides by Global Analysis of Type II Polyketide Synthases

Cited by 3 publications

References 46 publications

Protein language model-based end-to-end type II polyketide prediction without sequence alignment

Protein language model-based end-to-end type II polyketide prediction without sequence alignment

Total Biosynthesis of Mutaxanthene Unveils a Flavoprotein Monooxygenase Catalyzing Xanthene Ring Formation

Total Biosynthesis of Mutaxanthene Unveils a Flavoprotein Monooxygenase Catalyzing Xanthene Ring Formation

Contact Info

Product

Resources

About