2021
DOI: 10.1038/s41467-021-25975-9
|View full text |Cite
|
Sign up to set email alerts
|

Mapping the glycosyltransferase fold landscape using interpretable deep learning

Abstract: Glycosyltransferases (GTs) play fundamental roles in nearly all cellular processes through the biosynthesis of complex carbohydrates and glycosylation of diverse protein and small molecule substrates. The extensive structural and functional diversification of GTs presents a major challenge in mapping the relationships connecting sequence, structure, fold and function using traditional bioinformatics approaches. Here, we present a convolutional neural network with attention (CNN-attention) based deep learning m… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
27
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
8
1

Relationship

1
8

Authors

Journals

citations
Cited by 25 publications
(27 citation statements)
references
References 61 publications
(93 reference statements)
0
27
0
Order By: Relevance
“…Hereafter we will discuss only molecule A because it contains a better-defined density for the ligands. In addition, Dm C1GalT1 also contained the four conserved landmark features among GT-A GTs 42 : the DxD motif for metal cation interactions (Asp181-X-Asp183), a “glycine-rich” loop facing the acceptor and donor sugar site located in Dm C1GalT1 at loop β5-β6, an “xED” motif at the beginning of α6 in Dm C1GalT1 harboring the catalytic base (Asp255, see further experiments below), and a “C-His” residue that coordinates with the metal ion (His324) (Fig. 3b and Supplementary Fig.…”
Section: Resultsmentioning
confidence: 99%
“…Hereafter we will discuss only molecule A because it contains a better-defined density for the ligands. In addition, Dm C1GalT1 also contained the four conserved landmark features among GT-A GTs 42 : the DxD motif for metal cation interactions (Asp181-X-Asp183), a “glycine-rich” loop facing the acceptor and donor sugar site located in Dm C1GalT1 at loop β5-β6, an “xED” motif at the beginning of α6 in Dm C1GalT1 harboring the catalytic base (Asp255, see further experiments below), and a “C-His” residue that coordinates with the metal ion (His324) (Fig. 3b and Supplementary Fig.…”
Section: Resultsmentioning
confidence: 99%
“…Moreover, this proposed model can produce interpretable activation maps, with the help of attention modules, to provide meaningful biological inferences, where the activation maps can maintain spatial information (i.e., features) to determine the key protein sequence regions. 119 It is apparent that more interpretable GAN-based algorithms are warranted, and hence, we could identify explainable features pinpointed by the GAN model.…”
Section: Limitationsmentioning
confidence: 94%
“…17 It is also worth mentioning that not only the GAN model but also other deep learning algorithms (e.g., convolutional neural network modules) have this issue. 17 For instance, Taujale et al 119 proposed an interpretable deep learning model to predict protein folds for glycosyltransferase protein families, where the model was based on a convolutional neural network module with the addition of attention modules. Moreover, this proposed model can produce interpretable activation maps, with the help of attention modules, to provide meaningful biological inferences, where the activation maps can maintain spatial information (i.e., features) to determine the key protein sequence regions.…”
Section: Limitationsmentioning
confidence: 99%
“…In addition, the development of artificial intelligence provides a unique avenue for the identification of target enzymes through rapidly expanding sequence databases. Recent advances in deep learning models for feature extraction and patter recognition for sequence classification and functional prediction of enzymes in large datasets have also facilitated the discovery of novel GTs [ 124 , 125 ]. Yang et al.…”
Section: Mining Gts For Biosynthesis Of Glycosylated Pnpsmentioning
confidence: 99%