2021
DOI: 10.1038/s41467-021-22869-8
|View full text |Cite
|
Sign up to set email alerts
|

CopulaNet: Learning residue co-evolution directly from multiple sequence alignment for protein structure prediction

Abstract: Residue co-evolution has become the primary principle for estimating inter-residue distances of a protein, which are crucially important for predicting protein structure. Most existing approaches adopt an indirect strategy, i.e., inferring residue co-evolution based on some hand-crafted features, say, a covariance matrix, calculated from multiple sequence alignment (MSA) of target protein. This indirect strategy, however, cannot fully exploit the information carried by MSA. Here, we report an end-to-end deep n… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
44
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 65 publications
(54 citation statements)
references
References 30 publications
0
44
0
Order By: Relevance
“…By learning from thousands of experimentally solved protein structures, ResNet greatly reduced the number of sequence homologs needed for satisfactory contact prediction, doubling or even tripling the precision over traditional methods on the CASP13 hard test proteins [ 45 ]. Recent studies have shown that ResNet was able to predict accurate contacts and correct folds for most proteins with more than 30 non-redundant sequence homologs [ 95 , 96 ]. One of the major differences between DBN and RaptorX’s ResNet is that the former predicts inter-residue contacts one by one while the latter predicts the whole contact matrix simultaneously.…”
Section: Neoantigen Identificationmentioning
confidence: 99%
“…By learning from thousands of experimentally solved protein structures, ResNet greatly reduced the number of sequence homologs needed for satisfactory contact prediction, doubling or even tripling the precision over traditional methods on the CASP13 hard test proteins [ 45 ]. Recent studies have shown that ResNet was able to predict accurate contacts and correct folds for most proteins with more than 30 non-redundant sequence homologs [ 95 , 96 ]. One of the major differences between DBN and RaptorX’s ResNet is that the former predicts inter-residue contacts one by one while the latter predicts the whole contact matrix simultaneously.…”
Section: Neoantigen Identificationmentioning
confidence: 99%
“…Most neural network models, including Al-phaFold (AlQuraishi, 2019) and RaptorX (Xu, 2019), rely on this feature. However, due to the considerable information loss after transforming MSAs into hand-crafted features, supervised models, such as CopulaNet (Ju et al, 2021) and AlphaFold2 (Jumper et al, 2021), are proposed to directly build on the raw MSA. The superior performance over the baselines demonstrates that residue co-evolution information can be mined from the raw sequences by the model.…”
Section: Related Workmentioning
confidence: 99%
“…For protein structure prediction, the key step is to predict inter-residue contacts/distances, while the shared cornerstone of prediction is performing evolutionary coupling analysis, i.e. residue co-evolution analysis, on the constructed MSA for a target protein (Ju et al, 2021). The underlying rational is that two residues which are spatially close in the three-dimensional structure tend to coevolve, which in turns can be exploited to estimate contacts/distances between residues (Seemayer et al, 2014;Jones & Kandathil, 2018).…”
Section: Introductionmentioning
confidence: 99%
“…These huge amounts TA B L E 1 Overview of X-to-end and end-to-X deep learning approaches for protein structure prediction. End-to-end learning AlphaFold2 [19] The MSA, along with templates, is fed into a translation and rotation equivairant transformer architecture, which outputs a 3D structural model DMPfold2 (new) [35] The MSA, along with the precision matrix, is fed into a GRU, which outputs a 3D structure End-to-X learning MSA Transformer [45] Transformer architecture rawMSA [34] The MSA is fed into a 2DCNN (the first convolutional layer creates an embedding) which outputs a contact map CopulaNet [46] Extracts all sequence pairs from the MSA and feeds them to a dilated resCNN TOWER…”
Section: The Importance Of Data and Data Representationsmentioning
confidence: 99%
“…Stacking dilated convolutions with increasingly large d allows operating on exponentially large receptive fields, while retaining short backpropagations [110,111,7]. In CASP14, dilated convolutions were used by several groups, including ProSPr [22], DESTINI2 [26], CopulaNet [46], PrayogRealDistance [29,30], and also EMBER, TOWER, ICOS, and LAW/MASS. Another solution lies in the self-attention mechanism, where parametric filters capture high-order dependencies between the input observations at arbitrary range and with high precision (Fig.…”
Section: Volumetric Representations 3dcnn[84]mentioning
confidence: 99%