2021
DOI: 10.1371/journal.pcbi.1008865
|View full text |Cite
|
Sign up to set email alerts
|

Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks

Abstract: The topology of protein folds can be specified by the inter-residue contact-maps and accurate contact-map prediction can help ab initio structure folding. We developed TripletRes to deduce protein contact-maps from discretized distance profiles by end-to-end training of deep residual neural-networks. Compared to previous approaches, the major advantage of TripletRes is in its ability to learn and directly fuse a triplet of coevolutionary matrices extracted from the whole-genome and metagenome databases and the… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
58
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
7
1

Relationship

5
3

Authors

Journals

citations
Cited by 65 publications
(61 citation statements)
references
References 47 publications
0
58
0
Order By: Relevance
“…Although I-TASSER considerably refined the template quality by multiple fragment assembly simulations, the global fold was still incorrect; TM = 0.461 and root-mean-square deviation (RMSD) = 11.9Å. The six contact programs from C-I-TASSER (TripletRes, Li et al, 2021 ; ResTriplet, Li et al, 2019b ; ResPre, Li et al, 2019a ; ResPLM, Li et al, 2019b ; Zheng et al, 2019a ; and NeBconA and NeBconB, He et al, 2017 ) generated reasonable contact-map predictions, with a top L precision of 92.5%, 93.2%, 93.2%, 91.9%, 79.5%, and 85.1%, respectively, which resulted in an overall contact precision of 96.9% for the top L -ranked contacts after combining the maps. With the aid of this combined contact map, C-I-TASSER constructed a significantly improved model with TM = 0.746 and RMSD = 3.23Å.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Although I-TASSER considerably refined the template quality by multiple fragment assembly simulations, the global fold was still incorrect; TM = 0.461 and root-mean-square deviation (RMSD) = 11.9Å. The six contact programs from C-I-TASSER (TripletRes, Li et al, 2021 ; ResTriplet, Li et al, 2019b ; ResPre, Li et al, 2019a ; ResPLM, Li et al, 2019b ; Zheng et al, 2019a ; and NeBconA and NeBconB, He et al, 2017 ) generated reasonable contact-map predictions, with a top L precision of 92.5%, 93.2%, 93.2%, 91.9%, 79.5%, and 85.1%, respectively, which resulted in an overall contact precision of 96.9% for the top L -ranked contacts after combining the maps. With the aid of this combined contact map, C-I-TASSER constructed a significantly improved model with TM = 0.746 and RMSD = 3.23Å.…”
Section: Resultsmentioning
confidence: 99%
“…It is noted that because the work was completed, the field has witnessed considerable progress in deep-learning-based interresidue distance and torsion angle predictions ( Xu, 2019 ; Yang et al, 2020 ), as well as the most recent end-to-end model training ( Jumper et al, 2020 ), which demonstrated significant usefulness for improving 3D structure modeling accuracy. Nevertheless, given the dominantly important role of contact predictions ( Shrestha et al, 2019 ) and the fact that the most reliable distance predictions are for short distances ( Li et al, 2021 ), we believe it is still of significant importance to examine separately the impact of contact maps on ab initio structure prediction, especially in conjunction with the most advanced structure folding simulations that can help explore the maximum potential of contact-map predictions. Our study showed that optimized coupling of deep-learning-based spatial information with efficient structure assembly simulations is the key to improving the capability of distantly homologous protein folding.…”
Section: Introductionmentioning
confidence: 99%
“…This method predicts CM with various distance thresholds of 6, 7.5, 8, 8.5, and 10 Å, and then refines them to leave with only 8 Å CM with an improved prediction rate [ 77 ]. TripletRes starts with the collection of MSAs through whole-genome and metagenome sequence databases and then constructs three complimentary co-evolutionary feature matrices (covariance matrix, precision matrix, and pseudolikelihood maximization) to create contact-map models through deep residual convolutional neural network training [ 78 ]. DeepContact is also a CNN-based approach that discovers co-evolutionary motifs and leverages these patterns to enable accurate inference of contact probabilities [ 79 ].…”
Section: Prediction Of 1d and 2d Protein Structural Annotationsmentioning
confidence: 99%
“…C-I-TASSER (contact-guided iterative threading assembly refinement) is an extended method from the original I-TASSER for high-accuracy protein structure and function predictions [ 102 ]. It generates inter-residue CMs using multiple deep neural-network predictors (such as NeBcon, ResPRE, and TripletRes) and identifies reliable structural templates from the PDB database by multiple threading approach (LOMETS) [ 78 , 103 , 104 , 105 ]. Then, the full-length atomic models are assembled by contact-map-guided replica-exchange Monte Carlo simulations.…”
Section: Prediction Of Protein 3d Structuresmentioning
confidence: 99%
“…Two types of strategies have been widely considered for protein 3D structure prediction (2): the first is template-based modeling (TBM), which constructs structural models using solved structures as templates, where its success requests for the availability of homologous templates in the Protein Data Bank (PDB); the second is template-free modeling (FM) approach (or ab initio modeling), which dedicates to model the "Hard" proteins that do not have close homologous structures in the PDB. Due to the lack of reliable physics-based force fields, the most efficient FM methods, including Rosetta (3), QUARK (4), and I-TASSER (5), rely on a prior spatial restraints derived, usually through deep neural-network learning (6,7), from the co-evolution information based on multiple sequence alignments (MSA) of homologous proteins (8). Hence, to model 3D structure of the "Hard" proteins, a sufficient number of homologous sequences is critical to ensure the accuracy of deep machine-learning models and the quality of subsequent 3D structure constructions (9).…”
mentioning
confidence: 99%