2011
DOI: 10.1073/pnas.1111471108
|View full text |Cite
|
Sign up to set email alerts
|

Direct-coupling analysis of residue coevolution captures native contacts across many protein families

Abstract: The similarity in the three-dimensional structures of homologous proteins imposes strong constraints on their sequence variability. It has long been suggested that the resulting correlations among amino acid compositions at different sequence positions can be exploited to infer spatial contacts within the tertiary protein structure. Crucial to this inference is the ability to disentangle direct and indirect correlations, as accomplished by the recently introduced direct-coupling analysis (DCA). Here we develop… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

37
2,119
2
8

Year Published

2014
2014
2019
2019

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 1,310 publications
(2,166 citation statements)
references
References 51 publications
(90 reference statements)
37
2,119
2
8
Order By: Relevance
“…The approach is useful for predicting the native structure when sequence data are abundant but a structure has not been determined experimentally for a protein [31]. Even more interestingly, and going beyond earlier seminal findings [26], it was found that sequence information can reveal residue interactions that are not present in the PDB structure, including interactions between structural domains [31] as well as interactions involved in alternative conformational states with evolutionarily conserved functional significance [29]. Most recently, coevolutionary information of several protein families has been applied to determine a theoretical sequence-space rsif.royalsocietypublishing.org J. R. Soc.…”
Section: Evolutionary Protein Biophysics: Evolutionary Information Bementioning
confidence: 99%
“…The approach is useful for predicting the native structure when sequence data are abundant but a structure has not been determined experimentally for a protein [31]. Even more interestingly, and going beyond earlier seminal findings [26], it was found that sequence information can reveal residue interactions that are not present in the PDB structure, including interactions between structural domains [31] as well as interactions involved in alternative conformational states with evolutionarily conserved functional significance [29]. Most recently, coevolutionary information of several protein families has been applied to determine a theoretical sequence-space rsif.royalsocietypublishing.org J. R. Soc.…”
Section: Evolutionary Protein Biophysics: Evolutionary Information Bementioning
confidence: 99%
“…Phylogenetic trees of drug-naive and drugtreated HIV-1-infected patients have been shown to exhibit star-like phylogenies (Keele et al 2008;Gupta and Adami 2016), and thus phylogenetic corrections are not needed. Further, phylogenetic corrections based on pairwise sequence similarity cut-offs of 40% of sequence length or more which are common in studies utilizing direct coupling analysis (DCA) (Weigt et al 2009;Morcos et al 2011Morcos et al , 2014 of protein families would drastically reduce the number of effective sequences in our MSA and would lead to mischaracterization of the true underlying mutational landscape. We note that Potts models of other HIV-1 protein sequences under immune pressure have been parameterized with no phylogenetic corrections Mann et al 2014;Barton et al 2016b).…”
Section: Marginal Reweightingmentioning
confidence: 99%
“…Recently, probabilistic models, called Potts models, have been used to assign scores to individual protein sequences which correlate with experimental measures of fitness (Haq et al 2012;Ferguson et al 2013;Mann et al 2014;Figliuzzi et al 2015;Hopf et al 2017). These advances build upon previous and ongoing work in which Potts models have been used to extract information from sequence data regarding tertiary and quaternary structure of protein families (Weigt et al 2009;Morcos et al 2011Morcos et al , 2014Marks et al 2012;Sulkowska et al 2012;Sutto et al 2015;Barton et al 2016a;Haldane et al 2016;Jacquin et al 2016) and sequencespecific quantitative predictions of viral protein stability and fitness (Haq et al 2012;Shekhar et al 2013;Barton et al 2016b;Butler et al 2016).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Therefore, computational methods that allow for protein structure reconstruction from sequence only are greatly desired. One of these is the recently developed direct coupling analysis (DCA) method [1,2] which achieves the best results in residue-residue contact prediction from multiple sequence alignments only. Predicted contacts are used as restraints in the reconstruction of the three-dimensional structure of a protein.…”
mentioning
confidence: 99%