2014
DOI: 10.1371/journal.pcbi.1003847
|View full text |Cite
|
Sign up to set email alerts
|

Improving Contact Prediction along Three Dimensions

Abstract: Correlation patterns in multiple sequence alignments of homologous proteins can be exploited to infer information on the three-dimensional structure of their members. The typical pipeline to address this task, which we in this paper refer to as the three dimensions of contact prediction, is to (i) filter and align the raw sequence data representing the evolutionarily related proteins; (ii) choose a predictive model to describe a sequence alignment; (iii) infer the model parameters and interpret them in terms o… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

2
96
0

Year Published

2015
2015
2019
2019

Publication Types

Select...
3
2
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 75 publications
(99 citation statements)
references
References 43 publications
(95 reference statements)
2
96
0
Order By: Relevance
“…The gradient of the log likelihood with respect to the couplings w ij (a, b) can be written as: (11) where N ij represents the number of sequences that do not contain a gap at positions i and j, q(x i = a, x j = b) represents the empirically observed pairwise amino acid frequencies that are normalized over a, b ∈ {1, ..., 20} and 550 p(x i = a, x j = b|v, w) corresponds to the model probabilities of the MRF for observing an amino acid pair (a, b) at positions i and j. The empirical amino acid counts, given by N ij q(x i = a, x j = b), are constant and need to be computed only once from the alignment.…”
Section: Divergencementioning
confidence: 99%
See 1 more Smart Citation
“…The gradient of the log likelihood with respect to the couplings w ij (a, b) can be written as: (11) where N ij represents the number of sequences that do not contain a gap at positions i and j, q(x i = a, x j = b) represents the empirically observed pairwise amino acid frequencies that are normalized over a, b ∈ {1, ..., 20} and 550 p(x i = a, x j = b|v, w) corresponds to the model probabilities of the MRF for observing an amino acid pair (a, b) at positions i and j. The empirical amino acid counts, given by N ij q(x i = a, x j = b), are constant and need to be computed only once from the alignment.…”
Section: Divergencementioning
confidence: 99%
“…for residue contact prediction simply takes the L 2 norm of the 20 × 20-dimensional vector w ij with components w ij (a, b) [3,10,11,31,50],…”
mentioning
confidence: 99%
“…Specifically, we first survey the coevolution-based profiling of functional domains for a previously-annotated set of about 800 multiple sequence alignments [43]. From this extensive survey we find that domains inferred from the sole sequence-based, coevolutionary analysis are compact in space and well consistent with the dynamical, or quasi-rigid domains inferred from the analysis of small and large-scale structural fluctuations.…”
Section: Introductionmentioning
confidence: 99%
“…Correlated substitutions can help identify those sites that host co-evolving mutations and these, in turn, are an indicator of spatial proximity [38][39][40][41][42][43][44][45][46].…”
Section: Introductionmentioning
confidence: 99%
“…Therefore, computational methods that allow for protein structure reconstruction from sequence only are greatly desired. One of these is the recently developed direct coupling analysis (DCA) method [1,2] which achieves the best results in residue-residue contact prediction from multiple sequence alignments only. Predicted contacts are used as restraints in the reconstruction of the three-dimensional structure of a protein.…”
mentioning
confidence: 99%