2023
DOI: 10.1098/rsif.2022.0707
|View full text |Cite
|
Sign up to set email alerts
|

Impact of phylogeny on structural contact inference from protein sequence data

Abstract: Local and global inference methods have been developed to infer structural contacts from multiple sequence alignments of homologous proteins. They rely on correlations in amino acid usage at contacting sites. Because homologous proteins share a common ancestry, their sequences also feature phylogenetic correlations, which can impair contact inference. We investigate this effect by generating controlled synthetic data from a minimal model where the importance of contacts and of phylogeny can be tuned. We demons… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 61 publications
0
5
0
Order By: Relevance
“…Our results regarding synthetic data suggest that this could be due to the robustness of these methods to phylogenetic correlations, which are bound to exist in natural sequence data. Here, we showed that phylogenetic correlations make the inference of functional sectors challenging, very much like they obscure the inference of structural contacts [4,[36][37][38][39][40]. It is important to note that phylogenetic correlations are nevertheless interesting and provide useful signal e.g.…”
Section: Discussionmentioning
confidence: 89%
See 1 more Smart Citation
“…Our results regarding synthetic data suggest that this could be due to the robustness of these methods to phylogenetic correlations, which are bound to exist in natural sequence data. Here, we showed that phylogenetic correlations make the inference of functional sectors challenging, very much like they obscure the inference of structural contacts [4,[36][37][38][39][40]. It is important to note that phylogenetic correlations are nevertheless interesting and provide useful signal e.g.…”
Section: Discussionmentioning
confidence: 89%
“…in the case where all mutations are neutral. Phylogenetic correlations impair the inference of structural contacts from sequences [4,[36][37][38][39][40], which has motivated empirical corrections aiming at reducing their impact [4, 7-9, 41-43, 43-45]. Disentangling them from collectively correlated groups of amino acids such as functional sectors is bound to be a significant challenge too, perhaps even more.…”
Section: Introductionmentioning
confidence: 99%
“…Their importance are controlled respectively by the parameters µ and κ: a smaller µ means that sequences are more closely related, yielding more phylogenetic correlations, while a larger κ means stronger selection, yielding more correlations arising from the sector. Note that this data generation process with phylogeny is close to the one we used previously [40,47], but that the Hamiltonian we use here (Eq 1) is specific to the sector model. B: To incorporate phylogenetic correlations, we start from one equilibrium sequence, which becomes the ancestor.…”
Section: Resultsmentioning
confidence: 97%
“…Our results regarding synthetic data suggest that this could be due to the robustness of these methods to phylogenetic correlations, which are bound to exist in natural sequence data. Here, we showed that phylogenetic correlations make the inference of functional sectors challenging, very much like they obscure the inference of structural contacts [4, 3640]. It is important to note that phylogenetic correlations are nevertheless interesting and provide useful signal e.g.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation