2022
DOI: 10.1101/2022.08.24.505105
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Combining phylogeny and coevolution improves the inference of interaction partners among paralogous proteins

Abstract: Predicting protein-protein interactions from sequences is an important goal of computational biology. Various sources of information can be used to this end. Starting from the sequences of two interacting protein families, one can use phylogeny or residue coevolution to infer which paralogs are specific interaction partners within each species. We show that these two signals can be combined to improve the performance of the inference of interaction partners among paralogs. For this, we first align the sequence… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
13
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
1

Relationship

4
0

Authors

Journals

citations
Cited by 4 publications
(13 citation statements)
references
References 37 publications
0
13
0
Order By: Relevance
“…The ability to partly disentangle these correlations is one of the reasons of the success of Potts models, as we showed here, and also of protein language models, as we showed in [43]. While phylogenetic correlations are an issue for structure prediction, they are nevertheless an important and helpful signal for the inference of interaction partners among the paralogs of two protein families [39, 44, 45]. They are thus a double-edged sword for inference from MSAs.…”
Section: Discussionmentioning
confidence: 82%
“…The ability to partly disentangle these correlations is one of the reasons of the success of Potts models, as we showed here, and also of protein language models, as we showed in [43]. While phylogenetic correlations are an issue for structure prediction, they are nevertheless an important and helpful signal for the inference of interaction partners among the paralogs of two protein families [39, 44, 45]. They are thus a double-edged sword for inference from MSAs.…”
Section: Discussionmentioning
confidence: 82%
“…Because most cognate HK-RR pairs are encoded in the same operon, many interaction partners are known from genome proximity, which enables us to assess performance. In addition, earlier coevolution methods for paralog matching were tested on this dataset, allowing rigorous comparison [14, 47, 50]. Here, we focus on datasets comprising about 50 cognate HK-RR pairs.…”
Section: Resultsmentioning
confidence: 99%
“…1 shows that DiffPALM performs better than the chance expectation, obtained for random within-species matching. Moreover, it outperforms other coevolution-based methods, namely DCA-IPA [14], MI-IPA [47], which rely respectively on Potts models and on mutual information, and GA-IPA [50], which combines these coevolution measures with sequence similarity, a proxy for phylogeny. Importantly, these results are obtained without giving any paired sequences as input to the algorithm.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The ability to partly disentangle these correlations is one of the reasons for the success of Potts models, as we showed here, and also of protein language models, as we showed in [ 43 ]. While phylogenetic correlations are an issue for structure prediction, they are nevertheless an important and helpful signal for the inference of interaction partners among the paralogs of two protein families [ 39 , 44 , 45 ]. They are thus a double-edged sword for inference from MSAs.…”
Section: Discussionmentioning
confidence: 99%