2021
DOI: 10.1101/2021.12.21.473437
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Protein Complex Structure Prediction Powered by Multiple Sequence Alignments of Interologs from Multiple Taxonomic Ranks and AlphaFold2

Abstract: AlphaFold2 is expected to be able to predict protein complex structures as long as a multiple sequence alignment (MSA) of the interologs of the target protein-protein interaction (PPI) can be provided. However, preparing the MSA of protein-protein interologs is a non-trivial task. In this study, a simplified phylogeny-based approach was applied to generate the MSA of interologs, which was then used as the input of AlphaFold2 for protein complex structure prediction. Extensively benchmarked this protocol on non… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 33 publications
(51 reference statements)
0
4
0
Order By: Relevance
“…As it is trivial to build interologs for them, how to select high-quality MSA for homodimers is a more challenging yet important question. Previous work [39, 54] has an empirical insight that instead of using the full MSA searched from the protein sequence database, we can select a few high-quality MSA following some promisings, such as using the MSA maximizing the sequence diversity [39], or choosing the MSA owning the largest sequence similarity with the primary sequence [54]. To date, few efforts have systematically investigated the MSA-selection problem.…”
Section: Conclusion and Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…As it is trivial to build interologs for them, how to select high-quality MSA for homodimers is a more challenging yet important question. Previous work [39, 54] has an empirical insight that instead of using the full MSA searched from the protein sequence database, we can select a few high-quality MSA following some promisings, such as using the MSA maximizing the sequence diversity [39], or choosing the MSA owning the largest sequence similarity with the primary sequence [54]. To date, few efforts have systematically investigated the MSA-selection problem.…”
Section: Conclusion and Discussionmentioning
confidence: 99%
“…KM algorithm finds a global optimal solution. However, as suggested by [54], in each species, the sequence that is most similar to the query sequence may be more informative, while other sequences that are less similar may add noises. Thus we propose a greedy algorithm that focuses on pairs that have high similarity scores with the query sequence.…”
Section: Local Maximization Optimization (Interlocalcos)mentioning
confidence: 99%
“…For homomeric PPIs, the paired MSA is formed by concatenating two copies of the MSA. For heteromeric PPIs, the paired MSA is formed by pairing the MSAs through the phylogeny-based approach described in (https://github.com/ChengfeiYan/PPI_MSA-taxonomy_rank) 46 . We input the paired MSA into CCMpred 47 to get the evolutionary coupling matrix, and into alnstats 48 to get mutual information matrix, APC-corrected mutual information matrix and contact potential matrix.…”
Section: Methodsmentioning
confidence: 99%
“…The paired MSA for each homomeric PPI was formed by concatenating two copies of the MSA. For heteromeric PPIs, the paired MSA was formed by pairing the MSAs through the phylogeny-based approach described in 27 .…”
Section: The Preparation Of Input Featuresmentioning
confidence: 99%