2022
DOI: 10.1093/molbev/msac197
|View full text |Cite
|
Sign up to set email alerts
|

The Statistical Trends of Protein Evolution: A Lesson from AlphaFold Database

Abstract: The recent development of artificial intelligence provides us with new and powerful tools for studying the mysterious relationship between organism evolution and protein evolution. In this work, based on the AlphaFold Protein Structure Database (AlphaFold DB), we perform comparative analyses of the proteins of different organisms. The statistics of AlphaFold-predicted structures show that, for organisms with higher complexity, their constituent proteins will have larger radii of gyration, higher coil fractions… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(7 citation statements)
references
References 107 publications
0
4
0
Order By: Relevance
“…We also confirmed using shuffling methods that this complexity does not stem from content differences (e.g., the so-called C-value paradox or enigma [61]) but arises from internal sequential patterns. From the perspective of protein structure, studies have shown that species with higher complexity possess more proteins with larger radii of gyration (signifying increased flexibility) and a higher degree of modularity [62]. On the other hand, our analysis from the sequential perspective implies that the more complex a species is, the higher the tendency for sequence complexity.…”
Section: Statistical Observations On Sequential Orderliness and Compl...mentioning
confidence: 74%
See 1 more Smart Citation
“…We also confirmed using shuffling methods that this complexity does not stem from content differences (e.g., the so-called C-value paradox or enigma [61]) but arises from internal sequential patterns. From the perspective of protein structure, studies have shown that species with higher complexity possess more proteins with larger radii of gyration (signifying increased flexibility) and a higher degree of modularity [62]. On the other hand, our analysis from the sequential perspective implies that the more complex a species is, the higher the tendency for sequence complexity.…”
Section: Statistical Observations On Sequential Orderliness and Compl...mentioning
confidence: 74%
“…On the other hand, our analysis from the sequential perspective implies that the more complex a species is, the higher the tendency for sequence complexity. (It is worth noting that although the definition of species complexity remains debated, in practice, biologists often employ varied metrics like the total cell types, genome size, or proteome size to gauge species complexity, whereas these metrics are often interrelated [62,63].) Collectively, these results hint at positive correlations between amino acid sequence complexity, protein structural modularity, and overall species complexity.…”
Section: Statistical Observations On Sequential Orderliness and Compl...mentioning
confidence: 99%
“…had higher prediction scores, directing that both AF2 and ESMF perform well on multi-domain proteins, making single-and multiple-domain molecules equally likely to have lower scores. Multi-domain molecules with longer sequence lengths tend to have larger radii of gyration, resulting in increased complexity [83,84]. Generally, a larger radius of gyration indicates a more extended or less compact structure; therefore, it might add up to the challenge of structure prediction for prediction tools to model a structure accurately, resulting in lower scores.…”
Section: Discussionmentioning
confidence: 99%
“…It is well known that templates play a critical role in the protein structure modelling 4 . Meanwhile, the evolutionary relationship implicitly contained in templates are probably favorable for study of protein folding 6 .…”
Section: Introductionmentioning
confidence: 99%