Abstract:Protein structure fundamentally underpins the function and processes of numerous biological systems. Fold recognition algorithms offer a sensitive and robust tool to detect structural, and thereby functional, similarities between distantly related homologs. In the era of accurate structure prediction owing to advances in machine learning techniques, previously curated sequence databases have become a rich source of biological information. Here, we use bioinformatic fold recognition algorithms to scan the entir… Show more
“…The here described similarities and differences between SUZ12 and MES-3 should facilitate further experiments to elucidate the specific mechanisms by which MES-3 acts in PRC2 in C. elegans . Our work joins a rapidly growing set of in silico predictions of previously undetected homologies made possible by unprecedented advances in deep-learning driven structure prediction ( Bayly-Jones and Whisstock, 2021 ; Sanchez-Pulido and Ponting, 2021 ).…”
“…The here described similarities and differences between SUZ12 and MES-3 should facilitate further experiments to elucidate the specific mechanisms by which MES-3 acts in PRC2 in C. elegans . Our work joins a rapidly growing set of in silico predictions of previously undetected homologies made possible by unprecedented advances in deep-learning driven structure prediction ( Bayly-Jones and Whisstock, 2021 ; Sanchez-Pulido and Ponting, 2021 ).…”
“…The extensive structural dataset not only offers the potential to answer the critical questions about life activities and human health, but also suggests a new paradigm for studying protein evolution. Instead of focusing only on specific families of proteins (16), we can now perform comparative analyses of proteins structures in different species. The statistics of the full-proteome protein structures may indicate the trend of species evolution.…”
Section: Introductionmentioning
confidence: 99%
“…It is difficult to construct a database for analyzing the evolution of all proteins in various species since it is costly and time-consuming to determine the structures of all proteins contained within the proteome of a species by experiments.Notably, the recent development of artificial intelligence (AI) provides us with new and powerful tools to help us elucidate the trends in protein evolution on the macroscopic scale. AlphaFold, developed by DeepMind, is an artificial intelligence system based on deep learning, which performs protein structure predictions with high accuracy 14,15 and finds many applications in medical and biological sciences [16][17][18] . Although some of the predictions were noted to have limitations 19,20 , AlphaFold had already won an unprecedented and overwhelming success.…”
The relationship between species evolution and protein evolution has been remaining as a mystery. The recent development of artificial intelligence provides us with new and powerful tools for studying the evolution of proteins and species. In this work, based on the AlphaFold Protein Structure Database (AlphaFold DB), we perform comparative analyses of the protein structures of different species. The statistics of AlphaFold-predicted structures show that, as species evolve from prokaryotes to eukaryotes, from unicellular to multicellular organisms, from invertebrates to vertebrates, and so on, the proteins within them evolve towards larger radii of gyration, higher coil fractions, higher modularity, and slower relaxations, indicating that the average flexibility of proteins gradually increases in the species evolution. With the scaling analyses, the size dependence of proteins' shape, topology, and dynamics suggest a decreasing fractal dimension of the proteins in species evolution. This evolutionary trend is accompanied by the increasing eigengaps in the vibration spectra of proteins, implying that the proteins are statistically evolving towards higher functional specificity and lower dimensionality in the equilibrium dynamics. Furthermore, we also uncover the topology and sequence bases of this evolutionary trend. The residue contact networks of the proteins are evolving towards higher assortativity, and the hydrophobic and hydrophilic amino acid residues are evolving to have increasing segregations in the sequences. This study provides new insights into how the diversity in the functionality of the proteins increases and their plasticity grows in evolution. The evolutionary laws implied by these statistical results may also shed light on the study of protein design.
“…The here described similarities and differences between SUZ12 and MES-3 should facilitate further experiments to elucidate the specific mechanisms by which MES-3 acts in PRC2 in C. elegans. Our work joins a rapidly growing set of in silico predictions of previously undetected homologies made possible by unprecedented advances in deep-learning driven structure prediction 22,30 .…”
Section: Resultsmentioning
confidence: 98%
“…We capitalized on recent advantages on computational prediction approaches that enable to derive high-quality structures of protein monomers or multimers 23,24 , which enables to study protein function and evolution at unprecedented scale 22,30 . We demonstrate that MES-3 is a diverged ortholog of SUZ12, and that MES-3 may associate with MES-2, MES-6, and LIN-53, similar to the orthologous proteins in human PRC2.…”
Polycomb Repressive Complex 2 (PRC2) catalyzes the mono-, di, and trimethylation of histone protein H3 on lysine 27 (H3K27). Trimethylation of H3K27 is strongly associated with transcriptionally silent chromatin and plays an important role in the regulation of cell identity and developmental gene expression. The functional core of PRC2 is highly conserved in animals and consists of four subunits (Fig. 1a). Notably, one of these subunits, SUZ12, has not been identified in the genetic model Caenorhabditis elegans, whereas C. elegans PRC2 contains the lineage-specific protein MES-3 (Fig. 1a). Here, we demonstrate that MES-3 is in fact a highly divergent ortholog of SUZ12. Unbiased sensitive sequence similarity searches uncovered consistent but insignificant reciprocal best matches between MES-3 and SUZ12, suggesting that these proteins could share a common evolutionary history. We substantiate this hypothesis by directly comparing the predicted structures of SUZ12 and MES-3, which revealed shared protein folds and residues of key domains. Thus, in agreement with the observations in previous genetic and biochemical studies, we here provide evidence that C. elegans, like other animals, contains a diverged yet evolutionary conserved core PRC2.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.