The structure of proteins impacts directly on the function they perform. Mutations in the primary sequence can provoke structural changes with consequent modification of functional properties. SARS-CoV-2 proteins have been extensively studied during the pandemic. This wide dataset, related to sequence and structure, has enabled joint sequence-structure analysis. In this work, we focus on the SARS-CoV-2 S (Spike) protein and the relations between sequence mutations and structure variations, in order to shed light on the structural changes stemming from the position of mutated amino acid residues in three different SARS-CoV-2 strains. We propose the use of protein contact network (PCN) formalism to: (i) obtain a global metric space and compare various molecular entities, (ii) give a structural explanation of the observed phenotype, and (iii) provide context dependent descriptors of single mutations. PCNs have been used to compare sequence and structure of the Alpha, Delta, and Omicron SARS-CoV-2 variants, and we found that omicron has a unique mutational pattern leading to different structural consequences from mutations of other strains. The non-random distribution of changes in network centrality along the chain has allowed to shed light on the structural (and functional) consequences of mutations.
The structure and sequence of proteins strongly influence their biological functions. New models and algorithms can help researchers in understanding how the evolution of sequences and structures is related to changes in functions. Recently, studies of SARS-CoV-2 Spike (S) protein structures have been performed to predict binding receptors and infection activity in COVID-19, hence the scientific interest in the effects of virus mutations due to sequence, structure and vaccination arises. However, there is the need for models and tools to study the links between the evolution of S protein sequence, structure and functions, and virus transmissibility and the effects of vaccination. As studies on S protein have been generated a large amount of relevant information, we propose in this work to use Protein Contact Networks (PCNs) to relate protein structures with biological properties by means of network topology properties. Topological properties are used to compare the structural changes with sequence changes. We find that both node centrality and community extraction analysis can be used to relate protein stability and functionality with sequence mutations. Starting from this we compare structural evolution to sequence changes and study mutations from a temporal perspective focusing on virus variants. Finally by applying our model to the Omicron variant we report a timeline correlation between Omicron and the vaccination campaign.
Proteins sequence, structure, and function are related, so that any changes in the protein sequence may cause modifications in its structure and function. Thanks to the exponential growth of data availability, many studies have addressed different questions such as: (i) how structure evolves based on the sequence changes, (ii) how structure and function change over time. Computational experiments have contributed to the study of viral protein structures. For instance the Spike (S) protein has been investigated for its role in binding receptors and infection activity in COVID-19, hence the interest of scientific researchers in studying the effects of virus mutations due to sequence, structure and vaccination effects. Protein Contact Networks (PCNs) can be used for investigating protein structures to detect biological properties thorough network topology. We apply topological studies based on graph theory of the PCNs to compare the structural changes with sequence changes, and find that both node centrality and community extraction analysis play a relevant role in changes in protein stability and functionality caused by mutations. We compare the structural evolution to sequence changes and study mutations from a temporal perspective focusing on virus variants. We finally highlight a timeline correlation between Omicron variant identification and the vaccination campaign.
Since December 2019, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has affected almost all countries. The unprecedented spreading of this virus has led to the insurgence of many variants that impact protein sequence and structure that need continuous monitoring and analysis of the sequences to understand the genetic evolution and to prevent possible dangerous outcomes. Some variants causing the modification of the structure of the proteins, such as the Spike protein S, need to be monitored. Protein contact networks (PCNs) have been recently proposed as a modelling framework for protein structures. In such a framework, the protein structure is represented as an unweighted graph whose nodes are the central atoms of the backbones (C- ), and edges connect two atoms falling in the spatial distance between 4 and 7 Å. PCN may also be a data-rich representation since we may add to each node/atom biological and topological information. Such formalism enables the possibility of using algorithms from graph theory to analyze the graph. In particular, we refer to graph embedding methods enabling the analysis of such graphs with deep learning methods. In this work, we explore the possibility of embedding PCN using Graph Neural Networks and then analyze in the embedded space each residue to distinguish mutated residues from non-mutated ones. In particular, we analyzed the structure of the Spike protein of the coronavirus. First, we obtained the PCNs of the Spike protein for the wild-type, , , and variants. Then we used the GraphSage embedding algorithm to obtain an unsupervised embedding. Then we analyzed the point of mutation in the embedded space. Results show the characteristics of the mutation point in the embedding space.
There is increasing evidence that many molecular processes exhibit differences with age 1 and sex. Such differences produces also differences in the insurgence and progression of many 2 complex diseases. For instances, demographic data on insurgence of comorbidities of mellitus 3 diabetes, on lethality of COVID-19, and on some cancers, shows differences between sex and age 4 groups. Therefore, the growing interest on such area requires the management of related data as 5 well as the development of algorithms and tool for the analysis. The availability of omics data 6 annotated with metadata related to age and sex is mandatory for building pipeline of the analysis. 7 The number of databases containing data related to age and sex is hencefort growing. We here show 8 some databases and tools storing such data. Finally, future research direction are highlighte
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.