Bat coronaviruses related to SARS-CoV-2 and infectious for human cellsSarah T em ma m , K ha ms in g V on gp ha yl ot h, E du ar d Baquero Salazar, Sandie M un ie r , M as si mi li ano Bonomi, Béatrice R eg na ul t , B o u ns a v ane D ou an gb ou bp ha, Yasaman Karami, Delphine C hr ét ie n , D ao sa va nh Sanamxay, Vilakhan X ay ap he t , P he tp ho um in Paphaphanh, Vincent L ac os te , S om ph av anh S om lo r , K ha it ho ng L ak eo ma ny , Nothasin Phommavanh,
The systematic and accurate description of protein mutational landscapes is a question of utmost importance in biology, bioengineering, and medicine. Recent progress has been achieved by leveraging on the increasing wealth of genomic data and by modeling intersite dependencies within biological sequences. However, state-of-the-art methods remain time consuming. Here, we present Global Epistatic Model for predicting Mutational Effects (GEMME) (www.lcqb.upmc.fr/GEMME), an original and fast method that predicts mutational outcomes by explicitly modeling the evolutionary history of natural sequences. This allows accounting for all positions in a sequence when estimating the effect of a given mutation. GEMME uses only a few biologically meaningful and interpretable parameters. Assessed against 50 high- and low-throughput mutational experiments, it overall performs similarly or better than existing methods. It accurately predicts the mutational landscapes of a wide range of protein families, including viral ones and, more generally, of much conserved families. Given an input alignment, it generates the full mutational landscape of a protein in a matter of minutes. It is freely available as a package and a webserver at www.lcqb.upmc.fr/GEMME/.
The animal reservoir of SARS-CoV-2 is unknown despite reports of various SARS-CoV-2-related viruses in Asian Rhinolophus bats, including the closest virus from R. affinis, RaTG13. Several studies have suggested the involvement of pangolin coronaviruses in SARS-CoV-2 emergence. SARS-CoV-2 presents a mosaic genome, to which different progenitors contribute. The spike sequence determines the binding affinity and accessibility of its receptor-binding domain (RBD) to the cellular angiotensin-converting enzyme 2 (ACE2) receptor and is responsible for host range. SARS-CoV-2 progenitor bat viruses genetically close to SARS-CoV-2 and able to enter human cells through a human ACE2 pathway have not yet been identified, though they would be key in understanding the origin of the epidemics. Here we show that such viruses indeed circulate in cave bats living in the limestone karstic terrain in North Laos, within the Indochinese peninsula. We found that the RBDs of these viruses differ from that of SARS-CoV-2 by only one or two residues, bind as efficiently to the hACE2 protein as the SARS-CoV-2 Wuhan strain isolated in early human cases, and mediate hACE2-dependent entry into human cells, which is inhibited by antibodies neutralizing SARS-CoV-2. None of these bat viruses harbors a furin cleavage site in the spike. Our findings therefore indicate that bat-borne SARS-CoV-2-like viruses potentially infectious for humans circulate in Rhinolophus spp. in the Indochinese peninsula.
9The systematic and accurate description of protein mutational landscapes is a question of utmost 10 importance in biology, bioengineering and medicine. Recent progress has been achieved by leveraging 11 on the increasing wealth of genomic data and by modeling inter-site dependencies within biological 12 sequences. However, state-of-the-art methods require numerous highly variable sequences and remain 13 time consuming. Here, we present GEMME (www.lcqb.upmc.fr/GEMME), a method that overcomes 14 these limitations by explicitly modeling the evolutionary history of natural sequences. This allows 15 accounting for all positions in a sequence when estimating the effect of a given mutation. Assessed 16 against 41 experimental high-throughput mutational scans, GEMME overall performs similarly or 17 better than existing methods and runs faster by several orders of magnitude. It greatly improves 18 predictions for viral sequences and, more generally, for very conserved families. It uses only a few 19 biologically meaningful and interpretable parameters, while existing methods work with hundreds of 20 thousands of parameters. 21Introduction 24 Understanding which and how genetic variations affect proteins and their biological functions is 25 a central question for bioengineering, medicine and fundamental biology. In these fields, a fast and 26 accurate assessment of the effects of every possible substitution at every position in a protein sequence 27 (full single-site mutational landscape) or of combinations of mutations (pairs, triplets...) would allow 28 to reach some level of control over proteins, needed to improve the treatment of diseases, the design 29 of new proteins and the synthesis of molecular libraries. Deep mutational scans [1] or multiplexed 30 assays for variant effects (MAVEs) [2] have enabled the full description of the mutational landscapes 31 of a few tens of proteins (see [3] for a list of proteins and associated experiments). They have 32 revealed that a protein contains a relatively small number of positions highly sensitive to mutations, 33 where almost any substitution induces highly deleterious effects [4, 5]. Although these methods 34 represent major biotechnological advances, they remain resource intensive and are limited in their 35 scalability. Moreover, the measured phenotype and the way it is measured vary substantially from 36 one experiment to another, making it difficult to compare different measurements and/or proteins 37 [6]. These limitations call for the development of efficient and accurate computational methods for 38 high-throughput mutational scans. 39Many computational methods predicting mutational effects exploit information coming from pro-40 tein sequences observed in nature [3, 7, 8, 9, 10, 11, 12, 13, 14, 15]. They rely on the assumption that 41 rarely occurring mutations induce deleterious effects. Most of them start from a multiple sequence 42 alignment and treat each position in the alignment independently from the others to compute fre-43 quencies of ...
BackgroundProteins adapt to environmental conditions by changing their shape and motions. Characterising protein conformational dynamics is increasingly recognised as necessary to understand how proteins function. Given a conformational ensemble, computational tools are needed to extract in a systematic way pertinent and comprehensive biological information.ResultsHere, we present a method, Communication Mapping (COMMA), to decipher the dynamical architecture of a protein. The method first extracts residue-based dynamic properties from all-atom molecular dynamics simulations. Then, it integrates them in a graph theoretic framework, where it identifies groups of residues or protein regions that mediate short- and long-range communication. COMMA introduces original concepts to contrast the different roles played by these regions, namely communication blocks and communicating segment pairs, and evaluates the connections and communication strengths between them. We show the utility and capabilities of COMMA by applying it to three archetypal proteins, namely protein A, the tyrosine kinase KIT and the tumour suppressor p53.ConclusionOur method permits to compare in a direct way the dynamical behaviour either of proteins with different characteristics or of the same protein in different conditions. It is useful to identify residues playing a key role in protein allosteric regulation and to explain the effects of deleterious mutations in a mechanistic way. COMMA is a fully automated tool with broad applicability. It is freely available to the community at www.lcqb.upmc.fr/COMMA.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-015-0855-y) contains supplementary material, which is available to authorized users.
Loop regions in protein structures often have crucial roles, and they are much more variable in sequence and structure than other regions. In homology modeling, this leads to larger deviations from the homologous templates, and loop modeling of homology models remains an open problem. To address this issue, we have previously developed the DaReUS-Loop protocol, leading to significant improvement over existing methods. Here, a DaReUS-Loop web server is presented, providing an automated platform for modeling or remodeling loops in the context of homology models. This is the first web server accepting a protein with up to 20 loop regions, and modeling them all in parallel. It also provides a prediction confidence level that corresponds to the expected accuracy of the loops. DaReUS-Loop facilitates the analysis of the results through its interactive graphical interface and is freely available at http://bioserv.rpbs.univ-paris-diderot.fr/services/DaReUS-Loop/.
Despite efforts during the past decades, loop modeling remains a difficult part of protein structure modeling. Several approaches have been developed in the framework of crystal structures. However, for homology models, the modeling of loops is still far from being solved. We propose DaReUS-Loop, a data-based approach that identifies loop candidates mining the complete set of experimental structures available in the Protein Data Bank. Candidate filtering relies on local conformation profile-profile comparison, together with physico-chemical scoring. Applied to three different template-based test sets, DaReUS-Loop shows significant increase in the number of high-accuracy loops, and significant enhancement for modeling long loops. A special advantage is that our method proposes a prediction confidence score that correlates well with the expected accuracy of the loops. Strikingly, over 50% of successful loop models are derived from unrelated proteins, indicating that fragments under similar constraints tend to adopt similar structure, beyond mere homology.
Streptococcus pyogenes (Group A streptococcus; GAS) is an important human pathogen responsible for mild to severe, life-threatening infections. GAS expresses a wide range of virulence factors, including the M family proteins. The M proteins allow the bacteria to evade parts of the human immune defenses by triggering the formation of a dense coat of plasma proteins surrounding the bacteria, including IgGs. However, the molecular level details of the M1-IgG interaction have remained unclear. Here, we characterized the structure and dynamics of this interaction interface in human plasma on the surface of live bacteria using integrative structural biology, combining cross-linking mass spectrometry and molecular dynamics (MD) simulations. We show that the primary interaction is formed between the S-domain of M1 and the conserved IgG Fc-domain. In addition, we show evidence for a so far uncharacterized interaction between the A-domain and the IgG Fc-domain. Both these interactions mimic the protein G-IgG interface of group C and G streptococcus. These findings underline a conserved scavenging mechanism used by GAS surface proteins that block the IgG-receptor (FcγR) to inhibit phagocytic killing. We additionally show that we can capture Fab-bound IgGs in a complex background and identify XLs between the constant region of the Fab-domain and certain regions of the M1 protein engaged in the Fab-mediated binding. Our results elucidate the M1-IgG interaction network involved in inhibition of phagocytosis and reveal important M1 peptides that can be further investigated as future vaccine targets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.