Complex microbial communities shape the dynamics of various environments, ranging from the mammalian gastrointestinal tract to the soil. Advances in DNA sequencing technologies and data analysis have provided drastic improvements in microbiome analyses, for example, in taxonomic resolution, false discovery rate control and other properties, over earlier methods. In this Review, we discuss the best practices for performing a microbiome study, including experimental design, choice of molecular analysis technology, methods for data analysis and the integration of multiple omics data sets. We focus on recent findings that suggest that operational taxonomic unit-based analyses should be replaced with new methods that are based on exact sequence variants, methods for integrating metagenomic and metabolomic data, and issues surrounding compositional data analysis, where advances have been particularly rapid. We note that although some of these approaches are new, it is important to keep sight of the classic issues that arise during experimental design and relate to research reproducibility. We describe how keeping these issues in mind allows researchers to obtain more insight from their microbiome data sets.
The ongoing COVID-19 pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has resulted in more than 28,000,000 infections and 900,000 deaths worldwide to date. Antibody development efforts mainly revolve around the extensively glycosylated SARS-CoV-2 spike (S) protein, which mediates host cell entry by binding to the angiotensin-converting enzyme 2 (ACE2). Similar to many other viral fusion proteins, the SARS-CoV-2 spike utilizes a glycan shield to thwart the host immune response. Here, we built a full-length model of the glycosylated SARS-CoV-2 S protein, both in the open and closed states, augmenting the available structural and biological data. Multiple microsecond-long, all-atom molecular dynamics simulations were used to provide an atomistic perspective on the roles of glycans and on the protein structure and dynamics. We reveal an essential structural role of N -glycans at sites N165 and N234 in modulating the conformational dynamics of the spike’s receptor binding domain (RBD), which is responsible for ACE2 recognition. This finding is corroborated by biolayer interferometry experiments, which show that deletion of these glycans through N165A and N234A mutations significantly reduces binding to ACE2 as a result of the RBD conformational shift toward the “down” state. Additionally, end-to-end accessibility analyses outline a complete overview of the vulnerabilities of the glycan shield of the SARS-CoV-2 S protein, which may be exploited in the therapeutic efforts targeting this molecular machine. Overall, this work presents hitherto unseen functional and structural insights into the SARS-CoV-2 S protein and its glycan coat, providing a strategy to control the conformational plasticity of the RBD that could be harnessed for vaccine development.
Highlights d Mice harboring human ASD, but not TD, microbiomes exhibit ASD-like behaviors d ASD and TD microbiota produce differential metabolome profiles in mice d Extensive alternative splicing of risk genes in brains of mice with ASD microbiota d BTBR mice treated with 5AV or taurine improved repetitive and social behaviors
The ongoing COVID-19 pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has resulted in more than 7,000,000 infections and 400,000 deaths worldwide to date. Antibody development efforts mainly revolve around the extensively glycosylated SARS-CoV-2 spike (S) protein, which mediates the host cell entry by binding to the angiotensinconverting enzyme 2 (ACE2). In the context of vaccine design, similar to many other viruses, the SARS-CoV-2 spike utilizes a glycan shield to thwart the host immune response. Here, we built a full-length model of glycosylated SARS-CoV-2 S protein, both in the open and closed states, augmenting the available structural and biological data. Multiple microsecond-long, all-atom molecular dynamics simulations were used to provide an atomistic perspective on the glycan shield and the protein structure, stability, and dynamics. End-to-end accessibility analyses outline a complete overview of the vulnerabilities of the glycan shield of SARS-CoV-2 S protein, which can be harnessed for vaccine development. In addition, a dynamic analysis of the main antibody epitopes is provided. Finally, beyond shielding, a possible structural role of N-glycans at N165 and N234 is hypothesized to modulate and stabilize the conformational dynamics of the spike's receptor binding domain, which is responsible for ACE2 recognition. Overall, this work presents hitherto unseen functional and structural insights into the SARS-CoV-2 S protein and its glycan coat, which may be exploited by therapeutic efforts targeting this essential molecular machine. Glycan Shield of the Receptor Binding DomainAs discussed in the previous section, the glycan shield plays a critical role in hiding the S protein surface from molecular recognition. However, to effectively function, the spike needs to recognize and bind to ACE2 receptors as the primary host cell infection route. For this reason, the RBM must become fully exposed and accessible. 48 In this scenario, the glycan shield works in concert with a large conformational change that allows the RBD to emerge above the N-glycan coverage. Here, we quantify the ASA of the RBM within RBD-A, corresponding to the RBD/ACE2-interacting region (residues 400-508), at various probe radii in both the Open and Closed systems (Figures 3A and 3D, full data in Tables S4-S6.). As expected, the ASA plots show a significant difference between the "down" (Closed) and "up" (Open) RBD conformations, with the RBM area covered by glycans being remarkably larger in the former. When RBD-A is in the "up" conformation, its RBM shows an average (across all radii) of only ~9% surface area covered by glycans, compared with ~35% in the Closed system (Figures 3A and 3D). This difference is further amplified when considering a larger probe radius of 15 Å, with a maximum of 11% and 46% for Open and Closed, respectively. Interestingly, for smaller probes (1.4-3 Å) the shielding becomes weak in both systems, with an average of 6% and 16% for Open and Closed, respectively.Note that the RBD regio...
The rapid increase in the number of proteins in sequence databases and the diversity of their functions challenge computational approaches for automated function prediction. Here, we introduce DeepFRI, a Graph Convolutional Network for predicting protein functions by leveraging sequence features extracted from a protein language model and protein structures. It outperforms current leading methods and sequence-based Convolutional Neural Networks and scales to the size of current sequence repositories. Augmenting the training set of experimental structures with homology models allows us to significantly expand the number of predictable functions. DeepFRI has significant de-noising capability, with only a minor drop in performance when experimental structures are replaced by protein models. Class activation mapping allows function predictions at an unprecedented resolution, allowing site-specific annotations at the residue-level in an automated manner. We show the utility and high performance of our method by annotating structures from the PDB and SWISS-MODEL, making several new confident function predictions. DeepFRI is available as a webserver at https://beta.deepfri.flatironinstitute.org/.
Recent massive increases in the number of sequences available in public databases challenges current experimental approaches to determining protein function. These methods are limited by both the large scale of these sequences databases and the diversity of protein functions. We present a deep learning Graph Convolutional Network (GCN) trained on sequence and structural data and evaluate it on~40k proteins with known structures and functions from the Protein Data Bank (PDB). Our GCN predicts functions more accurately than Convolutional Neural Networks trained on sequence data alone and competing methods. Feature extraction via a language model removes the need for constructing multiple sequence alignments or feature engineering. Our model learns general structure-function relationships by robustly predicting functions of proteins with ≤ 30% sequence identity to the training set. Using class activation mapping, we can automatically identify structural regions at the residue-level that lead to each function prediction for every protein confidently predicted, advancing site-specific function prediction. De-noising inherent in the trained model allows an only minor drop in performance when structure predictions are used, including multiple de novo protocols. We use our method to annotate all proteins in the PDB, making several new confident function predictions spanning both fold and function trees.
Lifestyle factors, such as diet, strongly influence the structure, diversity, and composition of the microbiome. While we have witnessed over the last several years a resurgence of interest in fermented foods, no study has specifically explored the effects of their consumption on gut microbiota in large cohorts. To assess whether the consumption of fermented foods is associated with a systematic signal in the gut microbiome and metabolome, we used a multi-omic approach (16S rRNA amplicon sequencing, metagenomic sequencing, and untargeted mass spectrometry) to analyze stool samples from 6,811 individuals from the American Gut Project, including 115 individuals specifically recruited for their frequency of fermented food consumption for a targeted 4-week longitudinal study. We observed subtle but statistically significant differences between consumers and nonconsumers in beta diversity as well as differential taxa between the two groups. We found that the metabolome of fermented food consumers was enriched with conjugated linoleic acid (CLA), a putatively health-promoting molecule. Cross-omic analyses between metagenomic sequencing and mass spectrometry suggest that CLA may be driven by taxa associated with fermented food consumers. Collectively, we found modest yet persistent signatures associated with fermented food consumption that appear present in multiple -omic types which motivate further investigation of how different types of fermented food impact the gut microbiome and overall health. IMPORTANCE Public interest in the effects of fermented food on the human gut microbiome is high, but limited studies have explored the association between fermented food consumption and the gut microbiome in large cohorts. Here, we used a combination of omics-based analyses to study the relationship between the microbiome and fermented food consumption in thousands of people using both cross-sectional and longitudinal data. We found that fermented food consumers have subtle differences in their gut microbiota structure, which is enriched in conjugated linoleic acid, thought to be beneficial. The results suggest that further studies of specific kinds of fermented food and their impacts on the microbiome and health will be useful.
To advance the mission of in silico cell biology, modeling the interactions of large and complex biological systems becomes increasingly relevant. The combination of molecular dynamics (MD) simulations and Markov state models (MSMs) has enabled the construction of simplified models of molecular kinetics on long timescales. Despite its success, this approach is inherently limited by the size of the molecular system. With increasing size of macromolecular complexes, the number of independent or weakly coupled subsystems increases, and the number of global system states increases exponentially, making the sampling of all distinct global states unfeasible. In this work, we present a technique called independent Markov decomposition (IMD) that leverages weak coupling between subsystems to compute a global kinetic model without requiring the sampling of all combinatorial states of subsystems. We give a theoretical basis for IMD and propose an approach for finding and validating such a decomposition. Using empirical few-state MSMs of ion channel models that are well established in electrophysiology, we demonstrate that IMD models can reproduce experimental conductance measurements with a major reduction in sampling compared with a standard MSM approach. We further show how to find the optimal partition of all-atom protein simulations into weakly coupled subunits.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.