Microbiome samples are inherently defined by the environment in which they are found. Therefore, data that provide context and enable interpretation of measurements produced from biological samples, often referred to as metadata, are critical. Important contributions have been made in the development of community-driven metadata standards; however, these standards have not been uniformly embraced by the microbiome research community. To understand how these standards are being adopted, or the barriers to adoption, across research domains, institutions, and funding agencies, the National Microbiome Data Collaborative (NMDC) hosted a workshop in October 2019. This report provides a summary of discussions that took place throughout the workshop, as well as outcomes of the working groups initiated at the workshop.
In this work, we hypothesized that shifts in the food microbiome can be used as an indicator of unexpected contaminants or environmental changes. To test this hypothesis, we sequenced the total RNA of 31 high protein powder (HPP) samples of poultry meal pet food ingredients. We developed a microbiome analysis pipeline employing a key eukaryotic matrix filtering step that improved microbe detection specificity to >99.96% during in silico validation. The pipeline identified 119 microbial genera per HPP sample on average with 65 genera present in all samples. The most abundant of these were Bacteroides, Clostridium, Lactococcus, Aeromonas, and Citrobacter. We also observed shifts in the microbial community corresponding to ingredient composition differences. When comparing culture-based results for Salmonella with total RNA sequencing, we found that Salmonella growth did not correlate with multiple sequence analyses. We conclude that microbiome sequencing is useful to characterize complex food microbial communities, while additional work is required for predicting specific species’ viability from total RNA sequencing.
Despite advances in sequencing, lack of standardization makes comparisons across studies challenging and hampers insights into the structure and function of microbial communities across multiple habitats on a planetary scale. Here we present a multi-omics analysis of a diverse set of 880 microbial community samples collected for the Earth Microbiome Project. We include amplicon (16S, 18S, ITS) and shotgun metagenomic sequence data, and untargeted metabolomics data (liquid chromatography-tandem mass spectrometry and gas chromatography mass spectrometry). We used standardized protocols and analytical methods to characterize microbial communities, focusing on relationships and co-occurrences of microbially related metabolites and microbial taxa across environments, thus allowing us to explore diversity at extraordinary scale. In addition to a reference database for metagenomic and metabolomic data, we provide a framework for incorporating additional studies, enabling the expansion of existing knowledge in the form of an evolving community resource. We demonstrate the utility of this database by testing the hypothesis that every microbe and metabolite is everywhere but the environment selects. Our results show that metabolite diversity exhibits turnover and nestedness related to both microbial communities and the environment, whereas the relative abundances of microbially related metabolites vary and co-occur with specific microbial consortia in a habitat-specific manner. We additionally show the power of certain chemistry, in particular terpenoids, in distinguishing Earth’s environments (for example, terrestrial plant surfaces and soils, freshwater and marine animal stool), as well as that of certain microbes including Conexibacter woesei (terrestrial soils), Haloquadratum walsbyi (marine deposits) and Pantoea dispersa (terrestrial plant detritus). This Resource provides insight into the taxa and metabolites within microbial communities from diverse habitats across Earth, informing both microbial and chemical ecology, and provides a foundation and methods for multi-omics microbiome studies of hosts and the environment.
Tracking the bacterial communities present in our food has the potential to inform food safety and product origin. To do so, the entire genetic material present in a sample is extracted using chemical methods or commercially available kits and sequenced using next-generation platforms to provide a snapshot of the microbial composition.
Microbes produce an array of secondary metabolites that perform diverse functions from communication to defense. These metabolites have been used to benefit human health and sustainability. In their analysis of the Genomes from Earth's Microbiomes (GEM) catalog, Nayfach and co-authors observed that, whereas genes coding for certain classes of secondary metabolites are limited or enriched in certain microbial taxa, "specific chemistry is not limited or amplified by the environment, and that most classes of secondary metabolites can be found nearly anywhere". Although metagenome mining is a powerful way to annotate biosynthetic gene clusters (BCGs), chemical evidence is required to confirm the presence of metabolites and comprehensively address this fundamental hypothesis, as metagenomic data only identify metabolic potential. To describe the Earth's metabolome, we use an integrated omics approach: the direct survey of metabolites associated with microbial communities spanning diverse environments using untargeted metabolomics coupled with metagenome analysis. We show, in contrast to Nayfach and co-authors, that the presence of certain classes of secondary metabolites can be limited or amplified by the environment. Importantly, our data indicate that considering the relative abundances of secondary metabolites (i.e., rather than only presence/absence) strengthens differences in metabolite profiles across environments, and that their richness and composition in any given sample do not directly reflect those of co-occurring microbial communities, but rather vary with the environment.
30In this work, we hypothesized that shifts in the food microbiome can be used as an indicator of 31 unexpected contaminants or environmental changes. To test this hypothesis, we sequenced total 32 RNA of 31 high protein powder (HPP) samples of poultry meal pet food ingredients. We 33 developed a microbiome analysis pipeline employing a key eukaryotic matrix filtering step that 34 improved microbe detection specificity to >99.96% during in silico validation. The pipeline 35 identified 119 microbial genera per HPP sample on average with 65 genera present in all 36 samples. The most abundant of these were Bacteroides, Clostridium, Lactococcus, Aeromonas, 37 and Citrobacter. We also observed shifts in the microbial community corresponding to 38 ingredient composition differences. When comparing culture-based results for Salmonella with 39 total RNA sequencing, we found that Salmonella growth did not correlate with multiple 40 sequence analyses. We conclude that microbiome sequencing is useful to characterize complex 41 food microbial communities, while additional work is required for predicting specific species' 42 viability from total RNA sequencing. 43 44 KEYWORDS: 45 microbiome, food safety, bioinformatics, shotgun sequencing, microbial ecology, pathogens 46 47 48 3
SARS-CoV-2 genomic sequencing efforts have scaled dramatically to address the current global pandemic and aid public health. In this work, we analyzed a corpus of 66,000 SARS-CoV-2 genome sequences. We developed a novel semi-supervised pipeline for automated gene, protein, and functional domain annotation of SARS-CoV-2 genomes that differentiates itself by not relying on use of a single reference genome and by overcoming atypical genome traits. Using this method, we identified the comprehensive set of known proteins with 98.5% set membership accuracy and 99.1% accuracy in length prediction compared to proteome references including Replicase polyprotein 1ab (with its transcriptional slippage site). Compared to other published tools such as Prokka (base) and VAPiD, we yielded an 6.4- and 1.8-fold increase in protein annotations. Our method generated 13,000,000 molecular target sequences— some conserved across time and geography while others represent emerging variants. We observed 3,362 non-redundant sequences per protein on average within this corpus and describe key D614G and N501Y variants spatiotemporally. For spike glycoprotein domains, we achieved greater than 97.9% sequence identity to references and characterized Receptor Binding Domain variants. Here, we comprehensively present the molecular targets to refine biomedical interventions for SARS-CoV-2 with a scalable high-accuracy method to analyze newly sequenced infections.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.