Background A major bottleneck in the use of metagenome sequencing for human gut microbiome studies has been the lack of a comprehensive genome collection to be used as a reference database. Several recent efforts have been made to re-construct genomes from human gut metagenome data, resulting in a huge increase in the number of relevant genomes. In this work, we aimed to create a collection of the most prevalent healthy human gut prokaryotic genomes, to be used as a reference database, including both MAGs from the human gut and ordinary RefSeq genomes. Results We screened > 5,700 healthy human gut metagenomes for the containment of > 490,000 publicly available prokaryotic genomes sourced from RefSeq and the recently announced UHGG collection. This resulted in a pool of > 381,000 genomes that were subsequently scored and ranked based on their prevalence in the healthy human metagenomes. The genomes were then clustered at a 97.5% sequence identity resolution, and cluster representatives (30,691 in total) were retained to comprise the HumGut collection. Using the Kraken2 software for classification, we find superior performance in the assignment of metagenomic reads, classifying on average 94.5% of the reads in a metagenome, as opposed to 86% with UHGG and 44% when using standard Kraken2 database. A coarser HumGut collection, consisting of genomes dereplicated at 95% sequence identity—similar to UHGG, classified 88.25% of the reads. HumGut, half the size of standard Kraken2 database and directly comparable to the UHGG size, outperforms them both. Conclusions The HumGut collection contains > 30,000 genomes clustered at a 97.5% sequence identity resolution and ranked by human gut prevalence. We demonstrate how metagenomes from IBD-patients map equally well to this collection, indicating this reference is relevant also for studies well outside the metagenome reference set used to obtain HumGut. All data and metadata, as well as helpful code, are available at http://arken.nmbu.no/~larssn/humgut/.
8A major challenge with human gut microbiome studies is the lack of a publicly accessible human gut 9 genome collection that is verifiably complete. We aimed to create Humgut, a comprehensive collection 10 of healthy human gut prokaryotic genomes, to be used as a reference for worldwide human gut 11 microbiome studies. We screened >2,300 healthy human gut metagenomes for the containment of 12 >486,000 publicly available prokaryotic genomes. The contained genomes were then scored, ranked, 13 and clustered based on their sequence identity, only to keep representative genomes per cluster, 14 resulting thus in the creation of HumGut. Superior performance in the taxonomic assignment of 15 metagenomic reads, classifying 97% of reads on average, is a benchmark advantage of HumGut. Re-16 analyses of healthy gut samples using HumGut revealed that >90% contained a core set of 129 bacterial 17 species and that, on average, the guts of healthy people contain around 1,000 bacterial species. The 18 HumGut collection will continuously be updated as the list of publicly available genomes and 19 metagenomes expand. Our approach can also be extended to disease-associated genomes and 20 metagenomes, in addition to other species. The comprehensive, yet slim HumGut database streamlines 21 analyses while significantly improving taxonomic assignments in a field in dire need of method 22 standardization and effectivity. 23
Butyrate and propionate represent two of three main short-chain fatty acids produced by the intestinal microbiota. In healthy populations, their levels are reportedly equimolar, whereas a deviation in their ratio has been observed in various diseased cohorts. Monitoring such a ratio represents a valuable metric; however, it remains a challenge to adopt short-chain fatty acid detection techniques in clinical settings because of the volatile nature of these acids. Here we aimed to estimate short-chain fatty acid information indirectly through a novel, simple quantitative PCR-compatible assay (liquid array diagnostics) targeting a limited number of microbiome 16S markers. Utilizing 15 liquid array diagnostics probes to target microbiome markers selected by a model that combines partial least squares and linear discriminant analysis, the classes (normal vs high propionate-to-butyrate ratio) separated at a threshold of 2.6 with a prediction accuracy of 96%.
We present a novel liquid array diagnostics (LAD) method, which enables rapid and inexpensive detection of microbial markers in a single-tube multiplex reaction. We evaluated LAD both on pure cultures, and on infant gut microbiota for a 15-plex reaction. LAD showed more than 80 accuracy of classification and a detection limit lower than 2 of the Illumina reads per sample. The results on the clinical dataset showed that there was a rapid decrease of staphylococci from 10-day- to 4-month-old children, a peak of bifidobacteria at 4months, and a peak of Bacteroides in2-year-old children, which is in accordance with findings described in the literature. Being able to detect up to 50 biomarkers, LAD is a suitable method for assays where high throughput is essential.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.