MSPminer: abundance-based reconstitution of microbial pan-genomes from shotgun metagenomic data

Oñate, Florian Plaza; Chatelier, Emmanuelle Le; Almeida, Mathieu; Cervino, Alessandra; Gauthier, Franck; Magoulès, Frédéric; Ehrlich, S. Dusko; Pichaud, Matthieu

doi:10.1093/bioinformatics/bty830

Cited by 96 publications

(90 citation statements)

References 43 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Divisive and threshold-free agglomerative approaches achieve finer taxonomic resolutions than the threshold-based similarity approach. Using WGS in the ecosystems where a bacterial gene catalog is available, such as the human gut or the pig gut (Xiao et al, 2016), the standard approach consists in mapping the reads against the catalog and then clustering the bacterial genes based on their abundance profiles to produce metagenomic species (MGS) (Nielsen et al, 2014) or clusters of coabundant genes to reconstruct microbial pan-genomes (MSP) (Plaza Oñate et al, 2018). We will refer to taxa, noting that the term can designate OTUs, ASVs, oligotypes, MGSs, MSPs and generally any feature found in abundance tables (obtained by counting the number of copies of each feature in each sample).…”

Section: Introductionmentioning

confidence: 99%

Incorporating Phylogenetic Information in Microbiome Differential Abundance Studies Has No Effect on Detection Power and FDR Control

et al. 2020

View full text Add to dashboard Cite

We consider the problem of incorporating evolutionary information (e.g., taxonomic or phylogenic trees) in the context of metagenomics differential analysis. Recent results published in the literature propose different ways to leverage the tree structure to increase the detection rate of differentially abundant taxa. Here, we propose instead to use a different hierarchical structure, in the form of a correlation-based tree, as it may capture the structure of the data better than the phylogeny. We first show that the correlation tree and the phylogeny are significantly different before turning to the impact of tree choice on detection rates. Using synthetic data, we show that the tree does have an impact: smoothing p-values according to the phylogeny leads to equal or inferior rates as smoothing according to the correlation tree. However, both trees are outperformed by the classical, non-hierarchical, Benjamini-Hochberg (BH) procedure in terms of detection rates. Other procedures may use the hierarchical structure with profit but do not control the False Discovery Rate (FDR) a priori and remain inferior to a classical Benjamini-Hochberg procedure with the same nominal FDR. On real datasets, no hierarchical procedure had significantly higher detection rate that BH. Intuition advocates that the use of hierarchical structures should increase the detection rate of differentially abundant taxa in microbiome studies. However, our results suggest that current hierarchical procedures are still inferior to standard methods and more effective procedures remain to be invented.

show abstract

Section: Introductionmentioning

confidence: 99%

Incorporating Phylogenetic Information in Microbiome Differential Abundance Studies Has No Effect on Detection Power and FDR Control

et al. 2020

View full text Add to dashboard Cite

show abstract

“…Next, whole-metagenomic sequencing was performed on 138 individuals (102 IBS patients and 36 healthy subjects) (Table 1). Metagenomic reads (with an average of 14 million reads per sample) were mapped onto a catalog of Metagenomic Species Pangenomes (MSPs) 27 , yielding a total of 1,661 MSPs. On the basis of per-individual genetic content, 166 of them were further divided into 523 subspecies, corresponding to a mean of 75.3% of the metagenome read mass.…”

Section: Resultsmentioning

confidence: 99%

“…Metagenomics species pangenomes (MSPs) are co-abundant gene groups that can be considered part of complete microbial species pangenomes. MSP gene content was extracted from a previous publication by Plaza-Onate et al 27 . MSP gene content was subdivided into core and accessory genes.…”

Section: Methodsmentioning

confidence: 99%

Gut microbiome, diet and symptom interactions in irritable bowel syndrome

Tap

Störsrud

Nevé

et al. 2020

Preprint

View full text Add to dashboard Cite

While several studies have documented associations between dietary habits and microbiota composition and function in healthy subjects, no study explored these associations in patients with irritable bowel syndrome (IBS), and especially in relation to symptoms. Here, we used a novel approach that combined data from 4-day food diary, integrated into a food tree, together with gut microbiota (shotgun metagenomic) for IBS patients (N=149) and healthy subjects (N=52). Paired microbiota and food-based trees allowed to detect new association between subspecies and diet. Combining co-inertia analysis and linear regression models, exhaled gas levels and symptom severity could be predicted from metagenomic and dietary data. IBS patients with severe symptoms had a diet enriched in food items of poorer quality, a high abundance of gut microbial enzymes involved in hydrogen metabolism in correlation with animal carbohydrate (mucin/meat-derived) metabolism. Our study provides unprecedented resolution of diet-microbiota-symptom interactions and ultimately paves the way for personalized nutritional recommendations.

show abstract

“…Finally, the Zeller MSP data originates from the same study as the Zeller data (Zeller et al, 2014). It was created from the shotgun data by reconstructing Metagenomics Species Pan-genomes (MSPs) abundance count table, as reported in Plaza Oñate et al (2018). Briefly, reads were quality-filtered and unique reads were mapped against the 9.9 million Integrated Gene Catalog (Li et al, 2014) using BBmap (Bushnell, 2014).…”

Section: Methodsmentioning

confidence: 99%

“…Divisive and threshold-free agglomerative approaches achieve finer taxonomic resolutions than the threshold-based similarity approach. Using WGS in the ecosystems where a bacterial gene catalog is available, such as the human gut (Li et al, 2014) or the pig gut (Xiao et al, 2016), the standard approach consists in mapping the reads against the catalog and then clustering the bacterial genes based on their abundance profiles to produce metagenomic species (MGS) (Nielsen et al, 2014) or clusters of co-abundant genes to reconstruct microbial pan-genomes (MSP) (Plaza Oñate et al, 2018). We will refer to taxa, noting that the term can designate OTUs, ASVs, oligotypes, MGSs, MSPs and generally any feature found in abundance tables.…”

Section: Introductionmentioning

confidence: 99%

Incorporating phylogenetic information in microbiome abundance studies has no effect on detection power and FDR control

Bichat

Ambroise

Plassais

et al. 2020

Preprint

View full text Add to dashboard Cite

We consider the problem of incorporating evolutionary information (e.g. taxonomic or phylogenic trees) in the context of metagenomics differential analysis. Recent results published in the literature propose different ways to leverage the tree structure to increase the detection rate of differentially abundant taxa. Here, we propose instead to use a different hierachical structure, in the form of a correlation-based tree, as it may capture the structure of the data better than the phylogeny. We first show that the correlation tree and the phylogeny are significantly different before turning to the impact of tree choice on detection rates. Using synthetic data, we show that the tree does have an impact: smoothing p-values according to the phylogeny leads to equal or inferior rates as smoothing according to the correlation tree. However, both trees are outperformed by the classical, non hierachical, Benjamini-Hochberg (BH) procedure in terms of detection rates. Other procedures may use the hierachical structure with profit but do not control the False Discovery Rate (FDR) a priori and remain inferior to a classical Benjamini-Hochberg procedure with the same nominal FDR. On real datasets, no hierarchical procedure had significantly higher detection rate that BH. Although intuition advocates the use of a hierachical structure, be it the phylogeny or the correlation tree, to increase the detection rate in microbiome studies, current hierachical procedures are still inferior to non hierachical ones and effective procedures remain to be invented.

show abstract

MSPminer: abundance-based reconstitution of microbial pan-genomes from shotgun metagenomic data

Cited by 96 publications

References 43 publications

Incorporating Phylogenetic Information in Microbiome Differential Abundance Studies Has No Effect on Detection Power and FDR Control

Incorporating Phylogenetic Information in Microbiome Differential Abundance Studies Has No Effect on Detection Power and FDR Control

Gut microbiome, diet and symptom interactions in irritable bowel syndrome

Incorporating phylogenetic information in microbiome abundance studies has no effect on detection power and FDR control

Contact Info

Product

Resources

About