2022
DOI: 10.1101/2022.07.26.501561
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Robust Harmonization of Microbiome Studies by Phylogenetic Scaffolding with MaLiAmPi

Abstract: Microbiome science is difficult to translate back to patients due to an inability to harmonize 16S rRNA gene-based microbiome data, as differences in the technique will result in different amplicon sequence variants (ASV) from the same microbe. Here we demonstrate that placement of ASV onto a common phylogenetic tree of full-length 16S rRNA alleles can harmonize microbiome studies. Using in silico data approximating 100 healthy human stool microbiomes we demonstrated that phylogenetic placement of ASV can reca… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 9 publications
(11 citation statements)
references
References 31 publications
0
10
0
Order By: Relevance
“…Thus, we first focused on harmonizing the microbiome data from the nine studies that comprised our training set into a common set of features that were not reliant upon taxonomy, but instead based on phylogenetic placement of the ASVs onto a common de novo maximum likelihood phylogenetic tree comprised of full-length 16S rRNA alleles. This approach is fully described and validated elsewhere, and was implemented as a Nextflow-based workflow called MaLiAmPi 39 . After processing with MaLiAmPi, we were able to overcome most of the technique-based noise and successfully harmonize the data into one cohesive feature set.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Thus, we first focused on harmonizing the microbiome data from the nine studies that comprised our training set into a common set of features that were not reliant upon taxonomy, but instead based on phylogenetic placement of the ASVs onto a common de novo maximum likelihood phylogenetic tree comprised of full-length 16S rRNA alleles. This approach is fully described and validated elsewhere, and was implemented as a Nextflow-based workflow called MaLiAmPi 39 . After processing with MaLiAmPi, we were able to overcome most of the technique-based noise and successfully harmonize the data into one cohesive feature set.…”
Section: Resultsmentioning
confidence: 99%
“…We applied MaLiAmPi 39 to both training and test data to process and aggregate the datasets. Standardized processed data format facilitates running Docker containers, as we had participants use in our Challenge, and choosing feature sets for permutation.…”
Section: Methodsmentioning
confidence: 99%
“…Stabl was also tested on a dataset where previous models did not perform as well (AUROC < 0.7). The Microbiome Preterm Birth DREAM challenge aimed to classify pre-term (PT) and term (T) labor pregnancies using nine publicly available vaginal microbiome (phylotypic and taxonomic) datasets 48,49 . The top 20 models submitted by 318 participating analysis teams achieved AUROC scores between 0.59 and 0.69 for the task of predicting PT delivery.…”
Section: Articlementioning
confidence: 99%
“…The MaLiAmPi pipeline was used to process all the data 48,49 . Essentially, DADA2 was used to assemble each project's raw reads into approximate sequence variants (ASVs).…”
Section: Dream Challenge Datasetmentioning
confidence: 99%
“…Sequence variants were generated using DADA2 (ver 1.12.0) 35 , and then phylogenetically mapped to a custom reference set generated from full length 16S rRNA gene sequences from Ribosomal Database Project (Taxonomy 16) 36 , previously validated 37 . Phylotypes were formed by grouping amplicon sequence variants into clusters based on phylogenetic distance 38 .…”
Section: Microbiomementioning
confidence: 99%