2020
DOI: 10.1101/2020.05.09.085993
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Meta-Signer: Metagenomic Signature Identifier based on Rank Aggregation of Features

Abstract: Background: The advance of metagenomic studies provides the opportunity to identify microbial taxa that are associated to human diseases. Multiple methods exist for the association analysis. However, the results could be inconsistent, presenting challenges in interpreting the host-microbiome interactions. To address this issue, we introduce Meta-Signer, a novel Metagenomic Signature Identifier tool based on rank aggregation of features identified from multiple machine learning models including Random Forest, S… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
17
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(17 citation statements)
references
References 55 publications
(63 reference statements)
0
17
0
Order By: Relevance
“…Meta-Signer, a tool that produces a ranked list of the taxa based on machinelearning methods, was used to identify the most discriminative taxa between the 2 groups. 13,14 The top 10 microbes identified by MetaSigner could discriminate between the G15/19 and non-G15/19 groups, as supported by PERMA-NOVA P ¼ 0.006 (Figure 2A) where Anaeroplasma, Candidatus arthromitus, Sutterella, and Lachnospiraceae family showed higher average abundance at G15/G19 and Bacteroides, Corynebacterium, Acinetobacter, and Lactococcus showed a lower average abundance (Figure 2B). In addition, these taxa were found with a high level of cooccurrence based on Spearman's rank correlation coefficients (P < 0.05) (Figure 2C) that was consistent across mouse strains.…”
Section: Strain-specific Gm Composition Changes and Phenotypic Associ...mentioning
confidence: 79%
“…Meta-Signer, a tool that produces a ranked list of the taxa based on machinelearning methods, was used to identify the most discriminative taxa between the 2 groups. 13,14 The top 10 microbes identified by MetaSigner could discriminate between the G15/19 and non-G15/19 groups, as supported by PERMA-NOVA P ¼ 0.006 (Figure 2A) where Anaeroplasma, Candidatus arthromitus, Sutterella, and Lachnospiraceae family showed higher average abundance at G15/G19 and Bacteroides, Corynebacterium, Acinetobacter, and Lactococcus showed a lower average abundance (Figure 2B). In addition, these taxa were found with a high level of cooccurrence based on Spearman's rank correlation coefficients (P < 0.05) (Figure 2C) that was consistent across mouse strains.…”
Section: Strain-specific Gm Composition Changes and Phenotypic Associ...mentioning
confidence: 79%
“…The data for the PRISM and external IBD datsets as formatted for use of Meta-Signer can be found at Zenodo: derekreiman/Meta-Signer: Original Release. http://doi.org/10.5281/zenodo.4077403 36 .…”
Section: Data Availabilitymentioning
confidence: 99%
“… 1 , 2 Machine learning (ML) and deep learning (DL) methods facilitate metagenomics-based disease prediction and the discovery of consistent, replicable, and cross-cohort microbial biomarkers. 3 , 4 , 5 , 6 , 7 , 8 , 9 However, metagenomic data of individual clinical investigations are typical of low sample sizes (dozens-to-hundreds of samples), 3 , 4 , 10 high dimensionality (hundreds-to-thousands of microbes), 3 , 4 , 10 sparsity (sparsely distributed across taxonomic hierarchies), and high variations (biological and environmental). 11 These problems confound statistical inference and learning outcomes to random chances and false discoveries 12 and mask the identification of genuine biomarkers.…”
Section: Introductionmentioning
confidence: 99%
“… 12 , 13 DL outcomes are difficult to interpret, particularly in microbiome-wide association studies. 9 , 14 Instead of the end-to-end DL methods, ML methods with feature selection strategy have been practically used for metagenomic investigations of low sample sizes. 3 , 4 , 15 For example, the “Meta-Singer” is to rank the microbial features based on the aggregation of identified features from multiple ML models, 9 while the novel “predomics” tool employs the genetic algorithm to find the best number of features for simple condition models, leading to better accuracy and interpretability than the previous state-of-the-art (SOTA) ML models using fewer features.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation