2020
DOI: 10.21203/rs.3.rs-15502/v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Systematic evaluation of supervised machine learning for sample origin prediction using metagenomic sequencing data

Abstract: Background:The advent of metagenomic sequencing provides microbial abundance patterns that can be leveraged for sample origin prediction. Supervised machine learning classification approaches have been reported to predict sample origin accurately when the origin has been previously sampled. Using metagenomic datasets provided by the 2019 CAMDA challenge, we evaluated the influence of technical, analytical and machine learning approaches for result interpretation and source prediction of new origins.Results:Com… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 35 publications
0
4
0
Order By: Relevance
“…Cancer genomic studies have seen an exponential growth in the last decade [57][58][59][60], and a massive amount of data has now become openly accessible. Analyses of this information can effectively direct studies addressing the activation of specific gene network and their prognostic significance in cancer pathogenesis [61][62][63].…”
Section: Resultsmentioning
confidence: 99%
“…Cancer genomic studies have seen an exponential growth in the last decade [57][58][59][60], and a massive amount of data has now become openly accessible. Analyses of this information can effectively direct studies addressing the activation of specific gene network and their prognostic significance in cancer pathogenesis [61][62][63].…”
Section: Resultsmentioning
confidence: 99%
“…Artificial intelligence might just help with this issue. The possibility to develop executable cancer models would allow scientists to search among multiple datasets for the discovery of signatures at every level (genomic, transcriptomic, proteomic, clinical data) to detect key features of the biological behaviour of interest [20,69,[165][166][167]. Biological systems can be treated by AI as networks of information that can be programmed, i.e., reconstructed as a matrix of data [168,169].…”
Section: Executable Cancer Models: Successes and Challengesmentioning
confidence: 99%
“…In the pathogenesis of prostate cancer, the initial pathogenetic event might be considered the dependency from the androgen stimulation [46][47][48][49][50], but many more molecular alterations have been found associated to its progression [51][52][53][54][55] as well as to the development of specific clinical features of the disease such as bone metastasis [56] or studied as potential therapeutic targets [57]. The microbiome has also been shown to impact on the pathogenesis of this kind of cancer [58][59][60][61][62], also due to development of algorithms allowing to interpret its role in the context of a disease [63][64][65][66][67]. Anyhow, specific attention is focused on epigenetic remodeling.…”
Section: Serine Metabolism In Prostate Cancermentioning
confidence: 99%