2020
DOI: 10.1101/2020.10.29.360297
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Towards end-to-end disease prediction from raw metagenomic data

Abstract: Analysis of the human microbiome using metagenomic sequencing data has demonstrated high ability in discriminating various human diseases. Raw metagenomic sequencing data require multiple complex and computationally heavy bioinformatics steps prior to data analysis. Such data contain millions of short sequences read from the fragmented DNA sequences and are stored as fastq files. Conventional processing pipelines consist multiple steps including quality control, filtering, alignment of sequences against genomi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
16
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 9 publications
(20 citation statements)
references
References 40 publications
(68 reference statements)
0
16
0
Order By: Relevance
“…It aggregates these k-mer embeddings to get read embeddings. Using the same idea, Metagenome2Vec ( [100]) avoids the solution of simply aggregating data, which would lead to losing precision, by using fastDNA ( [99]).…”
Section: Sequence-based Approachesmentioning
confidence: 99%
See 1 more Smart Citation
“…It aggregates these k-mer embeddings to get read embeddings. Using the same idea, Metagenome2Vec ( [100]) avoids the solution of simply aggregating data, which would lead to losing precision, by using fastDNA ( [99]).…”
Section: Sequence-based Approachesmentioning
confidence: 99%
“…Metagenome2Vec ( [100]), IDMIL ( [132]) and the method described in [130]) use a particular DL paradigm called Multiple Instance Learning. Multiple Instance Learning (MIL) is a supervised learning paradigm that consists of learning from labeled sets of instances, known as 'bags', instead of learning from individually labeled instances.…”
Section: Sequence-based Approachesmentioning
confidence: 99%
“…Several machine learning approaches based on k-mers have been proposed in the literature for classification and clustering tasks [33,35,38,39]. More specifically, there are many classical algorithms for sequence classification [40,41].…”
Section: Related Workmentioning
confidence: 99%
“…Several machine learning approaches based on k-mers have been proposed in the literature for classification and clustering tasks [31,33,36,37]. More specifically, there are a lot of classical algorithms for sequence classification [38,39].…”
Section: Related Workmentioning
confidence: 99%