2018
DOI: 10.1093/bioinformatics/bty296
|View full text |Cite|
|
Sign up to set email alerts
|

MicroPheno: predicting environments and host phenotypes from 16S rRNA gene sequencing using a k-mer based representation of shallow sub-samples

Abstract: MotivationMicrobial communities play important roles in the function and maintenance of various biosystems, ranging from the human body to the environment. A major challenge in microbiome research is the classification of microbial communities of different environments or host phenotypes. The most common and cost-effective approach for such studies to date is 16S rRNA gene sequencing. Recent falls in sequencing costs have increased the demand for simple, efficient and accurate methods for rapid detection or di… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
58
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
3
2

Relationship

2
8

Authors

Journals

citations
Cited by 73 publications
(59 citation statements)
references
References 71 publications
1
58
0
Order By: Relevance
“…Asgari et al (2018) used shallow subsample representation based on k -mer and deep learning, random forests, and SVMs to predict environmental and host phenotypes from 16S rRNA gene sequencing using the MicroPheno system. They found that the shallow subsample representation based on k -mer is superior to OTU in terms of body location recognition and Crohn’s disease prediction.…”
Section: Classification and Prediction In Microbiologymentioning
confidence: 99%
“…Asgari et al (2018) used shallow subsample representation based on k -mer and deep learning, random forests, and SVMs to predict environmental and host phenotypes from 16S rRNA gene sequencing using the MicroPheno system. They found that the shallow subsample representation based on k -mer is superior to OTU in terms of body location recognition and Crohn’s disease prediction.…”
Section: Classification and Prediction In Microbiologymentioning
confidence: 99%
“…ASVs are commonly generated using the Divisive Amplicon Denoising Algorithm 2 (DADA2), and the resultant ASVs represent true biological sequences obtained from reads (Callahan et al, 2016). In addition, there have been recent efforts to use the occurrence of short-chain k-mer (15-30mer) (Molik et al, 2020), and very short-chain k-mers (<10mer) (Asgari et al, 2018(Asgari et al, , 2019, within reads that offer a unique reference-free and alignment-free approach to provide a data representation upon which a phenotype prediction model is built. We have included both of these k-mer approaches in our review to compare them directly with the OTU/ASV assignment methods.…”
Section: Introductionmentioning
confidence: 99%
“…Segmentation of biological sequences into bag-of-or sequenceof-overlapping fixed length k-mers is one of the most favorable representations in bioinformatics research. K-mer representations are widely used in the areas of proteomics (Grabherr et al, 2011;Asgari and Mofrad, 2015), genomics (Jolma et al, 2013;Alipanahi et al, 2015), epigenomics (Awazu, 2016;Giancarlo et al, 2015), and metagenomics (Wood and Salzberg, 2014;Asgari et al, 2018).…”
Section: K-mer Representationmentioning
confidence: 99%