2022
DOI: 10.1186/s13073-022-01047-5
|View full text |Cite
|
Sign up to set email alerts
|

Ontology-aware deep learning enables ultrafast and interpretable source tracking among sub-million microbial community samples from hundreds of niches

Abstract: The taxonomic structure of microbial community sample is highly habitat-specific, making source tracking possible, allowing identification of the niches where samples originate. However, current methods face challenges when source tracking is scaled up. Here, we introduce a deep learning method based on the Ontology-aware Neural Network approach, ONN4MST, for large-scale source tracking. ONN4MST outperformed other methods with near-optimal accuracy when source tracking among 125,823 samples from 114 niches. ON… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
11
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1
1

Relationship

2
5

Authors

Journals

citations
Cited by 12 publications
(11 citation statements)
references
References 45 publications
0
11
0
Order By: Relevance
“…However, when the number of samples and biomes increases, running time increases rapidly, preventing large-scale source tracking. This problem could be solved by deep learning solutions by utilizing model-based methods such as neural networks that would enable improvements in both speed and accuracy during source tracking [61] , [62] .…”
Section: The Dilemma Of Traditional Methods Could Be Solved By Deep L...mentioning
confidence: 99%
See 1 more Smart Citation
“…However, when the number of samples and biomes increases, running time increases rapidly, preventing large-scale source tracking. This problem could be solved by deep learning solutions by utilizing model-based methods such as neural networks that would enable improvements in both speed and accuracy during source tracking [61] , [62] .…”
Section: The Dilemma Of Traditional Methods Could Be Solved By Deep L...mentioning
confidence: 99%
“…The random forest approach is more widely used to identify microbial community sources via application toward the prediction of locations and times for forensic studies [7] , [57] , [58] , in addition to application in predicting sources of contamination [59] , [60] . ONN4MST [61] is a deep learning method which employs a neural network model to source track microbial communities at high efficiency and accuracy without any prior knowledge about the microbial communities to be estimated. Its pre-built biome ontology includes 60 environmental biomes, 25 host-associated biomes, and 10 engineered biomes, which represent the most comprehensive potential sources utilized for source tracking.…”
Section: Dark Matter In the Microbiome And The Computational Mining T...mentioning
confidence: 99%
“…Until now, finding suitable fingerprint factors to determine the main dispersal sources of sediment has been a formidable challenge (Vercruysse et al, 2017). Fortunately, some research shows that microorganisms can be used as a special fingerprint factor to trace their host origin due to their genetic diversity, biochemical characteristics and ability to produce a variety of specific metabolites in the environment (Zha et al, 2022). More recently, Zhang et al (2019) concluded that microorganisms can act as helpful indications for sediment sources, based on the indivisible relationship between sediment and microorganisms (Droppo, 2001).…”
Section: Introductionmentioning
confidence: 99%
“…For instance, performing microbial community classification among thousands of samples within hundreds of biomes may take hours for these methods [ 19 ]. Recently, ONN4MST was proposed to solve the irreconcilable contradiction between efficiency and accuracy [ 20 ]. ONN4MST is a supervised learning method based on ontology-aware neural network (ONN) model, which contains multiple output layers fitting with a general biome ontology.…”
Section: Introductionmentioning
confidence: 99%