2019
DOI: 10.1186/s13059-019-1819-8
|View full text |Cite
|
Sign up to set email alerts
|

SEPATH: benchmarking the search for pathogens in human tissue whole genome sequence data leads to template pipelines

Abstract: BackgroundHuman tissue is increasingly being whole genome sequenced as we transition into an era of genomic medicine. With this arises the potential to detect sequences originating from microorganisms, including pathogens amid the plethora of human sequencing reads. In cancer research, the tumorigenic ability of pathogens is being recognized, for example, Helicobacter pylori and human papillomavirus in the cases of gastric non-cardia and cervical carcinomas, respectively. As of yet, no benchmark has been carri… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 14 publications
(15 citation statements)
references
References 54 publications
0
15
0
Order By: Relevance
“…We took into account several considerations to control for the potential biases in our analysis. Whereas Kraken is computationally expensive, its performance has been demonstrated in several studies to rank among the best up to date [25][26][27]. Secondly, since distinguishing between closely related species may be difficult, to be conservative, we performed the taxonomic filter at the genus level instead of species.…”
Section: Discussionmentioning
confidence: 99%
“…We took into account several considerations to control for the potential biases in our analysis. Whereas Kraken is computationally expensive, its performance has been demonstrated in several studies to rank among the best up to date [25][26][27]. Secondly, since distinguishing between closely related species may be difficult, to be conservative, we performed the taxonomic filter at the genus level instead of species.…”
Section: Discussionmentioning
confidence: 99%
“…Each test data set consisted of simulated reads (using ART ( 23 )) from the selected viral genomes, the human genome (GRCh38 assembly), and contaminant genomes (bacterial and fungal) ( 19 ) as follows:…”
Section: Methodsmentioning
confidence: 99%
“…human). Unfortunately, in realistic host-viral settings, the sampled data often includes bacterial and fungal sequences ( 19 , 20 ), which are not known a priori and are therefore not included in the training set. We tested DeepVirFinder in these ‘open-set’ scenarios and observed a significant degradation in performance (Results).…”
Section: Introductionmentioning
confidence: 99%
“…DNA sequencing and other genetic testing approaches have clear utility in identifying the causes underlying congenital abnormalities and infections, but their clinical use in perinatal autopsy is currently limited, and generally restricted to hypothesis-based tests. Genome sequencing (GS) however, offers an unbiased and capture-free approach to sequence all DNA, human and microbial, present in a sample 11 . Genomic analysis has revolutionised the identi cation of genetic variation causative of monogenic disorders 12 , and while the e cacy of metagenomic analysis for the identi cation of pathogens underlying infectious disease is comparatively less well established, it has been demonstrated in a number of recent reports 9 .…”
Section: Full Textmentioning
confidence: 99%