2018
DOI: 10.1101/462788
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Exploring neighborhoods in large metagenome assembly graphs reveals hidden sequence diversity

Abstract: Genomes computationally inferred from large metagenomic data sets are often incomplete and may be missing functionally important content and strain variation. We introduce an information retrieval system for large metagenomic data sets that exploits the sparsity of DNA assembly graphs to efficiently extract subgraphs surrounding an inferred genome. We apply this system to recover missing content from genome bins and show that substantial genomic sequence variation is present in a real metagenome. Our software … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
4
3

Relationship

3
4

Authors

Journals

citations
Cited by 8 publications
(9 citation statements)
references
References 51 publications
0
9
0
Order By: Relevance
“…We have modified the analysis protocol by replacing the complete metagenomic assembly with a local assembly using MetaCherchant algorithm. Also, there are new additional methods to restore the hidden gene diversity of MAGs (Brown et al, 2019).…”
Section: Resultsmentioning
confidence: 99%
“…We have modified the analysis protocol by replacing the complete metagenomic assembly with a local assembly using MetaCherchant algorithm. Also, there are new additional methods to restore the hidden gene diversity of MAGs (Brown et al, 2019).…”
Section: Resultsmentioning
confidence: 99%
“…While sequence assembly has been an active area of research 61 , this has not been the case for gene prediction methods 61 , which are becoming outdated 62 and cannot cope with the current amount of data. Alternatives like protein-level assembly 63 combined with exploring the assembly graphs’ neighborhoods 64 become very attractive for our purposes. In any case, we still face the challenge of discriminating between real and artifactual singletons 65 .…”
Section: Discussionmentioning
confidence: 99%
“…This likely indicates that a P. ruminis strain that has not been sequenced before is in our sample, and that it shares more k-mers of size 31 in common with one strain than the other. (See Brown CT et al 23 for further analysis of this strain. )…”
Section: Use Casesmentioning
confidence: 99%