2021
DOI: 10.1101/2021.02.24.429166
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Automated strain separation in low-complexity metagenomes using long reads

Abstract: High-throughput short-read metagenomics has enabled large-scale species-level analysis and functional characterization of microbial communities. Microbiomes often contain multiple strains of the same species, and different strains have been shown to have important differences in their functional roles. Despite this, strain-level resolution from metagenomic sequencing remains challenging. Recent advances on long-read based methods enabled accurate assembly of bacterial genomes from complex microbiomes and an as… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
5
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 55 publications
(70 reference statements)
0
5
0
Order By: Relevance
“…Even though bins are often generalized to represent distinct microbial taxonomic units in a sample, they are rarely assumed to accurately represent true, genetically distinct microbial populations in a sample. This problem has been addressed by multiple studies 11,12 , and precise definitions for individual, highly resolved MAGs remain contextual to each study. Similar to one of these studies 11 , we focus on generating separate representative reference genomes for distinct microbial lineages within an individual metagenome, which we define as "lineage-resolved MAGs" 1,13 .…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Even though bins are often generalized to represent distinct microbial taxonomic units in a sample, they are rarely assumed to accurately represent true, genetically distinct microbial populations in a sample. This problem has been addressed by multiple studies 11,12 , and precise definitions for individual, highly resolved MAGs remain contextual to each study. Similar to one of these studies 11 , we focus on generating separate representative reference genomes for distinct microbial lineages within an individual metagenome, which we define as "lineage-resolved MAGs" 1,13 .…”
Section: Introductionmentioning
confidence: 99%
“…Furthermore, these workflows are designed primarily to identify strain lineages from alignments of short-read data and do not capture variant linkage data from longer read datasets. A recent attempt to adapt uncorrected long reads to this purpose requires the use of manual curation and a priori estimates of strain numbers in order to achieve optimal results 12 . An intuitive and automated method to generate lineage-resolved complete MAGs is needed for analysis of more complex metagenome communities in order to reduce the time required to validate results.…”
Section: Introductionmentioning
confidence: 99%
“…We recall that Strainline is unique insofar as it is the first approach to determine the haplotype/strain-specific genomes of viruses from long reads de novo. For the sake of a meaningful comparison, we chose long read de novo assemblers that are designed to deal with mixed samples (in other words, designed for metagenome assembly), such as Canu Koren et al (2017) and metaFlye Vicedomini et al (2021), on the one hand, and generic (consensus) de novo assemblers, such as wtdbg2 Ruan and Li (2020) and Shasta Shafin et al (2020) on the other hand. Of those, we subsequently excluded metaFlye, because it failed to perform the assemblies.…”
Section: Benchmarking: Alternative Approachesmentioning
confidence: 99%
“…In addition, metaFlye, originally designed to perform assembly of metagenomes, operates at the level of species Vicedomini et al (2021), so neglects to resolve individual genomes at the level of strains.…”
Section: Introductionmentioning
confidence: 99%
“…Despite impressive accomplishments, the MAG approach still harbours many challenges and limitations. By nature, short read metagenome assemblies remain highly fractionated, resulting from the limited ability of short read sequencing to accurately capture complex repeat regions (Chen et al, 2020) and the difficulties encountered in reconstructing sequence from closely related strains or sub–species (Bertrand et al, 2019; Quince et al, 2020; Vicedomini et al, 2021; Quince et al, 2017a). In practice a draft genome obtained from these methods would contain at best, tens and, more typically, hundreds, of distinct contigs, and so there are inherent difficulties in accurately determining the degree of genome completeness and the extent of contamination from non-cognate genomes (Chen et al, 2020), and in identifying the presence of horizontally transferred sequence (Douglas and Langille, 2019).…”
Section: Introductionmentioning
confidence: 99%