2008
DOI: 10.1101/gr.7337908
|View full text |Cite
|
Sign up to set email alerts
|

ALLPATHS: De novo assembly of whole-genome shotgun microreads

Abstract: New DNA sequencing technologies deliver data at dramatically lower costs but demand new analytical methods to take full advantage of the very short reads that they produce. We provide an initial, theoretical solution to the challenge of de novo assembly from whole-genome shotgun “microreads.” For 11 genomes of sizes up to 39 Mb, we generated high-quality assemblies from 80× coverage by paired 30-base simulated reads modeled after real Illumina-Solexa reads. The bacterial genomes of Campylobacter jejuni and Esc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
641
0
2

Year Published

2008
2008
2014
2014

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 747 publications
(643 citation statements)
references
References 10 publications
(11 reference statements)
0
641
0
2
Order By: Relevance
“…Paired reads from long inserts (500-1000bp) also offer long range connectivity, similar to 454 reads. Some assemblers, such as ALLPATHS, require at least two libraries with different insert sizes, for this reason 8 .…”
Section: Library Constructionmentioning
confidence: 99%
See 1 more Smart Citation
“…Paired reads from long inserts (500-1000bp) also offer long range connectivity, similar to 454 reads. Some assemblers, such as ALLPATHS, require at least two libraries with different insert sizes, for this reason 8 .…”
Section: Library Constructionmentioning
confidence: 99%
“…The Rnnotator 16 , Multiple-k 21 , and Trans-ABySS 19 assemblers follow the same strategy; they assemble the dataset multiple times using a De Bruijn graph-based approach [6][7][8]58 to reconstruct transcripts from a broad range of expression levels, and then post-process the assembly to merge contigs and remove redundancy (Figure 2b). By contrast, other assemblers (Trinity 59 , and Oases 20 ) traverse the De Bruijn graph directly to assemble each isoform.…”
Section: De Novo Strategymentioning
confidence: 99%
“…The third category of software based on de Bruijn graph approaches 40 are widely used in assembling data from the Solexa and SOLiD platforms. The tools in this category (such as ABySS, 51 ALLPATHS, 52 EULER-SR, 53 SOAPdenovo 54 and Velvet 55 ) have applied certain heuristic strategies to reduce the complexity of the de Bruijn graphs, which trivialize assembly problem by finding the path that would traverse each edge of the graph exactly once. EULER-SR 52 mitigates error sequencing impact by constructing different K-mer sizes De Bruijn graphs and reduces graph complexity by applying low-quality read ends and PE constraints.…”
Section: Assembly Strategiesmentioning
confidence: 99%
“…This approach is not unique to ABySS [4,25,6]. Although some groups present static graphs of real genome assembly data for publication purposes [5], no interactive analysis tools exist for this data type.…”
Section: Contig Connectivity As Graphsmentioning
confidence: 99%
“…Standard overlap search algorithms are not optimized for this very large number of short sequences, and thus a new generation of assembly algorithms has emerged. Our in-house assembly algorithm, ABySS [24], addresses the overlap search problem by representing DNA sequences as a de Bruijn graph, a notion pioneered by Pevzner et al [21] and employed in other recently published genome assembly tools [4,25,6]. A de Bruijn graph is a directed graph that compactly represents a uniform overlap between sequences.…”
Section: Assembling a Genomementioning
confidence: 99%