2008
DOI: 10.1093/bioinformatics/btn548
|View full text |Cite
|
Sign up to set email alerts
|

Aggressive assembly of pyrosequencing reads with mates

Abstract: Motivation: DNA sequence reads from Sanger and pyrosequencing platforms differ in cost, accuracy, typical coverage, average read length and the variety of available paired-end protocols. Both read types can complement one another in a ‘hybrid’ approach to whole-genome shotgun sequencing projects, but assembly software must be modified to accommodate their different characteristics. This is true even of pyrosequencing mated and unmated read combinations. Without special modifications, assemblers tuned for homog… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
475
0

Year Published

2010
2010
2017
2017

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 500 publications
(475 citation statements)
references
References 29 publications
(35 reference statements)
0
475
0
Order By: Relevance
“…For each species, a single paired-end library was sequenced to 9-39× coverage on the Illumina HiSeq 2000 platform (Supplementary Table 2 and Supplementary Note). The genome sequences were assembled with the Celera Assembler 26 , resulting in N50 scaffold sizes between 3.2 and 71 kb (Supplementary Table 3 and Supplementary Note). For most species, we recovered more than 75% of the conserved eukaryotic genes included in the CEGMA 27 analysis, and on average we recovered 74% of the conserved genes of the Actinopterygii data set included in the BUSCO 28 analysis.…”
Section: Sequencing and Draft Assembly Of 66 Teleost Genomesmentioning
confidence: 99%
“…For each species, a single paired-end library was sequenced to 9-39× coverage on the Illumina HiSeq 2000 platform (Supplementary Table 2 and Supplementary Note). The genome sequences were assembled with the Celera Assembler 26 , resulting in N50 scaffold sizes between 3.2 and 71 kb (Supplementary Table 3 and Supplementary Note). For most species, we recovered more than 75% of the conserved eukaryotic genes included in the CEGMA 27 analysis, and on average we recovered 74% of the conserved genes of the Actinopterygii data set included in the BUSCO 28 analysis.…”
Section: Sequencing and Draft Assembly Of 66 Teleost Genomesmentioning
confidence: 99%
“…It provides better and more preferable performance in terms of speed and output quality 44 compared with the other tools mentioned above. The second category of software that includes CABOG, 45 Edena, 46 Newbler 47 and Shorty 48 are based on overlap-layout-consensus. This strategy involves three main steps.…”
Section: Assembly Strategiesmentioning
confidence: 99%
“…Newbler, among the overlap-layout-consensus-based software, was specifically designed to handle the ambiguity in the length of 454's homopolymer runs, whereas the other widely used programs (distributed by Illumina/Solexa), including Shorty, can also be applied to 49,50 whose main procedure could be summarized as 'unitig-contig-scaffolds' , for base call correction. 45 Shorty innovatively estimates the intercontig distances from the mate pairs using a few seeds of 300-500 bp length. The third category of software based on de Bruijn graph approaches 40 are widely used in assembling data from the Solexa and SOLiD platforms.…”
Section: Assembly Strategiesmentioning
confidence: 99%
“…As an example, Figure 3 shows the Hawkeye LaunchPad for an assembly of the 2.9-Mb Staphylococcus aureus genome available on the AMOS website (http://sourceforge.net/projects/amos/files/sample_data/). The genome was assembled using the pre-assembly error correction program Quake [25] and the Celera Assembler [3] from $25Â coverage of 100-bp reads sequenced at the Broad Institute using an Illumina Genome Analyzer II (SRA study SRP001086). These mated reads were sequenced using a combination of $20Â coverage of a 165-bp fragment library, and $5Â coverage of a 3.5-kb jumping library.…”
Section: Technologies and Features Visual Assembly Analytics With Hawmentioning
confidence: 99%
“…Over the years, numerous genome assemblers have been developed, among which we highlight just a few of the most memorable: phrap-one of the most widely used assemblers of the first generation sequencing era; Celera Assembler [2,3]-the software originally developed at Celera Genomics and used to assemble the human genome through a whole-genome shotgun sequencing approach (as opposed to the BAC-by-BAC approach employed by the Human Genome Consortium); Velvet [4]one of the first among a number of assemblers developed specifically for second generation sequencing data; and ALLPATHS-LG [5]-perhaps the most effective genome assembler for second generation sequencing data today. These, and the many other assemblers used in the community, have contributed to the reconstruction of tens of thousands of viruses and bacteria, as well as hundreds of eukaryotes, including the genomes of many mammals (human [6], mouse [7], cow [8], panda [9], etc.…”
Section: Introduction and Contextmentioning
confidence: 99%