2017
DOI: 10.1093/gigascience/gix085
|View full text |Cite
|
Sign up to set email alerts
|

De novo PacBio long-read and phased avian genome assemblies correct and add to reference genes generated with intermediate and short reads

Abstract: Reference-quality genomes are expected to provide a resource for studying gene structure, function, and evolution. However, often genes of interest are not completely or accurately assembled, leading to unknown errors in analyses or additional cloning efforts for the correct sequences. A promising solution is long-read sequencing. Here we tested PacBio-based long-read sequencing and diploid assembly for potential improvements to the Sanger-based intermediate-read zebra finch reference and Illumina-based short-… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

9
180
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 195 publications
(194 citation statements)
references
References 58 publications
(79 reference statements)
9
180
0
Order By: Relevance
“…Our great ape genome assemblies improved sequence contiguity by orders of magnitude (20, 60), leading to a more comprehensive understanding of the evolution of structural variation. Coupling this effort with full-length cDNA sequencing improved gene annotation, especially for the discovery of new transcripts and isoforms that have recently diverged between closely related species.…”
Section: Discussionmentioning
confidence: 99%
“…Our great ape genome assemblies improved sequence contiguity by orders of magnitude (20, 60), leading to a more comprehensive understanding of the evolution of structural variation. Coupling this effort with full-length cDNA sequencing improved gene annotation, especially for the discovery of new transcripts and isoforms that have recently diverged between closely related species.…”
Section: Discussionmentioning
confidence: 99%
“…Scaffolds can contain errors in contig order (a 'translocation') or orientation (an 'inversion'). Examples of such errors can be found in the best available reference genomes for many species (Robert B. Norgren 2013;Shearer et al 2014;Tang et al 2014;Chen et al 2015;Davey et al 2016;Utsunomiya et al 2016;Schneider et al 2017;Korlach et al 2017). Consequently, inexpensive methods for identifying and correcting assembly errors are crucial for the generation of accurate assemblies (Salzberg and Yorke 2005;Phillippy, Schatz, and Pop 2008;Gnerre et al 2009;Tsai, Otto, and Berriman 2010;Salzberg et al 2012;Hunt et al 2013;Gurevich et al 2013;Bradnam et al 2013;Simão et al 2015;Fierst 2015;Muggli et al 2015;Yuan et al 2017;Harewood et al 2017).…”
mentioning
confidence: 99%
“…Much effort has been put into reducing these errors, including the introduction of diploid assembly methods [80], which reduce the rate of indel errors in the assembly caused by collapsed heterozygosity. However, our analysis of recent diploid assemblies produced by the PacBio tool Falcon Unzip [80] suggests that indel errors are still problematic and inflate the number of frame-shifting indels, with around 1% of transcripts frame-shifted in a diploid Zebrafinch assembly [81] compared to the Illumina based assembly of the same species. Additionally, these errors appear in assemblies of haploid cell lines, suggesting that heterozygosity is not the only cause.…”
Section: Base Level Accuracymentioning
confidence: 99%
“…both Pacific Biosciences [80,144,81] and 10x Genomics [145] provide tools to construct phased, diploid assemblies. Annotating diploid assemblies provides a window into haplotype-specific structural variation that may affect gene expression.…”
Section: Annotation Of Personal Human Diploid Assembliesmentioning
confidence: 99%