2018
DOI: 10.1111/1755-0998.12933
|View full text |Cite
|
Sign up to set email alerts
|

How complete are “complete” genome assemblies?—An avian perspective

Abstract: The genomics revolution has led to the sequencing of a large variety of nonmodel organisms often referred to as "whole" or "complete" genome assemblies. But how complete are these, really? Here, we use birds as an example for nonmodel vertebrates and find that, although suitable in principle for genomic studies, the current standard of short-read assemblies misses a significant proportion of the expected genome size (7% to 42%; mean 20 ± 9%). In particular, regions with strongly deviating nucleotide compositio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

4
134
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 115 publications
(143 citation statements)
references
References 49 publications
4
134
0
Order By: Relevance
“…This assembly consists of 62,122 scaffolds with a N50 of 52,818 bp (Table ) and was used to resolve the barn owl's position in the bird tree of life (Jarvis et al, ; Prum et al, ) and to search for genes associated with low‐light vision (Hanna et al, ; Hoglund et al, ; Le Duc et al, ; Wu et al, ). However, the use of draft genome entails problems such as noncontiguous assembly and missing genes, especially in GC‐rich portions of bird genomes (Peona, Weissensteiner, & Suh, ). As shown by Warren et al (), adding long reads such as those obtained from single‐molecule real‐time (SMRT, Pacific Biosciences, thereafter called PacBio) improves genome completeness and does not suffer from PCR amplification bias for the sequencing at GC or AT genome‐rich region.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…This assembly consists of 62,122 scaffolds with a N50 of 52,818 bp (Table ) and was used to resolve the barn owl's position in the bird tree of life (Jarvis et al, ; Prum et al, ) and to search for genes associated with low‐light vision (Hanna et al, ; Hoglund et al, ; Le Duc et al, ; Wu et al, ). However, the use of draft genome entails problems such as noncontiguous assembly and missing genes, especially in GC‐rich portions of bird genomes (Peona, Weissensteiner, & Suh, ). As shown by Warren et al (), adding long reads such as those obtained from single‐molecule real‐time (SMRT, Pacific Biosciences, thereafter called PacBio) improves genome completeness and does not suffer from PCR amplification bias for the sequencing at GC or AT genome‐rich region.…”
Section: Introductionmentioning
confidence: 99%
“…Thus, due to technical difficulty many genes remain non-or partially sequenced in birds(Botero-Castro et al, 2017). A recent study estimated the proportion of missing genome in typical bird assemblies at ~20%(Peona et al, 2018). However, in the European barn owl genome we retrieved genes missing in chicken or otherF I G U R E 6 Avian phylogenetic trees based on the American and European barn owl proteins predicted with the American and European barn owl annotations.…”
mentioning
confidence: 99%
“…Furthermore, the linked-read technology is still very new with ongoing developments and improvements of analytical tools and algorithms constantly being made. For instance, after the initial release of the Supernova assembly algorithm by 10X Genomics (v.1.1, Weisenfeld et al 2017 -used Together in combination with other sequencing technologies such as Hi-C or long-reads (Peona et al 2018), there is certainly room for further improvements towards a reference genome. Nevertheless, the current draft assembly ZSil_MB_1.0 marks an essential progress towards unraveling the genomic basis of diversification in a 'great speciator' system.…”
Section: Resultsmentioning
confidence: 99%
“…Third, analyses based upon genome assemblies will depend strongly on the quality of 875 the assembly. Even for genomes of reasonably high quality, protein coding genes may be 876 missing, either through problems with the annotation process or due to the fact that genes fall 877 into assembly gaps (Peona, Weissensteiner, & Suh, 2018). In other words, genome assembly in 878 not necessarily a panacea for all problems related to expression analyses.…”
Section: Conclusion 859mentioning
confidence: 99%