2014
DOI: 10.1038/srep04532
|View full text |Cite
|
Sign up to set email alerts
|

Non-random DNA fragmentation in next-generation sequencing

Abstract: Next Generation Sequencing (NGS) technology is based on cutting DNA into small fragments, and their massive parallel sequencing. The multiple overlapping segments termed “reads” are assembled into a contiguous sequence. To reduce sequencing errors, every genome region should be sequenced several dozen times. This sequencing approach is based on the assumption that genomic DNA breaks are random and sequence-independent. However, previously we showed that for the sonicated restriction DNA fragments the rates of … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
84
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
9
1

Relationship

1
9

Authors

Journals

citations
Cited by 108 publications
(85 citation statements)
references
References 40 publications
1
84
0
Order By: Relevance
“…DNA fragmentation (Poptsova et al, 2014) and PCR biases (Benjamini and Speed, 2012) introduced during library preparation result in a non-uniform sampling of possible sequencing reads and an under-representation of DNA with certain sequence features. Benjamini and Speed found that genomic fragments with high and low GC content are under-represented in Illumina libraries (Benjamini and Speed, 2012), and Manor and Borenstein found that intra-metagenome differences in coverage of different universal, single-copy genes can be explained by their GC content (Manor and Borenstein, 2015).…”
Section: Experimental Protocols Affect Results and Should Be Tracked mentioning
confidence: 99%
“…DNA fragmentation (Poptsova et al, 2014) and PCR biases (Benjamini and Speed, 2012) introduced during library preparation result in a non-uniform sampling of possible sequencing reads and an under-representation of DNA with certain sequence features. Benjamini and Speed found that genomic fragments with high and low GC content are under-represented in Illumina libraries (Benjamini and Speed, 2012), and Manor and Borenstein found that intra-metagenome differences in coverage of different universal, single-copy genes can be explained by their GC content (Manor and Borenstein, 2015).…”
Section: Experimental Protocols Affect Results and Should Be Tracked mentioning
confidence: 99%
“…and the methods used for data analysis (e.g., peak calling). Third, ChIP-seq data contain numerous technical biases (Kidder et al, 2011), including formaldehyde crosslinking bias (Solomon and Varshavsky, 1985; Lu et al, 2010; Gavrilov et al, 2015), antibody specificity and variability problems (Parseghian, 2013; Schonbrunn, 2014; Wardle and Tan, 2015) (Figure S15), technical artifacts due to highly expressed regions of the genome (which are not corrected by regular input controls) (Teytelman et al, 2013; Park et al, 2013; Jain et al, 2015), bias due to genome fragmentation and PCR amplification (Bardet et al, 2011; Poptsova et al, 2014), etc. These biases can lead to false-positive and false-negative peaks, and they also significantly affect any quantitative estimates of in vivo TF binding levels derived from ChIP-seq data, in ways that we do not understand well enough to correct (Gavrilov et al, 2015).…”
Section: Resultsmentioning
confidence: 99%
“…As there are known or suspected biases in next-generation sequencing (NGS) library construction (see [38] for example), it is possible that a small portion of the genome is missing from the sequencing libraries constructed. Published genome size estimates, however, range from 0.83–1.37 Gb [26], implying scaffold coverage could range anywhere from 64–100% of the genome.…”
Section: Resultsmentioning
confidence: 99%