2019
DOI: 10.1038/s41598-019-45832-6
|View full text |Cite
|
Sign up to set email alerts
|

A Characterization of the DNA Data Storage Channel

Abstract: Owing to its longevity and enormous information density, DNA, the molecule encoding biological information, has emerged as a promising archival storage medium. However, due to technological constraints, data can only be written onto many short DNA molecules that are stored in an unordered way, and can only be read by sampling from this DNA pool. Moreover, imperfections in writing (synthesis), reading (sequencing), storage, and handling of the DNA, in particular amplification via PCR, lead to a loss of DNA mole… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

2
185
2

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 213 publications
(210 citation statements)
references
References 40 publications
2
185
2
Order By: Relevance
“…On average, most of the experiments had total error rate around 1.3% (substitution: 0.4%, deletion: 0.85%, insertion: 0.05%). Based on the typical error rates for Illumina sequencing and experiments on paired-end data as in [9], the substitution errors are primarily caused by the sequencing and the insertion and deletion errors are primarily from the synthesis. Figure 10 shows the histogram of the coverage of the oligonucleotides for experiments #2 and #5 when the mean coverage was set to 5 by subsampling.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…On average, most of the experiments had total error rate around 1.3% (substitution: 0.4%, deletion: 0.85%, insertion: 0.05%). Based on the typical error rates for Illumina sequencing and experiments on paired-end data as in [9], the substitution errors are primarily caused by the sequencing and the insertion and deletion errors are primarily from the synthesis. Figure 10 shows the histogram of the coverage of the oligonucleotides for experiments #2 and #5 when the mean coverage was set to 5 by subsampling.…”
Section: Discussionmentioning
confidence: 99%
“…Since the RS code is applied on very short sequences, the scheme suffers from limitations of short block length codes [13]. The scheme also ignores reads with insertions and deletions, which usually comprise around 30-50% of the reads [9].…”
Section: Previous Workmentioning
confidence: 99%
See 1 more Smart Citation
“…1 shows a typical DNA storage system. Binary data is encoded into short DNA sequences (oligonucleotides, or oligos for short) with lengths limited to around 150 bases/nucleotides for current scalable synthesis technologies [6]. Note that each DNA nucleotide belongs to the set {A, C, G, T}.…”
Section: Introductionmentioning
confidence: 99%
“…We do not discuss the details on experimental implementations of DNA storage systems, which involve other important aspects such as data encoding, error correction, and system integration and automation. [11][12][13][14] In design A (Figure 1), the pool of data oligos is virtually organized in the form blocks, tables, rows, and columns. Each data-encoding strand is composed of several domains, including a data payload block surrounded by three address blocks (i.e., PCR primer targets) on both sides.…”
mentioning
confidence: 99%