Zi-Yi Gong scite author profile

Over the past decades, remarkable progress on phosphoramidite chemistry-based large-scale de novo oligonucleotide synthesis has been achieved, enabling numerous novel and exciting applications. Among them, de novo genome synthesis and DNA data storage are striking. However, to make these two applications more practical, the synthesis length, speed, cost, and throughput require vast improvements, which is a challenge to be met by the phosphoramidite chemistry. Harnessing the power of enzymes, the recently emerged enzymatic methods provide a competitive route to overcome this challenge. In this review, we first summarize the status of large-scale oligonucleotide synthesis technologies including the basic methodology and large-scale synthesis approaches, with special focus on the emerging enzymatic methods. Afterward, we discuss the opportunities and challenges of large-scale oligonucleotide synthesis on de novo genome synthesis and DNA data storage respectively.

show abstract

Robust data storage in DNA by de Bruijn graph-based de novo strand assembly

Song

Geng

Gong

et al. 2022

Nat Commun

View full text Add to dashboard Cite

DNA data storage is a rapidly developing technology with great potential due to its high density, long-term durability, and low maintenance cost. The major technical challenges include various errors, such as strand breaks, rearrangements, and indels that frequently arise during DNA synthesis, amplification, sequencing, and preservation. In this study, a de novo strand assembly algorithm (DBGPS) is developed using de Bruijn graph and greedy path search to meet these challenges. DBGPS shows substantial advantages in handling DNA breaks, rearrangements, and indels. The robustness of DBGPS is demonstrated by accelerated aging, multiple independent data retrievals, deep error-prone PCR, and large-scale simulations. Remarkably, 6.8 MB of data is accurately recovered from a severely corrupted sample that has been treated at 70 °C for 70 days. With DBGPS, we are able to achieve a logical density of 1.30 bits/cycle and a physical density of 295 PB/g.

show abstract

Robust retrieval of data stored in DNA by de Bruijn graph-basedde novostrand assembly

Song

Geng

Gong

et al. 2020

Preprint

View full text Add to dashboard Cite

High density and long-term features make DNA data storage a potential media. However, DNA data channel is a unique channel with unavoidable ‘data reputations’ in the forms of multiple error-rich strand copies. This multi-copy feature cannot be well harnessed by available codec systems optimized for single-copy media. Furthermore, lacking an effective mechanism to handle base shift issues, these systems perform poorly with indels. Here, we report the efficient reconstruction of DNA strands from multiple error-rich sequences directly, utilizing a De Bruijn Graph-based Greedy Path Search (DBG-GPS) algorithm. DBG-GPS can take advantage of the multi-copy feature for efficient correction of indels as well as substitutions. As high as 10% of errors can be accurately corrected with a high coding rate of 96.8%. Accurate data recovery with low quality, deep error-prone PCR products proved the high robustness of DBG-GPS (314Kb, 12K oligos). Furthermore, DBG-GPS shows 50 times faster than the clustering and multiple alignment-based methods reported. The revealed linear decoding complexity makes DBG-GPS a suitable solution for large-scale data storage. DBG-GPS’s capacity with large data was verified by large-scale simulations (300 MB). A Python implementation of DBG-GPS is available at https://switch-codes.coding.net/public/switch-codes/DNA-Fountain-De-Bruijn-Decoding/git/files.

show abstract

Engineering DNA Materials for Sustainable Data Storage Using a DNA Movable-Type System

et al. 2023

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zi-Yi Gong

Large-Scale de novo Oligonucleotide Synthesis for Whole-Genome Synthesis and Data Storage: Challenges and Opportunities

Robust data storage in DNA by de Bruijn graph-based de novo strand assembly

Robust retrieval of data stored in DNA by de Bruijn graph-basedde novostrand assembly

Engineering DNA Materials for Sustainable Data Storage Using a DNA Movable-Type System

Contact Info

Product

Resources

About