Metagenomic studies suggest that only a small fraction of viruses existing in nature have been identified and studied. Characterization of unknown viral genomes is hindered by many nonspecific genomes populating any virus sample. Here, we report a new platform integrating dropbased microfluidics and computational analysis that enables purification of any single viral species from a complex, mixed virus sample and the retrieval of complete genome sequences. Using this platform, we retrieve the genome sequence of a 5243 bp dsDNA virus that was spiked into wastewater with > 96% sequence coverage and > 99.8% identity. This platform holds great potential for virus discovery as it allows enrichment and sequencing of previously undescribed viruses as well as known viruses.
KeywordsMicrofluidics; Microemulsions; Viruses; Genome sequencing; High throughput screening * Corresponding author: Department of Physics, School of Engineering and Applied Sciences, Harvard University, Cambridge, MA 02138, USA, weitz@seas.harvard.edu.
HHS Public Access
Author Manuscript Author ManuscriptAuthor Manuscript
Author ManuscriptViruses are the most abundant biological entities on Earth and significantly impact living organisms by causing diseases and shaping their immune systems. Despite their ubiquity and influence, less than 0.01% of viruses are sequenced [1] . Establishment of an extensive virus database is crucial to identify potential emerging infectious diseases [2] and to improve our understanding of virus diversity, ecology, adaption and evolution. The major roadblock to characterizing unknown viral genomes is the lack of technologies enabling efficient enrichment of various types of viruses. Enrichment of a target viral species is required for the most common virus samples such as environmental samples, which generally harbor diverse viral populations [3] , or clinical samples where the amount of viral genomes is often lower than the amount of host genomes and the virions are localized to a small subset of cells in the tissue. An enrichment step is particularly crucial for viral genome sequencing because other abundant DNA in the sample such as genomic fragments of host DNA is often much larger than viral genomes and dominate the sequence space even with a small number of copies. Traditional enrichment methods for viruses include cell culture [4] , immunoscreening [5] followed by sequence-independent PCR [6] and differential hybridization [7] . All of these methods are labor-intensive, inefficient and more importantly, only applicable to a limited subset of viruses. Recently, a flow cytometric method was developed to disperse single virions into microwells and obtain their individual genome sequences [8] . However, this method does not employ a selection strategy. A selection strategy allows efficient usage of sequencing power and enables rare virus sequencing with a reasonable sequencing cost and time.In this paper, we report the development of a platform to isolate and sequence any single viral species from a large ge...