PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II’s sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.
In our previous work, detailed deletion mapping of ovarian cancers indicated that a 300-kb region of chromosome 6q27 was likely to contain one or more putative tumor suppressor genes associated with development of this type of cancer. DNA sequencing in the region disclosed the presence of AF-6, a gene that had been identified as the ALL-1 fusion partner involved in acute myeloid leukemias with t(6;11)(q27;q23) translocations. In the work reported here, we determined the complete genomic sequence of the AF-6 gene, including exon-intron boundaries, and found six DNA polymorphisms. One of them, an insertion/deletion polymorphism, determined the presence or absence of seven amino acids in the AF-6 product. We also identified two alternatively spliced forms of the gene; the two novel transcripts would encode additional C-terminal peptides in comparison to the reported protein. Sequencing of seven cosmid clones that covered the entire gene revealed 32 exons (not including one exon involved in the insertion/deletion polymorphism), spanning approximately 140 kb of genomic DNA. These results may contribute to an understanding of the mechanism causing chromosomal translocations in leukemic cells.
We previously determined the nucleotide sequence and characterized the 685-kb proximal half of CEPH YAC936c1, which corresponds to a portion of human chromosome 3p21.3. In the study reported here, we characterized the remaining 515-kb of this YAC clone corresponding to the telomeric half of its human insert. The newly sequenced region contained a total of ten genes including six reported previously: phospholipase C delta 1 (PLCD1), human activin receptor type IIB (hActR-IIB), organic cation transporter-like 1 (OCTL1), organic cation transporter-like 2 (OCTL2), oxidative stress response 1 (OSR1), and human xylulokinase-like protein (XYLB). The remaining four genes present in the telomeric region included two known genes, MyD88 and ACAA, and two novel genes. One (designated ENGL) of the novel sequences was found to encode an amino-acid sequence homologous to the family of DNA/RNA endonucleases, especially endonuclease G. The other gene F56 revealed no significant homology to any known genes. These results disclosed complete physical and transcriptional maps of the 1200-kb region of 3p present in YAC 936c1.
The first complete genome sequence of Lactobacillus curvatus was determined by PacBio RS II. The single circular chromosome (1,848,756 bp, G+C content of 42.1%) of L. curvatus FBA2, isolated from fermented vegetables, contained low G+C regions (26.9% minimum) and 43 sets of >1,000-bp identical sequence pairs. No plasmids were detected.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.