Jan‐Fang Cheng scite author profile

Genome sequencing enhances our understanding of the biological world by providing blueprints for the evolutionary and functional diversity that shapes the biosphere. However, microbial genomes that are currently available are of limited phylogenetic breadth, owing to our historical inability to cultivate most microorganisms in the laboratory. We apply single-cell genomics to target and sequence 201 uncultivated archaeal and bacterial cells from nine diverse habitats belonging to 29 major mostly uncharted branches of the tree of life, so-called 'microbial dark matter'. With this additional genomic information, we are able to resolve many intra-and inter-phylum-level relationships and to propose two new superphyla. We uncover unexpected metabolic features that extend our understanding of biology and challenge established boundaries between the three domains of life. These include a novel amino acid use for the opal stop codon, an archaeal-type purine synthesis in Bacteria and complete sigma factors in Archaea similar to those in Bacteria. The single-cell genomes also served to phylogenetically anchor up to 20% of metagenomic reads in some habitats, facilitating organism-level interpretation of ecosystem function. This study greatly expands the genomic representation of the tree of life and provides a systematic step towards a better understanding of biological evolution on our planet.Microorganisms are the most diverse and abundant cellular life forms on Earth, occupying every possible metabolic niche. The large majority of these organisms have not been obtained in pure culture and we have only recently become aware of their presence mainly through cultivationindependent molecular surveys based on conserved marker genes (chiefly small subunit ribosomal RNA; SSU rRNA) or through shotgun sequencing (metagenomics) 1,2 . As an increasing number of environments are deeply sequenced using next-generation technologies, diversity estimates for Bacteria and Archaea continue to rise, with the number of microbial 'species' predicted to reach well into the millions 3 . According to SSU rRNA-based phylogeny, these fall into at least 60 major lines of descent (phyla or divisions) within the bacterial and archaeal domains 4

show abstract

The Arabidopsis lyrata genome sequence and the basis of rapid genome size change

Pattyn

et al. 2011

View full text Add to dashboard Cite

We present the 207 Mb genome sequence of the outcrosser Arabidopsis lyrata, which diverged from the self-fertilizing species A. thaliana about 10 million years ago. It is generally assumed that the much smaller A. thaliana genome, which is only 125 Mb, constitutes the derived state for the family. Apparent genome reduction in this genus can be partially attributed to the loss of DNA from large-scale rearrangements, but the main cause lies in the hundreds of thousands of small deletions found throughout the genome. These occurred primarily in non-coding DNA and transposons, but protein-coding multi-gene families are smaller in A. thaliana as well. Analysis of deletions and insertions still segregating in A. thaliana indicates that the process of DNA loss is ongoing, suggesting pervasive selection for a smaller genome.

show abstract

A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea

Hugenholtz

Mavromatis

et al. 2009

Nature

891

858

View full text Add to dashboard Cite

Sequencing of bacterial and archaeal genomes has revolutionized our understanding of the many roles played by microorganisms1. There are now nearly 1,000 completed bacterial and archaeal genomes available2, most of which were chosen for sequencing on the basis of their physiology. As a result, the perspective provided by the currently available genomes is limited by a highly biased phylogenetic distribution3–5. To explore the value added by choosing microbial genomes for sequencing on the basis of their evolutionary relationships, we have sequenced and analysed the genomes of 56 culturable species of Bacteria and Archaea selected to maximize phylogenetic coverage. Analysis of these genomes demonstrated pronounced benefits (compared to an equivalent set of genomes randomly selected from the existing database) in diverse areas including the reconstruction of phylogenetic history, the discovery of new protein families and biological properties, and the prediction of functions for known genes from other organisms. Our results strongly support the need for systematic ‘phylogenomic’ efforts to compile a phylogeny-driven ‘Genomic Encyclopedia of Bacteria and Archaea’ in order to derive maximum knowledge from existing microbial genome data as well as from genome sequences to come.

show abstract

Dicer, Drosha, and Outcomes in Patients with Ovarian Cancer

et al. 2008

View full text Add to dashboard Cite

show abstract

Loss of silent-chromatin looping and impaired imprinting of DLX5 in Rett syndrome

et al. 2004

View full text Add to dashboard Cite

Mutations in MECP2 are associated with Rett syndrome, an X-linked neurodevelopmental disorder. To identify genes targeted by Mecp2, we sequenced 100 in vivo Mecp2-binding sites in mouse brain. Several sequences mapped to an imprinted gene cluster on chromosome 6, including Dlx5 and Dlx6, whose transcription was roughly two times greater in brains of Mecp2-null mice compared with those of wild-type mice. The maternally expressed gene DLX5 showed a loss of imprinting in lymphoblastoid cells from individuals with Rett syndrome. Because Dlx5 regulates production of enzymes that synthesize gamma-aminobutyric acid (GABA), loss of imprinting of Dlx5 may alter GABAergic neuron activity in individuals with Rett syndrome. In mouse brain, Dlx5 imprinting was relaxed, yet Mecp2-mediated silent-chromatin structure existed at the Dlx5-Dlx6 locus in brains of wild-type, but not Mecp2-null, mice. Mecp2 targeted histone deacetylase 1 to a sharply defined, approximately 1-kb region at the Dlx5-Dlx6 locus and promoted repressive histone methylation at Lys9 at this site. Chromatin immunoprecipitation-combined loop assays showed that Mecp2 mediated the silent chromatin-derived 11-kb chromatin loop at the Dlx5-Dlx6 locus. This loop was absent in chromatin of brains of Mecp2-null mice, and Dlx5-Dlx6 interacted with far distant sequences, forming distinct active chromatin-associated loops. These results show that formation of a silent-chromatin loop is a new mechanism underlying gene regulation by Mecp2.

show abstract

Assembling the Marine Metagenome, One Cell at a Time

et al. 2009

View full text Add to dashboard Cite

The difficulty associated with the cultivation of most microorganisms and the complexity of natural microbial assemblages, such as marine plankton or human microbiome, hinder genome reconstruction of representative taxa using cultivation or metagenomic approaches. Here we used an alternative, single cell sequencing approach to obtain high-quality genome assemblies of two uncultured, numerically significant marine microorganisms. We employed fluorescence-activated cell sorting and multiple displacement amplification to obtain hundreds of micrograms of genomic DNA from individual, uncultured cells of two marine flavobacteria from the Gulf of Maine that were phylogenetically distant from existing cultured strains. Shotgun sequencing and genome finishing yielded 1.9 Mbp in 17 contigs and 1.5 Mbp in 21 contigs for the two flavobacteria, with estimated genome recoveries of about 91% and 78%, respectively. Only 0.24% of the assembling sequences were contaminants and were removed from further analysis using rigorous quality control. In contrast to all cultured strains of marine flavobacteria, the two single cell genomes were excellent Global Ocean Sampling (GOS) metagenome fragment recruiters, demonstrating their numerical significance in the ocean. The geographic distribution of GOS recruits along the Northwest Atlantic coast coincided with ocean surface currents. Metabolic reconstruction indicated diverse potential energy sources, including biopolymer degradation, proteorhodopsin photometabolism, and hydrogen oxidation. Compared to cultured relatives, the two uncultured flavobacteria have small genome sizes, few non-coding nucleotides, and few paralogous genes, suggesting adaptations to narrow ecological niches. These features may have contributed to the abundance of the two taxa in specific regions of the ocean, and may have hindered their cultivation. We demonstrate the power of single cell DNA sequencing to generate reference genomes of uncultured taxa from a complex microbial community of marine bacterioplankton. A combination of single cell genomics and metagenomics enabled us to analyze the genome content, metabolic adaptations, and biogeography of these taxa.

show abstract

Cryptic inoviruses revealed as pervasive in bacteria and archaea across Earth’s biomes

et al. 2019

View full text Add to dashboard Cite

Bacteriophages from the Inoviridae family (inoviruses) are characterized by their unique morphology, genome content and infection cycle. One of the most striking features of inoviruses is their ability to establish a chronic infection whereby the viral genome resides within the cell in either an exclusively episomal state or integrated into the host chromosome and virions are continuously released without killing the host. To date, a relatively small number of inovirus isolates have been extensively studied, either for biotechnological applications, such as phage display, or because of their effect on the toxicity of known bacterial pathogens including Vibrio cholerae and Neisseria meningitidis. Here, we show that the current 56 members of the Inoviridae family represent a minute fraction of a highly diverse group of inoviruses. Using a machine learning approach leveraging a combination of marker gene and genome features, we identified 10,295 inovirus-like sequences from microbial genomes and metagenomes. Collectively, our results call for reclassification of the current Inoviridae family into a viral order including six distinct proposed families associated with nearly all bacterial phyla across virtually every ecosystem. Putative inoviruses were also detected in several archaeal genomes, suggesting that, collectively, members of this supergroup infect hosts across the domains Bacteria and Archaea. Finally, we identified an expansive diversity of inovirus-encoded toxin–antitoxin and gene expression modulation systems, alongside evidence of both synergistic (CRISPR evasion) and antagonistic (superinfection exclusion) interactions with co-infecting viruses, which we experimentally validated in a Pseudomonas model. Capturing this previously obscured component of the global virosphere may spark new avenues for microbial manipulation approaches and innovative biotechnological applications.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jan‐Fang Cheng

Initial sequencing and analysis of the human genome

Insights into the phylogeny and coding potential of microbial dark matter

The Arabidopsis lyrata genome sequence and the basis of rapid genome size change

A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea

Dicer, Drosha, and Outcomes in Patients with Ovarian Cancer

Loss of silent-chromatin looping and impaired imprinting of DLX5 in Rett syndrome

Assembling the Marine Metagenome, One Cell at a Time

Cryptic inoviruses revealed as pervasive in bacteria and archaea across Earth’s biomes

Contact Info

Product

Resources

About