Hi-C contact maps are valuable for genome assembly (Lieberman-Aiden, van Berkum et al. 2009; Burton et al. 2013; Dudchenko et al. 2017). Recently, we developed Juicebox, a system for the visual exploration of Hi-C data (Durand, Robinson et al. 2016), and 3D-DNA, an automated pipeline for using Hi-C data to assemble genomes (Dudchenko et al. 2017). Here, we introduce “Assembly Tools,” a new module for Juicebox, which provides a point-and-click interface for using Hi-C heatmaps to identify and correct errors in a genome assembly. Together, 3D-DNA and the Juicebox Assembly Tools greatly reduce the cost of accurately assembling complex eukaryotic genomes. To illustrate, we generated de novo assemblies with chromosome-length scaffolds for three mammals: the wombat, Vombatus ursinus (3.3Gb), the Virginia opossum, Didelphis virginiana (3.3Gb), and the raccoon, Procyon lotor (2.5Gb). The only inputs for each assembly were Illumina reads from a short insert DNA-Seq library (300 million Illumina reads, maximum length 2x150 bases) and an in situ Hi-C library (100 million Illumina reads, maximum read length 2x150 bases), which cost <$1000.
We investigated genome folding across the eukaryotic tree of life. We find two types of three-dimensional (3D) genome architectures at the chromosome scale. Each type appears and disappears repeatedly during eukaryotic evolution. The type of genome architecture that an organism exhibits correlates with the absence of condensin II subunits. Moreover, condensin II depletion converts the architecture of the human genome to a state resembling that seen in organisms such as fungi or mosquitoes. In this state, centromeres cluster together at nucleoli, and heterochromatin domains merge. We propose a physical model in which lengthwise compaction of chromosomes by condensin II during mitosis determines chromosome-scale genome architecture, with effects that are retained during the subsequent interphase. This mechanism likely has been conserved since the last common ancestor of all eukaryotes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.