Summary
Structural variants (SVs) are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight SV classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype-blocks in 26 human populations. Analyzing this set, we identify numerous gene-intersecting SVs exhibiting population stratification and describe naturally occurring homozygous gene knockouts suggesting the dispensability of a variety of human genes. We demonstrate that SVs are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of SV complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex SVs with multiple breakpoints likely formed through individual mutational events. Our catalog will enhance future studies into SV demography, functional impact and disease association.
We present the first comprehensive analysis of a diploid human genome that combines single-molecule sequencing with single-molecule genome maps. Our hybrid assembly markedly improves upon the contiguity observed from traditional shotgun sequencing approaches, with scaffold N50 values approaching 30 Mb, and we identified complex structural variants (SVs) missed by other high-throughput approaches. Furthermore, by combining Illumina short-read data with long reads, we phased both single-nucleotide variants and SVs, generating haplotypes with over 99% consistency with previous trio-based studies. Our work shows that it is now possible to integrate single-molecule and high-throughput sequence data to generate de novo assembled genomes that approach reference quality.
Tyrosine kinase domain mutations are a common cause of acquired clinical resistance to tyrosine kinase inhibitors (TKIs) used to treat cancer, including the FLT3 inhibitor quizartinib. Mutation of kinase “gatekeeper” residues, which control access to an allosteric pocket adjacent to the ATP-binding site, have been frequently implicated in TKI resistance. The molecular underpinnings of gatekeeper mutation-mediated resistance are incompletely understood. We report the first co-crystal structure of FLT3 with the TKI quizartinib, which demonstrates that quizartinib binding relies on essential edge-to-face aromatic interactions with the gatekeeper F691 residue, and F830 within the highly conserved DFG motif in the activation loop. This reliance makes quizartinib critically vulnerable to gatekeeper and activation loop substitutions while minimizing the impact of mutations elsewhere. Moreover, we identify PLX3397, a novel FLT3 inhibitor that retains activity against the F691L mutant due to a binding mode that depends less vitally on specific interactions with the gatekeeper position.
Higher-order chromatin structure arises from the combinatorial physical interactions of many genomic loci. To investigate this aspect of genome architecture we developed Pore-C, which couples chromatin conformation capture with Oxford Nanopore Technologies (ONT) long reads to directly sequence multi-way chromatin contacts without amplification. In GM12878, we demonstrate that the pairwise interaction patterns implicit in Pore-C multi-way contacts are consistent with gold standard Hi-C pairwise contact maps at the compartment, TAD, and loop scales. In addition, Pore-C also detects higher-order chromatin structure at 18.5-fold higher efficiency and greater fidelity than SPRITE, a previously published higher-order chromatin profiling technology. We demonstrate Pore-C's ability to detect and visualize multi-locus hubs associated with histone locus bodies and active / inactive nuclear compartments in GM12878. In the breast cancer cell line HCC1954, Pore-C contacts enable the reconstruction of complex and aneuploid rearranged alleles spanning multiple megabases and chromosomes. Finally, we apply Pore-C to generate a chromosome scale de novo assembly of the HG002 genome. Our results establish Pore-C as the most simple and scalable assay for the genome-wide assessment of combinatorial chromatin interactions, with additional applications for cancer rearrangement reconstruction and de novo genome assembly.
Chromatin structure | Structural variation | Long read sequencing | De novo genome assembly | cancer genomicsCorrespondence: mski@mskilab.org
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.