Genome-wide association studies suggest that common genetic variants explain only a small fraction of heritable risk for common diseases, raising the question of whether rare variants account for a significant fraction of unexplained heritability1,2. While DNA sequencing costs have fallen dramatically3, they remain far from what is necessary for rare and novel variants to be routinely identified at a genome-wide scale in large cohorts. We have therefore sought to develop second-generation methods for targeted sequencing of all protein-coding regions (`exomes'), to reduce costs while enriching for discovery of highly penetrant variants. Here we report on the targeted capture and massively parallel sequencing of the exomes of twelve humans. These include eight HapMap individuals representing three populations4, and four unrelated individuals with a rare dominantly inherited disorder, Freeman-Sheldon syndrome (FSS)5. We demonstrate the sensitive and specific identification of rare and common variants in over 300 megabases (Mb) of coding sequence. Using FSS as a proof-of-concept, we show that candidate genes for monogenic disorders can be identified by exome sequencing of a small number of unrelated, affected individuals. This strategy may be extendable to diseases with more complex genetics through larger sample sizes and appropriate weighting of nonsynonymous variants by predicted functional impact.
We demonstrate the first successful application of exome sequencing to discover the gene for a rare, Mendelian disorder of unknown cause, Miller syndrome (OMIM %263750). For four affected individuals in three independent kindreds, we captured and sequenced coding regions to a mean coverage of 40X, and sufficient depth to call variants at ~97% of each targeted exome. Filtering against public SNP databases and a small number of HapMap exomes for genes with two novel variants in each of the four cases identified a single candidate gene, DHODH, which encodes a key enzyme in the pyrimidine de novo biosynthesis pathway. Sanger sequencing confirmed the presence of DHODH mutations in three additional families with Miller syndrome. Exome sequencing of a small number of unrelated, affected individuals is a powerful, efficient strategy for identifying the genes underlying rare Mendelian disorders and will likely transform the genetic analysis of monogenic traits.
Exome sequencing - the targeted sequencing of the subset of the human genome that is protein coding - is a powerful and cost-effective new tool for dissecting the genetic basis of diseases and traits that have proved to be intractable to conventional gene-discovery strategies. Over the past 2 years, experimental and analytical approaches relating to exome sequencing have established a rich framework for discovering the genes underlying unsolved Mendelian disorders. Additionally, exome sequencing is being adapted to explore the extent to which rare alleles explain the heritability of complex diseases and health-related traits. These advances also set the stage for applying exome and whole-genome sequencing to facilitate clinical diagnosis and personalized disease-risk profiling.
Evidence for the etiology of autism spectrum disorders (ASD) has consistently pointed to a strong genetic component complicated by substantial locus heterogeneity1,2. We sequenced the exomes of 20 sporadic cases of ASD and their parents, reasoning that these families would be enriched for de novo mutations of major effect. We identified 21 de novo mutations, of which 11 were protein-altering. Protein-altering mutations were significantly enriched for changes at highly conserved residues. We identified potentially causative de novo events in 4/20 probands, particularly among more severely affected individuals, in FOXP1, GRIN2B, SCN1A, and LAMC3. In the FOXP1 mutation carrier, we also observed a rare inherited CNTNAP2 mutation and provide functional support for a multihit model for disease risk3. Our results demonstrate that trio-based exome sequencing is a powerful approach for identifying novel candidate genes for ASD and suggest that de novo mutations may contribute substantially to the genetic risk for ASD.
We demonstrate the successful application of exome sequencing1–3 to discover a gene for an autosomal dominant disorder, Kabuki syndrome (OMIM %147920). The exomes of ten unrelated probands were subjected to massively parallel sequencing. After filtering against SNP databases, there was no compelling candidate gene containing novel variants in all affected individuals. Less stringent filtering criteria permitted modest genetic heterogeneity or missing data, but identified multiple candidate genes. However, genotypic and phenotypic stratification highlighted MLL2, a Trithorax-group histone methyltransferase4, in which seven probands had novel nonsense or frameshift mutations. Follow-up Sanger sequencing detected MLL2 mutations in two of the three remaining cases, and in 26 of 43 additional cases. In families where parental DNA was available, the mutation was confirmed to be de novo (n = 12) or transmitted (n = 2) in concordance with phenotype. Our results strongly suggest that mutations in MLL2 are a major cause of Kabuki syndrome.
Haplotype information is essential to the complete description and interpretation of genomes 1 , genetic diversity 2 and genetic ancestry 3 . Although individual human genome sequencing is increasingly routine 4 , nearly all such genomes are unresolved with respect to haplotype. Here we combine the throughput of massively parallel sequencing 5 with the contiguity information provided by large-insert cloning 6 to experimentally determine the haplotype-resolved genome of a South Asian individual. A single fosmid library was split into a modest number of pools, each providing ~3% physical coverage of the diploid genome. Sequencing of each pool yielded reads overwhelmingly derived from only one homologous chromosome at any given location. These data were combined with whole-genome shotgun sequence to directly phase 94% of ascertained heterozygous single nucleotide polymorphisms (SNPs) into long haplotype blocks (N50 of 386 kilobases (kbp)). This method also facilitates the analysis of structural variation, for example, to anchor novel insertions 7,8 to specific locations and haplotypes.The high quality of the human reference genome derives from the hierarchical sequencing of large-insert clones, such that the assembly corresponding to each clone represents a single haplotype 9 . One of the first 'personal genomes' exploited clone-based mate pairing and long, accurate Sanger reads to resolve variants into haplotype blocks (N50 of 350 kbp; that is, 50% of resolved sequence is within blocks of at least 350 kbp) 1 . Although new technologies 5 have subsequently enabled >1,000-fold reduction in genome sequencing costs, the short read-lengths and paucity of contiguity information are such that it remains challenging to determine haplotypes at a genome-wide scale. Genomic phase, the assignment of alleles to homologous chromosomes, was determined for SNPs using mate-
To catalog protein-altering mutations that may drive the development of prostate cancers and their progression to metastatic disease systematically, we performed whole-exome sequencing of 23 prostate cancers derived from 16 different lethal metastatic tumors and three high-grade primary carcinomas. All tumors were propagated in mice as xenografts, designated the LuCaP series, to model phenotypic variation, such as responses to cancer-directed therapeutics. Although corresponding normal tissue was not available for most tumors, we were able to take advantage of increasingly deep catalogs of human genetic variation to remove most germline variants. On average, each tumor genome contained ∼200 novel nonsynonymous variants, of which the vast majority was specific to individual carcinomas. A subset of genes was recurrently altered across tumors derived from different individuals, including TP53, DLK2, GPC6, and SDF4. Unexpectedly, three prostate cancer genomes exhibited substantially higher mutation frequencies, with 2,000-4,000 novel coding variants per exome. A comparison of castration-resistant and castration-sensitive pairs of tumor lines derived from the same prostate cancer highlights mutations in the Wnt pathway as potentially contributing to the development of castration resistance. Collectively, our results indicate that point mutations arising in coding regions of advanced prostate cancers are common but, with notable exceptions, very few genes are mutated in a substantial fraction of tumors. We also report a previously undescribed subtype of prostate cancers exhibiting "hypermutated" genomes, with potential implications for resistance to cancer therapeutics. Our results also suggest that increasingly deep catalogs of human germline variation may challenge the necessity of sequencing matched tumornormal pairs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.