Genome-wide association studies suggest that common genetic variants explain only a small fraction of heritable risk for common diseases, raising the question of whether rare variants account for a significant fraction of unexplained heritability1,2. While DNA sequencing costs have fallen dramatically3, they remain far from what is necessary for rare and novel variants to be routinely identified at a genome-wide scale in large cohorts. We have therefore sought to develop second-generation methods for targeted sequencing of all protein-coding regions (`exomes'), to reduce costs while enriching for discovery of highly penetrant variants. Here we report on the targeted capture and massively parallel sequencing of the exomes of twelve humans. These include eight HapMap individuals representing three populations4, and four unrelated individuals with a rare dominantly inherited disorder, Freeman-Sheldon syndrome (FSS)5. We demonstrate the sensitive and specific identification of rare and common variants in over 300 megabases (Mb) of coding sequence. Using FSS as a proof-of-concept, we show that candidate genes for monogenic disorders can be identified by exome sequencing of a small number of unrelated, affected individuals. This strategy may be extendable to diseases with more complex genetics through larger sample sizes and appropriate weighting of nonsynonymous variants by predicted functional impact.
The incorporation of genomics into medicine is stimulating interest on the return of incidental findings (IFs) from exome and genome sequencing. However, no large-scale study has yet estimated the number of expected actionable findings per individual; therefore, we classified actionable pathogenic single-nucleotide variants in 500 European- and 500 African-descent participants randomly selected from the National Heart, Lung, and Blood Institute Exome Sequencing Project. The 1,000 individuals were screened for variants in 114 genes selected by an expert panel for their association with medically actionable genetic conditions possibly undiagnosed in adults. Among the 1,000 participants, 585 instances of 239 unique variants were identified as disease causing in the Human Gene Mutation Database (HGMD). The primary literature supporting the variants' pathogenicity was reviewed. Of the identified IFs, only 16 unique autosomal-dominant variants in 17 individuals were assessed to be pathogenic or likely pathogenic, and one participant had two pathogenic variants for an autosomal-recessive disease. Furthermore, one pathogenic and four likely pathogenic variants not listed as disease causing in HGMD were identified. These data can provide an estimate of the frequency (∼3.4% for European descent and ∼1.2% for African descent) of the high-penetrance actionable pathogenic or likely pathogenic variants in adults. The 23 participants with pathogenic or likely pathogenic variants were disproportionately of European (17) versus African (6) descent. The process of classifying these variants underscores the need for a more comprehensive and diverse centralized resource to provide curated information on pathogenicity for clinical use to minimize health disparities in genomic medicine.
Recommendations for laboratories to report incidental findings from genomic tests have stimulated interest in such results. In order to investigate the criteria and processes for assigning the pathogenicity of specific variants and to estimate the frequency of such incidental findings in patients of European and African ancestry, we classified potentially actionable pathogenic single-nucleotide variants (SNVs) in all 4300 European- and 2203 African-ancestry participants sequenced by the NHLBI Exome Sequencing Project (ESP). We considered 112 gene-disease pairs selected by an expert panel as associated with medically actionable genetic disorders that may be undiagnosed in adults. The resulting classifications were compared to classifications from other clinical and research genetic testing laboratories, as well as with in silico pathogenicity scores. Among European-ancestry participants, 30 of 4300 (0.7%) had a pathogenic SNV and six (0.1%) had a disruptive variant that was expected to be pathogenic, whereas 52 (1.2%) had likely pathogenic SNVs. For African-ancestry participants, six of 2203 (0.3%) had a pathogenic SNV and six (0.3%) had an expected pathogenic disruptive variant, whereas 13 (0.6%) had likely pathogenic SNVs. Genomic Evolutionary Rate Profiling mammalian conservation score and the Combined Annotation Dependent Depletion summary score of conservation, substitution, regulation, and other evidence were compared across pathogenicity assignments and appear to have utility in variant classification. This work provides a refined estimate of the burden of adult onset, medically actionable incidental findings expected from exome sequencing, highlights challenges in variant classification, and demonstrates the need for a better curated variant interpretation knowledge base.
Freeman-Sheldon syndrome, or distal arthrogryposis type 2A (DA2A), is an autosomal-dominant condition caused by mutations in MYH3 and characterized by multiple congenital contractures of the face and limbs and normal cognitive development. We identified a subset of five individuals who had been putatively diagnosed with "DA2A with severe neurological abnormalities" and for whom congenital contractures of the limbs and face, hypotonia, and global developmental delay had resulted in early death in three cases; this is a unique condition that we now refer to as CLIFAHDD syndrome. Exome sequencing identified missense mutations in the sodium leak channel, non-selective (NALCN) in four families affected by CLIFAHDD syndrome. We used molecular-inversion probes to screen for NALCN in a cohort of 202 distal arthrogryposis (DA)-affected individuals as well as concurrent exome sequencing of six other DA-affected individuals, thus revealing NALCN mutations in ten additional families with "atypical" forms of DA. All 14 mutations were missense variants predicted to alter amino acid residues in or near the S5 and S6 pore-forming segments of NALCN, highlighting the functional importance of these segments. In vitro functional studies demonstrated that NALCN alterations nearly abolished the expression of wild-type NALCN, suggesting that alterations that cause CLIFAHDD syndrome have a dominant-negative effect. In contrast, homozygosity for mutations in other regions of NALCN has been reported in three families affected by an autosomal-recessive condition characterized mainly by hypotonia and severe intellectual disability. Accordingly, mutations in NALCN can cause either a recessive or dominant condition characterized by varied though overlapping phenotypic features, perhaps based on the type of mutation and affected protein domain(s).
The detection of sequence variation, for which DNA sequencing has emerged as the most sensitive and automated approach, forms the basis of all genetic analysis. Here we describe and illustrate an algorithm that accurately detects and genotypes SNPs from fluorescence-based sequence data. Because the algorithm focuses particularly on detecting SNPs through the identification of heterozygous individuals, it is especially well suited to the detection of SNPs in diploid samples obtained after DNA amplification. It is substantially more accurate than existing approaches and, notably, provides a useful quantitative measure of its confidence in each potential SNP detected and in each genotype called. Calls assigned the highest confidence are sufficiently reliable to remove the need for manual review in several contexts. For example, for sequence data from 47-90 individuals sequenced on both the forward and reverse strands, the highest-confidence calls from our algorithm detected 93% of all SNPs and 100% of high-frequency SNPs, with no false positive SNPs identified and 99.9% genotyping accuracy. This algorithm is implemented in a software package, PolyPhred version 5.0, which is freely available for academic use.
Background Human exome sequencing is a recently developed tool to aid in the discovery of novel coding variants. Now broadly applied, exome sequencing datasets provide a novel opportunity to evaluate the allele frequencies of previously published pathogenic rare variants. Methods and Results We examined the exome dataset from the NHLBI Exome Sequencing Project (ESP) and compared this dataset with a catalog of 197 previously published rare variants reported as causative of dilated cardiomyopathy (DCM) from familial and sporadic cases. Of these 197, 33 (16.8%) were also present in the ESP database, raising the question of whether they were uncommon polymorphisms. Supporting functional data has been published for 14 of the 33 (42%), suggesting they are unlikely to be false positives. The frequencies of these functional variants in the ESP dataset ranged from 0.02–1.33% (median 0.04%), which when applied as a cut-off to filter variants in a DCM pedigree identified an additional DCM candidate gene. A greater proportion of sporadic DCM cases had variants that were present in the ESP dataset vs novel variants (i.e. not in ESP; 44% vs 21%), p=0.002), suggesting some of the variants identified as disease causing in sporadic DCM are either false positives or low penetrance alleles in human populations. Conclusions Rare nonsynonymous variants identified in DCM subjects also present at very low frequencies in public databases are likely relevant for DCM. Allele frequencies >0.04% are of less certain pathogenicity, especially if indentified in sporadic cases, although this cut-off should be viewed as preliminary.
Mitochondrial fatty acid synthesis (mtFAS) is an evolutionarily conserved pathway essential for the function of the respiratory chain and several mitochondrial enzyme complexes. We report here a unique neurometabolic human disorder caused by defective mtFAS. Seven individuals from five unrelated families presented with childhood-onset dystonia, optic atrophy, and basal ganglia signal abnormalities on MRI. All affected individuals were found to harbor recessive mutations in MECR encoding the mitochondrial trans-2-enoyl-coenzyme A-reductase involved in human mtFAS. All six mutations are extremely rare in the general population, segregate with the disease in the families, and are predicted to be deleterious. The nonsense c.855T>G (p.Tyr285), c.247_250del (p.Asn83Hisfs4), and splice site c.830+2_830+3insT mutations lead to C-terminal truncation variants of MECR. The missense c.695G>A (p.Gly232Glu), c.854A>G (p.Tyr285Cys), and c.772C>T (p.Arg258Trp) mutations involve conserved amino acid residues, are located within the cofactor binding domain, and are predicted by structural analysis to have a destabilizing effect. Yeast modeling and complementation studies validated the pathogenicity of the MECR mutations. Fibroblast cell lines from affected individuals displayed reduced levels of both MECR and lipoylated proteins as well as defective respiration. These results suggest that mutations in MECR cause a distinct human disorder of the mtFAS pathway. The observation of decreased lipoylation raises the possibility of a potential therapeutic strategy.
Despite rapid technical progress and demonstrable effectiveness for some types of diagnosis and therapy, much remains to be learned about clinical genome and exome sequencing (CGES) and its role within the practice of medicine. The Clinical Sequencing Exploratory Research (CSER) consortium includes 18 extramural research projects, one National Human Genome Research Institute (NHGRI) intramural project, and a coordinating center funded by the NHGRI and National Cancer Institute. The consortium is exploring analytic and clinical validity and utility, as well as the ethical, legal, and social implications of sequencing via multidisciplinary approaches; it has thus far recruited 5,577 participants across a spectrum of symptomatic and healthy children and adults by utilizing both germline and cancer sequencing. The CSER consortium is analyzing data and creating publically available procedures and tools related to participant preferences and consent, variant classification, disclosure and management of primary and secondary findings, health outcomes, and integration with electronic health records. Future research directions will refine measures of clinical utility of CGES in both germline and somatic testing, evaluate the use of CGES for screening in healthy individuals, explore the penetrance of pathogenic variants through extensive phenotyping, reduce discordances in public databases of genes and variants, examine social and ethnic disparities in the provision of genomics services, explore regulatory issues, and estimate the value and downstream costs of sequencing. The CSER consortium has established a shared community of research sites by using diverse approaches to pursue the evidence-based development of best practices in genomic medicine.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.