SummaryBackgroundHuman genome sequencing has transformed our understanding of genomic variation and its relevance to health and disease, and is now starting to enter clinical practice for the diagnosis of rare diseases. The question of whether and how some categories of genomic findings should be shared with individual research participants is currently a topic of international debate, and development of robust analytical workflows to identify and communicate clinically relevant variants is paramount.MethodsThe Deciphering Developmental Disorders (DDD) study has developed a UK-wide patient recruitment network involving over 180 clinicians across all 24 regional genetics services, and has performed genome-wide microarray and whole exome sequencing on children with undiagnosed developmental disorders and their parents. After data analysis, pertinent genomic variants were returned to individual research participants via their local clinical genetics team.FindingsAround 80 000 genomic variants were identified from exome sequencing and microarray analysis in each individual, of which on average 400 were rare and predicted to be protein altering. By focusing only on de novo and segregating variants in known developmental disorder genes, we achieved a diagnostic yield of 27% among 1133 previously investigated yet undiagnosed children with developmental disorders, whilst minimising incidental findings. In families with developmentally normal parents, whole exome sequencing of the child and both parents resulted in a 10-fold reduction in the number of potential causal variants that needed clinical evaluation compared to sequencing only the child. Most diagnostic variants identified in known genes were novel and not present in current databases of known disease variation.InterpretationImplementation of a robust translational genomics workflow is achievable within a large-scale rare disease research study to allow feedback of potentially diagnostic findings to clinicians and research participants. Systematic recording of relevant clinical data, curation of a gene–phenotype knowledge base, and development of clinical decision support software are needed in addition to automated exclusion of almost all variants, which is crucial for scalable prioritisation and review of possible diagnostic variants. However, the resource requirements of development and maintenance of a clinical reporting system within a research setting are substantial.FundingHealth Innovation Challenge Fund, a parallel funding partnership between the Wellcome Trust and the UK Department of Health.
The Human Phenotype Ontology (HPO) project, available at http://www.human-phenotype-ontology.org, provides a structured, comprehensive and well-defined set of 10,088 classes (terms) describing human phenotypic abnormalities and 13,326 subclass relations between the HPO classes. In addition we have developed logical definitions for 46% of all HPO classes using terms from ontologies for anatomy, cell types, function, embryology, pathology and other domains. This allows interoperability with several resources, especially those containing phenotype information on model organisms such as mouse and zebrafish. Here we describe the updated HPO database, which provides annotations of 7,278 human hereditary syndromes listed in OMIM, Orphanet and DECIPHER to classes of the HPO. Various meta-attributes such as frequency, references and negations are associated with each annotation. Several large-scale projects worldwide utilize the HPO for describing phenotype information in their datasets. We have therefore generated equivalence mappings to other phenotype vocabularies such as LDDB, Orphanet, MedDRA, UMLS and phenoDB, allowing integration of existing datasets and interoperability with multiple biomedical resources. We have created various ways to access the HPO database content using flat files, a MySQL database, and Web-based tools. All data and documentation on the HPO project can be found online.
Incorrect folding of proteins, leading to aggregation and amyloid formation, is associated with a group of highly debilitating medical conditions including Alzheimer's disease and late-onset diabetes. The issue of how unwanted protein association is normally avoided in a living system is particularly significant in the context of the evolution of multidomain proteins, which account for over 70% of all eukaryotic proteins, where the effective local protein concentration in the vicinity of each domain is very high. Here we describe the aggregation kinetics of multidomain protein constructs of immunoglobulin domains and the ability of different homologous domains to aggregate together. We show that aggregation of these proteins is a specific process and that the efficiency of coaggregation between different domains decreases markedly with decreasing sequence identity. Thus, whereas immunoglobulin domains with more than about 70% identity are highly prone to coaggregation, those with less than 30-40% sequence identity do not detectably interact. A bioinformatics analysis of consecutive homologous domains in large multidomain proteins shows that such domains almost exclusively have sequence identities of less than 40%, in other words below the level at which coaggregation is likely to be efficient. We propose that such low sequence identities could have a crucial and general role in safeguarding proteins against misfolding and aggregation.
PurposeGiven the rapid pace of discovery in rare disease genomics, it is likely that improvements in diagnostic yield can be made by systematically reanalysing previously generated genomic sequence data in light of new knowledge.MethodsWe tested this hypothesis in the UK-wide Deciphering Developmental Disorders Study, where in 2014 we reported a diagnostic yield of 27% through whole exome sequencing of 1133 children with severe developmental disorders and their parents. We reanalysed existing data using improved variant calling methodologies, novel variant detection algorithms, updated variant annotation, evidence-based filtering strategies, and newly discovered disease-associated genes.ResultsWe are now able to diagnose an additional 182 individuals, taking our overall diagnostic yield to 454/1133 (40%), and another 43 (4%) have a finding of uncertain clinical significance. The majority of these new diagnoses are due to novel developmental disorder-associated genes discovered since our original publication.ConclusionThis study highlights the importance of coupling large-scale research with clinical practice, and of discussing the possibility of iterative reanalysis and recontact with patients and health professionals at an early stage. We estimate that implementing parent-offspring whole exome sequencing as a first line diagnostic test for developmental disorders would diagnose >50% of patients.
We previously estimated that 42% of patients with severe developmental disorders carry pathogenic de novo mutations in coding sequences. The role of de novo mutations in regulatory elements affecting genes associated with developmental disorders, or other genes, has been essentially unexplored. We identified de novo mutations in three classes of putative regulatory elements in almost 8,000 patients with developmental disorders. Here we show that de novo mutations in highly evolutionarily conserved fetal brain-active elements are significantly and specifically enriched in neurodevelopmental disorders. We identified a significant twofold enrichment of recurrently mutated elements. We estimate that, genome-wide, 1-3% of patients without a diagnostic coding variant carry pathogenic de novo mutations in fetal brain-active regulatory elements and that only 0.15% of all possible mutations within highly conserved fetal brain-active elements cause neurodevelopmental disorders with a dominant mechanism. Our findings represent a robust estimate of the contribution of de novo mutations in regulatory elements to this genetically heterogeneous set of disorders, and emphasize the importance of combining functional and evolutionary evidence to identify regulatory causes of genetic disorders.
There are thousands of rare human disorders caused by a single deleterious, protein-coding genetic variant1. However, patients with the same genetic defect can have different clinical presentations2–4, and some individuals carrying known disease-causing variants can appear unaffected5. What explains these differences? Here, we study a cohort of 6,987 children assessed by clinical geneticists to have severe neurodevelopmental disorders, such as global developmental delay and autism, often with abnormalities of other organ systems. While the genetic causes of these neurodevelopmental disorders are expected to be almost entirely monogenic, we show that 7.7% of variance in risk is attributable to inherited common genetic variation. We replicated this genome wide common variant burden by showing that it is over-transmitted from parents to children with neurodevelopmental disorders in an independent sample of 728 trios from the same cohort. Our common variant signal is significantly positively correlated with genetic predisposition to fewer years of schooling, decreased intelligence, and risk of schizophrenia. We found that common variant risk was not significantly different between individuals with and without a known protein-coding diagnostic variant, suggesting that common variant risk is not confined to patients without a monogenic diagnosis. In addition, previously published common variant scores for autism, height, birth weight, and intracranial volume were all correlated with those traits within our cohort, suggesting that phenotypic expression in individuals with monogenic disorders is affected by the same variants as the general population. Our results demonstrate that common genetic variation affects both overall risk and clinical presentation in neurodevelopmental disorders typically considered to be monogenic.
More than 100,000 genetic variants are classified as disease causing in public databases. However, the true penetrance of many of these rare alleles is uncertain and might be over-estimated by clinical ascertainment. Here, we use data from 379,768 UK Biobank (UKB) participants of European ancestry to assess the pathogenicity and penetrance of putatively clinically important rare variants. Although rare variants are harder to genotype accurately than common variants, we were able to classify as high quality 1,244 of 4,585 (27%) putatively clinically relevant rare (MAF < 1%) variants genotyped on the UKB microarray. We defined as “clinically relevant” variants that were classified as either pathogenic or likely pathogenic in ClinVar or are in genes known to cause two specific monogenic diseases: maturity-onset diabetes of the young (MODY) and severe developmental disorders (DDs). We assessed the penetrance and pathogenicity of these high-quality variants by testing their association with 401 clinically relevant traits. 27 of the variants were associated with a UKB trait, and we were able to refine the penetrance estimate for some of the variants. For example, the HNF4A c.340C>T (p.Arg114Trp) (GenBank: NM_175914.4 ) variant associated with diabetes is <10% penetrant by the time an individual is 40 years old. We also observed associations with relevant traits for heterozygous carriers of some rare recessive conditions, e.g., heterozygous carriers of the ERCC4 c.2395C>T (p.Arg799Trp) variant that causes Xeroderma pigmentosum were more susceptible to sunburn. Finally, we refute the previous disease association of RNF135 in developmental disorders. In conclusion, this study shows that very large population-based studies will help refine our understanding of the pathogenicity of rare genetic variants.
Genome-wide sequencing in a research setting has the potential to reveal health-related information of personal or clinical utility for the study participant. There is increasing pressure to return research findings to participants that may not be related to the project aims, particularly when these could be used to prevent disease. Such secondary, unsolicited or 'incidental findings' (IFs) may be discovered unintentionally when interpreting sequence data, or as the result of a deliberate opportunistic screen. This cross-sectional, web-based survey investigated attitudes of 6944 individuals from 75 countries towards returning IFs from genome research. Participants included four relevant stakeholder groups: 4961 members of the public, 533 genetic health professionals, 843 non-genetic health professionals and 607 genomic researchers who were invited via traditional media, social media and professional e-mail list-serve. Treatability and perceived utility of incidental results were deemed important with 98% of stakeholders personally interested in learning about preventable life-threatening conditions. Although there was a generic interest in receiving genomic information, stakeholders did not expect researchers to opportunistically screen for IFs in a research setting. On many items, genetic health professionals had significantly more conservative views compared with other stakeholders. This finding demonstrates a disconnect between the views of those handling the findings of research and those participating in research. Exploring, evaluating and ultimately addressing this disconnect should form a priority for researchers and clinicians alike. This social sciences study offers the largest dataset, published to date, of attitudes towards issues surrounding the return of IFs from sequencing research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.