The Ensembl project (http://www.ensembl.org) is a comprehensive genome information system featuring an integrated set of genome annotation, databases, and other information for chordate, selected model organism and disease vector genomes. As of release 51 (November 2008), Ensembl fully supports 45 species, and three additional species have preliminary support. New species in the past year include orangutan and six additional low coverage mammalian genomes. Major additions and improvements to Ensembl since our previous report include a major redesign of our website; generation of multiple genome alignments and ancestral sequences using the new Enredo-Pecan-Ortheus pipeline and development of our software infrastructure, particularly to support the Ensembl Genomes project (http://www.ensemblgenomes.org/).
SummaryBackgroundHuman genome sequencing has transformed our understanding of genomic variation and its relevance to health and disease, and is now starting to enter clinical practice for the diagnosis of rare diseases. The question of whether and how some categories of genomic findings should be shared with individual research participants is currently a topic of international debate, and development of robust analytical workflows to identify and communicate clinically relevant variants is paramount.MethodsThe Deciphering Developmental Disorders (DDD) study has developed a UK-wide patient recruitment network involving over 180 clinicians across all 24 regional genetics services, and has performed genome-wide microarray and whole exome sequencing on children with undiagnosed developmental disorders and their parents. After data analysis, pertinent genomic variants were returned to individual research participants via their local clinical genetics team.FindingsAround 80 000 genomic variants were identified from exome sequencing and microarray analysis in each individual, of which on average 400 were rare and predicted to be protein altering. By focusing only on de novo and segregating variants in known developmental disorder genes, we achieved a diagnostic yield of 27% among 1133 previously investigated yet undiagnosed children with developmental disorders, whilst minimising incidental findings. In families with developmentally normal parents, whole exome sequencing of the child and both parents resulted in a 10-fold reduction in the number of potential causal variants that needed clinical evaluation compared to sequencing only the child. Most diagnostic variants identified in known genes were novel and not present in current databases of known disease variation.InterpretationImplementation of a robust translational genomics workflow is achievable within a large-scale rare disease research study to allow feedback of potentially diagnostic findings to clinicians and research participants. Systematic recording of relevant clinical data, curation of a gene–phenotype knowledge base, and development of clinical decision support software are needed in addition to automated exclusion of almost all variants, which is crucial for scalable prioritisation and review of possible diagnostic variants. However, the resource requirements of development and maintenance of a clinical reporting system within a research setting are substantial.FundingHealth Innovation Challenge Fund, a parallel funding partnership between the Wellcome Trust and the UK Department of Health.
Ensembl (http://www.ensembl.org) integrates genomic information for a comprehensive set of chordate genomes with a particular focus on resources for human, mouse, rat, zebrafish and other high-value sequenced genomes. We provide complete gene annotations for all supported species in addition to specific resources that target genome variation, function and evolution. Ensembl data is accessible in a variety of formats including via our genome browser, API and BioMart. This year marks the tenth anniversary of Ensembl and in that time the project has grown with advances in genome technology. As of release 56 (September 2009), Ensembl supports 51 species including marmoset, pig, zebra finch, lizard, gorilla and wallaby, which were added in the past year. Major additions and improvements to Ensembl since our previous report include the incorporation of the human GRCh37 assembly, enhanced visualisation and data-mining options for the Ensembl regulatory features and continued development of our software infrastructure.
Genome-wide sequencing in a research setting has the potential to reveal health-related information of personal or clinical utility for the study participant. There is increasing pressure to return research findings to participants that may not be related to the project aims, particularly when these could be used to prevent disease. Such secondary, unsolicited or 'incidental findings' (IFs) may be discovered unintentionally when interpreting sequence data, or as the result of a deliberate opportunistic screen. This cross-sectional, web-based survey investigated attitudes of 6944 individuals from 75 countries towards returning IFs from genome research. Participants included four relevant stakeholder groups: 4961 members of the public, 533 genetic health professionals, 843 non-genetic health professionals and 607 genomic researchers who were invited via traditional media, social media and professional e-mail list-serve. Treatability and perceived utility of incidental results were deemed important with 98% of stakeholders personally interested in learning about preventable life-threatening conditions. Although there was a generic interest in receiving genomic information, stakeholders did not expect researchers to opportunistically screen for IFs in a research setting. On many items, genetic health professionals had significantly more conservative views compared with other stakeholders. This finding demonstrates a disconnect between the views of those handling the findings of research and those participating in research. Exploring, evaluating and ultimately addressing this disconnect should form a priority for researchers and clinicians alike. This social sciences study offers the largest dataset, published to date, of attitudes towards issues surrounding the return of IFs from sequencing research.
The DECIPHER database (https://decipher.sanger.ac.uk/) is an accessible online repository of genetic variation with associated phenotypes that facilitates the identification and interpretation of pathogenic genetic variation in patients with rare disorders. Contributing to DECIPHER is an international consortium of >200 academic clinical centres of genetic medicine and ≥1600 clinical geneticists and diagnostic laboratory scientists. Information integrated from a variety of bioinformatics resources, coupled with visualization tools, provides a comprehensive set of tools to identify other patients with similar genotype–phenotype characteristics and highlights potentially pathogenic genes. In a significant development, we have extended DECIPHER from a database of just copy-number variants to allow upload, annotation and analysis of sequence variants such as single nucleotide variants (SNVs) and InDels. Other notable developments in DECIPHER include a purpose-built, customizable and interactive genome browser to aid combined visualization and interpretation of sequence and copy-number variation against informative datasets of pathogenic and population variation. We have also introduced several new features to our deposition and analysis interface. This article provides an update to the DECIPHER database, an earlier instance of which has been described elsewhere [Swaminathan et al. (2012) DECIPHER: web-based, community resource for clinical interpretation of rare variants in developmental disorders. Hum. Mol. Genet., 21, R37–R44].
Patients with developmental disorders often harbour sub-microscopic deletions or duplications that lead to a disruption of normal gene expression or perturbation in the copy number of dosage-sensitive genes. Clinical interpretation for such patients in isolation is hindered by the rarity and novelty of such disorders. The DECIPHER project (https://decipher.sanger.ac.uk) was established in 2004 as an accessible online repository of genomic and associated phenotypic data with the primary goal of aiding the clinical interpretation of rare copy-number variants (CNVs). DECIPHER integrates information from a variety of bioinformatics resources and uses visualization tools to identify potential disease genes within a CNV. A two-tier access system permits clinicians and clinical scientists to maintain confidential linked anonymous records of phenotypes and CNVs for their patients that, with informed consent, can subsequently be shared with the wider clinical genetics and research communities. Advances in next-generation sequencing technologies are making it practical and affordable to sequence the whole exome/genome of patients who display features suggestive of a genetic disorder. This approach enables the identification of smaller intragenic mutations including single-nucleotide variants that are not accessible even with high-resolution genomic array analysis. This article briefly summarizes the current status and achievements of the DECIPHER project and looks ahead to the opportunities and challenges of jointly analysing structural and sequence variation in the human genome.
Health-related results that are discovered in the process of genomic research should only be returned to research participants after being clinically validated and then delivered and followed up within a health service. Returning such results may be difficult for genomic researchers who are limited by resources or unable to access appropriate clinicians. Raw sequence data could, in theory, be returned instead. This might appear nonsensical as, on its own, it is a meaningless code with no clinical value. Yet, as and when direct to consumer genomics services become more widely available (and can be endorsed by independent health professionals and genomic researchers alike), the return of such data could become a realistic proposition. We explore attitudes from <7000 members of the public, genomic researchers, genetic health professionals and non-genetic health professionals and ask participants to suggest what they would do with a raw sequence, if offered it. Results show 62% participants were interested in using it to seek out their own clinical interpretation. Whilst we do not propose that raw sequence data should be returned at the moment, we suggest that should this become feasible in the future, participants of sequencing studies may possibly support this.
HighlightsWe created a novel, online survey including 10 short films.The extensive survey validation process involved 19 iterations before the final survey was ready.Focussing on the survey design paid dividends in high response rate and low drop out rate.Complex subject matter was no barrier to participant involvement.Using a film-survey combination was a successful strategy in terms of recruitment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.