The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database is a collection of annotated genomic, transcript and protein sequence records derived from data in public sequence archives and from computation, curation and collaboration (http://www.ncbi.nlm.nih.gov/refseq/). We report here on growth of the mammalian and human subsets, changes to NCBI’s eukaryotic annotation pipeline and modifications affecting transcript and protein records. Recent changes to NCBI’s eukaryotic genome annotation pipeline provide higher throughput, and the addition of RNAseq data to the pipeline results in a significant expansion of the number of transcripts and novel exons annotated on mammalian RefSeq genomes. Recent annotation changes include reporting supporting evidence for transcript records, modification of exon feature annotation and the addition of a structured report of gene and sequence attributes of biological interest. We also describe a revised protein annotation policy for alternatively spliced transcripts with more divergent predicted proteins and we summarize the current status of the RefSeqGene project.
Developmental and epileptic encephalopathy (DEE) is a group of conditions characterized by the co-occurrence of epilepsy and intellectual disability (ID), typically with developmental plateauing or regression associated with frequent epileptiform activity. The cause of DEE remains unknown in the majority of cases. We performed whole-genome sequencing (WGS) in 197 individuals with unexplained DEE and pharmaco-resistant seizures and in their unaffected parents. We focused our attention on de novo mutations (DNMs) and identified candidate genes containing such variants. We sought to identify additional subjects with DNMs in these genes by performing targeted sequencing in another series of individuals with DEE and by mining various sequencing datasets. We also performed meta-analyses to document enrichment of DNMs in candidate genes by leveraging our WGS dataset with those of several DEE and ID series. By combining these strategies, we were able to provide a causal link between DEE and the following genes: NTRK2, GABRB2, CLTC, DHDDS, NUS1, RAB11A, GABBR2, and SNAP25. Overall, we established a molecular diagnosis in 63/197 (32%) individuals in our WGS series. The main cause of DEE in these individuals was de novo point mutations (53/63 solved cases), followed by inherited mutations (6/63 solved cases) and de novo CNVs (4/63 solved cases). De novo missense variants explained a larger proportion of individuals in our series than in other series that were primarily ascertained because of ID. Moreover, these DNMs were more frequently recurrent than those identified in ID series. These observations indicate that the genetic landscape of DEE might be different from that of ID without epilepsy.
BackgroundDevelopmental disabilities have diverse genetic causes that must be identified to facilitate precise diagnoses. We describe genomic data from 371 affected individuals, 309 of which were sequenced as proband-parent trios.MethodsWhole-exome sequences (WES) were generated for 365 individuals (127 affected) and whole-genome sequences (WGS) were generated for 612 individuals (244 affected).ResultsPathogenic or likely pathogenic variants were found in 100 individuals (27%), with variants of uncertain significance in an additional 42 (11.3%). We found that a family history of neurological disease, especially the presence of an affected first-degree relative, reduces the pathogenic/likely pathogenic variant identification rate, reflecting both the disease relevance and ease of interpretation of de novo variants. We also found that improvements to genetic knowledge facilitated interpretation changes in many cases. Through systematic reanalyses, we have thus far reclassified 15 variants, with 11.3% of families who initially were found to harbor a VUS and 4.7% of families with a negative result eventually found to harbor a pathogenic or likely pathogenic variant. To further such progress, the data described here are being shared through ClinVar, GeneMatcher, and dbGaP.ConclusionsOur data strongly support the value of large-scale sequencing, especially WGS within proband-parent trios, as both an effective first-choice diagnostic tool and means to advance clinical and research progress related to pediatric neurological disease.Electronic supplementary materialThe online version of this article (doi:10.1186/s13073-017-0433-1) contains supplementary material, which is available to authorized users.
The Consensus Coding Sequence (CCDS) project (http://www.ncbi.nlm.nih.gov/CCDS/) is a collaborative effort to maintain a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assemblies by the National Center for Biotechnology Information (NCBI) and Ensembl genome annotation pipelines. Identical annotations that pass quality assurance tests are tracked with a stable identifier (CCDS ID). Members of the collaboration, who are from NCBI, the Wellcome Trust Sanger Institute and the University of California Santa Cruz, provide coordinated and continuous review of the dataset to ensure high-quality CCDS representations. We describe here the current status and recent growth in the CCDS dataset, as well as recent changes to the CCDS web and FTP sites. These changes include more explicit reporting about the NCBI and Ensembl annotation releases being compared, new search and display options, the addition of biologically descriptive information and our approach to representing genes for which support evidence is incomplete. We also present a summary of recent and future curation targets.
Objective:To assess the prevalence of somatic MTOR mutations in focal cortical dysplasia (FCD) and of germline MTOR mutations in a broad range of epilepsies.Methods:We collected 20 blood-brain paired samples from patients with FCD and searched for somatic variants using deep-targeted gene panel sequencing. Germline mutations in MTOR were assessed in a French research cohort of 93 probands with focal epilepsies and in a diagnostic Danish cohort of 245 patients with a broad range of epilepsies. Data sharing among collaborators allowed us to ascertain additional germline variants in MTOR.Results:We detected recurrent somatic variants (p.Ser2215Phe, p.Ser2215Tyr, and p.Leu1460Pro) in the MTOR gene in 37% of participants with FCD II and showed histologic evidence for activation of the mTORC1 signaling cascade in brain tissue. We further identified 5 novel de novo germline missense MTOR variants in 6 individuals with a variable phenotype from focal, and less frequently generalized, epilepsies without brain malformations, to macrocephaly, with or without moderate intellectual disability. In addition, an inherited variant was found in a mother–daughter pair with nonlesional autosomal dominant nocturnal frontal lobe epilepsy.Conclusions:Our data illustrate the increasingly important role of somatic mutations of the MTOR gene in FCD and germline mutations in the pathogenesis of focal epilepsy syndromes with and without brain malformation or macrocephaly.
From a GeneMatcher-enabled international collaboration, we identified ten individuals affected by intellectual disability, speech delay, ataxia, and facial dysmorphism and carrying a deleterious EBF3 variant detected by whole-exome sequencing. One 9-bp duplication and one splice-site, five missense, and two nonsense variants in EBF3 were found; the mutations occurred de novo in eight individuals, and the missense variant c.625C>T (p.Arg209Trp) was inherited by two affected siblings from their healthy mother, who is mosaic. EBF3 belongs to the early B cell factor family (also known as Olf, COE, or O/E) and is a transcription factor involved in neuronal differentiation and maturation. Structural assessment predicted that the five amino acid substitutions have damaging effects on DNA binding of EBF3. Transient expression of EBF3 mutant proteins in HEK293T cells revealed mislocalization of all but one mutant in the cytoplasm, as well as nuclear localization. By transactivation assays, all EBF3 mutants showed significantly reduced or no ability to activate transcription of the reporter gene CDKN1A, and in situ subcellular fractionation experiments demonstrated that EBF3 mutant proteins were less tightly associated with chromatin. Finally, in RNA-seq and ChIP-seq experiments, EBF3 acted as a transcriptional regulator, and mutant EBF3 had reduced genome-wide DNA binding and gene-regulatory activity. Our findings demonstrate that variants disrupting EBF3-mediated transcriptional regulation cause intellectual disability and developmental delay and are present in ~0.1% of individuals with unexplained neurodevelopmental disorders.
Despite rapid technical progress and demonstrable effectiveness for some types of diagnosis and therapy, much remains to be learned about clinical genome and exome sequencing (CGES) and its role within the practice of medicine. The Clinical Sequencing Exploratory Research (CSER) consortium includes 18 extramural research projects, one National Human Genome Research Institute (NHGRI) intramural project, and a coordinating center funded by the NHGRI and National Cancer Institute. The consortium is exploring analytic and clinical validity and utility, as well as the ethical, legal, and social implications of sequencing via multidisciplinary approaches; it has thus far recruited 5,577 participants across a spectrum of symptomatic and healthy children and adults by utilizing both germline and cancer sequencing. The CSER consortium is analyzing data and creating publically available procedures and tools related to participant preferences and consent, variant classification, disclosure and management of primary and secondary findings, health outcomes, and integration with electronic health records. Future research directions will refine measures of clinical utility of CGES in both germline and somatic testing, evaluate the use of CGES for screening in healthy individuals, explore the penetrance of pathogenic variants through extensive phenotyping, reduce discordances in public databases of genes and variants, examine social and ethnic disparities in the provision of genomics services, explore regulatory issues, and estimate the value and downstream costs of sequencing. The CSER consortium has established a shared community of research sites by using diverse approaches to pursue the evidence-based development of best practices in genomic medicine.
The bimolecular fluorescence complementation (BiFC) assay is a powerful tool for visualizing and identifying protein interactions in living cells. This assay is based on the principle of protein-fragment complementation, using two nonfluorescent fragments derived from fluorescent proteins. When two fragments are brought together in living cells by tethering each to one of a pair of interacting proteins, fluorescence is restored. Here, we provide a protocol for a Venus-based BiFC assay to visualize protein interactions in the living nematode, Caenorhabditis elegans. We discuss how to design appropriate C. elegans BiFC cloning vectors to enable visualization of protein interactions using either inducible heat shock promoters or native promoters; transform the constructs into worms by microinjection; and analyze and interpret the resulting data. When expression of BiFC fusion proteins is induced by heat shock, the fluorescent signals can be visualized as early as 30 min after induction and last for 24 h in transgenic animals. The entire procedure takes 2-3 weeks to complete.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.