The accurate identification and description of the genes in the human and mouse genomes is a fundamental requirement for high quality analysis of data informing both genome biology and clinical genomics. Over the last 15 years, the GENCODE consortium has been producing reference quality gene annotations to provide this foundational resource. The GENCODE consortium includes both experimental and computational biology groups who work together to improve and extend the GENCODE gene annotation. Specifically, we generate primary data, create bioinformatics tools and provide analysis to support the work of expert manual gene annotators and automated gene annotation pipelines. In addition, manual and computational annotation workflows use any and all publicly available data and analysis, along with the research literature to identify and characterise gene loci to the highest standard. GENCODE gene annotations are accessible via the Ensembl and UCSC Genome Browsers, the Ensembl FTP site, Ensembl Biomart, Ensembl Perl and REST APIs as well as https://www.gencodegenes.org.
By analyzing the exome sequences of 2,536 schizophrenia cases and 2,543 controls, we have demonstrated a polygenic burden primarily arising from rare (<1/10,000), disruptive mutations distributed across many genes. Especially enriched genesets included the voltage-gated calcium ion channel and the signaling complex formed by the activity-regulated cytoskeleton-associated (ARC) scaffold protein of the postsynaptic density (PSD), sets previously implicated by genome-wide association studies (GWAS) and copy-number variation (CNV) studies. Similar to reports in autism, targets of the fragile × mental retardation protein (FMRP, product of FMR1) were enriched for case mutations. No individual gene-based test achieved significance after correction for multiple testing and we did not detect any alleles of moderately low frequency (~0.5-1%) and moderately large effect. Taken together, these data suggest that population-based exome sequencing can discover risk alleles and complements established gene mapping paradigms in neuropsychiatric disease.
N-methyl-d-aspartate receptors (NMDAR) mediate long-lasting changes in synapse strength via downstream signaling pathways. We report proteomic characterization with mass spectrometry and immunoblotting of NMDAR multiprotein complexes (NRC) isolated from mouse brain. The NRC comprised 77 proteins organized into receptor, adaptor, signaling, cytoskeletal and novel proteins, of which 30 are implicated from binding studies and another 19 participate in NMDAR signaling. NMDAR and metabotropic glutamate receptor subtypes were linked to cadherins and L1 cell-adhesion molecules in complexes lacking AMPA receptors. These neurotransmitter-adhesion receptor complexes were bound to kinases, phosphatases, GTPase-activating proteins and Ras with effectors including MAPK pathway components. Several proteins were encoded by activity-dependent genes. Genetic or pharmacological interference with 15 NRC proteins impairs learning and with 22 proteins alters synaptic plasticity in rodents. Mutations in three human genes (NF1, Rsk-2, L1) are associated with learning impairments, indicating the NRC also participates in human cognition.
A small number of rare, recurrent genomic copy number variants (CNVs) are known to substantially increase susceptibility to schizophrenia. As a consequence of the low fecundity in people with schizophrenia and other neurodevelopmental phenotypes to which these CNVs contribute, CNVs with large effects on risk are likely to be rapidly removed from the population by natural selection. Accordingly, such CNVs must frequently occur as recurrent de novo mutations. In a sample of 662 schizophrenia proband–parent trios, we found that rare de novo CNV mutations were significantly more frequent in cases (5.1% all cases, 5.5% family history negative) compared with 2.2% among 2623 controls, confirming the involvement of de novo CNVs in the pathogenesis of schizophrenia. Eight de novo CNVs occurred at four known schizophrenia loci (3q29, 15q11.2, 15q13.3 and 16p11.2). De novo CNVs of known pathogenic significance in other genomic disorders were also observed, including deletion at the TAR (thrombocytopenia absent radius) region on 1q21.1 and duplication at the WBS (Williams–Beuren syndrome) region at 7q11.23. Multiple de novos spanned genes encoding members of the DLG (discs large) family of membrane-associated guanylate kinases (MAGUKs) that are components of the postsynaptic density (PSD). Two de novos also affected EHMT1, a histone methyl transferase known to directly regulate DLG family members. Using a systems biology approach and merging novel CNV and proteomics data sets, systematic analysis of synaptic protein complexes showed that, compared with control CNVs, case de novos were significantly enriched for the PSD proteome (P=1.72 × 10−6). This was largely explained by enrichment for members of the N-methyl-D-aspartate receptor (NMDAR) (P=4.24 × 10−6) and neuronal activity-regulated cytoskeleton-associated protein (ARC) (P=3.78 × 10−8) postsynaptic signalling complexes. In an analysis of 18 492 subjects (7907 cases and 10 585 controls), case CNVs were enriched for members of the NMDAR complex (P=0.0015) but not ARC (P=0.14). Our data indicate that defects in NMDAR postsynaptic signalling and, possibly, ARC complexes, which are known to be important in synaptic plasticity and cognition, play a significant role in the pathogenesis of schizophrenia.
The postsynaptic density from human neocortex (hPSD) was isolated and 1461 proteins identified. hPSD mutations cause 133 neurological and psychiatric diseases and show enrichment in cognitive, affective and motor phenotypes underpinned by sets of genes. Strong protein sequence conservation within mammalian lineages, particularly in hub proteins, indicates conserved function and organisation in primate and rodent models. The hPSD is a key structure for nervous system disease and behaviour.
Characterization of the composition of the postsynaptic proteome (PSP) provides a framework for understanding the overall organization and function of the synapse in normal and pathological conditions. We have identified 698 proteins from the postsynaptic terminal of mouse CNS synapses using a series of purification strategies and analysis by liquid chromatography tandem mass spectrometry and large-scale immunoblotting. Some 620 proteins were found in purified postsynaptic densities (PSDs), nine in AMPA-receptor immuno-purifications, 100 in isolates using an antibody against the NMDA receptor subunit NR1, and 170 by peptide-affinity purification of complexes with the C-terminus of NR2B. Together, the NR1 and NR2B complexes contain 186 proteins, collectively referred to as membrane-associated guanylate kinase-associated signalling complexes. We extracted data from six other synapse proteome experiments and combined these with our data to provide a consensus on the composition of the PSP. In total, 1124 proteins are present in the PSP, of which 466 were validated by their detection in two or more studies, forming what we have designated the Consensus PSD. These synapse proteome data sets offer a basis for future research in synaptic biology and will provide useful information in brain disease and mental disorder studies.
Expression Atlas (http://www.ebi.ac.uk/gxa) provides information about gene and protein expression in animal and plant samples of different cell types, organism parts, developmental stages, diseases and other conditions. It consists of selected microarray and RNA-sequencing studies from ArrayExpress, which have been manually curated, annotated with ontology terms, checked for high quality and processed using standardised analysis methods. Since the last update, Atlas has grown seven-fold (1572 studies as of August 2015), and incorporates baseline expression profiles of tissues from Human Protein Atlas, GTEx and FANTOM5, and of cancer cell lines from ENCODE, CCLE and Genentech projects. Plant studies constitute a quarter of Atlas data. For genes of interest, the user can view baseline expression in tissues, and differential expression for biologically meaningful pairwise comparisons—estimated using consistent methodology across all of Atlas. Our first proteomics study in human tissues is now displayed alongside transcriptomics data in the same tissues. Novel analyses and visualisations include: ‘enrichment’ in each differential comparison of GO terms, Reactome, Plant Reactome pathways and InterPro domains; hierarchical clustering (by baseline expression) of most variable genes and experimental conditions; and, for a given gene-condition, distribution of baseline expression across biological replicates.
SummaryThe transcription factor Oct4 is key in embryonic stem cell identity and reprogramming. Insight into its partners should illuminate how the pluripotent state is established and regulated. Here, we identify a considerably expanded set of Oct4-binding proteins in mouse embryonic stem cells. We find that Oct4 associates with a varied set of proteins including regulators of gene expression and modulators of Oct4 function. Half of its partners are transcriptionally regulated by Oct4 itself or other stem cell transcription factors, whereas one-third display a significant change in expression upon cell differentiation. The majority of Oct4-associated proteins studied to date show an early lethal phenotype when mutated. A fraction of the human orthologs is associated with inherited developmental disorders or causative of cancer. The Oct4 interactome provides a resource for dissecting mechanisms of Oct4 function, enlightening the basis of pluripotency and development, and identifying potential additional reprogramming factors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.