SUMMARY Just as reference genome sequences revolutionized human genetics, reference maps of interactome networks will be critical to fully understand genotype-phenotype relationships. Here, we describe a systematic map of ~14,000 high-quality human binary protein-protein interactions. At equal quality, this map is ~30% larger than what is available from small-scale studies published in the literature in the last few decades. While currently available information is highly biased and only covers a relatively small portion of the proteome, our systematic map appears strikingly more homogeneous, revealing a “broader” human interactome network than currently appreciated. The map also uncovers significant inter-connectivity between known and candidate cancer gene products, providing unbiased evidence for an expanded functional cancer landscape, while demonstrating how high quality interactome models will help “connect the dots” of the genomic revolution.
Summary De novo mutation plays an important role in Autism Spectrum Disorders (ASDs). Notably, pathogenic copy number variants (CNVs) are characterized by high mutation rates. We hypothesize that hypermutability is a property of ASD genes, and may also include nucleotide-substitution hotspots. We investigated global patterns of germline mutation by whole genome sequencing of monozygotic twins concordant for ASD and their parents. Mutation rates varied widely throughout the genome (by 100-fold) and could be explained by intrinsic characteristics of DNA sequence and chromatin structure. Dense clusters of mutations within individual genomes were attributable to compound mutation or gene conversion. Hypermutability was a characteristic of genes involved in ASD and other diseases. In addition, genes impacted by mutations in this study were associated with ASD in independent exome-sequencing datasets. Our findings suggest that regional hypermutation is a significant factor shaping patterns of genetic variation and disease risk in humans.
Identifying pathogenic variants and underlying functional alterations is challenging. To this end, we introduce MutPred2, a tool that improves the prioritization of pathogenic amino acid substitutions over existing methods, generates molecular mechanisms potentially causative of disease, and returns interpretable pathogenicity score distributions on individual genomes. Whilst its prioritization performance is state-of-the-art, a distinguishing feature of MutPred2 is the probabilistic modeling of variant impact on specific aspects of protein structure and function that can serve to guide experimental studies of phenotype-altering variants. We demonstrate the utility of MutPred2 in the identification of the structural and functional mutational signatures relevant to Mendelian disorders and the prioritization of de novo mutations associated with complex neurodevelopmental disorders. We then experimentally validate the functional impact of several variants identified in patients with such disorders. We argue that mechanism-driven studies of human inherited disease have the potential to significantly accelerate the discovery of clinically actionable variants.
We have sequenced five distinct mitochondrial genomes in maize: two fertile cytotypes (NA and the previously reported NB) and three cytoplasmic-male-sterile cytotypes (CMS-C, CMS-S, and CMS-T). Their genome sizes range from 535,825 bp in CMS-T to 739,719 bp in CMS-C. Large duplications (0.5-120 kb) account for most of the size increases. Plastid DNA accounts for 2.3-4.6% of each mitochondrial genome. The genomes share a minimum set of 51 genes for 33 conserved proteins, three ribosomal RNAs, and 15 transfer RNAs. Numbers of duplicate genes and plastid-derived tRNAs vary among cytotypes. A high level of sequence conservation exists both within and outside of genes (1.65-7.04 substitutions/10 kb in pairwise comparisons). However, sequence losses and gains are common: integrated plastid and plasmid sequences, as well as noncoding ''native'' mitochondrial sequences, can be lost with no phenotypic consequence. The organization of the different maize mitochondrial genomes varies dramatically; even between the two fertile cytotypes, there are 16 rearrangements. Comparing the finished shotgun sequences of multiple mitochondrial genomes from the same species suggests which genes and open reading frames are potentially functional, including which chimeric ORFs are candidate genes for cytoplasmic male sterility. This method identified the known CMS-associated ORFs in CMS-S and CMS-T, but not in CMS-C.
The electronic version of this article is the complete one and can be found online at http://genomebiology.com/2008/9/S1/S2Genome Biology 2008, 9:S2 http://genomebiology.com/2008/9/S1/S2 Genome Biology 2008, Volume 9, Suppl 1, Article S2 Peña-Castillo et al. S2.2 AbstractBackground: Several years after sequencing the human genome and the mouse genome, much remains to be discovered about the functions of most human and mouse genes. Computational prediction of gene function promises to help focus limited experimental resources on the most likely hypotheses. Several algorithms using diverse genomic data have been applied to this task in model organisms; however, the performance of such approaches in mammals has not yet been evaluated.
Summary Psychiatric disorders autism and schizophrenia have a strong genetic component, and copy number variants (CNVs) are firmly implicated. Recurrent deletions and duplications of chromosome 16p11.2 confer high risk for both diseases, but the pathways disrupted by this CNV are poorly defined. Here we investigate the dynamics of 16p11.2 network by integrating physical interactions of 16p11.2 proteins with spatio-temporal gene expression from developing human brain. We observe profound changes in protein interaction networks throughout different stages of brain development and/or in different brain regions. We identify late mid-fetal period of cortical development as most critical for establishing connectivity of 16p11.2 proteins with their co-expressed partners. Furthermore, our results suggest that the regulation of KCTD13-Cul3-RhoA pathway in layer four of inner cortical plate is crucial for controlling brain size and connectivity, and its dysregulation by the de novo mutations may be a potential determinant of 16p11.2 CNV deletion and duplication phenotypes.
Increased risk for autism spectrum disorders (ASD) is attributed to hundreds of genetic loci. The convergence of ASD variants have been investigated using various approaches, including protein interactions extracted from the published literature. However, these datasets are frequently incomplete, carry biases and are limited to interactions of a single splicing isoform, which may not be expressed in the disease-relevant tissue. Here we introduce a new interactome mapping approach by experimentally identifying interactions between brain-expressed alternatively spliced variants of ASD risk factors. The Autism Spliceform Interaction Network reveals that almost half of the detected interactions and about 30% of the newly identified interacting partners represent contribution from splicing variants, emphasizing the importance of isoform networks. Isoform interactions greatly contribute to establishing direct physical connections between proteins from the de novo autism CNVs. Our findings demonstrate the critical role of spliceform networks for translating genetic knowledge into a better understanding of human diseases.
BackgroundWith the expanding applications of mass cytometry in medical research, a wide variety of clustering methods, both semi-supervised and unsupervised, have been developed for data analysis. Selecting the optimal clustering method can accelerate the identification of meaningful cell populations.ResultTo address this issue, we compared three classes of performance measures, “precision” as external evaluation, “coherence” as internal evaluation, and stability, of nine methods based on six independent benchmark datasets. Seven unsupervised methods (Accense, Xshift, PhenoGraph, FlowSOM, flowMeans, DEPECHE, and kmeans) and two semi-supervised methods (Automated Cell-type Discovery and Classification and linear discriminant analysis (LDA)) are tested on six mass cytometry datasets. We compute and compare all defined performance measures against random subsampling, varying sample sizes, and the number of clusters for each method. LDA reproduces the manual labels most precisely but does not rank top in internal evaluation. PhenoGraph and FlowSOM perform better than other unsupervised tools in precision, coherence, and stability. PhenoGraph and Xshift are more robust when detecting refined sub-clusters, whereas DEPECHE and FlowSOM tend to group similar clusters into meta-clusters. The performances of PhenoGraph, Xshift, and flowMeans are impacted by increased sample size, but FlowSOM is relatively stable as sample size increases.ConclusionAll the evaluations including precision, coherence, stability, and clustering resolution should be taken into synthetic consideration when choosing an appropriate tool for cytometry data analysis. Thus, we provide decision guidelines based on these characteristics for the general reader to more easily choose the most suitable clustering tools.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.