We present Interactome INSIDER, a tool to link genomic variant information with
structural protein-protein interactomes. Underlying this tool is the application of
machine learning to predict protein interaction interfaces for 185,957 protein
interactions with previously unresolved interfaces, in human and 7 model organisms,
including the entire experimentally determined human binary interactome. Predicted
interfaces exhibit similar functional properties as known interfaces, including enrichment
for disease mutations and recurrent cancer mutations. Through 2,164 de
novo mutagenesis experiments, we show that mutations of predicted and known
interface residues disrupt interactions at a similar rate, and much more frequently than
mutations outside of predicted interfaces. To spur functional genomic studies, Interactome
INSIDER (http://interactomeinsider.yulab.org) enables users to identify whether
variants or disease mutations are enriched in known and predicted interaction interfaces
at various resolutions. Users may explore known population variants, disease mutations,
and somatic cancer mutations, or upload their own set of mutations for this purpose.
A new algorithm and web server, mutation3D (http://mutation3d.org), proposes driver genes in cancer by identifying clusters of amino acid substitutions within tertiary protein structures. We demonstrate the feasibility of using a 3D clustering approach to implicate proteins in cancer based on explorations of single proteins using the mutation3D web interface. On a large scale, we show that clustering with mutation3D is able to separate functional from non-functional mutations by analyzing a combination of 8,869 known inherited disease mutations and 2,004 SNPs overlaid together upon the same sets of crystal structures and homology models. Further, we present a systematic analysis of whole-genome and whole-exome cancer datasets to demonstrate that mutation3D identifies many known cancer genes as well as previously underexplored target genes. The mutation3D web interface allows users to analyze their own mutation data in a variety of popular formats and provides seamless access to explore mutation clusters derived from over 975,000 somatic mutations reported by 6,811 cancer sequencing studies. The mutation3D web interface is freely available with all major browsers supported.
Each human genome carries tens of thousands of coding variants. The extent to which this variation is functional and the mechanisms by which they exert their influence remains largely unexplored. To address this gap, we leverage the ExAC database of 60,706 human exomes to investigate experimentally the impact of 2009 missense single nucleotide variants (SNVs) across 2185 protein-protein interactions, generating interaction profiles for 4797 SNV-interaction pairs, of which 421 SNVs segregate at > 1% allele frequency in human populations. We find that interaction-disruptive SNVs are prevalent at both rare and common allele frequencies. Furthermore, these results suggest that 10.5% of missense variants carried per individual are disruptive, a higher proportion than previously reported; this indicates that each individual’s genetic makeup may be significantly more complex than expected. Finally, we demonstrate that candidate disease-associated mutations can be identified through shared interaction perturbations between variants of interest and known disease mutations.
Studies of the proteome would benefit greatly from methods to directly sequence and digitally quantify proteins and detect posttranslational modifications with single-molecule sensitivity. Here, we demonstrate single-molecule protein sequencing using a dynamic approach in which single peptides are probed in real time by a mixture of dye-labeled N-terminal amino acid recognizers and simultaneously cleaved by aminopeptidases. We annotate amino acids and identify the peptide sequence by measuring fluorescence intensity, lifetime, and binding kinetics on an integrated semiconductor chip. Our results demonstrate the kinetic principles that allow recognizers to identify multiple amino acids in an information-rich manner that enables discrimination of single amino acid substitutions and posttranslational modifications. With further development, we anticipate that this approach will offer a sensitive, scalable, and accessible platform for single-molecule proteomic studies and applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.