MotivationProtein–protein interactions are vital for protein function with the average protein having between three and ten interacting partners. Knowledge of precise protein–protein interfaces comes from crystal structures deposited in the Protein Data Bank (PDB), but only 50% of structures in the PDB are complexes. There is therefore a need to predict protein–protein interfaces in silico and various methods for this purpose. Here we explore the use of a predictor based on structural features and which exploits random forest machine learning, comparing its performance with a number of popular established methods.ResultsOn an independent test set of obligate and transient complexes, our IntPred predictor performs well (MCC = 0.370, ACC = 0.811, SPEC = 0.916, SENS = 0.411) and compares favourably with other methods. Overall, IntPred ranks second of six methods tested with SPPIDER having slightly better overall performance (MCC = 0.410, ACC = 0.759, SPEC = 0.783, SENS = 0.676), but considerably worse specificity than IntPred. As with SPPIDER, using an independent test set of obligate complexes enhanced performance (MCC = 0.381) while performance is somewhat reduced on a dataset of transient complexes (MCC = 0.303). The trade-off between sensitivity and specificity compared with SPPIDER suggests that the choice of the appropriate tool is application-dependent.Availability and implementationIntPred is implemented in Perl and may be downloaded for local use or run via a web server at www.bioinf.org.uk/intpred/.Supplementary information Supplementary data are available at Bioinformatics online.
Pathogenic deviations (PDs) in humans are disease-causing missense mutations. However, in some cases, these disease-associated residues occur as the wild-type residues in functionally equivalent proteins in other species and these cases are termed 'Compensated Pathogenic Deviations' (CPDs). The lack of pathogenicity in a non-human protein is presumed to be explained in most cases by the presence of compensatory mutations, most commonly within the same protein. Identifying structural features of CPDs, and detecting specific compensatory events, will help us to understand traversal along fitness landscape valleys in protein evolution.We divided mutations listed in the OMIM database into PD and CPD datasets and performed two independent analyses: (i) we searched for potential compensatory mutations spatially close to the CPDs and (ii) using our SAAPdb database, we examined likely structural effects to try to explain why mutations are pathogenic, comparing PDs and CPDs. Our datasets were obtained from a set of 245 human proteins of known structure and contained a total of 2328 mutations of which 453 (from 85 structures) were seen to be compensated in at least one functionally equivalent protein in another (non-human) species.Structural analysis results confirm previous findings that CPDs are, on average, 'milder' in their likely structural effects than uncompensated PDs and tend to be on the protein surface. We also showed that the residues surrounding the CPD residue in the folded protein are more often mutated than the residues surrounding an uncompensated mutation, supporting the hypothesis that compensation is largely a result of structurally local mutations.
BackgroundProtein Kinases are a superfamily of proteins involved in crucial cellular processes such as cell cycle regulation and signal transduction. Accordingly, they play an important role in cancer biology. To contribute to the study of the relation between kinases and disease we compared pathogenic mutations to neutral mutations as an extension to our previous analysis of cancer somatic mutations. First, we analyzed native and mutant proteins in terms of amino acid composition. Secondly, mutations were characterized according to their potential structural effects and finally, we assessed the location of the different classes of polymorphisms with respect to kinase-relevant positions in terms of subfamily specificity, conservation, accessibility and functional sites.ResultsPathogenic Protein Kinase mutations perturb essential aspects of protein function, including disruption of substrate binding and/or effector recognition at family-specific positions. Interestingly these mutations in Protein Kinases display a tendency to avoid structurally relevant positions, what represents a significant difference with respect to the average distribution of pathogenic mutations in other protein families.ConclusionsDisease-associated mutations display sound differences with respect to neutral mutations: several amino acids are specific of each mutation type, different structural properties characterize each class and the distribution of pathogenic mutations within the consensus structure of the Protein Kinase domain is substantially different to that for non-pathogenic mutations. This preferential distribution confirms previous observations about the functional and structural distribution of the controversial cancer driver and passenger somatic mutations and their use as a proxy for the study of the involvement of somatic mutations in cancer development.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.