Cellular functions are governed by proteins, and, while some proteins work independently,most work by interacting with other proteins. As a result it is crucially important to know theinteraction sites that facilitate the interactions between the proteins. Since the experimental methodsare costly and time consuming, it is essential to develop effective computational methods. We presentPITHIA, a sequence-based deep learning model for protein interaction site prediction that exploits thecombination of multiple sequence alignments and learning attention. We demonstrate that our newmodel clearly outperforms the state-of-the-art models on a wide range of metrics. In order to providemeaningful comparison, we update existing test datasets with new information regarding interactionsite, as well as
Proteins accomplish cellular functions by interacting with each other, which makes the prediction of interaction sites a fundamental problem. Computational prediction of the interaction sites has been studied extensively, with the structure-based programs being the most accurate, while the sequence-based ones being much more widely applicable, as the sequences available outnumber the structures by two orders of magnitude. We provide here the first solution that achieves both goals. Our new sequence-based program, Seq-InSite, greatly surpasses the performance of sequence-based models, matching the quality of state-of-the-art structure-based predictors, thus effectively superseding the need for models requiring structure. Seq-InSite is illustrated using an analysis of four protein sequences. Seq-InSite is freely available as a web server atseq-insite.csd.uwo.caand as free source code, including trained models and all datasets used for training and testing, atgithub.com/lucian-ilie/seq-insite.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.