Motivation Accurate and rapid prediction of protein-ligand binding affinity is a great challenge currently encountered in drug discovery. Recent advances have manifested a promising alternative in applying deep learning-based computational approaches for accurately quantifying binding affinity. The structure complementarity between protein-binding pocket and ligand has a great effect on the binding strength between a protein and a ligand, but most of existing deep learning approaches usually extracted the features of pocket and ligand by these two detached modules. Results In this work, a new deep learning approach based on the cross-attention mechanism named CAPLA was developed for improved prediction of protein-ligand binding affinity by learning features from sequence-level information of both protein and ligand. Specifically, CAPLA employs the cross-attention mechanism to capture the mutual effect of protein-binding pocket and ligand. We evaluated the performance of our proposed CAPLA on comprehensive benchmarking experiments on binding affinity prediction, demonstrating the superior performance of CAPLA over state-of-the-art baseline approaches. Moreover, we provided the interpretability for CAPLA to uncover critical functional residues that contribute most to the binding affinity through the analysis of the attention scores generated by the cross-attention mechanism. Consequently, these results indicate that CAPLA is an effective approach for binding affinity prediction and may contribute to useful help for further consequent applications. Availability The source code of the method along with trained models are freely available at https://github.com/lennylv/CAPLA. Supplementary information Supplementary data are available at Bioinformatics online.
BackgroundDeep learning is one of the most powerful machine learning methods that has achieved the state-of-the-art performance in many domains. Since deep learning was introduced to the field of bioinformatics in 2012, it has achieved success in a number of areas such as protein residue-residue contact prediction, secondary structure prediction, and fold recognition. In this work, we developed deep learning methods to improve the prediction of torsion (dihedral) angles of proteins.ResultsWe design four different deep learning architectures to predict protein torsion angles. The architectures including deep neural network (DNN) and deep restricted Boltzmann machine (DRBN), deep recurrent neural network (DRNN) and deep recurrent restricted Boltzmann machine (DReRBM) since the protein torsion angle prediction is a sequence related problem. In addition to existing protein features, two new features (predicted residue contact number and the error distribution of torsion angles extracted from sequence fragments) are used as input to each of the four deep learning architectures to predict phi and psi angles of protein backbone. The mean absolute error (MAE) of phi and psi angles predicted by DRNN, DReRBM, DRBM and DNN is about 20–21° and 29–30° on an independent dataset. The MAE of phi angle is comparable to the existing methods, but the MAE of psi angle is 29°, 2° lower than the existing methods. On the latest CASP12 targets, our methods also achieved the performance better than or comparable to a state-of-the art method.ConclusionsOur experiment demonstrates that deep learning is a valuable method for predicting protein torsion angles. The deep recurrent network architecture performs slightly better than deep feed-forward architecture, and the predicted residue contact number and the error distribution of torsion angles extracted from sequence fragments are useful features for improving prediction accuracy.Electronic supplementary materialThe online version of this article (10.1186/s12859-017-1834-2) contains supplementary material, which is available to authorized users.
Nearly one-third of non-synonymous single-nucleotide polymorphism (nsSNPs) are deleterious to human health but recognition of the disease-associated mutations remains a significant unsolved problem. We proposed a new algorithm, DAMpred, to identify disease-causing nsSNPs through the coupling of evolutionary profiles with structure predictions of proteins and protein-protein interactions. The pipeline was trained by a novel Bayes-guided artificial neural network algorithm that incorporates posterior probabilities of distinct feature classifiers with the network training process. DAMpred was tested on a large-scale dataset involving 10,635 nsSNPs from 2,154 ORFs in the human genome and recognized disease-associated nsSNPs with an accuracy 0.80 and Matthew's correlation coefficient (MCC) 0.601 that is 9.1% higher than the best of other state-ofthe-art methods. In the blind test on the TP53 gene, DAMpred correctly recognized the mutations causative of Li-Fraumeni-like syndrome with an MCC that is 27% higher than the control methods. The study demonstrates an efficient avenue to quantitatively model the association of nsSNPs with human diseases from low-resolution protein structure prediction, which should find important usefulness in diagnosis and treatment of genetic diseases.
Early recognition of citrus diseases is important for preventing crop losses and employing timely disease control measures in farms. Employing machine learning-based approaches, such as deep learning for accurate detection of multiple citrus diseases is challenging due to the limited availability of labeled diseased samples. Further, a lightweight architecture with low computational complexity is required to perform citrus disease classification on resource-constrained devices, such as mobile phones. This enables the practical utility of the architecture to perform effective monitoring of diseases by farmers using their own mobile devices in the farms. Hence, we propose a lightweight, fast, and accurate deep metric learningbased architecture for citrus disease detection from sparse data. In particular, we propose a patch-based classification network that comprises an embedding module, a cluster prototype module, and a simple neural network classifier, to detect the citrus diseases accurately. Evaluation of our proposed approach using publicly available citrus fruits and leaves dataset reveals its efficiency in accurately detecting the various diseases from leaf images. Further, the generalization capability of our approach is demonstrated using another dataset, namely the tea leaves dataset. Comparison analysis of our approach with existing stateof-the-art algorithms demonstrate its superiority in terms of detection accuracy (95.04%), the number of parameters required for tuning (less than 2.3 M), and the time efficiency in detecting the citrus diseases (less than 10 ms) using the trained model. Moreover, the ability to learn with fewer resources and without compromising accuracy empowers the practical utility of the proposed scheme on resource-constrained devices, such as mobile phones.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.