Identifying new therapeutic indications for existing drugs is a major challenge in drug repositioning. Most computational drug repositioning methods focus on known targets. Analyzing multiple aspects of various protein associations provides an opportunity to discover underlying drug-associated proteins that can be used to improve the performance of the drug repositioning approaches. In this study, machine learning models were developed based on the similarities of diversified biological features, including protein interaction, topological network, sequence alignment, and biological function to predict protein pairs associating with the same drugs. The crucial set of features was identified, and the high performances of protein pair predictions were achieved with an area under the curve (AUC) value of more than 93%. Based on drug chemical structures, the drug similarity levels of the promising protein pairs were used to quantify the inferred drug-associated proteins. Furthermore, these proteins were employed to establish an augmented drug-protein matrix to enhance the efficiency of three existing drug repositioning techniques: a similarity constrained matrix factorization for the drug-disease associations (SCMFDD), an ensemble meta-paths and singular value decomposition (EMP-SVD) model, and a topology similarity and singular value decomposition (TS-SVD) technique. The results showed that the augmented matrix helped to improve the performance up to 4% more in comparison to the original matrix for SCMFDD and EMP-SVD, and about 1% more for TS-SVD. In summary, inferring new protein pairs related to the same drugs increase the opportunity to reveal missing drug-associated proteins that are important for drug development via the drug repositioning technique.
Drug repositioning has been proposed to develop drugs for diseases. However, the similarity in a single aspect may not be sufficient to reveal hidden information. Therefore, we established protein–protein similarity vectors (PPSVs) based on potential similarities in various types of biological information associated with proteins, including their network topology, proteomic data, functional analysis, and druggable property. Based on the proposed PPSVs, a separate drug–disease matrix was constructed for individual to prevent characteristics from being obscured between diseases. The classification technique was employed for prediction. The results showed that more than half of the tested disease models exhibited high performance, with overall F1 scores of more than 80%. Furthermore, comparing all diseases using traditional methods in one run, we obtained an (area under the curve) AUC of 98.9%. All candidate drugs were then tested in clinical trials (p-value < 2.2 × 10−16) and were known drugs based on their functions (p-value < 0.05). An analysis revealed that, in the functional aspect, the confidence value of an interaction in the protein–protein interaction network and the functional pathway score were the best descriptors for prediction. Based on the learning processes of PPSVs with an isolated disease, the classifier exhibited high performance in predicting and identifying new potential drugs for that disease.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.