Drug response prediction is a well-studied problem in which the molecular profile of a given sample is used to predict the effect of a given drug on that sample. Effective solutions to this problem hold the key for precision medicine. In cancer research, genomic data from cell lines are often utilized as features to develop machine learning models predictive of drug response. Molecular networks provide a functional context for the integration of genomic features, thereby resulting in robust and reproducible predictive models. However, inclusion of network data increases dimensionality and poses additional challenges for common machine learning tasks. To overcome these challenges, we here formulate drug response prediction as a link prediction problem. For this purpose, we represent drug response data for a large cohort of cell lines as a heterogeneous network. Using this network, we compute “network profiles” for cell lines and drugs. We then use the associations between these profiles to predict links between drugs and cell lines. Through leave-one-out cross validation and cross-classification on independent datasets, we show that this approach leads to accurate and reproducible classification of sensitive and resistant cell line-drug pairs, with 85% accuracy. We also examine the biological relevance of the network profiles.
Drug response prediction is a well-studied problem in which the molecular profile of a given sample is used to predict the effect of a given drug on that sample. Effective solutions to this problem hold the key for precision medicine. In cancer research, genomic data from cell lines are often utilized as features to develop machine learning models predictive of drug response. Molecular networks provide a functional context for the integration of genomic features, thereby resulting in robust and reproducible predictive models. However, inclusion of network data increases dimensionality and poses additional challenges for common machine learning tasks. To overcome these challenges, we here formulate drug response prediction as a link prediction problem. For this purpose, we represent drug response data for a large cohort of cell lines as a heterogeneous network. Using this network, we compute "network profiles" for cell lines and drugs. We then use the associations between these profiles to predict links between drugs and cell lines. Through leave-one-out cross validation and cross-classification on independent datasets, we show that this approach leads to accurate and reproducible classification of sensitive and resistant cell line-drug pairs, with 85% accuracy. We also examine the biological relevance of the network profiles.The last decade has seen significant advances in high-throughput molecular profiling technologies 1,2 . As a result, precision medicine has gained much attention and extensive -omic datasets have been produced. This places bioinformatics at the forefront of personalized medicine research due to a need for techniques that can integrate and analyze such data in a way that can improve understanding of disease and potential treatment options 1,3,4 . One such area of interest is the problem of drug response prediction. This involves using a patient's molecular profile to estimate the effectiveness of a given drug on the patient. To address this problem, extensive patient drug screening projects need to be performed in order to discover significant patterns of response. However, it is not feasible to treat large populations of cancer patients with numerous drugs. To circumvent this issue in the context of cancer, two large drug screening projects have been carried out using cancer cell lines in place of patient samples. These are the Genomics of Drug Sensitivity in Cancer (GDSC) and the Cancer Cell Line Encyclopedia (CCLE) projects 5,6 . Both projects performed molecular profiling (somatic mutation, copy number variation (CNV), and gene expression screening) for hundreds of cancer cell lines and treated them with multiple established compounds. Data from these two projects are publicly available and have been studied extensively to develop and test methods of drug response prediction.One of the largest works in drug response prediction, the NCI-DREAM Drug Sensitivity Prediction Challenge, obtained data on breast cancer cell lines and compiled 44 drug response prediction algorithms from participati...
Network proximity is at the heart of a large class of network analytics and information retrieval techniques, including node/ edge rankings, network alignment, and randomwalk based proximity queries, among many others. Owing to its importance, significant effort has been devoted to accelerating iterative processes underlying network proximity computations. These techniques rely on numerical properties of power iterations, as well as structural properties of the networks to reduce the runtime of iterative algorithms. In this paper, we present an alternate approach to acceleration of network proximity queries using Chebyshev polynomials. We show that our approach, called Chopper, yields asymptotically faster convergence in theory, and significantly reduced convergence times in practice. We also show that other existing acceleration techniques can be used in conjunction with Chopper to further reduce runtime. Using a number of large real-world networks, and top-k proximity queries as the benchmark problem, we show that Chopper outperforms existing methods for wide ranges of parameter values.
Background Link prediction is an important and well-studied problem in network biology. Recently, graph representation learning methods, including Graph Convolutional Network (GCN)-based node embedding have drawn increasing attention in link prediction. Motivation An important component of GCN-based network embedding is the convolution matrix, which is used to propagate features across the network. Existing algorithms use the degree-normalized adjacency matrix for this purpose, as this matrix is closely related to the graph Laplacian, capturing the spectral properties of the network. In parallel, it has been shown that GCNs with a single layer can generate more robust embeddings by reducing the number of parameters. Laplacian-based convolution is not well suited to single layered GCNs, as it limits the propagation of information to immediate neighbors of a node. Results Capitalizing on the rich literature on unsupervised link prediction, we propose using node similarity based convolution matrices in GCNs to compute node embeddings for link prediction. We consider eight representative node similarity measures (Common Neighbors, Jaccard Index, Adamic-Adar, Resource Allocation, Hub Depressed Index, Hub Promoted Index, Sorenson Index, Salton Index) for this purpose. We systematically compare the performance of the resulting algorithms against GCNs that use the degree-normalized adjacency matrix for convolution, as well as other link prediction algorithms. In our experiments, we use three link prediction tasks involving biomedical networks: drug-disease association (DDA) prediction, drug–drug interaction (DDI) prediction, protein–protein interaction (PPI) prediction. Our results show that node similarity-based convolution matrices significantly improve the link prediction performance of GCN-based embeddings. Conclusion As sophisticated machine learning frameworks are increasingly employed in biological applications, historically well-established methods can be useful in making a head-start. Availability Our method, SiGraC, is implemented as a Python library and is freely available at https://github.com/mustafaCoskunAgu/SiGraC Supplementary information Supplementary data are available at Bioinformatics online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.