Cancer Research has advanced during the past few years. Using high throughput technology and advances in artificial intelligence, it is now possible to improve cancer diagnosis and targeted therapy, by integrating the investigation and analysis of clinical and omics profiles. The high dimensionality and class imbalance of the majority of available data sets represent a serious challenge to the development of computational methods and tools for cancer diagnosis and biomarker discovery. Taking into account multi-omics data further complicates the undertaking. In this paper, we describe a five-step integrative architecture for dealing with the three aforementioned problems by incorporating proteomics data, proteinprotein interaction networks, and signaling pathways in order to identify protein biomarkers with a direct association to cancerous patients' overall survival (OS) and progression free interval (PFI). The core parts of this architecture are a cluster based grey wolf optimization algorithm (CB-GWO) for feature selection and a deep stacked canonical correlation autoencoder (DSCC-AE) for clinical endpoint prediction. A thorough experimental study was carried out to evaluate the performance of the proposed optimization algorithm for feature selection, as well as the performance of the deep learning model in terms of Mathew coefficient correlation (MCC) and Area under the curve (AUC) on breast, lung, colon, and rectum cancers. The results were compared to other methods in the literature. The results are very promising and show the effectiveness of the proposed framework and its ability to outperform the other algorithms and models in terms of AUC (0.91) and MCC (0.64).
Drug development is the hardest phase for the pharmaceutical industry because it is extremely costly and time consuming. Though, due to the growing demand to produce safe and innovative medicines faster and more cost-effectively, the scientific community changed its objective into enhancing the lead identification and the lead optimization at the early discovery phase. This could be achieved using recent intelligent technologies that allow virtual screening as well as quantitative structure-activity relationship (QSAR) modeling to define the possible relationships between chemical compounds and biological activities. Among recent technologies, artificial intelligence (AI) has been introduced as a powerful solution to address problems related to drug discovery and development. In particular, machine learning (ML) has been meaningfully instrumental in the production of new drug candidates. In this work, we review the fundamental principles of machine learning algorithms, study and discuss their application and current issues in drug development.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.