The remarkable growth of multi-platform genomic profiles has led to the multiomics data integration challenge. In this study, we present a novel network-based integration method of multiomics data as well as a clustering technique founded on the Wasserstein (Earth Mover's) distance from the theory of optimal mass transport. We applied our proposed method of aggregating multiomics and Wasserstein distance clustering (aWCluster) to invasive breast carcinoma from The Cancer Genome Atlas (TCGA) project. The subtypes were characterized by the concordant effect of mRNA expression, DNA copy number alteration, and DNA methylation as well as the interaction network connectivity of the gene products. aWCluster successfully clusters the breast cancer TCGA data into classes with significantly different survival rates. A gene ontology enrichment analysis of significant genes in the low survival subgroup leads to the well-known phenomenon of tumor hypoxia and the transcription factor ETS1 whose expression is induced by hypoxia. In addition, immune subtype analysis in our clustering via aWCluster recovers the inflammatory immune subtype in a group demonstrating improved prognosis. Consequently, we believe aWCluster has the potential to discover novel subtypes and biomarkers by accentuating the genes that have concordant multiomics measurements in their interaction network, which are challenging to find without the network inference or with single omics analysis.
The emerging field of radiomics, which consists of transforming standard-of-care images to quantifiable scalar statistics, endeavors to reveal the information hidden in these macroscopic images. This field of research has found different applications ranging from phenotyping and tumor classification to outcome prediction and treatment planning. Texture analysis, which often consists of reducing spatial texture matrices to summary scalar features, has been shown to be important in many of the latter applications. However, as pointed out in many studies, some of the derived texture statistics are strongly correlated and tend to contribute redundant information; and are also sensitive to the parameters used in their computation, e.g., the number of gray intensity levels. In the present study, we propose first to consider texture matrices, with an emphasis on gray-level co-occurrence matrix (GLCM), as a non-parametric multivariate objects. The proposed modeling approach avoids evaluating redundant and strongly correlated features and also prevents the feature processing steps. Then, via the Wasserstein distance from optimal mass transport theory, we propose to compare these spatial objects to identify computerized tomography slices with dental artifacts in head and neck cancer. We demonstrate the robustness of the proposed classification approach with respect to the GLCM extraction parameters and the size of the training set. Comparisons with the random forest classifier, which is constructed on scalar texture features, demonstrates the efficiency and robustness of the proposed algorithm. ¶
The study of large-scale pharmacogenomics provides an unprecedented opportunity to develop computational models that can accurately predict large cohorts of cell lines and drugs. In this work, we present a novel method for predicting drug sensitivity in cancer cell lines which considers both cell line genomic features and drug chemical features. Our network-based approach combines the theory of optimal mass transport (OMT) with machine learning techniques. It starts with unsupervised clustering of both cell line and drug data, followed by the prediction of drug sensitivity in the paired cluster of cell lines and drugs. We show that prior clustering of the heterogenous cell lines and structurally diverse drugs significantly improves the accuracy of the prediction. In addition, it facilities the interpretability of the results and identification of molecular biomarkers which are significant for both clustering of the cell lines and predicting the drug response.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.