Most non-small cell lung cancer (NSCLC) prognosis prediction approaches use one data type and do not take advantage of the large amount of multimodal data available. To evaluate and explore the benefits of multimodal data integration, we present a combined feature selection and denoising autoencoder pipeline for NSCLC survival prediction and survival subtype identification using microRNA (miRNA), mRNA, DNA methylation, long non-coding RNA (lncRNA) and clinical data. Survival performance for both lung adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC) patients was compared across modality combinations, data integration time and training data types. Multimodal data combinations outperformed single data modalities, with the early integration of all data modalities achieving concordance indexes (C-indexes) of 0.67 (±.04) and 0.63 (±.02) for LUAD and LUSC, respectively versus corresponding C-index of 0.64 (±.02) and 0.59 (±.03) for the best single cell modality (clinical). Notably, combining just lncRNA and clinical data facilitated effective survival discrimination, with C-indexes of 0.69 (±.03) for LUAD and 0.62 (±.03) for LUSC. Overall, higher performance was achieved by using a single denoising autoencoder for all biological data (early integration) and by training on both LUSC and LUAD patient data together. Two survival subtypes (log rank test p-value=1e-9) were identified, with 991 differentially expressed transcripts in the poorer survival group. Our analysis shows the value of multimodal data integration for predicting NSCLC progression, with especially good performance using the combination of lncRNA and clinical data. Early integration of biological data, with an initial linear feature selection technique and a denoising autoencoder for dimensionality reduction, showed effective survival performance and survival subtype identification. Further research is underway to expand analysis to different cancer types and data modalities and extract more biological interpretability from autoencoder models. Citation Format: Jacob G. Ellen, Etai Jacob, Nikos Nikolaou, Natasha Markuzon. Autoencoder-based multimodal prediction of survival for non-small cell lung cancer. [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 5373.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.