4This study aims to create a tumor heterogeneity-based model for predicting the best features of 5 lung adenocarcinoma (LUAD) in multiple cancer subtypes using the Least Absolute Shrinking 6 and Selection Operator (LASSO). The RNA-Seq raw count data of 533 LUAD samples and 59 7 normal samples were downloaded from the TCGA data portal. Based on consensus clustering 8 method samples was divided into two subtypes, and clusters were validated using silhouette 9 width. Furthermore, we estimated subtypes for the abundance of immune and non-immune 1 6
This study is aimed to establish a Least Absolute Shrinkage and Selection Operator (LASSO) model based on tumor heterogeneity to predict the best features of LUSC in various cancer subtypes. The RNASeq data of 505 LUSC cancer samples were downloaded from the TCGA database. Subsequent to the identification of differentially expressed genes (DEGs), the samples were divided into two subtypes based on the consensus clustering method. The subtypes were estimated with the abundance of immune and non-immune stromal cell populations which infiltrated the tissue. LASSO model was established to predict each subtype's best genes. Enrichment pathway analysis was then carried out. Finally, the validity of the LUSC model for identifying features was established by the survival analysis. 240 and 262 samples were clustered in Subtype-1 and Subtype-2 groups respectively. DEG analysis was performed on each subtype. A standard cutoff was applied and in total, 4586 genes were up regulated and 1495 were down regulated in case of subtype-1 and 5016 genes were up regulated and 3224 were down regulated in case of subtype-2. LASSO model was established to predict the best features from each subtype, 49 and 34 most relevant genes were selected in subtype-1 and subtype-2. The abundance of tissue-infiltrates analysis distinguished the subtypes based on the expression pattern of immune infiltrates. Survival analysis showed that this model could effectively predict the best and distinct features in cancer subtypes. This study suggests that unsupervised clustering and LASSO model-based feature selection can be effectively used to predict relevant genes which might play an important role in cancer diagnosis.
This study aims to create a tumor heterogeneity-based model for predicting the best features of lung adenocarcinoma (LUAD) in multiple cancer subtypes using the Least Absolute Shrinkage and Selection Operator (LASSO). The RNASeq data of 533 LUAD cancer samples were downloaded from the TCGA database. Subsequent to the identification of differentially expressed genes (DEGs), the samples were divided into two subtypes based on the consensus clustering method. The subtypes were estimated with the abundance of immune and non-immune stromal cell populations which infiltrated tissue. LASSO model was established to predict each subtype's best genes. Enrichment pathway analysis was then carried out. Finally, the validity of the LUSC model for identifying features was established by the survival analysis.89 and 444 samples were clustered in Subtype-1 and Subtype-2 groups respectively. DEG analysis was performed on each subtype. A standard cutoff was applied and in total, 2033 genes were upregulated and 505 were downregulated in case of subtype-1 and 5039 genes were upregulated and 1219 were downregulated in case of subtype-2. LASSO model was established to predict the best features from each subtypes, 40 and 43 most relevant genes were selected in subtype-1 and subtype-2. The abundance of tissue-infiltrates analysis distinguished the subtypes based on the expression pattern of immune infiltrates. Survival analysis showed that this model could effectively predict the best and distinct features in cancer subtypes. The study suggests that unsupervised clustering and Machine learning methods such as LASSO model-based feature selection can be effectively used to predict relevant genes which might play an essential role in cancer diagnosis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.