Lung cancer is one of the deadliest cancers in the world. Two of the most common subtypes, lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC), have drastically different biological signatures, yet they are often treated similarly and classified together as non-small cell lung cancer (NSCLC). LUAD and LUSC biomarkers are scarce, and their distinct biological mechanisms have yet to be elucidated. To detect biologically relevant markers, many studies have attempted to improve traditional machine learning algorithms or develop novel algorithms for biomarker discovery. However, few have used overlapping machine learning or feature selection methods for cancer classification, biomarker identification, or gene expression analysis. This study proposes to use overlapping traditional feature selection or feature reduction techniques for cancer classification and biomarker discovery. The genes selected by the overlapping method were then verified using random forest. The classification statistics of the overlapping method were compared to those of the traditional feature selection methods. The identified biomarkers were validated in an external dataset using AUC and ROC analysis. Gene expression analysis was then performed to further investigate biological differences between LUAD and LUSC. Overall, our method achieved classification results comparable to, if not better than, the traditional algorithms. It also identified multiple known biomarkers, and five potentially novel biomarkers with high discriminating values between LUAD and LUSC. Many of the biomarkers also exhibit significant prognostic potential, particularly in LUAD. Our study also unraveled distinct biological pathways between LUAD and LUSC.
Introduction Cancer consistently remains one of the top causes of death in the United States every year, with many cancer deaths preventable if detected early. Circulating serum miRNAs are a promising, minimally invasive supplement or even an alternative to many current screening procedures. Many studies have shown that different serum miRNAs can discriminate healthy individuals from those with certain types of cancer. Although many of those miRNAs are often reported to be significant in one cancer type, they are also altered in other cancer types. Currently, very few studies have investigated serum miRNA biomarkers for multiple cancer types for general cancer screening purposes. Method To identify serum miRNAs that would be useful in screening multiple types of cancers, microarray cancer datasets were curated, yielding 13 different types of cancer with a total of 3352 cancer samples and 2809 non-cancer samples. The samples were divided into training and validation sets. One hundred random forest models were built using the training set to select candidate miRNAs. The selected miRNAs were then used in the validation set to see how well they differentiate cancer from normal samples in an independent dataset. Furthermore, the interactions between these miRNAs and their target mRNAs were investigated. Result The random forest models achieved an average of 97% accuracy in the training set with 95% bootstrap confidence interval of 0.9544 to 0.9778. The selected miRNAs were hsa-miR-663a, hsa-miR-6802-5p, hsa-miR-6784-5p, hsa-miR-3184-5p, and hsa-miR-8073. Each miRNA exhibited high area under the curve (AUC) value using receiver operating characteristic analysis. Moreover, the combination of four out of five miRNAs achieved the highest AUC value of 0.9815 with high sensitivity of 0.9773, indicating that these miRNAs have a high potential for cancer screening. miRNA-mRNA and protein-protein interaction analysis provided insights into how these miRNAs play a role in cancer.
The Ames dwarf (df/df) mouse is a well-established model for delayed aging. MicroRNAs (miRNAs), the most studied small noncoding RNAs (sncRNAs), may regulate ovarian aging to maintain a younger ovarian phenotype in df/df mice. In this study, we profile other types of ovarian sncRNAs, PIWI-interacting RNAs (piRNAs) and piRNA-Like RNAs (piLRNAs) in young and aged df/df and normal mice. Half of the piRNAs derive from transfer RNA fragments (tRF-piRNAs). Aging and dwarfism alter the ovarian expression of these novel sncRNAs. Specific tRF-piRNAs that increased with age might target and decrease the expression of the breast cancer antiestrogen resistance protein 3 (BCAR3) gene in the ovaries of old df/df mice. A set of piLRNAs that decreased with age map to D10Wsu102e mRNA and may be involved in trans-regulatory functions. Other piLRNAs that decreased with age potentially target and may de-repress transposable elements (TEs), leading to a beneficial impact on ovarian aging in df/df mice. These results identify unique responses in ovarian tissues with regard to aging and dwarfism. Overall, our findings highlight the complexity of the aging effects on gene expression and suggest that, in addition to miRNAs, piRNAs, piLRNAs, tRF-piRNAs, and their potential targets, can be central players in the maintenance of a younger ovarian phenotype in df/df mice.
Lung cancer is one of the deadliest cancers in the world. Two of the most common subtypes, lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC), have drastically different biological signatures, yet they are often treated similarly and classified together as non-small cell lung cancer (NSCLC). LUAD and LUSC biomarkers are scarce and their distinct biological mechanisms have yet to be elucidated. Many studies have attempted to improve traditional machine learning algorithms or develop novel algorithms to identify biomarkers, but few have used overlapping machine learning or feature selection methods for cancer classification, biomarker identification, or pathway analysis. This study proposes selecting overlapping features as a way to differentiate between cancer subtypes, especially between LUAD and LUSC. Overall, this method achieved classification results comparable to, if not better than, the traditional algorithms. It also identified multiple known biomarkers, and five potentially novel biomarkers with high discriminating values between the two subtypes. Many of the biomarkers also exhibit significant prognostic potential, particularly in LUAD. Our study also unraveled distinct biological pathways between LUAD and LUSC.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.