2021
DOI: 10.1109/access.2021.3070428
|View full text |Cite
|
Sign up to set email alerts
|

A Novel Hybrid Feature Selection and Ensemble Learning Framework for Unbalanced Cancer Data Diagnosis With Transcriptome and Functional Proteomic

Abstract: The high dimension, high redundancy and class imbalance of cancer multiple omics data are the main challenges for cancer diagnosis. Existing studies have neglected the role of functional proteomics in the occurrence and development of cancer. In this study, a novel hybrid feature selection and ensemble learning framework, referred to as the three-stage feature selection and twice-competitional ensemble learning method (TSFS-TCEM), is proposed for cancer diagnosis. Firstly, we combine the transcriptome and func… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
17
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(18 citation statements)
references
References 49 publications
(41 reference statements)
1
17
0
Order By: Relevance
“…Tang et al constructed multi-omics data with the transcriptome and functional proteomics data and employed a bagging-based ensemble learning method on breast cancer to avoid over-fitting problems [ 53 ]. Le et al utilized an ensemble of pre-trained ResNet50 CNN models for the classification of skin cancer to support dermatologists in skin cancer diagnosis [ 54 ].…”
Section: Overview Of the Existing Methods For The Class Imbalance Pro...mentioning
confidence: 99%
See 2 more Smart Citations
“…Tang et al constructed multi-omics data with the transcriptome and functional proteomics data and employed a bagging-based ensemble learning method on breast cancer to avoid over-fitting problems [ 53 ]. Le et al utilized an ensemble of pre-trained ResNet50 CNN models for the classification of skin cancer to support dermatologists in skin cancer diagnosis [ 54 ].…”
Section: Overview Of the Existing Methods For The Class Imbalance Pro...mentioning
confidence: 99%
“…Feature selection approaches are considered dimensionality reduction techniques. Due to the high dimensionality of proteomic data, analyzing, storing, training, and classifying these data may be considered an NP-Hard problem [ 53 , 60 ]. Feature selection methods to find optimal feature subsets reduce the computational and storing cost and eliminate redundant and irrelevant information, facilitating data visualization.…”
Section: Overview Of the Existing Methods For The Class Imbalance Pro...mentioning
confidence: 99%
See 1 more Smart Citation
“…The framework performs significantly well in predicting colorectal cancer. Tang et al [9] proposed a hybrid framework with a combination of feature selection and ensemble-based learning called the Three-stage Feature selection and Twice-competitional Ensemble learning Method (TSFS-RCEM). This comprises of three-stage; the first stage is to perform information gain (IG) towards the imbalanced data, the second stage involves reducing its high dimensionality, and the final stage, includes feature selection to select the most relevant features.…”
Section: Related Workmentioning
confidence: 99%
“…Feature extraction methods can be broadly classified into two categories: those based on structural information and those based on sequence information ( Kim et al, 2004 ; Meng and Kurgan, 2016 ; Qu et al, 2019 ; Ao et al, 2021a ; Lv et al, 2021a ; Liu et al, 2021 ; Tang et al, 2021 ; Wu and Yu, 2021 ); ( Stawiski et al, 2003 ) proposed a model based on protein structure that utilises a neural network approach incorporating information like residue and hydrogen bond potential. Liu et al ( Liu et al, 2014 ) developed a model called IDNA-prot|dis, based on the pseudo amino acid composition (PseAAC) of protein sequence information.…”
Section: Introductionmentioning
confidence: 99%