2021
DOI: 10.3390/cancers13174297
|View full text |Cite
|
Sign up to set email alerts
|

Relevant and Non-Redundant Feature Selection for Cancer Classification and Subtype Detection

Abstract: Biologists seek to identify a small number of significant features that are important, non-redundant, and relevant from diverse omics data. For example, statistical methods such as LIMMA and DEseq distinguish differentially expressed genes between a case and control group from the transcript profile. Researchers also apply various column subset selection algorithms on genomics datasets for a similar purpose. Unfortunately, genes selected by such statistical or machine learning methods are often highly co-regul… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(11 citation statements)
references
References 32 publications
(39 reference statements)
1
6
0
Order By: Relevance
“…The expression of hundreds of transcripts is often cataloged by RNA-seq measurements, but most are redundant (i.e., strongly correlated) or noisy. In addition, the number of samples available is less than the number of features owing to the expenses associated with conducting experiments, which makes it simple for conventional machine learning and statistical algorithms to overfit the biological data [ 71 ].…”
Section: Discussionmentioning
confidence: 99%
“…The expression of hundreds of transcripts is often cataloged by RNA-seq measurements, but most are redundant (i.e., strongly correlated) or noisy. In addition, the number of samples available is less than the number of features owing to the expenses associated with conducting experiments, which makes it simple for conventional machine learning and statistical algorithms to overfit the biological data [ 71 ].…”
Section: Discussionmentioning
confidence: 99%
“…Therefore, a technique is needed to obtain the least overlapping features. The best features are ideally free from redundancy with each other (Rana et al, 2021).…”
Section: Feature Selectionmentioning
confidence: 99%
“…In contrast with the previous two researchers in this paper, the author conducted multi features selection for multi objects applied to the digital image identification of beef and pork. The key feature is considered relevant when having minimum overlap (Rana et al, 2021). The fundamental problem for overlap feature is that this set of features do not comprehensively represent the value of a feature as a target as it still contains the values of other features.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…, the most differentially expressed genes between two classes of interest) and the “native” selection procedures of the most important features from some models, such as random forests, regression techniques with L 1 -regularization and many others ( Saeys, Inza & Larranaga, 2007 ; Chandrashekar & Sahin, 2014 ; Wang, Wang & Chang, 2016 ). Besides that, several approaches were designed specifically for classification problems involving cancer transcriptomics data, including gene ranking, filtration and combining the most relevant genes in a single model ( Arakelyan, Aslanyan & Boyajyan, 2013 ; Zhang et al, 2021 ; Rana et al, 2021 ).…”
Section: Introductionmentioning
confidence: 99%