2022
DOI: 10.3389/fmolb.2022.907150
|View full text |Cite
|
Sign up to set email alerts
|

A comprehensive survey on computational learning methods for analysis of gene expression data

Abstract: Computational analysis methods including machine learning have a significant impact in the fields of genomics and medicine. High-throughput gene expression analysis methods such as microarray technology and RNA sequencing produce enormous amounts of data. Traditionally, statistical methods are used for comparative analysis of gene expression data. However, more complex analysis for classification of sample observations, or discovery of feature genes requires sophisticated computational approaches. In this revi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
7
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
2
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 13 publications
(11 citation statements)
references
References 242 publications
(259 reference statements)
0
7
0
Order By: Relevance
“…We developed a unique strategy of preprocessing microarray data to build a robust model that can be used to analyze a single sample. We used the feature selection approach (DGEA) for better “explainability” in clinical settings 12 and SVM and PAM for binary classification analysis. The two algorithms use different logic for classification, and a consensus rule resulted in higher precision without any significant drop in accuracy compared to the individual models ( Figure 3A and B ).…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…We developed a unique strategy of preprocessing microarray data to build a robust model that can be used to analyze a single sample. We used the feature selection approach (DGEA) for better “explainability” in clinical settings 12 and SVM and PAM for binary classification analysis. The two algorithms use different logic for classification, and a consensus rule resulted in higher precision without any significant drop in accuracy compared to the individual models ( Figure 3A and B ).…”
Section: Discussionmentioning
confidence: 99%
“…Support vector machines (SVM) and nearest shrunken centroids (NSC) are popular examples of supervised learning ML algorithms used in genomics, particularly in transcriptomics. 12 …”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…We used differential gene expression analysis (DGEA) for feature selection, which can be important for application of models into clinical settings. Feature selection methods perform better than extraction methods for their "explainability" in clinical settings (Bhandari et al, 2022). In addition, the simple filter methods such as DGEA are computationally less intensive as compared to embedded methods of feature selection.…”
Section: Discussionmentioning
confidence: 99%
“…First, the computational analysis to identify statistically significant differences; and second, the “biological” analysis to identify biologically significant differences. The computational analysis is challenging because of the large amount of data that must be handle by researchers, who require adequate training and specialization ( Bhandari et al, 2022 ); fortunately, researchers count with abundant manuals and reviews that describe computational and informatic tools used for statistical analyses of massive gene expression data ( e. g. , Kappelmann-Fenzl, 2021 ). In contrast, the articles that address how to analyze data for biological significance ( e. g. , Olson, 2006 ; Davidson, 2015 ) are scarce, even though statistical significance should be computed considering biologically meaningful contexts.…”
Section: Introductionmentioning
confidence: 99%