2019
DOI: 10.1038/s41598-019-53034-3
|View full text |Cite
|
Sign up to set email alerts
|

CPEM: Accurate cancer type classification based on somatic alterations using an ensemble of a random forest and a deep neural network

Abstract: With recent advances in DNA sequencing technologies, fast acquisition of large-scale genomic data has become commonplace. For cancer studies, in particular, there is an increasing need for the classification of cancer type based on somatic alterations detected from sequencing analyses. However, the ever-increasing size and complexity of the data make the classification task extremely challenging. In this study, we evaluate the contributions of various input features, such as mutation profiles, mutation rates, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
23
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 30 publications
(23 citation statements)
references
References 28 publications
0
23
0
Order By: Relevance
“…Therefore, a large number of classification models based on ensemble learning have been proposed. For example, Lee et al [ 5 ] developed an ensemble model based on random forest and deep neural network for cancer classification and achieved an accuracy of 94%. ALzubi et al [ 28 ] used the boosted weighted optimization neural network ensemble classification algorithm to classify cancer patients, thereby improving the accuracy of cancer diagnosis.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Therefore, a large number of classification models based on ensemble learning have been proposed. For example, Lee et al [ 5 ] developed an ensemble model based on random forest and deep neural network for cancer classification and achieved an accuracy of 94%. ALzubi et al [ 28 ] used the boosted weighted optimization neural network ensemble classification algorithm to classify cancer patients, thereby improving the accuracy of cancer diagnosis.…”
Section: Related Workmentioning
confidence: 99%
“…At present, the application of machine learning methods to cancer classification is a significant research field in bioinformatics [4,5]. Many traditional machine learning methods have been successfully applied to the classification analysis of gene expression data [6][7][8][9], such as RF, decision tree, KNN, and neural networks.…”
Section: Introductionmentioning
confidence: 99%
“…While in this work we focused on tumor transcriptome data which can be measured with high precision over a wide dynamic range of transcript abundances by RNA-seq, we note that TCGA datasets of tumor somatic mutations and copy number alteration events are also available (Hutter and Zenklusen, 2018). Given the voluminous literature on the use of tumor somatic genomic data for precision cancer diagnosis (Mitchel et al , 2019; Zhang et al , 2020; Lee et al , 2019), tumor DNA datasets are fertile ground for developing a semi-supervised, multi-omics model for predicting response to chemotherapy.…”
Section: Discussionmentioning
confidence: 99%
“…In the context of supervised analysis, some authors have been using the cross-validation strategy to classify some types of cancer, for example Lee et al, (2019), who used the RF with 10-fold cross-validation to classify 31 types of cancer and managed to reach 84% accuracy, reaching up to 94% for the 6 most common types of cancer. As mentioned, RF was used in our study because it avoids overfitting more efficiently than decision trees, in addition to obtaining greater accuracy and being more stable (J. C. da Silva, 2018).…”
Section: Discussionmentioning
confidence: 99%