Breast cancer (BC) or breast neoplasm is causing major menace to the life of women around the world. The significance of early detection and staging of BC has been substantial in diagnosing protocol. This work aims to develop an automated system that combines multivariate data analysis (PCA - principal components analysis) with ensemble recurrent neural network models (stacked OGRU-LSTM) to identify Raman spectral characteristics that can be used as spectral cancer markers for the detection of BC progression and staging. Features of blood plasma from histopathologically diagnosed BC candidates were compared to healthy ones in this study. The same is performed on different leading classification models as the stacked basic RNN, the stacked-RNN-LSTM, and RNN-GRU models. A total of 2,340 Raman spectra generated is evaluated in this study. It is found from the study that stage 3 and stage 2 are structurally identical, but with PCA-Factorial Discriminant Analysis (FDA) they can be distinguished from each other, hence the Raman spectrum pertaining to blood plasma samples of the BC candidates is classified efficiently, yielding potentially high values of specificity and sensitivity for all the BC stages. Comparative classification results show that the stacked OGRU-LSTM model outperforms well for BC detection, and better differentiates various stages of BC by employing the multivariate data analysis technique. The stacked OGRU-LSTM model achieved the highest classification accuracy (97.89 %), Cohen-kappa score (0.928), F1-score (0.957), and the lowermost number of test loss and MSE (0.037), indicating that the model outperforms other baseline classifiers. HIGHLIGHTS The use of Raman spectroscopy in conjunction with deep learning models and multivariate data analysis to diagnose and categorize blood plasma samples as cancerous or noncancerous and staging of breast cancer based on their chemical composition To address the issue of underfitting and overfitting caused by insufficient Raman spectral data, spectral data augmentation techniques were implemented The potential for this technique is used to accurately classify breast cancerous samples and hence reduce the number of unnecessary excisional breast biopsies Stage 3 and stage 2 of breast cancer were found to be structurally identical but can be distinguished from each other using PCA-Factorial Discriminant Analysis with high specificity and sensitivity for all BC stages The stacked OGRU-LSTM model outperformed other baseline classifiers for breast cancer detection and better differentiated various stages of breast cancer by employing multivariate data analysis technique GRAPHICAL ABSTRACT
Breast Cancer (BC) is a serious menace to women’s health around the world. Early BC identification has been critically important for diagnosing protocol. Several classification methods for breast cancer were examined recently with various techniques, and Raman spectroscopy (RS) has become an effective approach for the identification of responsible metabolites. Moreover, the rapid and accurate classification of BC using RS necessitates active engagement in processing and analyzing Raman spectral data. This work aims to develop an efficient Hybrid Deep Learning (HDL) neural network model to differentiate breast cancer blood plasma from control samples and the spectral features obtained are used as spectral cancer markers for the detection of breast cancer. To find the optimum performing HDL model, several other HDL models were implemented to perform the binary classification of the Raman spectral signal. A total of 62199 Raman spectra generated from 26 blood plasma samples are evaluated in this study. Mainly 6 HDL methods, 1D-CNN-GRU, CNN-BiLSTM-AT, 1D-CNN-LSTM, GRU-LSTM, RNN-LSTM, and OGRU-LSTM are modeled to evaluate the performance of hybrid models to identify 2 classes of Raman spectral data. Comparative classification results show that the stacked 1D-CNN-GRU model outperforms well for breast cancer detection using the Raman spectral dataset than other prominent HDL architectures. The stacked 1D-CNN-GRUclassifier model achieved the highest classification accuracy (98.90 %), Cohen-kappa score (0.941), F1-score (0.969), and the lowermost number of test loss as 0.102776 and MSE (0.0230) indicating that the model outperforms other HDL classifiers. HIGHLIGHTS The potential of Raman spectroscopy in combination with hybrid deep learning (HDL) models to diagnose and classify cancerous or noncancerous samples, specifically blood plasma samples, based on chemical composition The implementation of data augmentation techniques to address underfitting and overfitting issues occur in the classification of spectral samples due to a lack of sufficient Raman spectral data The development of an efficient Hybrid Deep Learning (HDL) neural network model to differentiate breast cancer blood plasma from control samples and the use of spectral features as spectral cancer markers for breast cancer detection The evaluation of several HDL models for binary classification of Raman spectral signals, with the stacked 1D-CNN-GRU model achieving the highest classification accuracy and the lowest test losses The potential for this technique is to accurately classify breast cancerous samples and reduce the number of unnecessary excisional breast biopsies GRAPHICAL ABSTRACT
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.