Abstract:DNA sequence classification is one of the major challenges in biological data processing. The identification and classification of novel viral genome sequences drastically help in reducing the dangers of a viral outbreak like COVID-19. The more accurate the classification of these viruses, the faster a vaccine can be produced to counter them. Thus, more accurate methods should be utilized to classify the viral DNA. This research proposes a hybrid deep learning model for efficient viral DNA sequence classificat… Show more
“…Furthermore, in a previous study, the Genetic Algorithm (GA) optimized Convolutional Neural Network (GA-CNN) model was introduced [9]. This model represents a synergy of two powerful computational concepts: the robust feature extraction inherent in Convolutional Neural Networks (CNNs) and the efficiency of optimization provided by Genetic Algorithms (GAs).…”
Section: Related Workmentioning
confidence: 99%
“…This model represents a synergy of two powerful computational concepts: the robust feature extraction inherent in Convolutional Neural Networks (CNNs) and the efficiency of optimization provided by Genetic Algorithms (GAs). The GA-CNN model, as detailed in previous research, has successfully harnessed these capabilities to analyze and classify specific viral sequences with remarkable accuracy [9]. El-Tohamy et al [9] developed an optimized Convolutional Neural Network (GA-CNN) for classification of viral genomes.…”
Section: Related Workmentioning
confidence: 99%
“…The GA-CNN model, as detailed in previous research, has successfully harnessed these capabilities to analyze and classify specific viral sequences with remarkable accuracy [9]. El-Tohamy et al [9] developed an optimized Convolutional Neural Network (GA-CNN) for classification of viral genomes. It was also enhanced to outperform previous work of Gunasekaran, Hemalatha, et al [21] by introducing more virus labels.…”
Section: Related Workmentioning
confidence: 99%
“…The GA-CNN model was adeptly applied to a dataset of specific viral sequences using 3 encoding methods label , one-hot and k-mer encoding, achieving the highest classification accuracy of 94.88% using label encoding . El-Tohamy et al [11] also introduced Convolutional Neural Network with Extreme Learning Machines (CNN-ELM) model with the generic diverse virus family dataset collected. This model achieved a notable accuracy of 94.54% using k-mer encoding for classifying a broad spectrum of viral families.…”
Section: Related Workmentioning
confidence: 99%
“…It effectively captures the subtle genetic variations and patterns that are key to distinguishing between different viruses. The other model is the Convolutional Neural Network with Extreme Learning Machines (CNN-ELM) [11]. It harnesses the power of Convolutional Neural Networks (CNNs) for extracting key features from complex viral sequences.…”
The field of genomic bioinformatics is continually challenged by the need for precise classification of viral DNA sequences. The challenge of accurately classifying viral sequences is crucial for the development of diagnostic and therapeutic strategies for any viral outbreaks. This study presents a comprehensive approach integrating two distinct deep learning models, namely the Genetic Algorithm (GA) optimized Convolutional Neural Networks (CNN) hybrid model and the CNN-Extreme Learning Machines (ELM) model aiming to enhance the classification of viral DNA sequences across specific viruses and viral families.A comprehensive data preprocessing strategy is employed, wherein both datasets undergo k-mer, label, and one-hot vector encoding. This allows for a uniform and comparative analysis across different models and datasets. When the optimized GA-CNN is applied to the more generic viral family dataset, it demonstrates a good adaptability with an accuracy of 95.88% achieving a higher result than the CNN-ELM. In contrast, the CNN-ELM, when tested on the specific virus dataset, maintains robust feature extraction capabilities, faster training time but lower than the optimized GA-CNN model achieving an accuracy of 92.7%. A comparative analysis of training times is also employed in this study. The CNN-ELM model shows a notable efficiency, with a 34% faster training time compared to the GA-CNN. Moreover, when both models are applied to the new generic dataset, a comparative study with other deep learning models is conducted. Remarkably, the GA-CNN outperforms other models, achieving the highest classification accuracy of 95.88%.
“…Furthermore, in a previous study, the Genetic Algorithm (GA) optimized Convolutional Neural Network (GA-CNN) model was introduced [9]. This model represents a synergy of two powerful computational concepts: the robust feature extraction inherent in Convolutional Neural Networks (CNNs) and the efficiency of optimization provided by Genetic Algorithms (GAs).…”
Section: Related Workmentioning
confidence: 99%
“…This model represents a synergy of two powerful computational concepts: the robust feature extraction inherent in Convolutional Neural Networks (CNNs) and the efficiency of optimization provided by Genetic Algorithms (GAs). The GA-CNN model, as detailed in previous research, has successfully harnessed these capabilities to analyze and classify specific viral sequences with remarkable accuracy [9]. El-Tohamy et al [9] developed an optimized Convolutional Neural Network (GA-CNN) for classification of viral genomes.…”
Section: Related Workmentioning
confidence: 99%
“…The GA-CNN model, as detailed in previous research, has successfully harnessed these capabilities to analyze and classify specific viral sequences with remarkable accuracy [9]. El-Tohamy et al [9] developed an optimized Convolutional Neural Network (GA-CNN) for classification of viral genomes. It was also enhanced to outperform previous work of Gunasekaran, Hemalatha, et al [21] by introducing more virus labels.…”
Section: Related Workmentioning
confidence: 99%
“…The GA-CNN model was adeptly applied to a dataset of specific viral sequences using 3 encoding methods label , one-hot and k-mer encoding, achieving the highest classification accuracy of 94.88% using label encoding . El-Tohamy et al [11] also introduced Convolutional Neural Network with Extreme Learning Machines (CNN-ELM) model with the generic diverse virus family dataset collected. This model achieved a notable accuracy of 94.54% using k-mer encoding for classifying a broad spectrum of viral families.…”
Section: Related Workmentioning
confidence: 99%
“…It effectively captures the subtle genetic variations and patterns that are key to distinguishing between different viruses. The other model is the Convolutional Neural Network with Extreme Learning Machines (CNN-ELM) [11]. It harnesses the power of Convolutional Neural Networks (CNNs) for extracting key features from complex viral sequences.…”
The field of genomic bioinformatics is continually challenged by the need for precise classification of viral DNA sequences. The challenge of accurately classifying viral sequences is crucial for the development of diagnostic and therapeutic strategies for any viral outbreaks. This study presents a comprehensive approach integrating two distinct deep learning models, namely the Genetic Algorithm (GA) optimized Convolutional Neural Networks (CNN) hybrid model and the CNN-Extreme Learning Machines (ELM) model aiming to enhance the classification of viral DNA sequences across specific viruses and viral families.A comprehensive data preprocessing strategy is employed, wherein both datasets undergo k-mer, label, and one-hot vector encoding. This allows for a uniform and comparative analysis across different models and datasets. When the optimized GA-CNN is applied to the more generic viral family dataset, it demonstrates a good adaptability with an accuracy of 95.88% achieving a higher result than the CNN-ELM. In contrast, the CNN-ELM, when tested on the specific virus dataset, maintains robust feature extraction capabilities, faster training time but lower than the optimized GA-CNN model achieving an accuracy of 92.7%. A comparative analysis of training times is also employed in this study. The CNN-ELM model shows a notable efficiency, with a 34% faster training time compared to the GA-CNN. Moreover, when both models are applied to the new generic dataset, a comparative study with other deep learning models is conducted. Remarkably, the GA-CNN outperforms other models, achieving the highest classification accuracy of 95.88%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.