In recent years, convolutional neural network (CNN) has attracted considerable attention since its impressive performance in various applications, such as Arabic sentence classification. However, building a powerful CNN for Arabic sentiment classification can be highly complicated and time consuming. In this paper, we address this problem by combining differential evolution (DE) algorithm and CNN, where DE algorithm is used to automatically search the optimal configuration including CNN architecture and network parameters. In order to achieve the goal, five CNN parameters are searched by the DE algorithm which include convolution filter sizes that control the CNN architecture, number of filters per convolution filter size (NFCS), number of neurons in fully connected (FC) layer, initialization mode, and dropout rate. In addition, the effect of the mutation and crossover operators in DE algorithm were investigated. The performance of the proposed framework DE-CNN is evaluated on five Arabic sentiment datasets. Experiments’ results show that DE-CNN has higher accuracy and is less time consuming than the state-of-the-art algorithms.
Developing cyber security is very necessary and has attracted considerable attention from academy and industry organizations worldwide. It is also very necessary to provide sustainable computing for the the Internet of Things (IoT). Machine learning techniques play a vital role in the cybersecurity of the IoT for intrusion detection and malicious identification. Thus, in this study, we develop new feature extraction and selection methods and for the IDS system using the advantages of the swarm intelligence (SI) algorithms. We design a feature extraction mechanism depending on the conventional neural networks (CNN). After that, we present an alternative feature selection (FS) approach using the recently developed SI algorithm, Aquila optimizer (AQU). Moreover, to assess the quality of the developed IDS approach, four well-known public datasets, CIC2017, NSL-KDD, BoT-IoT, and KDD99, were used. We also considered extensive comparisons to other optimization methods to verify the competitive performance of the developed method. The results show the high performance of the developed approach using different evaluation indicators.
Stemming is one of the most effective techniques, which has been adopted in many applications, such as machine learning, machine translation, document classification (DC), information retrieval, and natural language processing. The stemming technique is meant to be applied during the classification of documents to reduce the high dimensionality of the feature space, which, in turn, raises the functioning of the classification system, particularly with extreme modulated language, for instance, Arabic language. This paper aims to study the impact of stemming techniques, namely Information Science Research Institute (ISRI), Tashaphyne, and ARLStem on Arabic DC. The classification algorithms, namely Naïve Bayesian (NB), support vector machine (SVM), and K-nearest neighbors (KNN), are used in this paper. In addition, the chi-square feature selection is used to select the most relevant features. Experiments are conducted on CNN Arabic corpus, which is collected from Arabic websites to assess the performance of the classification system. In order to evaluate the classifiers, the K-fold cross-validation method and Micro-F1 are used. Findings of this paper indicate that the ARLStem outperforms the ISRI and Tashaphyne stemmers. The outcomes clearly showed the effectiveness of the SVM over the KNN and NB classifiers, which achieved 94.64% Micro-F1 value when using the ARLStem stemmer. INDEX TERMS Arabic text classification, text preprocessing, stemming techniques, feature extraction, feature selection.
This article studies convolutional neural networks for Tigrinya (also referred to as Tigrigna), which is a family of Semitic languages spoken in Eritrea and northern Ethiopia. Tigrinya is a “low-resource” language and is notable in terms of the absence of comprehensive and free data. Furthermore, it is characterized as one of the most semantically and syntactically complex languages in the world, similar to other Semitic languages. To the best of our knowledge, no previous research has been conducted on the state-of-the-art embedding technique that is shown here. We investigate which word representation methods perform better in terms of learning for single-label text classification problems, which are common when dealing with morphologically rich and complex languages. Manually annotated datasets are used here, where one contains 30,000 Tigrinya news texts from various sources with six categories of “sport”, “agriculture”, “politics”, “religion”, “education”, and “health” and one unannotated corpus that contains more than six million words. In this paper, we explore pretrained word embedding architectures using various convolutional neural networks (CNNs) to predict class labels. We construct a CNN with a continuous bag-of-words (CBOW) method, a CNN with a skip-gram method, and CNNs with and without word2vec and FastText to evaluate Tigrinya news articles. We also compare the CNN results with traditional machine learning models and evaluate the results in terms of the accuracy, precision, recall, and F1 scoring techniques. The CBOW CNN with word2vec achieves the best accuracy with 93.41%, significantly improving the accuracy for Tigrinya news classification.
This study proposes a novel framework to improve intrusion detection system (IDS) performance based on the data collected from the Internet of things (IoT) environments. The developed framework relies on deep learning and metaheuristic (MH) optimization algorithms to perform feature extraction and selection. A simple yet effective convolutional neural network (CNN) is implemented as the core feature extractor of the framework to learn better and more relevant representations of the input data in a lower-dimensional space. A new feature selection mechanism is proposed based on a recently developed MH method, called Reptile Search Algorithm (RSA), which is inspired by the hunting behaviors of the crocodiles. The RSA boosts the IDS system performance by selecting only the most important features (an optimal subset of features) from the extracted features using the CNN model. Several datasets, including KDDCup-99, NSL-KDD, CICIDS-2017, and BoT-IoT, were used to assess the IDS system performance. The proposed framework achieved competitive performance in classification metrics compared to other well-known optimization methods applied for feature selection problems.
The great advancements in communication, cloud computing, and the internet of things (IoT) have opened critical challenges in security. With these developments, cyberattacks are also rapidly growing since the current security mechanisms do not provide efficient solutions. Recently, various artificial intelligence (AI) based solutions have been proposed for different security applications, including intrusion detection. In this paper, we propose an efficient AI-based mechanism for intrusion detection systems (IDS) in IoT systems. We leverage the advancements of deep learnings and metaheuristics (MH) algorithms that approved their efficiency in solving complex engineering problems. We propose a feature extraction method using the convolutional neural networks (CNNs) to extract relevant features. Also, we develop a new feature selection method using a new variant of the transient search optimization (TSO) algorithm, called TSODE, using the operators of differential evolution (DE) algorithm. The proposed TSODE uses the DE to improve the process of balancing between exploitation and exploration phases. Furthermore, we use three public datasets, KDDCup-99, NSL-KDD, BoT-IoT, and CICIDS-2017 to assess the performance of the developed method, which achieved higher accuracy compared to several existing approaches.
Human activity recognition (HAR) plays a vital role in different real-world applications such as in tracking elderly activities for elderly care services, in assisted living environments, smart home interactions, healthcare monitoring applications, electronic games, and various human–computer interaction (HCI) applications, and is an essential part of the Internet of Healthcare Things (IoHT) services. However, the high dimensionality of the collected data from these applications has the largest influence on the quality of the HAR model. Therefore, in this paper, we propose an efficient HAR system using a lightweight feature selection (FS) method to enhance the HAR classification process. The developed FS method, called GBOGWO, aims to improve the performance of the Gradient-based optimizer (GBO) algorithm by using the operators of the grey wolf optimizer (GWO). First, GBOGWO is used to select the appropriate features; then, the support vector machine (SVM) is used to classify the activities. To assess the performance of GBOGWO, extensive experiments using well-known UCI-HAR and WISDM datasets were conducted. Overall outcomes show that GBOGWO improved the classification accuracy with an average accuracy of 98%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.