Arabic text classification methods have emerged as a natural result of the existence of a massive amount of varied textual information (written in Arabic language) on the web. In most text classification processes, feature selection is crucial task since it highly affects the classification accuracy. Generally, two types of features could be used: Statistical based features and semantic and concept features. The main interest of this paper is to specify the most effective semantic and concept features on Arabic text classification process. In this study, two novel features that use lexical, semantic and lexico-semantic relations of Arabic WordNet (AWN) ontology are suggested. The first feature set is List of Pertinent Synsets (LoPS), which is list of synsets that have a specific relation with the original terms. The second feature set is List of Pertinent Words (LoPW), which is list of words that have a specific relation with the original terms. Fifteen different relations (defined in AWN ontology) are used with both proposed features. Naïve Bayes classifier is used to perform the classification process. The experimental results, which are conducted on BBC Arabic dataset, show that using LoPS feature set improves the accuracy of Arabic text classification compared with the well-known Bag-of-Word feature and the recent Bag-of-Concept (synset) features. Also, it was found that LoPW (especially with related-to relation) improves the classification accuracy compared with LoPS, Bagof-Word and Bag-of-Concept.
FakeNews is one of the most popular phenomena that have considerable effects on our social life, especially in the political domain. Nowadays, creating fake news becomes very easy because of users' widespread using the internet and social media. Therefore, the detection of elusiveness news is a crucial problem that needs to be considerable mainly because of its challenges like the limited amount of the benchmark datasets and the amount of the published news every second. This research proposed utilizing two different machine learning algorithms (random forest and decision tree (J48)) to detect the fake news. In this paper, the full dataset size equals 20,761 samples, while the testing sample size equals 4,345 samples.The preprocessing steps start with cleaning data by removing unnecessary special characters, numbers, English letters, and white spaces, and finally, removing stop words is implemented. After that, the most popular feature extraction method (TF-IDF) is used before applying the two suggested classification algorithms. The results show that the best accuracy achieved equals 89.11% using the decision tree model while using the random forest; the accuracy achieved equals 84.97 %.
Oil price forecasting has captured the attention of both researchers and academics because of the unique characteristics of crude oil prices and how they have a big impact on a lot of different parts of the economic value of the product. As a result, most academics use a lot of different ways to predict the future. On the other hand, researchers have a hard time because crude oil prices are very unpredictable and can be affected by many different things. This study uses support vector regression (SVR) with technical indicators as a feature to improve the prediction of the monthly West Texas Intermediate (WTI) price of crude oil. The root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) measure how well the model is working. The RMSE was 1.5456, the MAE was 1.3219, and the MAPE was 1.9173 in the experiment. The results show that WTI crude oil prices are affected by technical indicators and get good performance that outperforms most other models that can be found.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.