Hamzah A. Alsayadi scite author profile

Arabic automatic speech recognition (ASR) methods with diacritics have the ability to be integrated with other systems better than Arabic ASR methods without diacritics. In this work, the application of state-of-the-art end-to-end deep learning approaches is investigated to build a robust diacritised Arabic ASR. These approaches are based on the Mel-Frequency Cepstral Coefficients and the log Mel-Scale Filter Bank energies as acoustic features. To the best of our knowledge, end-to-end deep learning approach has not been used in the task of diacritised Arabic automatic speech recognition. To fill this gap, this work presents a new CTC-based ASR, CNN-LSTM, and an attention-based end-to-end approach for improving diacritisedArabic ASR. In addition, a word-based language model is employed to achieve better results. The end-to-end approaches applied in this work are based on state-of-the-art frameworks, namely ESPnet and Espresso. Training and testing of these frameworks are performed based on the Standard Arabic Single Speaker Corpus (SASSC), which contains 7 h of modern standard Arabic speech. Experimental results show that the CNN-LSTM with an attention framework outperforms conventional ASR and the Joint CTC-attention ASR framework in the task of Arabic speech recognition. The CNN-LSTM with an attention framework could achieve a word error rate better than conventional ASR and the Joint CTC-attention ASR by 5.24% and 2.62%, respectively.This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

show abstract

Non-diacritized Arabic speech recognition based on CNN-LSTM and attention-based models

Alsayadi

Abdelhamid

Hegazy

et al. 2021

IFS

View full text Add to dashboard Cite

Arabic language has a set of sound letters called diacritics, these diacritics play an essential role in the meaning of words and their articulations. The change in some diacritics leads to a change in the context of the sentence. However, the existence of these letters in the corpus transcription affects the accuracy of speech recognition. In this paper, we investigate the effect of diactrics on the Arabic speech recognition based end-to-end deep learning. The applied end-to-end approach includes CNN-LSTM and attention-based technique presented in the state-of-the-art framework namely, Espresso using Pytorch. In addition, and to the best of our knowledge, the approach of CNN-LSTM with attention-based has not been used in the task of Arabic Automatic speech recognition (ASR). To fill this gap, this paper proposes a new approach based on CNN-LSTM with attention based method for Arabic ASR. The language model in this approach is trained using RNN-LM and LSTM-LM and based on nondiacritized transcription of the speech corpus. The Standard Arabic Single Speaker Corpus (SASSC), after omitting the diacritics, is used to train and test the deep learning model. Experimental results show that the removal of diacritics decreased out-of-vocabulary and perplexity of the language model. In addition, the word error rate (WER) is significantly improved when compared to diacritized data. The achieved average reduction in WER is 13.52%.

show abstract

Integrating Semantic Features for Enhancing Arabic Named Entity Recognition

Alsayadi¹,

El-Korany²

2016

ijacsa

View full text Add to dashboard Cite

Abstract-Named Entity Recognition (NER) is currently an essential research area that supports many tasks in NLP. Its goal is to find a solution to boost accurately the named entities identification. This paper presents an integrated semantic-based Machine learning (ML) model for Arabic Named Entity Recognition (ANER) problem. The basic idea of that model is to combine several linguistic features and to utilize syntactic dependencies to infer semantic relations between named entities. The proposed model focused on recognizing three types of named entities: person, organization and location. Accordingly, it combines internal features that represented linguistic features as well as external features that represent the semantic of relations between the three named entities to enhance the accuracy of recognizing them using external knowledge source such as Arabic WordNet ontology (ANW). We introduced both features to CRF classifier, which are effective for ANER. Experimental results show that this approach can achieve an overall F-measure around 87.86% and 84.72% for ANERCorp and ALTEC datasets respectively.

show abstract

Blog Feedback Prediction based on Ensemble Machine Learning Regression Model: Towards Data Fusion Analysis

Alsayadi¹,

El-Kenawy²,

Ibrahim³

et al. 2022

FPA

View full text Add to dashboard Cite

The last decade lead to an unbelievable growth of the importance of social media. Due to the huge amounts of documents appearing in social media, there is an enormous need for the automatic analysis of such documents. In this work, we proposed various regression models for the blog feedback prediction to be used in the data fusion environment. These models include decision tree regressor, MLP regressor, SVR, random forest regressor, and K-Neighbors regressor. The models are enhanced by average ensemble and ensemble using K-Neighbors regressor. The Blog Feedback dataset is used for training and evaluating the proposed models. The results show that there is a decrease in RMSE, MAE, MBE, R, R2, RRMSE, NSE, and WI when compared to the traditional methods.

show abstract

Deep Investigation of the Recent Advances in Dialectal Arabic Speech Recognition

et al. 2022

View full text Add to dashboard Cite

Ensemble of Machine Learning Fusion Models for Breast Cancer Detection Based on the Regression Model

Alsayadi¹,

Abdelhamid²,

El-Kenawy³

et al. 2022

FPA

View full text Add to dashboard Cite

Breast cancer is one of the deadliest cancers among women worldwide and one of the main causes of mortality for women in the United States. Breast cancer can be detected earlier and with more accuracy, extending life expectancy at a lower cost. To do this, the efficiency and precision of early breast cancer detection can be increased by evaluating the large data that is currently available utilizing technologies like machine learning fusion-based decision support systems. In this paper, we investigate the prediction performance of various regression models and a decision support system based on these models that provided the predicted category along with a prediction confidence measure. The various machine learning (ML) algorithms applied include decision tree regressor, MLP regressor, SVR, random forest regressor, and K-Neighbors regressor. The models are enhanced by average ensemble and ensemble using K-Neighbors regressor. We used the Breast Cancer Wisconsin Dataset from Wisconsin Prognostic Breast Cancer (WPBC) with 569 digitized images of a fine needle aspirate (FNA) of breast mass and 10 real-valued feature information. Among all five machine learning methods, K-Neighbors regressor had the best performance and ensemble using K-Neighbors regressor gave the best accuracy. The results show that there is a decrease in RMSE, MAE, MBE, R, R2, RRMSE, NSE, and WI when compared to the traditional methods.

show abstract

Automatic Speech Recognition for Qur’an Verses using Traditional Technique

Alsayadi¹,

Hadwan²

2022

JAIM

View full text Add to dashboard Cite

Deep learning is the one of approaches of machine learning that uses algorithms for building a model based on complex unstructured data. The Muslims Holy Qur’an book is written using Arabic diacritized text. In this paper, a traditional method to build a robust Qur’an versus recognition is proposed. The MFCC is used to extract features. These features are adapted using minimum phone error (MPE) as a discriminative model. The acoustic model was built using the deep neural network (DNN) model. We present an n-gram language model (LM). The dataset of Qur’an verses is used for training and evaluating the proposed model, consisting of 10 hours of .wav recitations performed by 60 reciters. The Experimental results showed that the proposed DNN model achieved a significantly low character error rate (CER) of 4.09% and a word error rate (WER) of 8.46%.

show abstract

Improving the Regression of Air Quality Using Ensemble of Machine Learning Models

Alsayadi¹,

Abdelhamid²,

El-Kenawy³

et al. 2022

JAIM

View full text Add to dashboard Cite

Air pollution is a particularly important problem in most countries right now because of its terrible effects on both the environment and human health. Big cities are most impacted because of the country’s quick industrial and economic development. In this paper, the authors proposed various regression model for the prediction of air quality including decision tree regressor, MLP regressor, SVR, random forest regressor, and K-Neighbors regressor. The air quality dataset, in Itally cities, is used for training and evaluation the proposed model. The results show that there is a decrease in RMSE, MAE, MBE, R, R2, RRMSE, NSE, and WI when compared to the traditional methods.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.