Speech Enhancement Based on Deep AutoEncoder for Remote Arabic Speech Recognition

Dendani, Bilal; Bahi, Halima; Sari, Toufik

doi:10.1007/978-3-030-51935-3_24

Cited by 12 publications

(4 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…After the denoising step with the wavelet technique, the next step is feature extraction. MFCC is one of the successful techniques for feature extraction [11,14,15]. Experiments revealed that MFCC is a commonly utilized technique, especially for a noisy dataset like the collected speech dataset produced by an EL device.…”

Section: Preprocessing and Feature Extractionmentioning

confidence: 99%

“…Model's name Accuracy Word error rate Jaber and Abdulbaqi [28] Autoencoder (CNN) 93% -Eljawad et al [7] Fuzzy neural network 94.5% -Dendani et al [11] Autoencoder (MLP) 65.72% -Alsayadi et al [14] Autoencoder (LSTM) 71.58 28.42% Alsayadi et al [15] CNN-LSTM -13.52% Proposed model Autoencoder (GRU) 95.31% 4.69%…”

Section: Referencementioning

confidence: 99%

“…Tey reported 65.0% accuracy for ASR in stressful conditions. In another research study by Dendani et al [11] an autoencoder model was proposed for enhancing Arabic language recognition performance. Tey used the proposed model to restore the original clean speech from noisy datasets.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Deep Learning Methods for Arabic Autoencoder Speech Recognition System for Electro-Larynx Device

Ameen

Kadhim

2023

Advances in Human-Computer Interaction

View full text Add to dashboard Cite

Recent advances in speech recognition have achieved remarkable performance comparable with human transcribers’ abilities. But this significant performance is not the same for all the spoken languages. The Arabic language is one of them. Arabic speech recognition is bounded to the lack of suitable datasets. Artificial intelligence algorithms have shown promising capabilities for Arabic speech recognition. Arabic is the official language of 22 countries, and it has been estimated that 400 million people speak the Arabic language worldwide. Speech disabilities have been one of the expanding problems in the last decades, even in kids. Some devices can be used to generate speech for those people. One of these devices is the Servox Digital Electro-Larynx (EL). In this research, we developed an autoencoder with a combination of long short-term memory (LSTM) and gated recurrent units (GRU) models to recognize recorded signals from Servox Digital EL Electro-Larynx. The proposed framework consisted of three steps: denoising, feature extraction, and Arabic speech recognition. The experimental results show 95.31% accuracy for Arabic speech recognition with the proposed model. In this research, we evaluated different combinations of LSTM and GRU for constructing the best autoencoder. A rigorous evaluation process indicates better performance with the use of GRU in both encoder and decoder structures. The proposed model achieved a 4.69% word error rate (WER). Experimental results confirm that the proposed model can be used for developing a real-time app to recognize common Arabic spoken words.

show abstract

Section: Preprocessing and Feature Extractionmentioning

confidence: 99%

Section: Referencementioning

confidence: 99%

See 1 more Smart Citation

Deep Learning Methods for Arabic Autoencoder Speech Recognition System for Electro-Larynx Device

Ameen

Kadhim

2023

Advances in Human-Computer Interaction

View full text Add to dashboard Cite

show abstract

“…Long short-term memory (LSTM) and gated recurrent unit (GRU) are used for Arabic speech recognition [35]. Deep auto-encoder was used in [36], for remote Arabic speech recognition system. The authors used isolated words Arabic speech database for their experiments, where the database contained only a recording of 20 words.…”

Section: Literature Reviewsmentioning

confidence: 99%

Enabling Two-Way Communication of Deaf Using Saudi Sign Language

Faisal,

Alsulaiman,

Mekhtiche

et al. 2023

IEEE Access

View full text Add to dashboard Cite

Disabled people are facing many difficulties communicating with others and involving in society. Modern societies have dedicated significant efforts to promote the integration of disabled individuals into their societies and services. Currently, smart healthcare systems are used to facilitate disabled people. The objective of this paper is to enable two-way communication of deaf individuals with the rest of society, thus enabling their migration from marginal elements of society to mainstream contributing elements. In the proposed system, we developed three modules; the sign recognition module (SRM) that recognizes the signs of a deaf individual, the speech recognition and synthesis module (SRSM) that processes the speech of a non-deaf individual and converts it to text, and an Avatar module (AM) to generate and perform the corresponding sign of the non-deaf speech, which were integrated into the sign translation companion system called Saudi deaf companion system (SDCS) to facilitate the communication from the deaf to the hearing and vice versa. This paper also contributes to the literature by utilizing our self-developed database, the largest Saudi Sign Language (SSL) database-the King Saud University Saudi-SSL (KSU-SSL). The proposed SDCS system performs 293 Saudi signs that are recommended by the Saudi Association for Hearing Impairment (SAHI) from 10 domains (healthcare, common, alphabets, verbs, pronouns and adverbs, numbers, days, kings, family, and regions).

show abstract

Feature Embedding Representation for Unsupervised Speaker Diarization in Telephone Calls

Hamouda,

Bahi

2023

Communications in Computer and Information Science

View full text Add to dashboard Cite

Speech Enhancement Based on Deep AutoEncoder for Remote Arabic Speech Recognition

Cited by 12 publications

References 25 publications

Deep Learning Methods for Arabic Autoencoder Speech Recognition System for Electro-Larynx Device

Deep Learning Methods for Arabic Autoencoder Speech Recognition System for Electro-Larynx Device

Enabling Two-Way Communication of Deaf Using Saudi Sign Language

Feature Embedding Representation for Unsupervised Speaker Diarization in Telephone Calls

Contact Info

Product

Resources

About