Convolutional Neural Network for Arabic Speech Recognition

Abdelmaksoud, Engy Ragaei; Hassen, A.; Hassan, N.; Farouk, Mohamed Hesham

doi:10.21608/ejle.2020.47685.1015

Cited by 14 publications

(2 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The neurons in the convolution layer use a set of filters (weight matrix V ) to perform convolution calculations with some neurons in the previous layer. This convolution structure exploits the shift-invariance and spatial correlation of target features [34], where shift-invariance refers to the property of the convolution operation that it produces the same result regardless of the position of the input features in the receptive field, and spatial correlation refers to the statistical dependence between nearby features in the image. To reduce redundancy, a set of weight matrices is shared among all receptive fields on the same layer during convolution operations, while different feature maps employ distinct weight matrices.…”

Section: A Complex-valued Convolution Layermentioning

confidence: 99%

Multiscale Complex-Valued Feature Attention Convolutional Neural Network for SAR Automatic Target Recognition

Zhou,

Luo,

Ren

et al. 2024

IEEE J. Sel. Top. Appl. Earth Observations Remote Sensing

View full text Add to dashboard Cite

Synthetic aperture radar (SAR) images often suffer from inadequate attention to target features, insufficient expression of target feature information, and the neglect of phase information in traditional convolutional neural network (CNN) recognition methods. This leads to low recognition accuracy and slow processing speed, which are critical limitations for SAR automatic target recognition (ATR) systems. To overcome these challenges, this paper proposes the multi-scale complexvalued feature attention CNN (MsCvFA-CNN) for SAR ATR. MsCvFA-CNN is a model specifically designed for amplitude and phase information of SAR images. A novel complex-valued attention module (CAM) is proposed in this work to focus on the amplitude and phase characteristics of the target separately. By decoupling the amplitude and phase features, the CAM reduces the training time of the network, while preserving the relevant information. Furthermore, the MsCvFA-CNN employs multiple branches for feature extraction with different kernel sizes, which are then combined with CAM in the fusion stage to improve the network's representation of target features. The proposed MsCvFA-CNN is evaluated on both the complexvalued moving and stationary target acquisition and recognition (MSTAR) dataset, as well as the more challenging dataset for urban interpretation (OpenSARUrban). The results demonstrate that it outperforms traditional networks in terms of recognition accuracy and computational efficiency. Specifically, the use of complex-valued networks results in a 2.23% improvement in recognition accuracy compared to traditional real-valued networks. When CAM is added, the network's accuracy is further improved by 3.21%, and the number of epochs required to achieve the highest accuracy is reduced by nearly half.

show abstract

Section: A Complex-valued Convolution Layermentioning

confidence: 99%

Multiscale Complex-Valued Feature Attention Convolutional Neural Network for SAR Automatic Target Recognition

Zhou,

Luo,

Ren

et al. 2024

IEEE J. Sel. Top. Appl. Earth Observations Remote Sensing

View full text Add to dashboard Cite

show abstract

“…CNNs are widely used in Arabic speech recognition for their ability to capture local patterns and features in speech data [115], [116]. The paper [117] focuses on Arabic ASR using MFSC and GFCC with their first and second-order derivatives. The utilization of CNN facilitates feature learning and classification, leading to enhanced performance in Arabic ASR.…”

Section: Convolutional Neural Network (Cnn)mentioning

confidence: 99%

Arabic Speech Recognition: Advancement and Challenges

Rahman,

Kabir,

Mridha

et al. 2024

IEEE Access

View full text Add to dashboard Cite

Speech recognition is a captivating process that revolutionizes human-computer interactions, allowing us to interact and control machines through spoken commands. The foundation of speech recognition lies in understanding a given language's linguistic and textual characteristics. Although automatic speech recognition (ASR) systems flawlessly convert speech into text for various international languages, their implementation for Arabic remains inadequate. In this research, we diligently explore the current state of Arabic ASR systems and unveil the challenges encountered during their development. We categorize these challenges into two groups: those specific to the Arabic language and those that are more general. Additionally, we propose strategies to overcome these obstacles and emphasize the need for ASR architectures tailored to the Arabic language's unique grammatical and phonetic structure. In addition, we provide a comprehensive and explicit description of various feature extraction methods, language models, and acoustic models utilized in the Arabic ASR system.

show abstract

Deep fusion framework for speech command recognition using acoustic and linguistic features

Mehra

Susan

2023

Multimed Tools Appl

View full text Add to dashboard Cite

Convolutional Neural Network for Arabic Speech Recognition

Cited by 14 publications

References 34 publications

Multiscale Complex-Valued Feature Attention Convolutional Neural Network for SAR Automatic Target Recognition

Multiscale Complex-Valued Feature Attention Convolutional Neural Network for SAR Automatic Target Recognition

Arabic Speech Recognition: Advancement and Challenges

Deep fusion framework for speech command recognition using acoustic and linguistic features

Contact Info

Product

Resources

About