Distinctive Phonetic Features Modeling and Extraction Using Deep Neural Networks

Seddiq, Yasser Mohammad; Alotaibi, Yousef Ajami; Selouani, Sid‐Ahmed; Meftah, Ali H.

doi:10.1109/access.2019.2924014

Cited by 7 publications

(27 citation statements)

References 24 publications

(39 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To achieve this goal, we propose the AFD-Obj system, which is a single network for AF sequence detection in Arabic and English speech. We do not use a different network for each AF, as done in some state-of-the-art systems [ 3 , 4 , 15 ], where a neural network is used for each AF to detect the presence or absence of an AF in speech frames or group of frames. We select the YOLOv3-tiny [ 27 ] detector because of its simplicity, fast computation property, and the fact that it supports multi-label detection.…”

Section: Methodsmentioning

confidence: 99%

“…Each phoneme has a unique vector of AFs; hence, we can use our system for phoneme recognition by mapping the detected AFs to the corresponding phonemes, as suggested in Ref. [ 3 ]. Moreover, we propose PD-Obj, which is an end-to-end system for direct Arabic sequence phoneme recognition from the spectrogram without AF usage.…”

Section: Methodsmentioning

confidence: 99%

“…Based on data from Arabic linguistics and researchers, [ 14 ] showed the common DPF for modern standard Arabic. According to these DPFs, a recent study of modeling and extracting them using deep neural network and multi-layer perceptron is presented in [ 3 ]. Experiments were conducted using the KACST Arabic Phonetic Database (KAPD) corpus to extract the 31 DPF elements presented in a previous study [ 14 ].…”

Section: Literature Reviewmentioning

confidence: 99%

“…Each phoneme has attributes or features describing their articulation. Based on these attributes, phonemes are described by binary vectors (e.g., ones and zeros) that indicate the existence and absence of articulatory features (AFs) [ 3 , 4 ]. For example, the phonemes/m/and/n/in the Arabic language have similar AF vectors, except at three AFs, where the alveodental feature exists in phoneme/n/and absent in phoneme/m/; bilabial feature exists in phoneme/m/and absent in a phoneme/n/; and/n/is a coronal, and/m/is not [ 3 ], as shown in Figure 1 .…”

Section: Introductionmentioning

confidence: 99%

“…Based on these attributes, phonemes are described by binary vectors (e.g., ones and zeros) that indicate the existence and absence of articulatory features (AFs) [ 3 , 4 ]. For example, the phonemes/m/and/n/in the Arabic language have similar AF vectors, except at three AFs, where the alveodental feature exists in phoneme/n/and absent in phoneme/m/; bilabial feature exists in phoneme/m/and absent in a phoneme/n/; and/n/is a coronal, and/m/is not [ 3 ], as shown in Figure 1 . AFs are used in studies related to pronunciation error detection [ 4 , 5 ], speech synthesis [ 6 ], speech pathology [ 7 ], tone recognition [ 8 ], and other speech domains.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Deep Learning-Based Detection of Articulatory Features in Arabic and English Speech

Algabri

Mathkour

Alsulaiman

et al. 2021

Sensors

View full text Add to dashboard Cite

This study proposes using object detection techniques to recognize sequences of articulatory features (AFs) from speech utterances by treating AFs of phonemes as multi-label objects in speech spectrogram. The proposed system, called AFD-Obj, recognizes sequence of multi-label AFs in speech signal and localizes them. AFD-Obj consists of two main stages: firstly, we formulate the problem of AFs detection as an object detection problem and prepare the data to fulfill requirement of object detectors by generating a spectral three-channel image from the speech signal and creating the corresponding annotation for each utterance. Secondly, we use annotated images to train the proposed system to detect sequences of AFs and their boundaries. We test the system by feeding spectrogram images to the system, which will recognize and localize multi-label AFs. We investigated using these AFs to detect the utterance phonemes. YOLOv3-tiny detector is selected because of its real-time property and its support for multi-label detection. We test our AFD-Obj system on Arabic and English languages using KAPD and TIMIT corpora, respectively. Additionally, we propose using YOLOv3-tiny as an Arabic phoneme detection system (i.e., PD-Obj) to recognize and localize a sequence of Arabic phonemes from whole speech utterances. The proposed AFD-Obj and PD-Obj systems achieve excellent results for Arabic corpus and comparable to the state-of-the-art method for English corpus. Moreover, we showed that using only one-scale detection is suitable for AFs detection or phoneme recognition.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

Section: Literature Reviewmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Deep Learning-Based Detection of Articulatory Features in Arabic and English Speech

Algabri

Mathkour

Alsulaiman

et al. 2021

Sensors

View full text Add to dashboard Cite

show abstract

Sentiment Analysis of Covid19 Tweets Using A MapReduce Fuzzified Hybrid Classifier Based On C4.5 Decision Tree and Convolutional Neural Network

Es-sabery

Garmani

et al. 2021

E3S Web Conf.

View full text Add to dashboard Cite

This contribution proposes a new model for sentiment analysis, which combines the convolutional neural network (CNN), C4.5 decision tree algorithm, and Fuzzy Rule-Based System (FRBS). Our suggested method consists of six parts. Firstly we have applied several pre-processing techniques. Secondly, we have used the fastText method for vectoring the analysed tweets. Thirdly, we have implemented the CNN for extracting and selecting the pertinent features from the tweets. Fourthly, we have fuzzified the CNN output using the Gaussian Fuzzification (GF) method for coping with vague data. Then we have applied fuzziness C4.5 for creating the fuzziness rules. Finally, we have used the General Fuzziness Reasoning (GFR) approach for classifying the new tweets. In summary, our method integrates the advantages of CNN and C4.5 techniques and overcomes the shortcomings of ambiguous data in the tweets using FRBS, which is consists of three-phase: fuzzification phase using GF, inference mechanism using fuzziness C4.5, and defuzzification phase using GFR. Also, to give our approach the ability to deal with the massive data, we have implemented it on the Hadoop framework of five computers. The experiential findings confirmed that our model operates excellently compared to other chosen models form the literature.

show abstract

Optimizing Arabic Speech Distinctive Phonetic Features and Phoneme Recognition Using Genetic Algorithm

et al. 2020

Self Cite

View full text Add to dashboard Cite

Distinctive phonetic features have an important role in Arabic speech phoneme recognition. In a given language, distinctive phonetic features are extrapolated from acoustic features using different methods. However, exploiting lengthy acoustic features vector in the sake of phoneme recognition has a huge cost in terms of computational complexity, which in turn, affects real time applications. The aim of this work is to consider methods to reduce the size of features vector employed for distinctive phonetic feature and phoneme recognition. The objective is to select the relevant input features that contribute to the speech recognition process. This, in turn, will lead to a reduced computational complexity of recognition algorithm, and an improved recognition accuracy. In the proposed approach, genetic algorithm is used to perform optimal features selection. Therefore, a baseline model based on feedforward neural networks is first built. This model is used to benchmark the results of proposed features selection method with a method that employs all elements of a features vector. Experimental results, utilizing the King Abdulaziz City for Science and Technology Arabic Phonetic Database, show that the average genetic algorithm based phoneme overall recognition accuracy is maintained slightly higher than that of recognition method employing the full-fledge features vector. The genetic algorithm based distinctive phonetic features recognition method has achieved a 50% reduction in the dimension of the input vector while obtaining a recognition accuracy of 90%. Moreover, the results of the proposed method is validated using Wilcoxon signed rank test.

show abstract

Distinctive Phonetic Features Modeling and Extraction Using Deep Neural Networks

Cited by 7 publications

References 24 publications

Deep Learning-Based Detection of Articulatory Features in Arabic and English Speech

Deep Learning-Based Detection of Articulatory Features in Arabic and English Speech

Sentiment Analysis of Covid19 Tweets Using A MapReduce Fuzzified Hybrid Classifier Based On C4.5 Decision Tree and Convolutional Neural Network

Optimizing Arabic Speech Distinctive Phonetic Features and Phoneme Recognition Using Genetic Algorithm

Contact Info

Product

Resources

About