Duration modelling and evaluation for Arabic statistical parametric speech synthesis

Zangar, Imene; Mnasri, Zied; Colotte, Vincent; Jouvet, Denis

doi:10.1007/s11042-020-09901-7

Cited by 4 publications

(6 citation statements)

References 43 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Table XII lists the techniques used to evaluate the studies. The subjective evaluation method was commonly applied to evaluate TTS systems, and the most commonly used method was the Mean Opinion Score (MOS) test [88], [92], [93], [94], [99], [103], [104], [105], [61], [106], [108], [110], [112], [114], [115], [117], [118], [119]. Categorial estimation (CE) tests, preference test [102], [108], [117], DMOS test [108], and DRT tests [87], [96], [116] have also been used as subjective evaluation tests in some studies.…”

Section: ) Rq5: Evaluation Techniques and Resultsmentioning

confidence: 99%

“…The subjective evaluation method was commonly applied to evaluate TTS systems, and the most commonly used method was the Mean Opinion Score (MOS) test [88], [92], [93], [94], [99], [103], [104], [105], [61], [106], [108], [110], [112], [114], [115], [117], [118], [119]. Categorial estimation (CE) tests, preference test [102], [108], [117], DMOS test [108], and DRT tests [87], [96], [116] have also been used as subjective evaluation tests in some studies. Most studies measured intelligibility [91], [94], [95], [96], [99], [103], [104], [105], [61], [106], [109], [110], [112], [115], [116], [118], [120] and naturalness [94], [95], [99], [101],…”

Section: ) Rq5: Evaluation Techniques and Resultsmentioning

confidence: 99%

“…Categorial estimation (CE) tests, preference test [102], [108], [117], DMOS test [108], and DRT tests [87], [96], [116] have also been used as subjective evaluation tests in some studies. Most studies measured intelligibility [91], [94], [95], [96], [99], [103], [104], [105], [61], [106], [109], [110], [112], [115], [116], [118], [120] and naturalness [94], [95], [99], [101], [103], [104], [105], [108], [109], [112], [115], [116], [118] whereas others measured pronunciation [95], [109], sound quality [95], [109], [111], prosody [91], nasality [87], graveness [87], compactness [87], clearness ...…”

Section: ) Rq5: Evaluation Techniques and Resultsmentioning

confidence: 99%

“…Some studies were evaluated on words, triphones, and diphones [93], some were evaluated on both words and sentence synthesis [109], [112] while others were evaluated on one of them. Objective tests that were commonly used to evaluate include Visual Perceptual Test [96], PESQ test [61], [110], [116], [119], Mel cepstral distortion (MCD) [98], [99], [102], [106] and Root Mean Squared Error (RMSE) of energy, pitch, and duration models [99], [106], [108], [117]. Some studies have also compared their TTS systems with other existing TTS systems for a more objective evaluation [87], [91], [96], [103], [106], [111], [114], [117].…”

Section: ) Rq5: Evaluation Techniques and Resultsmentioning

confidence: 99%

“…Objective tests that were commonly used to evaluate include Visual Perceptual Test [96], PESQ test [61], [110], [116], [119], Mel cepstral distortion (MCD) [98], [99], [102], [106] and Root Mean Squared Error (RMSE) of energy, pitch, and duration models [99], [106], [108], [117]. Some studies have also compared their TTS systems with other existing TTS systems for a more objective evaluation [87], [91], [96], [103], [106], [111], [114], [117]. Some studies have also evaluated their TTS systems based on a success rate formula [100], [109], [120].…”

Section: ) Rq5: Evaluation Techniques and Resultsmentioning

confidence: 99%

See 4 more Smart Citations

Advancements in Arabic Text-to-Speech Systems: A 22-Year Literature Review

Chemnad

Othman

2023

IEEE Access

View full text Add to dashboard Cite

Although there are several speech synthesis models available for different languages tailored to specific domain requirements and applications, there is currently no readily available information on the latest trends in Arabic language speech synthesis. This can make it challenging for beginners to research and develop text-to-speech (TTS) systems for Arabic languages. To address this issue, this article provides a comprehensive overview of several scholars' contributions to the field of Arabic TTS, along with an examination of the unique features of the Arabic language and the corresponding challenges in creating TTS systems. Reporting only on papers discussing Arabic TTS, this systematic review evaluated the available literature published between 2000 and 2022. We conducted a systematic review in six databases using preferred reporting items for systematic reviews and meta-analyses (PRISMA) guidelines to identify studies that addressed Arabic Text-to-Speech systems. Of a total of 3719 articles identified, only 36 (0.96%) articles met our search criteria. Bibliometric analyses of these studies were conducted and reported. The results highlight the main types of speech synthesis techniques used in TTS systems: concatenative, formant, deep neural network (DNN), hybrid models, and multiagent. The corpora used to develop these systems, as well as the diacritization techniques incorporated, evaluation techniques, and the results of the performance of the systems are reported. Subjective evaluation using the mean opinion score is most applied to measure the accuracy of systems. This study also identifies gaps in the literature and makes recommendations for future research directions.

show abstract

Section: ) Rq5: Evaluation Techniques and Resultsmentioning

confidence: 99%

Section: ) Rq5: Evaluation Techniques and Resultsmentioning

confidence: 99%

Section: ) Rq5: Evaluation Techniques and Resultsmentioning

confidence: 99%

Section: ) Rq5: Evaluation Techniques and Resultsmentioning

confidence: 99%

Section: ) Rq5: Evaluation Techniques and Resultsmentioning

confidence: 99%

See 3 more Smart Citations

Advancements in Arabic Text-to-Speech Systems: A 22-Year Literature Review

Chemnad

Othman

2023

IEEE Access

View full text Add to dashboard Cite

show abstract

A Review on Speech Synthesis Based on Machine Learning

Kumari

Dev

Kumar

2022

Communications in Computer and Information Science

View full text Add to dashboard Cite

Modern Standard Arabic Speech Corpora: A Systematic Review

et al. 2023

View full text Add to dashboard Cite

Speech processing applications have become integral components across various domains of modern life. The design and preparation of a reliable recognition system rely heavily on the availability of suitable speech databases. While numerous speech databases exist for English and other languages, the availability of comprehensive resources for Arabic language remains limited. In light of this, we conducted a systematic review aiming to identify, analyse, and classify existing Modern Standard Arabic speech databases. Through our review, we identified 27 publicly available databases and analysed an additional 80 subjective databases. These databases were thoroughly studied, classified based on their characteristics, and subjected to a detailed analysis of research trends in the field. This paper provides a comprehensive discussion on the diverse speech databases developed for various speech processing applications. It sheds light on the purposes and unique characteristics of Arabic speech databases, enabling researchers to easily access suitable resources for their specific applications. The findings of this review contribute to bridging the gap in available Arabic speech databases and serve as a valuable resource for researchers in the field.

show abstract

Duration modelling and evaluation for Arabic statistical parametric speech synthesis

Cited by 4 publications

References 43 publications

Advancements in Arabic Text-to-Speech Systems: A 22-Year Literature Review

Advancements in Arabic Text-to-Speech Systems: A 22-Year Literature Review

A Review on Speech Synthesis Based on Machine Learning

Modern Standard Arabic Speech Corpora: A Systematic Review

Contact Info

Product

Resources

About