Interspeech 2020 2020
DOI: 10.21437/interspeech.2020-2382
|View full text |Cite
|
Sign up to set email alerts
|

Deep Learning Based Assessment of Synthetic Speech Naturalness

Abstract: In this paper, we present a new objective prediction model for synthetic speech naturalness. It can be used to evaluate Text-To-Speech or Voice Conversion systems and works language independently. The model is trained end-to-end and based on a CNN-LSTM network that previously showed to give good results for speech quality estimation. We trained and tested the model on 16 different datasets, such as from the Blizzard Challenge and the Voice Conversion Challenge. Further, we show that the reliability of deep lea… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
34
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 41 publications
(34 citation statements)
references
References 21 publications
0
34
0
Order By: Relevance
“…Lo et al [7] adopted the convolutional and recurrent neural network models to build a mean opinion score predictor. Gabriel and Sebastian [2] proposed a TTS naturalness prediction model which achieved promising results on unseen datasets.…”
Section: Related Workmentioning
confidence: 99%
See 4 more Smart Citations
“…Lo et al [7] adopted the convolutional and recurrent neural network models to build a mean opinion score predictor. Gabriel and Sebastian [2] proposed a TTS naturalness prediction model which achieved promising results on unseen datasets.…”
Section: Related Workmentioning
confidence: 99%
“…CNN-LSTM is a neural network structure that combines CNN and LSTM and it has been recently used for speech quality assessment [2,7,25]. In this structure, CNNs extract deep features of speech and the CNN feature vectors are then used as input for LSTM network that models time dependencies, which means that CNN-LSTM has the advantages of both CNN and LSTM.…”
Section: Cnn-lstmmentioning
confidence: 99%
See 3 more Smart Citations