Synthesizing Speech for Communication Devices

Yarrington, Debra

doi:10.1201/9781439864869-9

Search citation statements

Order By: Relevance

Paper Sections

Select...

서론1

Citation Types

Supporting

Mentioning

Contrasting

Unclassified

Year Published

2021

Publication Types

Select...

Article1

Relationship

Self Cite0

Independent1

Authors

Journals

Cited by 1 publication

(1 citation statement)

References 0 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…TTS(text-to-speech)는 주어진 문자열을 음성으로 출력하는 모 든 시스템을 의미한다. Yarrington(2007) Autoregressive TTS에 대한 선행 연구는 대부분 Google에서 행해졌다. Google의 Tacotron (Wang et al, 2017)은 RNN encoder, RNN decoder, CBHG 모듈, Griffin-Lim algorithm (Griffin & Lim, 1983)으로 음성을 출력한다.…”

Section: 서론unclassified

End-to-end non-autoregressive fast text-to-speech

Kim¹,

Nam²

2021

Phonetics Speech Sci.

View full text Add to dashboard Cite

Autoregressive Text-to-Speech (TTS) models suffer from inference instability and slow inference speed. Inference instability occurs when a poorly predicted sample at time step t affects all the subsequent predictions. Slow inference speed arises from a model structure that forces the predicted samples from time steps 1 to t-1 to predict the sample at time step t. In this study, an end-to-end non-autoregressive fast text-to-speech model is suggested as a solution to these problems. The results of this study show that this model's Mean Opinion Score (MOS) is close to that of Tacotron 2 -WaveNet, while this model's inference speed and stability are higher than those of Tacotron 2 -WaveNet. Further, this study aims to offer insight into the improvement of non-autoregressive models.

show abstract

Section: 서론unclassified

End-to-end non-autoregressive fast text-to-speech

Kim¹,

Nam²

2021

Phonetics Speech Sci.

View full text Add to dashboard Cite

show abstract

Synthesizing Speech for Communication Devices

Cited by 1 publication

References 0 publications

End-to-end non-autoregressive fast text-to-speech

End-to-end non-autoregressive fast text-to-speech

Contact Info

Product

Resources

About