2022
DOI: 10.11591/ijeecs.v26.i1.pp243-252
|View full text |Cite
|
Sign up to set email alerts
|

Performance analysis of different intonation models in Kannada speech synthesis

Abstract: Text <span lang="EN-US">to speech (TTS) is a system that generates artificial speech from text input. The prosodic models used improve the quality of the synthesized speech especially naturalness and intelligibility. The prosody involves intonation, intonation refers to the variations in the pitch frequency (F0) with respect to time in an utterance. This work mainly concentrates on building feedback neural network model to predict F0 contour in the utterances using Fujisaki intonation model parameters as… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 14 publications
(15 reference statements)
0
1
0
Order By: Relevance
“…Various concepts of intonation modelling can be found. Some researchers have tried to apply ideas of classical intonation models by predicting their meanings (Chakrasali et al, 2022;Kuczmarski, 2021;Marelli et al, 2019) or identifying intonation segments (Alvarez et al, 2022). Other intonation modelling instances include the transfer of intonation components into neutral synthesized speech (Honnet and Garner, 2016), unsupervised or supervised training of latent prosody space (Sun et al, 2020;Raitio et al, 2020Raitio et al, , 2022.…”
Section: Introductionmentioning
confidence: 99%
“…Various concepts of intonation modelling can be found. Some researchers have tried to apply ideas of classical intonation models by predicting their meanings (Chakrasali et al, 2022;Kuczmarski, 2021;Marelli et al, 2019) or identifying intonation segments (Alvarez et al, 2022). Other intonation modelling instances include the transfer of intonation components into neutral synthesized speech (Honnet and Garner, 2016), unsupervised or supervised training of latent prosody space (Sun et al, 2020;Raitio et al, 2020Raitio et al, , 2022.…”
Section: Introductionmentioning
confidence: 99%