Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96
DOI: 10.1109/icslp.1996.607255
|View full text |Cite
|
Sign up to set email alerts
|

Quantitative analysis of the local speech rate and its application to speech synthesis

Abstract: On the basis of the short-time relative speech rate defined by the authors, this paper examines the optimum width of the smoothing window by perceptual experiments on the naturalness of re-synthesized speech. With the optimum window of 270 ms, relative speech rates are obtained both for 'fast' and 'slow' utterances of the same sentence, using an utterance produced at a 'normal' speech rate. The averaged results show that the speech rate control function for an utterance can be approximately decomposed into a g… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
3
0

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 5 publications
(3 reference statements)
0
3
0
Order By: Relevance
“…A case study on contrastive focus placement in Neapolitan Italian is used to show how this can be done in the practice, as well as to let the reader appreciate the quality of the results. Two features will be analyzed, f0 and relative speech rate, the latter expressed as a continuous function of time as proposed in [7]. The results show that known facts about the prosody of Neapolitan Italian emerged from the data, but also other interesting local or cross-feature relationships between contour traits appeared.…”
Section: Introductionmentioning
confidence: 99%
“…A case study on contrastive focus placement in Neapolitan Italian is used to show how this can be done in the practice, as well as to let the reader appreciate the quality of the results. Two features will be analyzed, f0 and relative speech rate, the latter expressed as a continuous function of time as proposed in [7]. The results show that known facts about the prosody of Neapolitan Italian emerged from the data, but also other interesting local or cross-feature relationships between contour traits appeared.…”
Section: Introductionmentioning
confidence: 99%
“…Therefore, we rely on DTW to align segments within each word. The allowable region of the dynamic path is set within the range of [1/3,3] [41]. We use the MFCCs as features for DTW, which are estimated for the synthetic signals and the target speech.…”
Section: B Time Alignment Processmentioning
confidence: 99%
“…In order to understand paralinguistic information using a computer, it is one of important issues to detect portions of sentences in which the speaker intentionally decelerates the speech rate. There are several studies on local speech rate variation [5][6][7]. However, there are few studies on detection of local speech rate variation.…”
mentioning
confidence: 99%