1990
DOI: 10.1016/0167-6393(90)90021-z
|View full text |Cite
|
Sign up to set email alerts
|

Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
467
0
10

Year Published

1999
1999
2015
2015

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 973 publications
(496 citation statements)
references
References 10 publications
0
467
0
10
Order By: Relevance
“…We hence artificially modified F 0 and/or duration cues associated to the last syllable of each fragment using PSOLA (Pitch Synchronous Overlap and Add; Moulines and Charpentier 1990) in Praat (Boersma and Weenink 2009) in order to manipulate boundary strength (Table 3) for a total of 120 stimuli (20 NPs  2 boundary levels  3 acoustic combinations).…”
Section: Methods 321 Materialsmentioning
confidence: 99%
“…We hence artificially modified F 0 and/or duration cues associated to the last syllable of each fragment using PSOLA (Pitch Synchronous Overlap and Add; Moulines and Charpentier 1990) in Praat (Boersma and Weenink 2009) in order to manipulate boundary strength (Table 3) for a total of 120 stimuli (20 NPs  2 boundary levels  3 acoustic combinations).…”
Section: Methods 321 Materialsmentioning
confidence: 99%
“…The quality of the generated contours is evaluated acoustically. This is achieved by resynthesizing the original utterance with the newly generated F 0 contour using the PSOLA (Pitch Synchronous Overlap and Add) resynthesis method (Moulines and Charpentier 1990). Thus, in a small-scale perceptual experiment the resynthesized utterances taken from ToBI and the Boston Radio News Corpus (the only generally available prosodically labelled corpus of American English) are assessed as to their naturalness as well as their similarity to / differences from the respective originals.…”
Section: Introductionmentioning
confidence: 99%
“…An /aː/ vowel from one of the speaker's accented productions of sagte which had F1 values closest to the F1 median of this speaker's /a/ and /aː/ tokens was manipulated in duration to create 11 equidistant steps using Praat's (Boersma and Weenink 2012) implementation of the PSOLA algorithm (Moulines and Charpentier 1990). The 11 steps ranged from 40 ms to 112 ms; these selected durations were based on the model speaker's range of target vowel durations in two small pilot studies.…”
Section: Methodsmentioning
confidence: 99%