2011
DOI: 10.1121/1.3631667
|View full text |Cite
|
Sign up to set email alerts
|

Estimating speech spectra for copy synthesis by linear prediction and by hand

Abstract: Linear prediction is a widely available technique for analyzing acoustic properties of speech, although this method is known to be error-prone. New tests assessed the adequacy of linear prediction estimates by using this method to derive synthesis parameters and testing the intelligibility of the synthetic speech that results. Matched sets of sine-wave sentences were created, one set using uncorrected linear prediction estimates of natural sentences, the other using estimates made by hand. Phoneme restrictions… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
16
0

Year Published

2013
2013
2021
2021

Publication Types

Select...
8
1
1

Relationship

4
6

Authors

Journals

citations
Cited by 25 publications
(16 citation statements)
references
References 22 publications
0
16
0
Order By: Relevance
“…Estimates of the center frequency and amplitude of vocal resonances were created by hand and used as synthesis parameters for four time-varying sinusoids (see Remez et al, 2011). Temporally distorted versions were created by reversing a brief span of a waveform and composing a new waveform of reversed samples, preserving the original order.…”
Section: Methodsmentioning
confidence: 99%
“…Estimates of the center frequency and amplitude of vocal resonances were created by hand and used as synthesis parameters for four time-varying sinusoids (see Remez et al, 2011). Temporally distorted versions were created by reversing a brief span of a waveform and composing a new waveform of reversed samples, preserving the original order.…”
Section: Methodsmentioning
confidence: 99%
“…After telephone transmission the measurements for /i/ were approximately 12% lower. On the basis of these 1 Chen et al (2009), Deng et al (2007, Remez et al (2011), and Vallabha and Tuller (2002) discuss some of the difficulties associated with fully-automatic measurement, and Byrne and Foulkes (2004), Duckworth et al (2011), Hillenbrand et al (1995), and Kü nzel (2001) discuss some of the difficulties associated with human-supervised measurement.…”
Section: Introductionmentioning
confidence: 98%
“…In practice, the third-formant contour often corresponded to the fricative formant rather than F3 during phonetic segments with frication; these cases were not treated as errors. Gross errors in automatic estimates of the three formant frequencies were hand-corrected using a graphics tablet; artifacts are not uncommon and manual post-processing of the extracted formant tracks is often necessary (Remez et al, 2011). Amplitude contours corresponding to the corrected formant frequencies were extracted automatically from the stimulus spectrograms; these contours were used to generate synthetic analogues of each sentence.…”
Section: Stimuli and Conditionsmentioning
confidence: 99%