2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).
DOI: 10.1109/icassp.2003.1198883
|View full text |Cite
|
Sign up to set email alerts
|

Evaluation of methods for parameteric formant transformation in voice conversion

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
9
0

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(9 citation statements)
references
References 5 publications
0
9
0
Order By: Relevance
“…aspects of the speech that occur over timescales larger than individual phonemes, and the Mannerisms such as particular word choice or preferred phrases, or all kinds of other high-level behavioral characteristics. The formant structure and the vocal tract are represented by the overall spectral envelope shape of the signal, and thus are major features to be considered in voice transformation (Kain & Macon, 2001 (Abe et al (1988);Arslan (1999)), linear transformation (Kain & Macon, 2001;Ye & Young, 2003), formant transformation (Turajlic et al, 2003), vocal tract length normalization (VTLN) (Sundermann et al, 2003), and prosodic transformation (Erro et al, 2010). In text-independent voice conversion techniques, the system trains on source and target speakers uttering different text.…”
Section: Voice Transformationmentioning
confidence: 99%
“…aspects of the speech that occur over timescales larger than individual phonemes, and the Mannerisms such as particular word choice or preferred phrases, or all kinds of other high-level behavioral characteristics. The formant structure and the vocal tract are represented by the overall spectral envelope shape of the signal, and thus are major features to be considered in voice transformation (Kain & Macon, 2001 (Abe et al (1988);Arslan (1999)), linear transformation (Kain & Macon, 2001;Ye & Young, 2003), formant transformation (Turajlic et al, 2003), vocal tract length normalization (VTLN) (Sundermann et al, 2003), and prosodic transformation (Erro et al, 2010). In text-independent voice conversion techniques, the system trains on source and target speakers uttering different text.…”
Section: Voice Transformationmentioning
confidence: 99%
“…In text-dependent methods, training procedures are based on parallel corpora, i.e., training data have the source and the target speakers uttering the same text. Such methods include vector quantization [2,7], linear transformation [38,84], formant transformation [77], vocal tract length normalization (VTLN) [71], and prosodic transformation [7]. In text-independent voice conversion techniques, the system is trained with source and target speakers uttering different texts.…”
Section: Speech Processingmentioning
confidence: 99%
“…Speaker transformation techniques [28,39,85,40,2,7,77,62,16] might involve modifications of different aspects of the speech signal that carries the speaker's identity. We can cite different methods.…”
Section: Speech Processingmentioning
confidence: 99%
“…the lack of control of the spectral shape, has not been solved. Frequency warping methods, such as by Turajlic et al [9], give high quality of modified speech. However, frequency warping methods meet difficulties in modifying spectral peaks, such as preserving shapes of peaks, and emphasizing spectral peaks around 3 kHz in transformation of speaking voice into singing voice, since they do not estimate spectral peaks.…”
Section: Introductionmentioning
confidence: 99%
“…They can be classified into two popular approaches: linear prediction (LP)-based methods [7,8] and frequency warping methods [9]. LP-based methods are often affected by the pole interaction problem suffered by pole modification techniques.…”
Section: Introductionmentioning
confidence: 99%