LSF mapping for voice conversion with very small training sets

Helander, Elina; Nurminen, Jukka K.; Gabbouj, Moncef

doi:10.1109/icassp.2008.4518698

Cited by 13 publications

(7 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This process will increase the energy of signal at higher frequency [3] as shown in figure (2). ) ( * ) ( ) ( 2 n s a n s n s (1) where s(n) is the speech signal , s 2 (n) is the output signal and the value of a is usually between 0.9 and 1.0. The ztransform of the filter is …”

Section: A Preemphasismentioning

confidence: 99%

See 1 more Smart Citation

Arabic speech transformation using MFCC in GMM

Elmanfaloty

Korany

Youssef

2012

2012 International Conference on Computer and Communication Engineering (ICCCE)

View full text Add to dashboard Cite

Voice conversion (VC) is a process which modifies the speech signal produced by one source speaker so that it sounds like another target speaker. In this paper the transformation is determined by using equal Arabic utterances from source and target speakers. A conversion function based on Gaussian mixture model (GMM) is used for transforming the spectral envelope described by Mel Frequency Cepstral Coefficients (MFCC). The quality of the transformed utterances is measured using subjective and objective evaluations.

show abstract

Section: A Preemphasismentioning

confidence: 99%

“…Examples of such features include MFCCs (Mel Frequency Cepstral Coefficients) and LSFs (Line Spectral Frequencies) [1]. The aim of this paper is to extract the MFCC, and use them by the GMM for voice conversion of Arabic spoken words.…”

Section: Introductionmentioning

confidence: 99%

Arabic speech transformation using MFCC in GMM

Elmanfaloty

Korany

Youssef

2012

2012 International Conference on Computer and Communication Engineering (ICCCE)

View full text Add to dashboard Cite

show abstract

“…The Line Spectral Frequencies (LSF) were selected as the representation of the vocal characteristics of source and target speakers due to their favorable interpolation properties and stableness [16] . A 20 ms length Hanning window overlapped by 10 ms was used to compute and extract the LPC parameters.…”

Section: Objective Evaluationmentioning

confidence: 99%

On using non-linear canonical correlation analysis for voice conversion based on Gaussian mixture model

Jian

Yang

2010

J. Electron.(China)

View full text Add to dashboard Cite

Voice conversion algorithm aims to provide high level of similarity to the target voice with an acceptable level of quality. The main object of this paper was to build a nonlinear relationship between the parameters for the acoustical features of source and target speaker using Non-Linear Canonical Correlation Analysis (NLCCA) based on jointed Gaussian mixture model. Speaker individuality transformation was achieved mainly by altering vocal tract characteristics represented by Line Spectral Frequencies (LSF). To obtain the transformed speech which sounded more like the target voices, prosody modification is involved through residual prediction. Both objective and subjective evaluations were conducted. The experimental results demonstrated that our proposed algorithm was effective and outperformed the conventional conversion method utilized by the Minimum Mean Square Error (MMSE) estimation.

show abstract

“…The joint density Gaussian mixture model (JD-GMM) [4], [5], [6] is one of the most effective approaches. Unfortunately, it requires relatively large parallel training data to avoid over-fitting [8].…”

Section: Introductionmentioning

confidence: 99%

Mixture of Factor Analyzers Using Priors From Non-Parallel Speech for Voice Conversion

Kinnunen

Chng

et al. 2012

IEEE Signal Process. Lett.

View full text Add to dashboard Cite

Abstract-A robust voice conversion function relies on a large amount of parallel training data, which is difficult to collect in practice. To tackle the sparse parallel training data problem in voice conversion, this paper describes a mixture of factor analyzers method which integrates prior knowledge from nonparallel speech into the training of conversion function. The experiments on CMU ARCTIC corpus show that the proposed method improves the quality and similarity of converted speech. With both objective and subjective evaluations, we show the proposed method outperforms the baseline GMM method.Index Terms-Voice conversion, prior knowledge, factor analysis, mixture of factor analyzers.

show abstract

LSF mapping for voice conversion with very small training sets

Cited by 13 publications

References 7 publications

Arabic speech transformation using MFCC in GMM

Arabic speech transformation using MFCC in GMM

On using non-linear canonical correlation analysis for voice conversion based on Gaussian mixture model

Mixture of Factor Analyzers Using Priors From Non-Parallel Speech for Voice Conversion

Contact Info

Product

Resources

About