2016 IEEE 13th International Conference on Signal Processing (ICSP) 2016
DOI: 10.1109/icsp.2016.7877819
|View full text |Cite
|
Sign up to set email alerts
|

Phone-aware LSTM-RNN for voice conversion

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 14 publications
(14 citation statements)
references
References 12 publications
0
14
0
Order By: Relevance
“…Mel-cepstral distortion (MCD) [246] is commonly used to measure the difference between two spectral features [62], [67], [256], [257]. It is calculated between the converted and target Mel-cepstral coefficients, or MCEPs, [258], [259], y and y.…”
Section: A Objective Evaluation 1) Spectrum Conversionmentioning
confidence: 99%
“…Mel-cepstral distortion (MCD) [246] is commonly used to measure the difference between two spectral features [62], [67], [256], [257]. It is calculated between the converted and target Mel-cepstral coefficients, or MCEPs, [258], [259], y and y.…”
Section: A Objective Evaluation 1) Spectrum Conversionmentioning
confidence: 99%
“…The alignment is usually applied directly by Dynamic Time Wrapping (DTW) [17]. Also, there are techniques to get a more accurate feature alignment with the help of automatic speech recognition (ASR) techniques [18,14,19]. The aligned feature sequences x = x1, .., xT and y = y1, .., yT are then converted frame by frame in different methods (e.g.…”
Section: Parallel Data Voice Conversionmentioning
confidence: 99%
“…The Mel-spectrograms are extracted through a short-time Fourier transform (STFT) using a 50ms frame size, 12.5 ms frame hop and a Hann window function as in [10]. The baseline system uses the same LSTM-RNN voice conversion system in [14]. The converted acoustic features are vocoded into speech waveform using both MLSA and Mcep-based WaveNet vocoder [8].…”
Section: Experiments Setupmentioning
confidence: 99%
See 2 more Smart Citations