Improving Post-Filtering of Artificial Speech Using Pre-Trained LSTM Neural Networks

Coto-Jiménez, Marvin

doi:10.3390/biomimetics4020039

Cited by 15 publications

(12 citation statements)

References 41 publications

(43 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The parameters are processed independently, as proposed in previous references [ 11 ], and after the parametrization, we separate the parameters in voiced (with a value of

) and unvoiced (with a value of

according to the Ahocoder parametrization), both in the synthesized and natural utterances. The reason of this discrimination is that voiced/unvoiced is one of the most distinctive features of the speech sounds, reflected from the source filter model of speech production [ 27 ].…”

Section: Proposed Systemmentioning

confidence: 99%

“…To improve the results obtained with this technique, some researchers have implemented postfilters, by adding algorithms as a final step to enhance the quality of the sound. Some algorithms implemented are deep generative architectures [10], Restricted Boltzmann Machines, and Long Short-term Memory (LSTM) [11].…”

Section: Introductionmentioning

confidence: 99%

“…One of the types of RNN that has worked with better results is the LSTM and its bidirectional counterpart BLSTM. For example, in enhancing the Mel-cepstral coefficients of synthetic voices [ 19 ] and the fundamental frequency [ 11 ].…”

Section: Introductionmentioning

confidence: 99%

“…This procedure was similar to those presented in [11], but with the implementation of an additional discriminative process, that allows a further improvement in the quality of newly synthesized utterances with HTS, using distinct collections of networks as a way of refining the voiced and unvoiced sounds. Figure 3 shows the procedure followed for the enhancing of the new utterances (test set): Each frame of the utterance is labeled with a sequential number.…”

mentioning

confidence: 99%

See 3 more Smart Citations

Discriminative Multi-Stream Postfilters Based on Deep Learning for Enhancing Statistical Parametric Speech Synthesis

Coto-Jiménez

2021

Biomimetics

Self Cite

View full text Add to dashboard Cite

Statistical parametric speech synthesis based on Hidden Markov Models has been an important technique for the production of artificial voices, due to its ability to produce results with high intelligibility and sophisticated features such as voice conversion and accent modification with a small footprint, particularly for low-resource languages where deep learning-based techniques remain unexplored. Despite the progress, the quality of the results, mainly based on Hidden Markov Models (HMM) does not reach those of the predominant approaches, based on unit selection of speech segments of deep learning. One of the proposals to improve the quality of HMM-based speech has been incorporating postfiltering stages, which pretend to increase the quality while preserving the advantages of the process. In this paper, we present a new approach to postfiltering synthesized voices with the application of discriminative postfilters, with several long short-term memory (LSTM) deep neural networks. Our motivation stems from modeling specific mapping from synthesized to natural speech on those segments corresponding to voiced or unvoiced sounds, due to the different qualities of those sounds and how HMM-based voices can present distinct degradation on each one. The paper analyses the discriminative postfilters obtained using five voices, evaluated using three objective measures, Mel cepstral distance and subjective tests. The results indicate the advantages of the discriminative postilters in comparison with the HTS voice and the non-discriminative postfilters.

show abstract

“…The parameters are processed independently, as proposed in previous references [ 11 ], and after the parametrization, we separate the parameters in voiced (with a value of

) and unvoiced (with a value of

Section: Proposed Systemmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

mentioning

confidence: 99%

See 2 more Smart Citations

Discriminative Multi-Stream Postfilters Based on Deep Learning for Enhancing Statistical Parametric Speech Synthesis

Coto-Jiménez

2021

Biomimetics

Self Cite

View full text Add to dashboard Cite

show abstract

“…En contextos aplicados, las divisiones temáticas dentro de la inteligencia artificial han tenido conexiones claras con la ingeniería eléctrica, tales como los sistemas expertos en robótica (Sanders, Graham-Jones y Gegov, 2010), redes neuronales artificiales (Coto-Jiménez, 2019;Ekonomou, 2010) y algoritmos evolutivos (Chan, Lee, Sudhoff y Zivi, 2008).…”

Section: Sobre Los Contenidos Temáticos a Considerarunclassified

Consideraciones para la incorporación de la Inteligencia Artificial en un programa de pregrado de Ingeniería Eléctrica

Coto-Jiménez

2021

Act. Inv. en Educ.

Self Cite

View full text Add to dashboard Cite

Con la rápida proliferación de técnicas y aplicaciones, la inteligencia artificial ha adquirido gran relevancia en diversos campos de la sociedad, lo cual incluye nuevas formas de resolver problemas relacionados con sistemas de energía, de señales y de información, los cuales son analizados dentro de una carrera como Ingeniería Eléctrica en la Universidad de Costa Rica. Dada la relevancia del tema, algunos autores han señalado la importancia de difundir el conocimiento de la Inteligencia Artificial más allá de laboratorios especializados y programas de posgrado donde usualmente se han desarrollado. Este ensayo tiene como objetivo aportar a la discusión sobre la conveniencia y forma apropiada de introducir la inteligencia artificial en el currículo de pregrado en carreras de ingeniería, en especial en un programa de estudio de ingeniería eléctrica de la Universidad de Costa Rica, en el cual se propicia una formación general. Una propuesta como esta debe tener en cuenta las bases con que cuentan las personas discentes, así como los contenidos y estrategias convenientes para lograr una introducción adecuada en su formación académica y el beneficio de la profesión. En el ensayo se presentan y discuten estas consideraciones a la luz de la conceptualización de la inteligencia artificial y su aplicabilidad en la actualidad de la ingeniería.

show abstract

A Performance Evaluation of Several Artificial Neural Networks for Mapping Speech Spectrum Parameters

Yeom-Song

Zeledón-Córdoba

Coto-Jiménez

2020

Communications in Computer and Information Science

View full text Add to dashboard Cite

Improving Post-Filtering of Artificial Speech Using Pre-Trained LSTM Neural Networks

Cited by 15 publications

References 41 publications

Discriminative Multi-Stream Postfilters Based on Deep Learning for Enhancing Statistical Parametric Speech Synthesis

Discriminative Multi-Stream Postfilters Based on Deep Learning for Enhancing Statistical Parametric Speech Synthesis

Consideraciones para la incorporación de la Inteligencia Artificial en un programa de pregrado de Ingeniería Eléctrica

A Performance Evaluation of Several Artificial Neural Networks for Mapping Speech Spectrum Parameters

Contact Info

Product

Resources

About