Speaker-independent HMM-based voice conversion using adaptive quantization of the fundamental frequency

Nose, Takeru; Kobayashi, Takao

doi:10.1016/j.specom.2011.05.001

Cited by 10 publications

(5 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…One interesting direction to investigate, in the context of human-animal interaction and sound design, is how to generate call sequences from human vocal imitation or sketching (e.g. analogously to [14], build a parallel database of animal vocalization and human imitations, and train models for both), as well as methods to embed human emotions in the synthetic vocalizations.…”

Section: Discussionmentioning

confidence: 99%

Bird Song Synthesis Based on Hidden Markov Models

2016

View full text Add to dashboard Cite

This paper focuses on the synthesis of bird songs using Hidden Markov Models (HMM). This technique has been widely used for speech modeling and synthesis. However, features and contextual factors typically used for human speech are not appropriate for modeling bird songs. Moreover, while for speech we can easily control the content of the recordings, this is not the case for bird songs, where we have to rely on the spontaneous singing of the animal. In this work we briefly overview the characteristics of bird songs, compare them to speech, and propose strategies for adapting the widely-used HTS (HMM-based Speech Synthesis System) framework to model and synthesize bird songs. In particular, we focus on Chaffinch species and a database of recordings of several song bouts of one male bird. At the end we discuss the synthesis results obtained.

show abstract

Section: Discussionmentioning

confidence: 99%

Bird Song Synthesis Based on Hidden Markov Models

2016

View full text Add to dashboard Cite

show abstract

“…The reference sample was vocoded speech of the target speaker. Other detailed experimental conditions are found in [11]. The results are shown in Fig.…”

Section: Very Low Bit-rate Speech Coding Based On Msd-hmm With Qumentioning

confidence: 99%

Quantized F0 Context and Its Applications to Speech Synthesis, Speech Coding, and Voice Conversion

Nose

Kobayashi

2014

2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing

Self Cite

View full text Add to dashboard Cite

This paper describes a technique for languageindependent prosody modeling using unsupervised prosodic labeling in HMM-based speech synthesis and shows its applications to low bit-rate speech coding and speaker-independent voice conversion. In the proposed technique, sequences of prosodic features are roughly quantized at a phone level and the resultant indexes are used as the prosodic context for the model training. The conventional HMM-based speech synthesis requires accurate prosodic labels corresponding to the speech samples where manual modification is necessary to improve the modeling accuracy, which sometimes takes extra costs and limits its application. In contrast, the proposed technique creates the prosodic label from the training data itself and can apply not only to the speech synthesis but also to the speech coding and voice conversion. Subjective experimental results show the effectiveness of the use of the quantized F0 context without manual prosodic labeling.

show abstract

“…A speaker-independent hidden Markov model (HMM) -based voice conversion technique was proposed by Nose and Kobayashi. The study included context-dependent prosodic symbols obtained using adaptive quantization of the fundamental frequency (F0) [63]. The input utterance of a source speaker was decoded into phonetic and prosodic symbol sequences and the converted speech was generated using the decoded information.…”

Section: Published Work In the Year 2011mentioning

confidence: 99%

Analysis of Methods and Techniques Used for Speaker Identification, Recognition, and Verification: A Study on Quarter-Century Research Outcomes

Mohammed

Aljebory

Rasheed

et al. 2021

eijs

View full text Add to dashboard Cite

The theories and applications of speaker identification, recognition, and verification are among the well-established fields. Many publications and advances in the relevant products are still emerging. In this paper, research-related publications of the past 25 years (from 1996 to 2020) were studied and analysed. Our main focus was on speaker identification, speaker recognition, and speaker verification. The study was carried out using the Science Direct databases. Several references, such as review articles, research articles, encyclopaedia, book chapters, conference abstracts, and others, were categorized and investigated. Summary of these kinds of literature is presented in this paper, together with statistical analyses to represent the publications and their categories over the mentioned period. Important information, including the dataset used, the size of the data adopted, the implemented methods, and the accuracy of the obtained results in the analysed research, are extracted from the explored publications and tabulated. The results show that the sum of published research articles is outnumbering other categories of publications. The number of researches in speech and speaker identification, recognition, and verification shows an increasing trend. Based on the normalized comparative factors of research publications, we found that many of them reached a high level of accuracy in their findings; hence the significantly superior techniques were derived and discussed for future researches. This survey paper would be beneficial for all those who wish to enhance their researches in the area of voice identification, recognition, and verification.

show abstract

Speaker-independent HMM-based voice conversion using adaptive quantization of the fundamental frequency

Cited by 10 publications

References 24 publications

Bird Song Synthesis Based on Hidden Markov Models

Bird Song Synthesis Based on Hidden Markov Models

Quantized F0 Context and Its Applications to Speech Synthesis, Speech Coding, and Voice Conversion

Analysis of Methods and Techniques Used for Speaker Identification, Recognition, and Verification: A Study on Quarter-Century Research Outcomes

Contact Info

Product

Resources

About