Abstract:A new audio-visual speech synthesis approach is proposed based on Chinese visual triphone. Chinese visual triphone model is constructed using a new clustering method combining artificial immune system and FCM. In the analysis stage, with the training phonetic transcription, visual triphone segments are selected from video sequence, and corresponding lip feature vectors are extracted. In the synthesis stage, viterbi search algorithm is used to select the best visual triphone segments by finding out a path which… Show more
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.