Abstract: In speech synthesis and recognition, the segmentation is an important step. The result of further steps depend completely on this process. There are several effective segmentation method in the literature, but for Vietnamese speech, researchers usually base on their experience to set the length while using sliding window. It causes an inefficient segmentation; and they need to try with the other value (length of voice). In this paper, we propose a method supporting in segmentation for Vietnamese speech and automatically determine the suitable length of voices and silent pause. We firstly estimate, by experimenting, the min and average length of a voice and a silent pause for Vietnamese speech in three main type speaking (slow, normal and fast). Then, based on these values, we start to segment the voice and pause by sliding window with proposed algorithm. Experiment results show that the proposed method can be used to effectively segment the Vietnamese speech.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.