The paper studies an automatic translation method that translates from the text of a language (L1) to the speech of an unwritten language (L2). Normally the written text is used as the bridge to connect a translation module that translates from the text of L1 to the text of L2 and a synthesis module that generates the speech of L2 from the text. In the case of unwritten language, an intermediate representation has to be used instead of the writing form of L2. This paper proposes the use of phoneme representation because of the intimate relationship between phonemes and speech in one language. The proposed method was applied to the Viet-Muong language pair. The Vietnamese text needs to be translated into Muong language in two dialects, Muong Bi - Hoa Binh and Muong Tan Son - Phu Tho, both unwritten. The paper also proposes a phoneme set for each Muong language and applies them to the problem. The evaluation results showed that the translation quality was relatively high in both dialects (for Muong Bi, the fluency score was 4.63/5.0, and the adequacy score was 4.56/5.0). The synthesized speaking quality in both dialects is acceptable (for Muong Bi, the MOS score was 4.47/5.0, and the comprehension score was 93.55%). The results also show that the applicability of the proposed system to other unwritten languages is promising.
đại học Mỏ Địa chất 2 Viện nghiên cứu quốc tế MICA, Trường đại học Bách Khoa Hà Nội 3 Phòng Ngữ âm học, Viện ngôn ngữ học 4 Văn phòng các chương trình trọng điểm cấp nhà nước, Bộ Khoa học và Công nghệ
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.