Interspeech 2019 2019
DOI: 10.21437/interspeech.2019-1365
|View full text |Cite
|
Sign up to set email alerts
|

Towards Language-Universal Mandarin-English Speech Recognition

Abstract: Multilingual and code-switching speech recognition are two challenging tasks that are studied separately in many previous works. In this work, we jointly study multilingual and codeswitching problems, and present a language-universal bilingual system for Mandarin-English speech recognition. Specifically, we propose a novel bilingual acoustic model, which consists of two monolingual system initialized subnets and a shared output layer corresponding to the Character-Subword acoustic modeling units. The bilingual… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
10
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
7
2

Relationship

1
8

Authors

Journals

citations
Cited by 12 publications
(10 citation statements)
references
References 29 publications
0
10
0
Order By: Relevance
“…Similarly, a phoneme-based modeling unit was also studied to achieve the multilingual ASR task [44]. A network sharing approach was also developed to recognize the Chinese and English languages [45]. A multi-task learning mechanism was proposed to obtain an end-to-end multilingual task in [46].…”
Section: A Asr Related Workmentioning
confidence: 99%
“…Similarly, a phoneme-based modeling unit was also studied to achieve the multilingual ASR task [44]. A network sharing approach was also developed to recognize the Chinese and English languages [45]. A multi-task learning mechanism was proposed to obtain an end-to-end multilingual task in [46].…”
Section: A Asr Related Workmentioning
confidence: 99%
“…Their results showed that bytes are superior to grapheme characters over a wide variety of languages in monolingual end-to-end speech recognition. Characters are the most commonly used modeling unit for end-to-end ASR in Mandarin Chinese; sub-words have also been employed [45].…”
Section: Modeling Units In Mandarin Asrmentioning
confidence: 99%
“…Compared to monolingual ASR with plenty of monolingual data, CS ASR is limited by hard-to-collect speech and transcriptions, especially in the era of deep learning. Therefore, reducing the demand for CS data and making full use of rich resources monolingual data have become research hotspots [13,14,15,16,17,18]. Dual-encoder structure is an effective way to make full use of two monolingual data [15,16,17,18].…”
Section: Introductionmentioning
confidence: 99%