Bilingual acoustic modeling with state mapping and three-stage adaptation for transcribing unbalanced code-mixed lectures

Yeh, Ching-Feng; Sun, Liang-Che; Huang, Chih-Hsin; Lee, Lin-Shan

doi:10.1109/icassp.2011.5947484

Cited by 8 publications

(2 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…There is comparatively less work in the literature on automated analysis of code-switched speech, partially due to the relative lack of structured corpora (as compared to those for textbased work) and also potentially because it also poses yet another significant challenge in the form of speech recognition for multiple languages. Nonetheless, some researchers have made strong strides in spoken corpus development to support such research in certain language pairs, for instance, Mandarin-English [21,22], Cantonese-English [23] and Hindi-English [24], which have in turn led to developments in automatic speech recognition [25,26] and language modeling [27]. However, these are limited; there remains a need for more codeswitched speech resources in these and other languages to spur research into the automated processing and analysis of such data.…”

Section: Introductionmentioning

confidence: 99%

Jee haan, I’d like both, por favor: Elicitation of a Code-Switched Corpus of Hindi–English and Spanish–English Human–Machine Dialog

Ramanarayanan

Suendermann-Oeft

2017

Interspeech 2017

View full text Add to dashboard Cite

We present a database of code-switched conversational humanmachine dialog in English-Hindi and English-Spanish. We leveraged HALEF, an open-source standards-compliant cloudbased dialog system to capture audio and video of bilingual crowd workers as they interacted with the system. We designed conversational items with intra-sentential code-switched machine prompts, and examine its efficacy in eliciting codeswitched speech in a total of over 700 dialogs. We analyze various characteristics of the code-switched corpus and discuss some considerations that should be taken into account while collecting and processing such data. Such a database can be leveraged for a wide range of potential applications, including automated processing, recognition and understanding of codeswitched speech and language learning applications for new language learners.

show abstract

Section: Introductionmentioning

confidence: 99%

Jee haan, I’d like both, por favor: Elicitation of a Code-Switched Corpus of Hindi–English and Spanish–English Human–Machine Dialog

Ramanarayanan

Suendermann-Oeft

2017

Interspeech 2017

View full text Add to dashboard Cite

show abstract

“…Usually the similarity and differences between the phonemes and the unique characteristics for some phonemes are difficult to measure quantitatively. Many approaches have been proposed to merge acoustic units on different levels to handle these problems [3] [4][5] [6] [7]. In general, bilingual speech can be classified into two categories.…”

Section: Introductionmentioning

confidence: 99%

Minimum Phone Error model training on merged acoustic units for transcribing bilingual code-switched speech

Yeh

Lin

Lee

2012

2012 8th International Symposium on Chinese Spoken Language Processing

Self Cite

View full text Add to dashboard Cite

This paper proposes to perform Minimum Phone Error (MPE) model training on merged acoustic units for transcribing Mandarin-English code-switched lectures with highly imbalanced language distribution. Some of the acoustic events in Mandarin and English may have very similar characteristics, so the states or Gaussian mixtures representing them can be merged with identical shared parameters. When MPE is performed afterwards, these merged identical states or Gaussian mixtures can form a compact acoustic unit set. In this way MPE can better discriminate the acoustic units of both languages, because similar units are merged while distinct units are differentiated. Significant improvements in recognition accuracy were observed in the preliminary experiments on real-world bilingual code-switched lecture corpus recorded at National Taiwan University.

show abstract

Improved open-vocabulary spoken content retrieval with word and subword lattices using acoustic feature similarity

Lee

Chou

Lee

2014

Computer Speech & Language

View full text Add to dashboard Cite

Bilingual acoustic modeling with state mapping and three-stage adaptation for transcribing unbalanced code-mixed lectures

Cited by 8 publications

References 9 publications

Jee haan, I’d like both, por favor: Elicitation of a Code-Switched Corpus of Hindi–English and Spanish–English Human–Machine Dialog

Jee haan, I’d like both, por favor: Elicitation of a Code-Switched Corpus of Hindi–English and Spanish–English Human–Machine Dialog

Minimum Phone Error model training on merged acoustic units for transcribing bilingual code-switched speech

Improved open-vocabulary spoken content retrieval with word and subword lattices using acoustic feature similarity

Contact Info

Product

Resources

About