2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2017
DOI: 10.1109/asru.2017.8268921
|View full text |Cite
|
Sign up to set email alerts
|

Language diarization for semi-supervised bilingual acoustic model training

Abstract: In this paper, we investigate several automatic transcription schemes for using raw bilingual broadcast news data in semi-supervised bilingual acoustic model training. Specifically, we compare the transcription quality provided by a bilingual ASR system with another system performing language diarization at the front-end followed by two monolingual ASR systems chosen based on the assigned language label. Our research focuses on the Frisian-Dutch code-switching (CS) speech that is extracted from the archives of… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
16
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
3
2

Relationship

2
7

Authors

Journals

citations
Cited by 21 publications
(16 citation statements)
references
References 26 publications
0
16
0
Order By: Relevance
“…The acoustic data augmentation relies on available monolingual acoustic resources from the high-resourced mixed language (Dutch). Using more monolingual Dutch speech for acoustic model training has found to be effective in improving the general ASR performance, only after increasing the in-domain CS data applying the semi-supervised techniques described in [17,21].…”
Section: Introductionmentioning
confidence: 99%
“…The acoustic data augmentation relies on available monolingual acoustic resources from the high-resourced mixed language (Dutch). Using more monolingual Dutch speech for acoustic model training has found to be effective in improving the general ASR performance, only after increasing the in-domain CS data applying the semi-supervised techniques described in [17,21].…”
Section: Introductionmentioning
confidence: 99%
“…Then, we extract word-level segmentation files for each LM weight. By comparing these alignments with the ground truth word-level alignments, a time-based CS detection accuracy metric is calculated [19]. CS detection accuracy is evaluated by reporting the equal error rates (EER) calculated based on the detection error tradeoff (DET) graph [39] plotted for visualizing the CS detection performance.…”
Section: Recognition and Cs Detection Experimentsmentioning
confidence: 99%
“…Code-switching can be broadly divided into two groups [5]: inter-sentential switching -the alternation is between sentences (also called extrasentential), and intra-sentential switching -the alternation is within sentences (it can also include intra-word). With the rapidly growing of bilingual/multilingual population, CS is no longer a phenomenon relevant for minority languages, which are affected by majority languages, but it also concerns majority languages influenced by lingua francas, such as English and French, as properly pointed out in [6]. Despite the recent significant advances witnessed in the field of automatic speech recognition (ASR) [7], ASR systems have unfortunately still limited capability in tackling the code-switching problem, especially intra-sentential switching.…”
Section: Introductionmentioning
confidence: 99%