2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) 2015
DOI: 10.1109/asru.2015.7404803
|View full text |Cite
|
Sign up to set email alerts
|

Multilingual representations for low resource speech recognition and keyword search

Abstract: This paper examines the impact of multilingual (ML) acoustic representations on Automatic Speech Recognition (ASR) and keyword search (KWS) for low resource languages in the context of the OpenKWS15 evaluation of the IARPA Babel program. The task is to develop Swahili ASR and KWS systems within two weeks using as little as 3 hours of transcribed data. Multilingual acoustic representations proved to be crucial for building these systems under strict time constraints. The paper discusses several key insights on … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
48
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
4
4
2

Relationship

2
8

Authors

Journals

citations
Cited by 79 publications
(48 citation statements)
references
References 38 publications
0
48
0
Order By: Relevance
“…One way to re-use information extracted from other multilingual corpora is to use multilingual bottleneck features (BNFs), which has shown to perform well in conventional ASR as well as intrinsic evaluations [20,21,[27][28][29][30]. These features are typically obtained by training a deep neural network jointly on several languages for which labelled data is available.…”
Section: Bottleneck Featuresmentioning
confidence: 99%
“…One way to re-use information extracted from other multilingual corpora is to use multilingual bottleneck features (BNFs), which has shown to perform well in conventional ASR as well as intrinsic evaluations [20,21,[27][28][29][30]. These features are typically obtained by training a deep neural network jointly on several languages for which labelled data is available.…”
Section: Bottleneck Featuresmentioning
confidence: 99%
“…Moreover, they significantly simplify infrastructure by supporting n languages with a single speech model rather than n individual models. Successful strategies for building multilingual acoustic models (AMs) include stacked bottleneck features [1][2][3][4], shared hidden layers [5,6], knowledge distillation [7], and multitask learning [8]. Building multilingual language models (LMs) has also been attempted recently (e.g.…”
Section: Introductionmentioning
confidence: 99%
“…There are two different ways to use the transcribed other source languages data. One way is to build hybrid multilingual deep neural network (MDNN), and then conduct the crosslingual transfer for target language [15], [17], [18], [21], [22]. Another way is to build bottleneck feature (BNF) extractor.…”
Section: Introductionmentioning
confidence: 99%