Interspeech 2022 2022
DOI: 10.21437/interspeech.2022-10916
|View full text |Cite
|
Sign up to set email alerts
|

Investigating the Impact of Crosslingual Acoustic-Phonetic Similarities on Multilingual Speech Recognition

Abstract: Multilingual automatic speech recognition (ASR) systems mostly benefit low resource languages but suffer degradation in performance across several languages relative to their monolingual counterparts. Limited studies have focused on understanding the languages behaviour in the multilingual speech recognition setups. In this paper, a novel data-driven approach is proposed to investigate the cross-lingual acoustic-phonetic similarities. This technique measures the similarities between posterior distributions fro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 19 publications
0
3
0
Order By: Relevance
“…The weights are assigned to the fusing languages on the basis of similarity of the source and the target language. The study on cross-lingual acoustic-phonetic similarities using the same mapping network approach observes that the entropy of a <source, target> mapping network shows the language similarities [28]. The same similarity measure is used along with mapping network accuracy to assign the weights.…”
Section: Acoustic Model Fusionmentioning
confidence: 99%
See 1 more Smart Citation
“…The weights are assigned to the fusing languages on the basis of similarity of the source and the target language. The study on cross-lingual acoustic-phonetic similarities using the same mapping network approach observes that the entropy of a <source, target> mapping network shows the language similarities [28]. The same similarity measure is used along with mapping network accuracy to assign the weights.…”
Section: Acoustic Model Fusionmentioning
confidence: 99%
“…The number of shared phonemes is not a reliable metric to measure language similarities and each participating language in a multilingual system has a different similarity with the target language. Even the balanced language data sampling can cause degradation or improvement due to internal acoustic-phonetic unbalancing [28]. It demands very controlled language mixing for a target language ASR.…”
Section: Introductionmentioning
confidence: 99%
“…Recently, we have proposed a technique to learn crosslingual acoustic-phonetic similarities on phoneme level [23] which has been used for multilingual and cross-lingual acoustic model fusion [24]. A model is trained to learn mappings from a source language ASR output posterior distributions to that of the target language ASR.…”
Section: Introductionmentioning
confidence: 99%