Interspeech 2018 2018
DOI: 10.21437/interspeech.2018-1584
|View full text |Cite
|
Sign up to set email alerts
|

An Exploration towards Joint Acoustic Modeling for Indian Languages: IIIT-H Submission for Low Resource Speech Recognition Challenge for Indian Languages, INTERSPEECH 2018

Abstract: India being a multilingual society, a multilingual automatic speech recognition system (ASR) is widely appreciated. Despite different orthographies, Indian languages share same phonetic space. To exploit this property, a joint acoustic model has been trained for developing multilingual ASR system using a common phone-set. Three Indian languages namely Telugu, Tamil and, Gujarati are considered for the study. This work studies the amenability of two different acoustic modeling approaches for training a joint ac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 17 publications
(5 citation statements)
references
References 26 publications
0
5
0
Order By: Relevance
“…The data sets are different. [52], [53], [49], and [55] used isolated Gujarati words, [54] used 25-word sentences, [56] did not limit to number of words in the sentences, [51], [57] used continuous speech of three Indian languages, and [50] used continuous speech of 9 Indian languages. The table highlights the accuracy achieved with Gujarati language.…”
Section: Mathematical Evaluation Of Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The data sets are different. [52], [53], [49], and [55] used isolated Gujarati words, [54] used 25-word sentences, [56] did not limit to number of words in the sentences, [51], [57] used continuous speech of three Indian languages, and [50] used continuous speech of 9 Indian languages. The table highlights the accuracy achieved with Gujarati language.…”
Section: Mathematical Evaluation Of Resultsmentioning
confidence: 99%
“…There is no historical evidence of Cocktail-party scene with Gujarati language [47][48][49][50][51][52][53][54][55][56][57]. For ASR in Gujarati, methods like Statistical, Neural Networks and End-to-end recognition are used [35].…”
Section: Introductionmentioning
confidence: 99%
“…Comparison of word error rates % (WER) for different approaches and different multi-stream approaches evaluated on target-language (Mandarin) test set. The lattice-combination approach is described in [12], Feature-combination approach is used in [14,34]. Considering all the results above, all the three feature fusion approaches perform better than the baseline (see Table 6).…”
Section: Combination Layer Configuration Wer[%]mentioning
confidence: 90%
“…In those experiments, both stops and nasals attributes were correctly detected, which can prove that the speech attribute can be used in cross-lingual speech recognition in English and Mandarin. There are few studies on multilingual speech recognition integrating AFs; Hari Krishna et al, trained a bank of AFs detectors using source language to predict the articulatory features for the target languages, which showed that the combination of AFs using AF-Tandem method performs better than the lattice-rescoring approach [14].…”
Section: Related Workmentioning
confidence: 99%
“…The 'Low Resource Speech Recognition Challenge for Indian Languages -Interspeech 2018' included 40 hours of speech corpora in Telugu, Tamil and Gujarati languages. Multilingual training was adapted wherein the acoustic model was trained in all three languages leading to an improvement of approximately 5-8% in WER [35,36,37,38,39]. However, these methods above reduce recognition errors in words already present in the ASR's lexicon.…”
Section: Data Augmentation In Asrmentioning
confidence: 99%