2019
DOI: 10.1016/j.specom.2019.04.007
|View full text |Cite
|
Sign up to set email alerts
|

IITG-HingCoS corpus: A Hinglish code-switching database for automatic speech recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
11
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 16 publications
(11 citation statements)
references
References 10 publications
0
11
0
Order By: Relevance
“…According to [7], [20], [44], unlike the monolingual case, the salient challenges posed by the code-switching phenomenon are (i) influence of native language on the pronunciation of nonnative language words within an utterance, (ii) requirement of expert linguistic knowledge and dedicated tools to handle the involved languages and (iii) lack of publicly available domain-specific resources. For augmenting resources in the Indian context, we recently created the HingCoS corpus [28] containing Hindi-English code-switching text and speech data. A few example sentences from the said corpus along with their respective English translations are given in Table 1.…”
Section: A Review Of Code-switching Phenomenonmentioning
confidence: 99%
See 4 more Smart Citations
“…According to [7], [20], [44], unlike the monolingual case, the salient challenges posed by the code-switching phenomenon are (i) influence of native language on the pronunciation of nonnative language words within an utterance, (ii) requirement of expert linguistic knowledge and dedicated tools to handle the involved languages and (iii) lack of publicly available domain-specific resources. For augmenting resources in the Indian context, we recently created the HingCoS corpus [28] containing Hindi-English code-switching text and speech data. A few example sentences from the said corpus along with their respective English translations are given in Table 1.…”
Section: A Review Of Code-switching Phenomenonmentioning
confidence: 99%
“…Recently, in the context of the multilingual ASR task [45], the authors successfully used the union of phone sets of the underlying languages as targets to the E2E ASR system instead of the combined character set. Motivated by that, in an earlier work [28], we had defined a common phone set having 62 labels that cover both Hindi and English languages. In the same work, that phone set was also explored for developing a hybrid Hindi-English code-switching ASR system.…”
Section: B Reduced Target Set Modelingmentioning
confidence: 99%
See 3 more Smart Citations