Proceedings of the 15th Conference of the European Chapter of The Association for Computational Linguistics: Volume 2 2017
DOI: 10.18653/v1/e17-2052
|View full text |Cite
|
Sign up to set email alerts
|

Joining Hands: Exploiting Monolingual Treebanks for Parsing of Code-mixing Data

Abstract: In this paper, we propose efficient and less resource-intensive strategies for parsing of code-mixed data. These strategies are not constrained by in-domain annotations, rather they leverage pre-existing monolingual annotated resources for training. We show that these methods can produce significantly better results as compared to an informed baseline. Besides, we also present a data set of 450 Hindi and English code-mixed tweets of Hindi multilingual speakers for evaluation. The data set is manually annotated… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
28
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 25 publications
(28 citation statements)
references
References 18 publications
0
28
0
Order By: Relevance
“…We use the Universal Dependencies' Hindi-English codemixed data set (Bhat et al, 2017) to test the model's ability to label code-mixed data. This dataset is based on code-switching tweets of Hindi and English multilingual speakers.…”
Section: Codemixed Inputmentioning
confidence: 99%
“…We use the Universal Dependencies' Hindi-English codemixed data set (Bhat et al, 2017) to test the model's ability to label code-mixed data. This dataset is based on code-switching tweets of Hindi and English multilingual speakers.…”
Section: Codemixed Inputmentioning
confidence: 99%
“…However, while some classes of dependency structures tolerating certain crossings have a very good empirical coverage [31,[42][43][44], these proposals still face counterexamples that fall outside the restrictions [45][46][47].…”
Section: A Minimization Of Crossingsmentioning
confidence: 99%
“…The Hindi-English Code switching treebank is based on CS tweets of Hindi and English multilingual speakers (mostly Indian) (Bhat et al, 2017). The treebank is manually annotated using UD scheme.…”
Section: Hin-engmentioning
confidence: 99%