Proceedings of the Second Workshop on Computational Approaches to Code Switching 2016
DOI: 10.18653/v1/w16-5811
|View full text |Cite
|
Sign up to set email alerts
|

Part-of-speech Tagging of Code-Mixed Social Media Text

Abstract: A common step in the processing of any text is the part-of-speech tagging of the input text. In this paper, we present an approach to tackle code-mixed text from three different languages Bengali, Hindi, and Tamilapart from English. Our system uses Conditional Random Field, a sequence learning method, which is useful to capture patterns of sequences containing code switching to tag each word with accurate part-of-speech information. We have used various pre-processing and post-processing modules to improve the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0
1

Year Published

2017
2017
2023
2023

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 27 publications
(15 citation statements)
references
References 22 publications
0
14
0
1
Order By: Relevance
“…Our choice of tasks is primarily motivated by the availability of annotated CM data. There has been prior work on CM sentiment identification (Vilares and Alonso, 2016;Rudra et al, 2016; and POS tagging (Solorio and Liu, 2008;AlGhamdi et al, 2016;Ghosh et al, 2016). But we are not aware of any work that utilizes pre-trained bilingual embeddings for these tasks.…”
Section: Discussionmentioning
confidence: 99%
“…Our choice of tasks is primarily motivated by the availability of annotated CM data. There has been prior work on CM sentiment identification (Vilares and Alonso, 2016;Rudra et al, 2016; and POS tagging (Solorio and Liu, 2008;AlGhamdi et al, 2016;Ghosh et al, 2016). But we are not aware of any work that utilizes pre-trained bilingual embeddings for these tasks.…”
Section: Discussionmentioning
confidence: 99%
“…The work on studying the types of conceptual structures and their language correspondences started within the framework of exploring artificial intelligence [6,7]. The result of this work was the model of "conceptual dependences".…”
Section: Literature Review and Problem Statementmentioning
confidence: 99%
“…As is typically the case in NLP, such pipelines suffer from the problem of cascading errors; e.g., failures of the language identification will cause problems in the tag prediction (Barman et al, 2016). Other approaches have trained supervised models on POS-annotated, code-switched data (Jamatia et al, 2015;Ghosh et al, 2016;Gupta et al, 2017;Barman et al, 2016;Sequiera et al, 2015, inter alia), resources which are expensive to create and unavailable for most language pairs.…”
Section: Introductionmentioning
confidence: 99%