Proceedings of the Seventh Named Entities Workshop 2018
DOI: 10.18653/v1/w18-2405
|View full text |Cite
|
Sign up to set email alerts
|

Named Entity Recognition for Hindi-English Code-Mixed Social Media Text

Abstract: Named Entity Recognition (NER) is a major task in the field of Natural Language Processing (NLP), and also is a subtask of Information Extraction. The challenge of NER for tweets lies in the insufficient information available in a tweet. There has been a significant amount of work done related to entity extraction, but only for resource-rich languages and domains such as the newswire. Entity extraction is, in general, a challenging task for such an informal text, and code-mixed text further complicates the pro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
21
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
4
1

Relationship

2
8

Authors

Journals

citations
Cited by 51 publications
(23 citation statements)
references
References 16 publications
0
21
0
Order By: Relevance
“…They used CRF as classifier for their NER task. Singh et al presented a named entity recognition system for Hindi-English codemixed social media text (twitter) using word, character and lexical features [53]. Sabty et al proposed a NER system for identifying NEs from Arabic-English Code-Mixed Data [54].…”
Section: Related Workmentioning
confidence: 99%
“…They used CRF as classifier for their NER task. Singh et al presented a named entity recognition system for Hindi-English codemixed social media text (twitter) using word, character and lexical features [53]. Sabty et al proposed a NER system for identifying NEs from Arabic-English Code-Mixed Data [54].…”
Section: Related Workmentioning
confidence: 99%
“…Gupta et al (2014) introduced the concept of Mixed-Script Information Retrieval and the problems posed by transliterated content such as spelling variations etc. There has been a surge of data set creation for code-mixed data (Bhat et al, 2017; and application based tools such as question classification (Raghavi et al, 2015), named-entity recognition (Singh et al, 2018), sentiment analysis (Prabhu et al, 2016;Ghosh et al, 2017) and so on. We built our corpus on syntactic information obtained from dependency labels.…”
Section: Background and Related Workmentioning
confidence: 99%
“…Bhargava et al (2016) proposed an algorithm which uses a hybrid approach of a dictionary cum supervised classification approach for identifying entities in Code Mixed Text of Indian Languages such as Hindi-English and Tamil-English. Nelakuditi et al (2016) reported work on annotating code mixed English-Telugu data collected from social media site Facebook and creating automatic POS Taggers for this corpus, Singh et al (2018a) presented an exploration of automatic NER of Hindi-English code-mixed data, Singh et al (2018b) presented a corpus for NER in Hindi-English Code-Mixed along with experiments on their machine learning models. To the best of our knowledge the corpus we created is the first Telugu-English code-mixed corpus with named entity tags.…”
Section: T1mentioning
confidence: 99%