2017
DOI: 10.5120/ijca2017913621
|View full text |Cite
|
Sign up to set email alerts
|

Survey of Named Entity Recognition Techniques for Various Indian Regional Languages

Abstract: Named entity recognition is a process and study of identification of entities that are proper nouns and classifying them to their appropriate pre-defined class, also called as tag. Named entity recognition is also called as entity chunking, entity identification and entity extraction. It is a sub task of information extraction, where structured text is extracted from unstructured text. Popular applications of NER are machine translation, text mining, data classification, question answering system. This paper p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(5 citation statements)
references
References 8 publications
0
5
0
Order By: Relevance
“…But unstructured big data bring more challenges to these techniques such as scalability, automatic semantic labeling, selection of appropriate techniques for the task and requirements of user, data annotation. 25,30,34,35,57,100,111 Hence, the emergence of advanced learning-based approaches with rule-based will improve the performance of IE systems for the huge volume and variety of big data. Optimal feature extraction and selection: Feature extraction and transformation from unstructured data are more critical for data analysis as compared to structured data due to the heterogeneity and multidimensionality of unstructured documents. Features like bag-of-words, orthographic features, lexical features, and gazetteer-related features can be extracted from the text for learning-based approaches 130 that improves the data analysis process.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…But unstructured big data bring more challenges to these techniques such as scalability, automatic semantic labeling, selection of appropriate techniques for the task and requirements of user, data annotation. 25,30,34,35,57,100,111 Hence, the emergence of advanced learning-based approaches with rule-based will improve the performance of IE systems for the huge volume and variety of big data. Optimal feature extraction and selection: Feature extraction and transformation from unstructured data are more critical for data analysis as compared to structured data due to the heterogeneity and multidimensionality of unstructured documents. Features like bag-of-words, orthographic features, lexical features, and gazetteer-related features can be extracted from the text for learning-based approaches 130 that improves the data analysis process.…”
Section: Resultsmentioning
confidence: 99%
“…But unstructured big data bring more challenges to these techniques such as scalability, automatic semantic labeling, selection of appropriate techniques for the task and requirements of user, data annotation. 25,30,34,35,57,100,111 Hence, the emergence of advanced learning-based approaches with rule-based will improve the performance of IE systems for the huge volume and variety of big data.…”
Section: Limitations Of Existing Ie Techniques For Unstructured Data mentioning
confidence: 99%
See 1 more Smart Citation
“…Named entity recognition (NER) is a fundamental task of information extraction, which seeks to discover elements in a text and assign them to predefined categories (Kale and Govilkar, 2017). Abundant articles have studied the Chinese NER models in general domains which can only retrieve common information such as organizations, persons and addresses (Liu et al , 2018).…”
Section: Introductionmentioning
confidence: 99%
“…The work is very mature and the functionality comes out of the box with NLP libraries like NLTK [5] and spacy [10]. In contrast, limited work is done in the Indic languages like Hindi and Marathi [14]. [25] addresses the problems faced by Indian languages like the presence of abbreviations, ambiguities in named entity categories, different dialects, spelling variations and the presence of foreign words.…”
Section: Introductionmentioning
confidence: 99%