Proceedings - Natural Language Processing in a Deep Learning World 2019
DOI: 10.26615/978-954-452-056-4_122
|View full text |Cite
|
Sign up to set email alerts
|

Development and Evaluation of Three Named Entity Recognition Systems for Serbian - The Case of Personal Names

Abstract: In this paper we present a rule-and lexicon-based system for the recognition of Named Entities (NE) in Serbian newspaper texts that was used to prepare a gold standard annotated with personal names. It was further used to prepare training sets for four different levels of annotation, which were further used to train two Named Entity Recognition (NER) systems: Stanford and spaCy. All obtained models, together with a rule-and lexiconbased system were evaluated on two sample texts: a part of the gold standard and… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
3
0
1

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 12 publications
0
3
0
1
Order By: Relevance
“…The results of application of the methods and the tools used to label the terms in Serbian [24,25] are presented below, as well as the methods of supervised multi-class classification performed by the tool Weka 3.8.4 8 . These methods were applied to our primary annotated dataset expanded with additional attributes.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The results of application of the methods and the tools used to label the terms in Serbian [24,25] are presented below, as well as the methods of supervised multi-class classification performed by the tool Weka 3.8.4 8 . These methods were applied to our primary annotated dataset expanded with additional attributes.…”
Section: Resultsmentioning
confidence: 99%
“…A bottom-up approach of natural language processing based on taggers and machine learning methods applied to texts in Serbian is shown in papers [24,25]. Taggers can be used to classify terms into groups with different tags.…”
Section: Related Workmentioning
confidence: 99%
“…Takođe, od velikog značaja za našu problematiku su istraživanja koja su sproveli [11], zatim [10], kao i [12], rešavajući slične probleme u domenima sopstvenih jezika. Poseban značaj imaju studije koje su sačinili [13][8], jer su se autori bavili prepoznavanjem imenovanih entiteta u srpskom jeziku. U radu [6] prezentuju HuggingFace biblioteku i platformu, čija je glavna prednost mogućnost deljenja obučenih modela i skupova podataka široj javnosti.…”
Section: Prethodna Rešenjaunclassified
“…Furthermore, textual corpora with over 20 million tokens have been collected and processed in order to train language models that can be used as a basis for grammatical and semantic error detection and correction in text in Serbian [11]. Significant work has been conducted in the field of NLP on the Faculty of Philology in Belgrade, among which the most recent research was related to NER [12] and diacritization of text in Serbian [13]. However, the tools and language resources have not been open for research or application in industry.…”
Section: Introductionmentioning
confidence: 99%