2013
DOI: 10.15398/jlm.v1i2.63
|View full text |Cite
|
Sign up to set email alerts
|

Populating a multilingual ontology of proper names from open sources

Abstract: Even if proper names play a central role in natural language processing (NLP) applications they are still under-represented in lexicons, annotated corpora, and other resources dedicated to text processing.  One of the main challenges is both the prevalence and the dynamicity of proper names. At the same time, large and regularly-updated knowledge sources containing partially-structured data, such as Wikipedia or GeoNames, are publicly available and contain large numbers of proper names. We present a method for… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2016
2016
2018
2018

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 15 publications
0
3
0
Order By: Relevance
“…The second version of Prolexbase has been supported by the Hubert Curien Polonium project which brought a good coverage for Polish and English Savary et al (2013). The Serbian part has been significantly improved in the third version of Prolexbase, as a result of a one month visit of Professor Cvetana Krstev which was sponsored by The University of Tours.…”
Section: Motivationmentioning
confidence: 99%
“…The second version of Prolexbase has been supported by the Hubert Curien Polonium project which brought a good coverage for Polish and English Savary et al (2013). The Serbian part has been significantly improved in the third version of Prolexbase, as a result of a one month visit of Professor Cvetana Krstev which was sponsored by The University of Tours.…”
Section: Motivationmentioning
confidence: 99%
“…In this context, the collaboratively built, semi-structured and multilingual Wikipedia resource appeared as a great relief, and several named entity dictionaries were built out of it [68,57,59]. Prolexbase [38], a manually produced multilingual ontology of proper names built up over many years, recently adopted a semi-automatic enrichment strategy based on Wikipedia [51]. All of these resources are the result of exploiting Wikipedia and, with the excep-tion of [59] which makes use of LMF, they are not interoperable.…”
Section: Related Workmentioning
confidence: 99%
“…The task of adaptation of such a base to another language is far from trivial, especially for Slavonic languages with complex NE inflection (Przepiórkowski, 2007). An ontology taking into account Polish inflection (Prolexbase) has been created by Savary et al (2013), but it contains only 40,000 names, grouped into 34 types.…”
Section: Deep Entity Recognitionmentioning
confidence: 99%