2016
DOI: 10.3233/sw-160228
|View full text |Cite
|
Sign up to set email alerts
|

JRC-Names: Multilingual entity name variants and titles as Linked Data

Abstract: Since 2004 the European Commission's Joint Research Centre (JRC) has been analysing the online version of printed media in over twenty languages and has automatically recognised and compiled large amounts of named entities (persons and organisations) and their many name variants. The collected variants not only include standard spellings in various countries, languages and scripts, but also frequently found spelling mistakes or lesser used name forms, all occurring in real-life text (e.g. Benjamin/Binyamin/Bib… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2016
2016
2020
2020

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 17 publications
(10 citation statements)
references
References 41 publications
0
10
0
Order By: Relevance
“…JRC-TMA-CC is a hybrid system combining a rule-based approach and machine learning techniques. It is a corpus-driven system, lightweight and highly multilingual, exploiting both automatically created lexical resources, such as JRC-Names (Ehrmann et al, 2017), and external resources, such as BabelNet (Jacquet et al, 2019a). The main focus of the approach is on generating the possible inflected variants for known names (Jacquet et al, 2019b RIS is a modified BERT model, which uses CRF as the top-most layer (Arkhipov et al, 2019).…”
Section: Participant Systemsmentioning
confidence: 99%
“…JRC-TMA-CC is a hybrid system combining a rule-based approach and machine learning techniques. It is a corpus-driven system, lightweight and highly multilingual, exploiting both automatically created lexical resources, such as JRC-Names (Ehrmann et al, 2017), and external resources, such as BabelNet (Jacquet et al, 2019a). The main focus of the approach is on generating the possible inflected variants for known names (Jacquet et al, 2019b RIS is a modified BERT model, which uses CRF as the top-most layer (Arkhipov et al, 2019).…”
Section: Participant Systemsmentioning
confidence: 99%
“…However, we discarded this class during the second annotation stage since person titles are not a commonly included category in named entity annotation. They can be, however, found for example in related work on semantic webs (Ehrmann et al, 2017).…”
Section: The Named Entity Classesmentioning
confidence: 99%
“…We submitted four system instance results, all of which are based on our in-house NER system NERONE (Ehrmann et al, 2017;Steinberger et al, 2015), which we describe first.…”
Section: Approachmentioning
confidence: 99%