2006
DOI: 10.1007/11751984_4
|View full text |Cite
|
Sign up to set email alerts
|

REPENTINO – A Wide-Scope Gazetteer for Entity Recognition in Portuguese

Abstract: Abstract. In this paper we describe REPENTINO, a publicly available gazetteer intended to help the development of named entity recognition systems for Portuguese. REPENTINO wishes to minimize the problems developers face due to the limited availability of this type of lexical-semantic resources for Portuguese. The data stored in REPENTINO was mostly extracted from corpora and from the web using simple semi-automated methods. Currently, REPENTINO stores nearly 450k instances of named entities divided in more th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0
6

Year Published

2006
2006
2016
2016

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(13 citation statements)
references
References 4 publications
(5 reference statements)
0
6
0
6
Order By: Relevance
“…REPENTINO (REPositório para reconhecimento de ENTidades com NOme) (Sarmento et al, 2006) is a repository of monolingual (Portuguese) NEs. This resource contains 450,129 entities, which are organised according to a taxonomy made up of several top categories (abstract, art and media, nature, event, legal, localisation, organisation, product, being and substance) which in turn are subdivided into subcategories.…”
Section: Onomastica Acquisition and Creationmentioning
confidence: 99%
“…REPENTINO (REPositório para reconhecimento de ENTidades com NOme) (Sarmento et al, 2006) is a repository of monolingual (Portuguese) NEs. This resource contains 450,129 entities, which are organised according to a taxonomy made up of several top categories (abstract, art and media, nature, event, legal, localisation, organisation, product, being and substance) which in turn are subdivided into subcategories.…”
Section: Onomastica Acquisition and Creationmentioning
confidence: 99%
“…for the HAREM and MiniHAREM corpora, we apply a BLS that makes use of gazetteers only. We use the gazetteers presented in [20], as well as some sections of the REPENTINO gazetteer [30]. [16] shows the best result reported so far.…”
Section: Modelingmentioning
confidence: 99%
“…A importância do Reconhecimento de Entidades Nomeadas (REN) tem crescido com a propagação de sistemas de extração de informação (Sarmento et al, 2006), uma vez que o pré-processamento da maioria das atividades desses sistemas tem como principal responsabilidade o REN (Nothman et al, 2013). O REN pode ser considerado uma sub-tarefa dos processos de extração de informação e envolve processar um texto e identificar as ocorrências de palavras ou expressões pertencentesàs categorias de entidades nomeadas (Mikheev et al, 1999).…”
Section: Reconhecimento De Entidades Nomeadasunclassified
“…As entidades nomeadas incluem todas as entidades que podem ser identificadas por um nome próprio, como pessoas, organizações, lugares, marcas, produtos entre outros (Sarmento et al, 2006;Kozareva, 2006). Além dessas entidades tambémé possível reconhecer expressões temporais e expressões numéricas.…”
Section: Reconhecimento De Entidades Nomeadasunclassified
See 1 more Smart Citation