2011
DOI: 10.1007/978-3-642-24769-9_48
|View full text |Cite
|
Sign up to set email alerts
|

A Bootstrapping Approach for Training a NER with Conditional Random Fields

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 12 publications
(7 citation statements)
references
References 9 publications
0
7
0
Order By: Relevance
“…The pattern identified is vaguely the same for both news and Wikipedia data, and follows a structure of [CapitalizedSequence], [ergonym] just like in Teixeira et al, 2011. Some examples of words present in the ergonym list are: presidente, jogador, vocalista, pai, marido, mulher.…”
Section: Methodsmentioning
confidence: 82%
See 1 more Smart Citation
“…The pattern identified is vaguely the same for both news and Wikipedia data, and follows a structure of [CapitalizedSequence], [ergonym] just like in Teixeira et al, 2011. Some examples of words present in the ergonym list are: presidente, jogador, vocalista, pai, marido, mulher.…”
Section: Methodsmentioning
confidence: 82%
“…Annotated datasets are scarce and hard to obtain for most languages. Teixeira et al, 2011 concluded that HAREM datasets are not adequate to be used on an up-to-date NER system due to the age of the articles that compose the datasets. On the other hand, non annotated data or raw text is for the most part freely available with virtually no cost and a constant stream of fresh data.…”
Section: Bootstrappingmentioning
confidence: 99%
“…Bootstrapping approaches were originally used as a method of extracting terms through the recognition of patterns [31]- [33]. In [34] and [35], bootstrapping algorithms are used to automatically label unlabeled data. This data is then used for an NER.…”
Section: Term Extraction With Ner For Nl Requirementsmentioning
confidence: 99%
“…"Barack Obama, president of USA") and linguistic patterns well defined for the journalistic text style. We use a bootstrap approach to train the NER system [11]. Our method starts by annotating persons names on a dataset of 50,000 news items.…”
Section: News Processing Pipelinementioning
confidence: 99%
“…TimeMachine, as a computational journalism tool, brings together a set of Natural Language Processing, Text Mining and Information Retrieval technologies to automatically extract and index entity related knowledge from the news articles [5][6][7][8][9][10][11]. It allows users to issue queries containing keywords and phrases about news stories or events, and retrieves the most relevant entities mentioned in the news articles through time.…”
Section: Introductionmentioning
confidence: 99%