2000
DOI: 10.1111/0824-7935.00129
|View full text |Cite
|
Sign up to set email alerts
|

Applying Machine Learning for High‐Performance Named‐Entity Extraction

Abstract: This paper describes a machine learning approach to building an efficient and accurate name spotting system. Finding names in free text is an important task in many text-based applications. Most previous approaches were based on hand-crafted modules encoding language and genre-specific knowledge. These approaches had at least two shortcomings: they required large amounts of time and expertise to develop and were not easily portable to new languages and genres. This paper describes an extensible system that aut… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
32
0
1

Year Published

2002
2002
2011
2011

Publication Types

Select...
6
2
2

Relationship

0
10

Authors

Journals

citations
Cited by 57 publications
(36 citation statements)
references
References 19 publications
0
32
0
1
Order By: Relevance
“…For example, they can extract peoples names, locations, vehicle types, and accidents from a passage. Information extraction techniques can be rule-based Ciravegna et al (1999) and Krupka & Hausman (1998), statistics-based Witten et al (1999), or use machine learning Baluja et al (1999). Text categorization and clustering are like categorization and clustering performed in data mining, but performed on narratives or text.…”
Section: Text Miningmentioning
confidence: 99%
“…For example, they can extract peoples names, locations, vehicle types, and accidents from a passage. Information extraction techniques can be rule-based Ciravegna et al (1999) and Krupka & Hausman (1998), statistics-based Witten et al (1999), or use machine learning Baluja et al (1999). Text categorization and clustering are like categorization and clustering performed in data mining, but performed on narratives or text.…”
Section: Text Miningmentioning
confidence: 99%
“…Most of these improvements are based on the use of empirical methods in NLP. Given the kinds of knowledge needed by empirical approaches to NLP, machine learning techniques have been widely used for its acquisition: Sekine et al [1998]; Borthwick et al [1998]; Baluja et al [1999]; Borthwick [1999]; Takeuchi and Collier [2002]; Yarowsky [2003] for NE recognition, Cardie et al [2000] for chunking, and McCarthy and ; Aone and Bennet [1996]; Cardie and Wagstaff [1999]; Mitkov [1998]; Ng and Cardie [2003] for coreference resolution. Detailed thorough surveys on the use of ML techniques for NLP tasks can be also found Young and Bloothooft [1997]; Manning and Schütze [1999]; Mooney and Cardie [1999].…”
Section: Machine Learning For Information Extractionmentioning
confidence: 99%
“…While the two systems are similar, there are significant differences between them. Another system using decision trees is proposed by Baluja et al [3]. Like the systems described by both Sekine and Bennett et al, they utilized a part-of-speech tagger, dictionary lookups, and word-level features, such as all-uppercase, initial-caps, single-character, and punctuation features.…”
Section: Related Researchmentioning
confidence: 99%