1999
DOI: 10.1023/a:1007562322031
|View full text |Cite
|
Sign up to set email alerts
|

Untitled

Abstract: Abstract.A wealth of on-line text information can be made available to automatic processing by information extraction (IE) systems. Each IE application needs a separate set of rules tuned to the domain and writing style. WHISK helps to overcome this knowledge-engineering bottleneck by learning text extraction rules automatically.WHISK is designed to handle text styles ranging from highly structured to free text, including text that is neither rigidly formatted nor composed of grammatical sentences. Such semi-s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
27
0
3

Year Published

2002
2002
2017
2017

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 630 publications
(34 citation statements)
references
References 21 publications
0
27
0
3
Order By: Relevance
“…In fact, regular expressions can be considered to be a generalization of the bag-of-words representation or any n-gram representation. While there are several IE methods that use regular expressions-like representation, such as WHISK, Soderland (1999), we suggest arranging several expressions into a hierarchical structure which constitutes a decision tree.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…In fact, regular expressions can be considered to be a generalization of the bag-of-words representation or any n-gram representation. While there are several IE methods that use regular expressions-like representation, such as WHISK, Soderland (1999), we suggest arranging several expressions into a hierarchical structure which constitutes a decision tree.…”
Section: Discussionmentioning
confidence: 99%
“…Califf and Mooney (1999) proposed the RAPIER system that induces pattern-match rules from rigidly structured text. Freitag (1998) described the SRV framework that exploits linguistic syntax and lexical information for corpus based learning while Soderland (1999) proposed the WHISK system for learning text extraction rules automatically. The (LP) 2 algorithm described in Ciravegna (2001) learns tagging rules from an annotated corpus.…”
Section: Framework For Information Extractionmentioning
confidence: 99%
“…Earlier studies (e.g. Soderland, 1999) showed how results from software processing free text have limited applications and they could hardly be compared to results from human processing. In more recent reviews (e.g.…”
Section: Previous Attempts [Heading]mentioning
confidence: 99%
“…Tang and Heidorn [22] subsequently advanced the research to the character level. They adapted Soderland’s supervised learning system, WHISK [23], to extract leaf characters and fruit/nut shapes from 1600 Flora of North America (FNA) species descriptions. The system scored 33–80% in recall and 75–100% in precision depending on the characters.…”
Section: Introductionmentioning
confidence: 99%