Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2000
DOI: 10.1145/347090.347125
|View full text |Cite
|
Sign up to set email alerts
|

A framework for specifying explicit bias for revision of approximate information extraction rules

Abstract: Information extraction is one of the most important techniques used in Text Mining. One of the main problems in building information extraction (IE) systems is that the knowledge elicited from domain experts tends to be only approximately correct. In addition, the knowledge acquisition phase for building IE rules usually takes a tremendous amount of time on the part of the expert and of the linguist creating the rules. We therefore need an effective means of revising our IE rules whenever we discover such an i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2001
2001
2015
2015

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 11 publications
(5 citation statements)
references
References 16 publications
0
5
0
Order By: Relevance
“…Based on actual empirical evaluation, it was found that it is enough to focus just on the core constituents of sentences and use shallow parsing augmented by smart skips. These skips enable the information extraction engine to skip irrelevant parts, and focus just on the important phrases of each sentence (Appelt and Israel, 1999;Feldman et al, 2001;Feldman et al, 2000). Researchers have attempted before to use full parsing as a component in their information systems and have concluded that it was not worthwhile to invest the extra effort.…”
Section: Full Versus Shallow Parsing In Iementioning
confidence: 99%
“…Based on actual empirical evaluation, it was found that it is enough to focus just on the core constituents of sentences and use shallow parsing augmented by smart skips. These skips enable the information extraction engine to skip irrelevant parts, and focus just on the important phrases of each sentence (Appelt and Israel, 1999;Feldman et al, 2001;Feldman et al, 2000). Researchers have attempted before to use full parsing as a component in their information systems and have concluded that it was not worthwhile to invest the extra effort.…”
Section: Full Versus Shallow Parsing In Iementioning
confidence: 99%
“…In keyword-based search, the user specifies some keywords, and a search engine (e.g., Yahoo, Excite, Alta Vista, and Google) finds those Web pages that contain the keywords, and ranks them according to various measures. In Web information extraction [e.g., 2,13,9,15,8], a wrapper or a specific extraction procedure is built automatically or manually for a Web page to extract some specific pieces of information requested by the user, e.g., extracting the prices of some products. User preference based approaches are commonly used in push type of systems [e.g., 26], where the user specifies what categories of information are interesting to him/her.…”
Section: Related Workmentioning
confidence: 99%
“…In Web query based approaches, database query language such as SQL is extended and modified so that it can be used to query semi-structured information resources, XML documents and Web pages [e.g., 19,12,6,8]. Web resource discovery aims to find resources (Web pages) related to the user requests [e.g., 16,7,8,9,10,13]. This approach uses techniques such as link analysis, and text So far, we have not given extra considerations to metadata [11,22] and hyperlinks ll6], which will be studied in our future work.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Web query languages allow the user to query Web pages ( [3,8]). Wrapper-based information extraction enables the user to extract specific pieces of information from targeted Web sites ( [6,7]). The user preference approaches provide the information to the user according to users interests and requirements [13].…”
Section: Introductionmentioning
confidence: 99%