Information extraction is one of the most important techniques used in Text Mining. One of the main problems in building information extraction (IE) systems is that the knowledge elicited from domain experts tends to be only approximately correct. In addition, the knowledge acquisition phase for building IE rules usually takes a tremendous amount of time on the part of the expert and of the linguist creating the rules. We therefore need an effective means of revising our IE rules whenever we discover such an inaccuracy. The IE revision problem is how best to go about revising a deficient IE rules using information contained in examples that expose inaccuracies. The revision process is very sensitive to implicit and explicit biases encoded in the specific revision algorithm employed. In a sense, each revision algorithm must provide two forms of biases: bias as to the place of the revision and bias as to the type of the revision that should be performed. In this paper we present a framework for writing approximate IE rules that are provided with explicit bias. The proposed framework can be used by many existing revision algorithms. The purpose of the revision bias framework is to allow the user to declare his own bias in a simple and structured way, i.e. to express the conditions placed on the domain knowledge for a given revision operator to be applied. This language extends and generalizes the work reported in [Feldman et. al. 1993]. It attacks the problem of writing IE rules from a novel perspective, one which enables a much faster development of IE systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.