Abstract. We present a new approach to learning hypertext classifiers that combines a statistical text-learning method with a relational rule learner. This approach is well suited to learning in hypertext domains because its statistical component allows it to characterize text in terms of word frequencies, whereas its relational component is able to describe how neighboring documents are related to each other by hyperlinks that connect them. We evaluate our approach by applying it to tasks that involve learning definitions for (i) classes of pages, (ii) particular relations that exist between pairs of pages, and (iii) locating a particular class of information in the internal structure of pages. Our experiments demonstrate that this new approach is able to learn more accurate classifiers than either of its constituent methods alone.
Abstract. A major advantage in using a case-based approach to developing knowledge-based systems is that it can be applied to problems where a strong domain theory may be difficult to determine. However the development of case-based reasoning (CBR) systems that set out to support a sophisticated case adaptation process does require a strong domain model. The Derivational Analogy (DA) approach to CBR is a case in point. In DA the case representation contains a trace of the reasoning process involved in producing the solution for that case. In the adaptation process this reasoning trace is reinstantiated in the context of the new target case; this requires a strong domain model and the encoding of problem solving knowledge. In this paper we analyse this issue using as an example a CBR system called CoBRA that assists with the modelling tasks in numerical simulation.
Abstract. We present compelling evidence that the World Wide Web is a domain in which applications can benefit from using first-order learning methods, since the graph structure inherent in hypertext naturally lends itself to a relational representation. We demonstrate strong advantages for two applications -learning classifiers for Web pages, and learning rules to discover relations among pages. I n t r o d u c t i o nIn recent years, there has been a large body of research centered around the topic of learning first-order representations. Although these representations can succinctly represent a much larger class of concepts than propositional representations, to date there have been only a few problem domains in which first-order representations have demonstrated a decided advantage over propositional representations. The graph-like structure provided by pages on the World Wide Web is one domain that seems natural for first-order representation, yet has not been previously studied in this context. Cohen [1] has used first-order methods for text classification, but the focus was on finding relations between words rather than between documents. The lower half of Figure 1 illustrates the notion of the Web as a directed graph where pages correspond to the nodes in the graph and hyperlinks correspond to edges. Using this representation, we address two types of learning tasks: learning definitions of page classes, and learning definitions of relations between pages. In contrast to related efforts on similar Web tasks, our work focuses on learning concepts which represent relational generalizations of the inherent graph structure.Our work on these two learning tasks has been conducted as part of a larger effort aimed at developing methods for automatically constructing knowledge bases by extracting information from the Web [2]. Given an ontology defining classes and relations of interest, such as that shown in the top half of Figure 1, along with training examples consisting of labeled Web pages, the system learns a set of information extractors for the classes and relations in the ontology, and then populates a knowledge base by exploring the Web. The task of recognizing class instances can be framed as a page-classification task. For example, we can *
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.