Abstract. The pruning of decision trees often relies on the classification accuracy of the decision tree. In this paper, we show how the misclassification costs, a related criterion applied if errors vary in their costs, can be intregrated in several well-known pruning techniques.
Abstract. In this paper we describe two methods for improving systems that induce disjunctive Horn clause definitions. The first method is the well-known use of argument types during induction. Our novel contribution is an algorithm for extracting type information from the example set mechanically. The second method provides a set of clause heads partitioning the example set in disjuncts according to structural properties. Those heads can be used in top-down inductive inference systems as starting point of the general-to-specific search and reduce the resulting space of clause bodies.
As each of the four main approaches to a declarative bias represention in Inductive Logic Programming (ILP), the representation by paxameterized languages or by clause sets, the grammar-based and the scheme-based representation, fails in representing all language biases in ILP systems, we present a unifying representation language MILES-CTL for these biases by extending the scheme-based approach.
In spite of the desirable properties of using Horn logic as hypothesis language, the expressivness leads to huge hypothesis spaces containing up to millions of hypotheses for even simple learning problems. Controlling hypothesis spaces by biases requires knowledge on the effects and applicability of biases in different domains. This knowledge can be gained experimentally by comparing the size of hypothesis spaces with respect to the language bias and the application domain. This approach contrasts theoretical comparisons of the complexity where the results are very general and small bias variations mostly cannot be considered. In order to yield more detailed information on small bias variations and to compare the results independently of systems, their implementations and additional more or less hidden biases, we use MILES-CTL for the experiments. As application domains, we selected a function-free domain including family relations and a non-function-free domain including •tprocessing programs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.