Editor: Alberto Segre
l. IntroductionWe are pleased to reply to Michael Pazzani's thorough review of our book on Inductive Logic Programming (ILP). The book gives an introduction to this new and fast-growing field. We would like to emphasize that it also gives an in-depth account of the most established and applicable techniques within the field and several applications of these techniques. Our reply presents the view we took when writing the book; this answers most of the specific points made by the reviewer.Inductive Logic Programming (ILP) is concerned with learning first-order rules formulated in the language of logic programs. ILP systems use this language for representing background knowledge, examples and hypotheses. The main motivation for using this language is its clear syntax and semantics, as well as the sound theoretical and practical methods for deductive inference in it. Early relational learning systems, such as ARCHES, INDUCE, and ML-SMART, which use other representation formalisms, are not generally considered ILP systems. Thus, they are not discussed in the book.We have consciously biased the book toward ILP techniques and systems that have reached a certain degree of applicability to practical problems. This has strongly influenced the choice of topics covered by the book. For example, while learning recursive rules (inverted implication) and predicate invention have received a lot of attention within the ILP community, few practical results exist so far. Furthermore, while learnability results are important, they are not of immediate practical interest. Consequently, we have decided not to discuss learnability, including, for example, the learnability results obtained by transforming ILP problems to propositional form (Dzeroski et al., 1992).The practical orientation of the book explains the omission of several techniques and systems mentioned in the review, including interactive ILP systems (e.g., CLINT). Although the first ILP systems were of an interactive nature and important developments in this area have been made, few interactive ILP systems have been applied to practical problems. Interaction with the user (oracle) is in fact demanding and so are the practical applications of interactive ILP systems. The empirical ILP setting, on the other hand, resembles the well-understood propositional learning setting as used in the widely-used ID3 and AQ systems. In short, most existing ILP applications involve empirical ILP systems; hence the focus of our book is on empirical ILR
This paper presents an approach to expert-guided subgroup discovery. The main
step of the subgroup discovery process, the induction of subgroup descriptions,
is performed by a heuristic beam search algorithm, using a novel parametrized
definition of rule quality which is analyzed in detail. The other important
steps of the proposed subgroup discovery process are the detection of
statistically significant properties of selected subgroups and subgroup
visualization: statistically significant properties are used to enrich the
descriptions of induced subgroups, while the visualization shows subgroup
properties in the form of distributions of the numbers of examples in the
subgroups. The approach is illustrated by the results obtained for a medical
problem of early detection of patient risk groups
Closed sets are being successfully applied in the context of compacted data representation for association rule learning. However, their use is mainly descriptive. This paper shows that, when considering labeled data, closed sets can be adapted for prediction and discrimination purposes by conveniently contrasting covering properties on positive and negative examples. We formally justify that these sets characterize the space of relevant combinations of features for discriminating the target class. In practice, identifying relevant/irrelevant combinations of features through closed sets is useful in many applications. Here we apply it to compacting emerging patterns and essential rules and to learn descriptions for subgroup discovery.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.