David Perry Greene scite author profile

Competition-based induction of decision models from examples

Greene

¹

,

Smith

²

1994

View full text Add to dashboard Cite

Abstract. Symbolic induction is a promising approach to constructing decision models by extracting regularities from a data set of examples. The predominant type of model is a classification rule (or set of rules) that maps a set of relevant environmental features into specific categories or values. Classifying loan risk based on borrower profiles, consumer choice from purchase data, or supply levels based on operating conditions are all examples of this type of model-building task. Although current inductive approaches, such as ID3 and CN2, perform well on certain problems, their potential is limited by the incremental nature of their search. Genetic ,algorithms (GA) have shown great promise on complex search domains, and hence suggest a means for overcoming these limitations. However, effective use of genetic search in this context requires a framework that promotes the fundamental model-building objectives of predictive accuracy and model simplicity. In this article we describe, COGIN, a GAbased inductive system that exploits the conventions of induction from examples to provide this framework. The novelty of COGIN lies in its use of training set coverage to simultaneously promote competition in various classification niches within the model and constrain overall model complexity. Experimental comparisons 'with NewID and CN2 provide evidence of the effectiveness of the COGIN framework and the viability of the GA approach.

show abstract

Competition-Based Induction of Decision Models from Examples

Greene

¹

,

Smith

²

1993

View full text Add to dashboard Cite

Abstract. Symbolic induction is a promising approach to constructing decision models by extracting regularities from a data set of examples. The predominant type of model is a classification rule (or set of rules) that maps a set of relevant environmental features into specific categories or values. Classifying loan risk based on borrower profiles, consumer choice from purchase data, or supply levels based on operating conditions are all examples of this type of model-building task. Although current inductive approaches, such as ID3 and CN2, perform well on certain problems, their potential is limited by the incremental nature of their search. Genetic ,algorithms (GA) have shown great promise on complex search domains, and hence suggest a means for overcoming these limitations. However, effective use of genetic search in this context requires a framework that promotes the fundamental model-building objectives of predictive accuracy and model simplicity. In this article we describe, COGIN, a GAbased inductive system that exploits the conventions of induction from examples to provide this framework. The novelty of COGIN lies in its use of training set coverage to simultaneously promote competition in various classification niches within the model and constrain overall model complexity. Experimental comparisons 'with NewID and CN2 provide evidence of the effectiveness of the COGIN framework and the viability of the GA approach.

show abstract

A research design for generalizing from multiple case studies

Greene¹,

David²

1984

Evaluation and Program Planning

View full text Add to dashboard Cite

Using Coverage as a Model Building Constraint in Learning Classifier Systems

Greene

¹

,

Smith

²

1994

Evolutionary Computation

View full text Add to dashboard Cite

Promoting and maintaining diversity is a critical requirement of search in learning classifier systems (LCSs). What is required of the genetic algorithm (GA) in an LCS context is not convergence to a single global maximum, as in the standard optimization framework, but instead the generation of individuals (i.e., rules) that collectively cover the overall problem space. COGIN (COverage-based Genetic INduction) is a system designed to exploit genetic recombination for the purpose of constructing rule-based classification models from examples. The distinguishing characteristic of COGIN is its use of coverage of training set examples as an explicit constraint on the search, which acts to promote appropriate diversity in the population of rules over time. By treating training examples as limited resources, COGIN creates an ecological model that simultaneously accommodates a dynamic range of niches while encouraging superior individuals within a niche, leading to concise and accurate decision models. Previous experimental studies with COGIN have demonstrated its performance advantages over several well-known symbolic induction approaches. In this paper, we examine the effects of two modifications to the original system configuration, each designed to inject additional diversity into the search: increasing the carrying capacity of training set examples (i.e., increasing coverage redundancy) and increasing the level of disruption in the recombination operator used to generate new rules. Experimental results are given that show both types of modifications to yield substantial improvements to previously published results.

show abstract