On the handling of continuous-valued attributes in decision tree generation

Fayyad, Usama M.; Irani, Keki B.

doi:10.1007/bf00994007

Cited by 498 publications

(427 citation statements)

References 5 publications

Supporting

Mentioning

373

Contrasting

Unclassified

Order By: Relevance

“…For determining these intervals we follow the general scheme of the discretization technique described by Ching et al [14] and Fayyad et al [19] using the training set. The parameters such as thresholds and weights are determined by the interaction with the decision-maker.…”

Section: The Developed Methodsmentioning

confidence: 99%

Multicriteria fuzzy classification procedure PROCFTN: methodology and medical application

Belacel¹,

Boulassel

2004

Fuzzy Sets and Systems

View full text Add to dashboard Cite

Abstract.In this paper we introduce a new classification procedure for assigning objects to predefined classes, named PROCFTN. This procedure is based on a fuzzy scoring function for choosing a subset of prototypes, which represent the closest resemblance with an object to be assigned. It then applies the majority-voting rule to assign an object to a class. We also present a medical application of this procedure as an aid to assist the diagnosis of central nervous system tumours. The results are compared with those obtained by other classification methods, reported on the same data set, including decision tree, production rules, neural network, k-nearest neighbour, multilayer perceptron and logistic regression.Our results are very encouraging and show that the multicriteria decision analysis approach can be successfully used to help medical diagnosis.

show abstract

Section: The Developed Methodsmentioning

confidence: 99%

Multicriteria fuzzy classification procedure PROCFTN: methodology and medical application

Belacel¹,

Boulassel

2004

Fuzzy Sets and Systems

View full text Add to dashboard Cite

show abstract

“…Most empirical research on symbolic concept induction has focussed on learning decision trees (Quinlan, 1986;Breiman et al, 1984;Buntine & Niblett, 1992;Fayyad & Irani, 1992) or disjunctive normal form (DNF) expressions (Michalski & Chilausky, 1980;Michalski et al, 1986;Clark & Niblett, 1989;Pagallo & Haussler, 1990). Very little experimental research has been done on learning conjunctive normal form (CNF).…”

Section: Lntroductionmentioning

confidence: 99%

“…Decision-tree methods for discretizing continuous attributes (Quinlan, 1986;Fayyad & Irani, 1992) could be employed to handle real-valued features. The effect of using numerical thresholds and internal disjunction in DNF formulae needs to be determined.…”

Section: Future Research Issuesmentioning

confidence: 99%

Encouraging experimental results on learning CNF

Mooney

1995

Mach Learn

View full text Add to dashboard Cite

show abstract

“…Intuitively, a boundary point is a value V in between two sorted attribute values U and W so that all examples having attribute value U have a different class label compared to the examples having attribute value W, or U and W have a different class frequency distribution. Previous work [Fayyad and Irani 1992] has contributed substantially in identifying potential cutpoints. They proved that it is sufficient to consider boundary points as potential cutpoints, because optimal splits always fall on boundary points.…”

Section: Cost Sensitive Discretizationmentioning

confidence: 99%

Cost sensitive discretization of numeric attributes

Brijs

Vanhoof

1998

Principles of Data Mining and Knowledge Discovery

View full text Add to dashboard Cite

Abstract. Many algorithms in decision tree learning are not designed to handle numeric valued attributes very well. Therefore, discretization of the continuous feature space has to be carried out. In this article we introduce the concept of cost sensitive discretization as a preprocessing step to induction of a classifier and as an elaboration of the error-based discretization method to obtain an optimal multi-interval splitting for each numeric attribute. A transparant description of the method and steps involved in cost sensitive discretization is given. We also evaluate its performance against two other well known methods, i.e. entropy-based discretization and pure error-based discretization on a real life financial dataset. From the algoritmic point of view, we show that an important deficiency from error-based discretization methods can be solved by introducing costs. From the application point of view, we discovered that using a discretization method is recommended. To conclude, we use ROC-curves to illustrate that under particular conditions cost-based discretization may be optimal.

show abstract

On the handling of continuous-valued attributes in decision tree generation

Cited by 498 publications

References 5 publications

Multicriteria fuzzy classification procedure PROCFTN: methodology and medical application

Multicriteria fuzzy classification procedure PROCFTN: methodology and medical application

Encouraging experimental results on learning CNF

Cost sensitive discretization of numeric attributes

Contact Info

Product

Resources

About