2013
DOI: 10.5121/ijist.2013.3103
|View full text |Cite
|
Sign up to set email alerts
|

Extracting Useful Rules Through Improved Decision Tree Induction Using Information Entropy

Abstract: Classification is widely used technique in the data mining domain, where scalability and efficiency are the immediate problems in classification algorithms for large databases. We suggest improvements to the existing C4.5 decision tree algorithm. In this paper attribute oriented induction (AOI) and relevance analysis are incorporated with concept hierarchy's knowledge and HeightBalancePriority algorithm for construction of decision tree along with Multi level mining. The assignment of priorities to attributes … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
3
0

Year Published

2014
2014
2020
2020

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 7 publications
(16 reference statements)
0
3
0
Order By: Relevance
“…Hence, relevance analysis of data features is indispensable in feature selection. At present, the main methods include Chi-square check, information gain, Pearson correlation coefficient and CfsSubsetEval [13]. The limitation of Chi-square verification is the "low-frequency defect", which exaggerates the role of low-frequency features.…”
Section: Association Rules Miningmentioning
confidence: 99%
See 1 more Smart Citation
“…Hence, relevance analysis of data features is indispensable in feature selection. At present, the main methods include Chi-square check, information gain, Pearson correlation coefficient and CfsSubsetEval [13]. The limitation of Chi-square verification is the "low-frequency defect", which exaggerates the role of low-frequency features.…”
Section: Association Rules Miningmentioning
confidence: 99%
“…(13) y is the determinate value after defuzzification, Y is used to denote the fuzzy quantity in the fuzzy set Y used to denote the membership value of Y to Y → .The association rules with the largest value of y are screened out and placed in the set of association rules (MAXVALUE_r) to determine the features. The features contained in MAXVALUE_r are used as the features for malicious traffic detection.…”
mentioning
confidence: 99%
“…Nowadays C4.5 is renamed as J48 classifier in WEKA tool, which is an open source data mining tool. The heuristic function used in this classifier is based on the concept of information entropy [39]. We used WEKA to build our classifiers.…”
Section: Feature Ideas Detailsmentioning
confidence: 99%