2020
DOI: 10.48550/arxiv.2006.14118
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

The Max-Cut Decision Tree: Improving on the Accuracy and Running Time of Decision Trees

Abstract: Decision trees are a widely used method for classification, both by themselves and as the building blocks of multiple different ensemble learning methods. The Max-Cut decision tree involves novel modifications to a standard, baseline model of classification decision tree construction, precisely CART Gini. One modification involves an alternative splitting metric, maximum cut, based on maximizing the distance between all pairs of observations belonging to separate classes and separate sides of the threshold val… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 1 publication
0
1
0
Order By: Relevance
“…Each branch represents the outcome of the test made and each leaf node obtained represents the final decision taken. A variant of the tree algorithm known as classification and regression tree works by using the Gini index to split the training set into subsets using a feature-threshold value pair, selecting the subset with more decrease in impurity [7], [8]. The two models described above have been employed over the years in the classification of a variety of datasets.…”
Section: Introductionmentioning
confidence: 99%
“…Each branch represents the outcome of the test made and each leaf node obtained represents the final decision taken. A variant of the tree algorithm known as classification and regression tree works by using the Gini index to split the training set into subsets using a feature-threshold value pair, selecting the subset with more decrease in impurity [7], [8]. The two models described above have been employed over the years in the classification of a variety of datasets.…”
Section: Introductionmentioning
confidence: 99%