2022
DOI: 10.1609/aaai.v36i9.21194
|View full text |Cite
|
Sign up to set email alerts
|

Fast Sparse Decision Tree Optimization via Reference Ensembles

Abstract: Sparse decision tree optimization has been one of the most fundamental problems in AI since its inception and is a challenge at the core of interpretable machine learning. Sparse decision tree optimization is computationally hard, and despite steady effort since the 1960's, breakthroughs have been made on the problem only within the past few years, primarily on the problem of finding optimal sparse decision trees. However, current state-of-the-art algorithms often require impractical amounts of computation tim… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 9 publications
(11 citation statements)
references
References 21 publications
0
11
0
Order By: Relevance
“…Rule-based classification (Lakkaraju, Bach, and Leskovec 2016;Dash, Gunluk, and Wei 2018;Chen and Rudin 2018;Proenc ¸a and van Leeuwen 2020;Hüllermeier, Fürnkranz, and Loza Mencia 2020;McTavish et al 2022;Huynh, Fürnkranz, and Beck 2023;Lin et al 2022) aims to find interpretable classification rules of the form if X 1 = 1∧X 5 = 1 then Y = 0. While such results are interpretable, these methods primarily focus on prediction rather than description and, hence, miss out on important details.…”
Section: Related Workmentioning
confidence: 99%
“…Rule-based classification (Lakkaraju, Bach, and Leskovec 2016;Dash, Gunluk, and Wei 2018;Chen and Rudin 2018;Proenc ¸a and van Leeuwen 2020;Hüllermeier, Fürnkranz, and Loza Mencia 2020;McTavish et al 2022;Huynh, Fürnkranz, and Beck 2023;Lin et al 2022) aims to find interpretable classification rules of the form if X 1 = 1∧X 5 = 1 then Y = 0. While such results are interpretable, these methods primarily focus on prediction rather than description and, hence, miss out on important details.…”
Section: Related Workmentioning
confidence: 99%
“…Decision tree algorithms have a long history [17,18,19], but the vast majority of work on trees has used greedy induction [20,21] to avoid solving the NP-complete problem of finding an optimal tree [22]. However, greedy tree induction provides suboptimal trees, which has propelled research since the 1990s on mathematical optimization for finding optimal decision trees [4,23,24,25,26,27,28,29,30], as well as dynamic programming with branch-and-bound [31,32,33,34]. We refer readers to two recent reviews of this area [35,36].…”
Section: Related Workmentioning
confidence: 99%
“…Typically, the reference model is an empirical risk minimizer t ref ∈ arg min t∈trees Obj(t, x, y). Recent advances in decision tree optimization have allowed us to find this empirical risk minimizer, specifically using the GOSDT algorithm [31,32]. Our goal is to store R set ( , t ref , T ), sample from it, and compute statistics from it.…”
Section: Bounds For Reducing the Search Spacementioning
confidence: 99%
“…[13] show useful computational gains by using pre-computed information from sub-trees and hash functions. [22] design smart guessing strategies to improve the performance of BnB-based approaches.…”
Section: Papermentioning
confidence: 99%
“…[27] report that a MIP solver cannot solve an optimal tree of depth 2 with less than 1000 observations and 10 features in 10 minutes. (ii) Most state-of-the-art algorithms [13,2,22] consider datasets with binary features (i.e, every feature is {0, 1}) rather than continuous ones. Algorithms for optimal trees with continuous features are much less developed 2 -our goal in this paper is to bridge this gap in the literature.…”
Section: Introductionmentioning
confidence: 99%