2020
DOI: 10.48550/arxiv.2010.11034
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

On Explaining Decision Trees

Yacine Izza,
Alexey Ignatiev,
Joao Marques-Silva

Abstract: Decision trees (DTs) epitomize what have become to be known as interpretable machine learning (ML) models. This is informally motivated by paths in DTs being often much smaller than the total number of features. This paper shows that in some settings DTs can hardly be deemed interpretable, with paths in a DT being arbitrarily larger than a PI-explanation, i.e. a subset-minimal set of feature values that entails the prediction. As a result, the paper proposes a novel model for computing PI-explanations of DTs,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
11
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 14 publications
(14 citation statements)
references
References 45 publications
(74 reference statements)
0
11
0
Order By: Relevance
“…A commonly criticised point, however, is that the resulting trees may not be the best representation of the data in terms of accuracy and size. [19,20] show that in some settings decision trees can hardly be deemed interpretable, with paths being arbitrarily larger than a minimal set of features that entails the prediction (i.e., a so-called PI-explanation). This motivated the development of optimal classification tree algorithms that globally optimise the decision tree in contrast to heuristic methods that perform a sequence of locally optimal decisions.…”
Section: Other Related Workmentioning
confidence: 99%
“…A commonly criticised point, however, is that the resulting trees may not be the best representation of the data in terms of accuracy and size. [19,20] show that in some settings decision trees can hardly be deemed interpretable, with paths being arbitrarily larger than a minimal set of features that entails the prediction (i.e., a so-called PI-explanation). This motivated the development of optimal classification tree algorithms that globally optimise the decision tree in contrast to heuristic methods that perform a sequence of locally optimal decisions.…”
Section: Other Related Workmentioning
confidence: 99%
“…For linear models, promoting interpretability essentially corresponds to reducing the number of features [43,53,54,63]. For decision trees and decision rules, besides reducing the number of features, approaches exist to restrict model size, prune unnecessary parts [5,26], aggregate local models in a hierarchy [48], or promote a trade-off between accuracy and complexity by means of loss functions [30,52] or prior distributions [33,60,61]. Regarding GP (and close relatives like grammatical evolution), perhaps the most simple and popular strategy to favor interpretability is to restrain the number of model components [16,31,57], sometimes in elaborate ways or particular settings [6,32,40,49,56].…”
Section: Related Workmentioning
confidence: 99%
“…Additionally, it should be noted that when forming these local explanations, only the function in the leaf node is taken into consideration, even though the path from the root node to this leaf node is not irrelevant and most likely should be considered. However, including the paths in (both local and global) explanations should not be done carelessly since even irreducible DTs can have irrelevant splits [36].…”
Section: Extracting Feature Attributions From the Leaf Nodesmentioning
confidence: 99%