The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2018
DOI: 10.1007/978-3-319-91458-9_48
|View full text |Cite
|
Sign up to set email alerts
|

Unpack Local Model Interpretation for GBDT

Abstract: A gradient boosting decision tree (GBDT), which aggregates a collection of single weak learners (i.e. decision trees), is widely used for data mining tasks. Because GBDT inherits the good performance from its ensemble essence, much attention has been drawn to the optimization of this model. With its popularization, an increasing need for model interpretation arises. Besides the commonly used feature importance as a global interpretation, feature contribution is a local measure that reveals the relationship bet… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
3

Relationship

1
5

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 9 publications
0
7
0
Order By: Relevance
“…The proportion of positive samples in all training samples contained in this node is denoted as r t k (y), which can also be considered as the probability that the training sample contained in node k belongs to the predicted sample category y. The difference in proportion of positive samples in child node and its corresponding parent node can be viewed as the node importance of the child node [40][41][42]. The larger the difference, the higher the purity of the sample split to the child node compared to that of the parent node, thus the higher the importance of the child node for the classification problem.…”
Section: Traditional Random Forest Algorithmmentioning
confidence: 99%
“…The proportion of positive samples in all training samples contained in this node is denoted as r t k (y), which can also be considered as the probability that the training sample contained in node k belongs to the predicted sample category y. The difference in proportion of positive samples in child node and its corresponding parent node can be viewed as the node importance of the child node [40][41][42]. The larger the difference, the higher the purity of the sample split to the child node compared to that of the parent node, thus the higher the importance of the child node for the classification problem.…”
Section: Traditional Random Forest Algorithmmentioning
confidence: 99%
“…GBDT is a common choice in machine learning tasks. Besides the high performance and efficiency, GBDT and its variants also provides the model interpretability [47] and the easiness of parameter tuning. The most direct transfer is first train a model on the source dataset.…”
Section: B Model-based Transfermentioning
confidence: 99%
“…The interpretability of boosted tree model in both global and local level has been shown in [3]. In our work, since the whole model of each task consists of the common part and the specific part, so we collect them all to get the whole importance of each feature.…”
Section: Interpretabilitymentioning
confidence: 99%
“…In our work, since the whole model of each task consists of the common part and the specific part, so we collect them all to get the whole importance of each feature. For each instance, the contribution of each feature to the final prediction can be calculated with the method in [3]. An example of the top 20 important feature in task2 of Scene1 is shown in figure 2.…”
Section: Interpretabilitymentioning
confidence: 99%
See 1 more Smart Citation